Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[improve][broker] Require authRole is proxyRole to set originalPrincipal #19455

Merged
merged 17 commits into from
Feb 14, 2023

Conversation

michaeljmarshall
Copy link
Member

@michaeljmarshall michaeljmarshall commented Feb 7, 2023

Motivation

While working on #19270, I noticed we do not set any strict rules on which roles can supply the originalPrincipal or originalAuthData fields on the ConnectCommand:

// Original principal that was verified by
// a Pulsar proxy. In this case the auth info above
// will be the auth of the proxy itself
optional string original_principal = 7;
// Original auth role and auth Method that was passed
// to the proxy. In this case the auth info above
// will be the auth of the proxy itself
optional string original_auth_data = 8;
optional string original_auth_method = 9;

This PR's goal is to prevent connections where the originalPrincipal is set with an authRole that is not a proxy role. This is an invalid state that was allowed to persist before. It was not, strictly speaking, a vulnerability because the isTopicOperationAllowed validates that both the originalPrincipal and the authRole have permission to perform any operation the client attempts.

Additionally, in the admin http service, there is a risk where a proxy that is configured with a superuser token that is not also a proxy role, the call is authorized only with the proxy's auth data, which could result in excessive permissions.

This is technically a breaking change in that upgrading existing proxies will not work if the proxyRoles is not correctly configured in the broker.conf.

Modifications

  • Update the original principal validation to require that when originalPrincipal is set, the authRole (or authenticatedPrincipal) must be in the proxyRoles set. Because we run this check after authenticating the originalAuthData, we will correctly fail calls that pass and do not pass the originalAuthData in the ServerCnx.
  • Consolidate logic for validateOriginalPrincipal in the AuthorizationService.
  • Remove static modifier from several methods. This could break extensions, though it is unlikely because these methods are all HTTP server related.

Verifying this change

An existing test is updated to cover the added logic.

Does this pull request potentially affect one of the following parts:

  • The binary protocol

This change affects the binary protocol's usage without changing the binary protocol itself.

Documentation

If we accept this change, I'll follow up with a docs PR to make sure all documentation is up to date.

Matching PR in forked repository

PR in forked repository: michaeljmarshall#25

@michaeljmarshall michaeljmarshall added area/broker doc-required Your PR changes impact docs and you will update later. release/note-required labels Feb 7, 2023
@michaeljmarshall michaeljmarshall added this to the 3.0.0 milestone Feb 7, 2023
@michaeljmarshall michaeljmarshall self-assigned this Feb 7, 2023
@nodece
Copy link
Member

nodece commented Feb 8, 2023

Good work!

This is technically a breaking change in that upgrading existing proxies will not work if the proxyRoles is not correctly configured in the broker.conf.

I think we should add a paramter in the config file, and when this paramter is enabled, we perform the strict checks.

@michaeljmarshall
Copy link
Member Author

I think we should add a paramter in the config file, and when this paramter is enabled, we perform the strict checks.

My primary concern with making this requirement configurable is that it is prone to error. For me, the justification comes here:

if (isProxyRole(role)) {
CompletableFuture<Boolean> isRoleAuthorizedFuture = allowTenantOperationAsync(
tenantName, operation, role, authData);
CompletableFuture<Boolean> isOriginalAuthorizedFuture = allowTenantOperationAsync(
tenantName, operation, originalRole, authData);
return isRoleAuthorizedFuture.thenCombine(isOriginalAuthorizedFuture,
(isRoleAuthorized, isOriginalAuthorized) -> isRoleAuthorized && isOriginalAuthorized);
} else {
return allowTenantOperationAsync(tenantName, operation, role, authData);
}

A proxy that forwards an admin call without being configured as a proxyRole will only be authorized based on the role supplied by the proxy. Since these proxyRoles are often also superUsers, this is extremely problematic and easy to misconfigure, especially because everything will "work" when the proxy's auth role is a super user. However, it will work because the proxy is over provisioned and the misconfiguration could lead to elevated permissions by the client.

@michaeljmarshall
Copy link
Member Author

I added a commit to expand the scope of this PR so that both admin and binary endpoints have the same requirements.

@nicoloboschi
Copy link
Contributor

I think we should add a paramter in the config file, and when this paramter is enabled, we perform the strict checks.

My primary concern with making this requirement configurable is that it is prone to error. For me, the justification comes here:

if (isProxyRole(role)) {
CompletableFuture<Boolean> isRoleAuthorizedFuture = allowTenantOperationAsync(
tenantName, operation, role, authData);
CompletableFuture<Boolean> isOriginalAuthorizedFuture = allowTenantOperationAsync(
tenantName, operation, originalRole, authData);
return isRoleAuthorizedFuture.thenCombine(isOriginalAuthorizedFuture,
(isRoleAuthorized, isOriginalAuthorized) -> isRoleAuthorized && isOriginalAuthorized);
} else {
return allowTenantOperationAsync(tenantName, operation, role, authData);
}

A proxy that forwards an admin call without being configured as a proxyRole will only be authorized based on the role supplied by the proxy. Since these proxyRoles are often also superUsers, this is extremely problematic and easy to misconfigure, especially because everything will "work" when the proxy's auth role is a super user. However, it will work because the proxy is over provisioned and the misconfiguration could lead to elevated permissions by the client.

@michaeljmarshall good work!
IIUC if the proxy uses a role not included in proxyRoles the originalPrincipal authorization checks are skipped completely?
so if the proxy uses a super user which is not correctly set in the broker's proxyRoles, any client connected through the proxy is considered a superuser by the broker?

@michaeljmarshall
Copy link
Member Author

/pulsarbot rerun-failure-checks

@michaeljmarshall
Copy link
Member Author

There are some test failures that seem related to incorrect resources getting copied around. I deleted some of the github actions artifacts to try to make it work.

@tuteng
Copy link
Member

tuteng commented Mar 6, 2023

This seems to be a breaking change, I would tend to think it's a security enhancement (it's not a vulnerability), wouldn't it be more appropriate to enable it by default in some future version? Now to avoid the impact on user clusters, can we introduce an option to support disable or enable it. @michaeljmarshall cc/@nodece

@michaeljmarshall
Copy link
Member Author

@tuteng - the proxy is supposed to connect using one of the proxyRoles configured in the broker.conf. This is somewhat documented here https://pulsar.apache.org/docs/2.11.x/security-authorization/#proxy-roles. If the proxy connects with something other than a proxy role when using TLS authentication, only the proxy's role will be used to verify authorization, and that will likely result in accidental elevation of privileges. In my opinion, this is not a breaking change because it is preventing a state that shouldn't have existed in the first place. However, I guess that conclusion is up for debate.

@nodece
Copy link
Member

nodece commented Mar 9, 2023

This change is always positive, but we shouldn't affect the old version because many users don't care about the proxyRoles. Once the user upgrades the Pulsar version, this user gets an error about the proxy role.

I think this change can only be released in 3.0.

If you agree, please help revert this PR in the old branch!

@michaeljmarshall
Copy link
Member Author

This change is always positive, but we shouldn't affect the old version because many users don't care about the proxyRoles. Once the user upgrades the Pulsar version, this user gets an error about the proxy role.

I do not think we should revert this change from old branches. The risk to users is that misconfiguration leads to excessive permissions, as I documented. The solution is to use dedicated authentication data for a proxy so that it has a proxy role. I will reply on the mailing list as well.

@nodece
Copy link
Member

nodece commented Mar 10, 2023

I understand that, but for the sake of the user, I think this check is optional, and we can add a switch to control this check, then we update our documentation that how to enable this switch.

Default to enable this switch in the 3.0.0, disable this switch in the 2.8, 2.9, 2.10, 2.11.

@tuteng
Copy link
Member

tuteng commented Mar 10, 2023

@tuteng - the proxy is supposed to connect using one of the proxyRoles configured in the broker.conf. This is somewhat documented here https://pulsar.apache.org/docs/2.11.x/security-authorization/#proxy-roles. If the proxy connects with something other than a proxy role when using TLS authentication, only the proxy's role will be used to verify authorization, and that will likely result in accidental elevation of privileges. In my opinion, this is not a breaking change because it is preventing a state that shouldn't have existed in the first place. However, I guess that conclusion is up for debate.

I would like to confirm if it is possible to make it optional (enabled or disabled), in many user environments it can be trusted and there is no such risk, now it is enabled by default in all major versions and cannot be disabled by the user? @michaeljmarshall

@michaeljmarshall
Copy link
Member Author

in many user environments it can be trusted and there is no such risk

Can you clarify what "it" is here? What is the purpose of authorization if the environment is trusted?

@tuteng
Copy link
Member

tuteng commented Mar 10, 2023

in many user environments it can be trusted and there is no such risk

Can you clarify what "it" is here? What is the purpose of authorization if the environment is trusted?

it means is that the client application uses the proxy's super role to connect to the cluster
In a pulsar cluster there are not only super roles, but also non-super roles, so authorization is still required, and if a client has been assigned a super role, it is of course trusted

Before this change, if a client wanted to connect to the cluster using the super role, the role had to be configured in the proxy's superUserRoles, otherwise the proxy would not be able to authenticate it, after this change, if a client connected to the cluster using the super role, the role could not appear in superUserRoles of the proxy and proxyRoles of the broker, otherwise the connect also will fail, which seems to be an opposite behavior

I think this is a security enhancement, not a vulnerability (I'm sure you agree), but now that this change has been introduced on most major branches.I understand it doesn't make sense to use the proxy's superRoles, but it works correctly, and if a user performs a minor version upgrade without understanding the context, such as upgrading a cluster from 2.10.3 to 2.10.4 (ideally there should be no breaking changes), that will result in clients not being able to successfully connect to the cluster, which will cause some failures, which have actually happened in some users' test environments

I don't think it is a problem to introduce this change in a major release (e.g. 3.0) because it will give users enough time to understand it. Minor upgrades are frequent, so I don't think it is appropriate to introduce it on a minor release and not be configurable, which can very easily lead to unpredictable failures

@michaeljmarshall
Copy link
Member Author

In a pulsar cluster there are not only super roles, but also non-super roles, so authorization is still required

Sure, and that is why this PR is necessary for the older branches. If the proxy has a super user role that is not in the proxyRoles configured by the broker and the proxy is using mTLS to authenticate with the broker, all clients going through the proxy inherit the proxy's superuser role.

Before this change, if a client wanted to connect to the cluster using the super role, the role had to be configured in the proxy's superUserRoles, otherwise the proxy would not be able to authenticate it, after this change, if a client connected to the cluster using the super role, the role could not appear in superUserRoles of the proxy and proxyRoles of the broker, otherwise the connect also will fail, which seems to be an opposite behavior

This analysis does not match my understanding of the PR. The proxy's superUserRoles has nothing to do with this change. This PR only changes the broker's logic to require an originalPrincipal to be supplied, the role must be one of the proxyRoles in the broker.conf. The core logic is here:

} else if (StringUtils.isNotBlank(originalPrincipal)
&& !(allowNonProxyPrincipalsToBeEqual && originalPrincipal.equals(authenticatedPrincipal))) {
errorMsg = "cannot specify originalPrincipal when connecting without valid proxy role.";
}

It sounds like the problem you're encountering is that your proxy's role is not in the proxyRoles list. The tests modified by this PR support my understanding of this change.

I think this is a security enhancement, not a vulnerability (I'm sure you agree)

I disagree that this is only an enhancement. This change protects operators from dangerous misconfigurations.

The only way this is not a vulnerability is if we agree that a proxy is supposed to connect with a role in the proxyRoles. It is a vulnerability if we decide that the proxyRoles list is irrelevant because it's not the user misconfiguring things but rather pulsar doing the wrong thing.

liangyepianzhou added a commit to streamnative/pulsar-archived that referenced this pull request Mar 13, 2023
hangc0276 added a commit to hangc0276/pulsar that referenced this pull request Mar 15, 2023
hangc0276 added a commit to hangc0276/pulsar that referenced this pull request Mar 15, 2023
hangc0276 added a commit to hangc0276/pulsar that referenced this pull request Mar 15, 2023
michaeljmarshall added a commit that referenced this pull request Mar 20, 2023
…ior (#19845)

Relates to: #19455 #19830

### Motivation

When I added the requirement for the proxy to use a role in the `proxyRoles` set, I didn't add a test that checked the negative case. This new test was first added in #19830 with one small difference. The goal of this test is to ensure that authorization of the client role and the original role is handled correctly.

### Modifications

* Add new test class named `AuthorizationServiceTest`. We use `pass.proxy` and `fail.proxy` as proxy roles to simulate cases where the proxy's role passes and fails authorization, which is always possible.
* Add new mock authorization provider named `MockAuthorizationProvider`. The logic is to let any role that starts with `pass` be considered authorized.

### Verifying this change

This is a new test. It simply verifies the existing behavior to prevent future regressions.

### Documentation

- [x] `doc-not-needed`

### Matching PR in forked repository

PR in forked repository: Skipping forked test since the new tests pass locally and there are no other modifications.
michaeljmarshall added a commit that referenced this pull request Apr 5, 2023
### Motivation

In #19455, I added a requirement that only the proxy role could supply an original principal. That check is only supposed to apply when the broker has authorization enabled. However, in one case, that was not the case. This PR does a check and returns early when authorization is not enabled in the broker.

See #19830 (comment) for additional motivation.

### Modifications

* Update the `PulsarWebResource#validateSuperUserAccessAsync` to only validate when authentication and authorization are enabled in the configuration.

### Verifying this change

This is a trivial change. It'd be good to add tests, but I didn't include them here because this is a somewhat urgent fix. There was one test that broke because of this change, so there is at least some existing coverage.

### Documentation

- [x] `doc-not-needed`

### Matching PR in forked repository

PR in forked repository: michaeljmarshall#39
michaeljmarshall added a commit that referenced this pull request Apr 5, 2023
### Motivation

In #19455, I added a requirement that only the proxy role could supply an original principal. That check is only supposed to apply when the broker has authorization enabled. However, in one case, that was not the case. This PR does a check and returns early when authorization is not enabled in the broker.

See #19830 (comment) for additional motivation.

### Modifications

* Update the `PulsarWebResource#validateSuperUserAccessAsync` to only validate when authentication and authorization are enabled in the configuration.

### Verifying this change

This is a trivial change. It'd be good to add tests, but I didn't include them here because this is a somewhat urgent fix. There was one test that broke because of this change, so there is at least some existing coverage.

### Documentation

- [x] `doc-not-needed`

### Matching PR in forked repository

PR in forked repository: michaeljmarshall#39

(cherry picked from commit 1a6c28d)
michaeljmarshall added a commit that referenced this pull request Apr 6, 2023
In #19455, I added a requirement that only the proxy role could supply an original principal. That check is only supposed to apply when the broker has authorization enabled. However, in one case, that was not the case. This PR does a check and returns early when authorization is not enabled in the broker.

See #19830 (comment) for additional motivation.

* Update the `PulsarWebResource#validateSuperUserAccessAsync` to only validate when authentication and authorization are enabled in the configuration.

This is a trivial change. It'd be good to add tests, but I didn't include them here because this is a somewhat urgent fix. There was one test that broke because of this change, so there is at least some existing coverage.

- [x] `doc-not-needed`

PR in forked repository: michaeljmarshall#39

(cherry picked from commit 1a6c28d)
michaeljmarshall added a commit that referenced this pull request Apr 6, 2023
In #19455, I added a requirement that only the proxy role could supply an original principal. That check is only supposed to apply when the broker has authorization enabled. However, in one case, that was not the case. This PR does a check and returns early when authorization is not enabled in the broker.

See #19830 (comment) for additional motivation.

* Update the `PulsarWebResource#validateSuperUserAccessAsync` to only validate when authentication and authorization are enabled in the configuration.

This is a trivial change. It'd be good to add tests, but I didn't include them here because this is a somewhat urgent fix. There was one test that broke because of this change, so there is at least some existing coverage.

- [x] `doc-not-needed`

PR in forked repository: michaeljmarshall#39

(cherry picked from commit 1a6c28d)
(cherry picked from commit 36f0db5)
michaeljmarshall added a commit to datastax/pulsar that referenced this pull request Apr 13, 2023
…ior (apache#19845)

Relates to: apache#19455 apache#19830

### Motivation

When I added the requirement for the proxy to use a role in the `proxyRoles` set, I didn't add a test that checked the negative case. This new test was first added in apache#19830 with one small difference. The goal of this test is to ensure that authorization of the client role and the original role is handled correctly.

### Modifications

* Add new test class named `AuthorizationServiceTest`. We use `pass.proxy` and `fail.proxy` as proxy roles to simulate cases where the proxy's role passes and fails authorization, which is always possible.
* Add new mock authorization provider named `MockAuthorizationProvider`. The logic is to let any role that starts with `pass` be considered authorized.

### Verifying this change

This is a new test. It simply verifies the existing behavior to prevent future regressions.

### Documentation

- [x] `doc-not-needed`

### Matching PR in forked repository

PR in forked repository: Skipping forked test since the new tests pass locally and there are no other modifications.

(cherry picked from commit c6de57c)
nicoloboschi pushed a commit to datastax/pulsar that referenced this pull request May 11, 2023
…#19989)

In apache#19455, I added a requirement that only the proxy role could supply an original principal. That check is only supposed to apply when the broker has authorization enabled. However, in one case, that was not the case. This PR does a check and returns early when authorization is not enabled in the broker.

See apache#19830 (comment) for additional motivation.

* Update the `PulsarWebResource#validateSuperUserAccessAsync` to only validate when authentication and authorization are enabled in the configuration.

This is a trivial change. It'd be good to add tests, but I didn't include them here because this is a somewhat urgent fix. There was one test that broke because of this change, so there is at least some existing coverage.

- [x] `doc-not-needed`

PR in forked repository: michaeljmarshall#39

(cherry picked from commit 1a6c28d)
(cherry picked from commit 36f0db5)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants