Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(auth): Fine grained ownership policies #7499

Merged

Conversation

skrydal
Copy link
Contributor

@skrydal skrydal commented Mar 4, 2023

This PR introduces a new optional field for DataHubPolicyInfo - resourceOwnersType. It can only be used with resourceOwners flag set to True. Currently, any owner of the resource will match policy with resourceOwners, this change introduces possibility to specify which particular "types" (i.e. TECHNICAL_OWNER, BUSINESS_OWNER) of ownership will be considered for particular policy - making it possible to, for example, grant rights to change tags of datasets to only their technical owners but not business owners. This is an actual user requirement described here: https://feature-requests.datahubproject.io/p/authorization-policy-restricted-to-_eg_-technical_owners-owners

Beside changes in the PolicyEngine and other parts of backend also UI was amended to allow for new capability to be set as show below:
Screenshot from 2023-06-07 14-35-58
Policy view pop-up also shows new capabilities:
Screenshot from 2023-06-07 14-36-24

@github-actions github-actions bot added devops PR or Issue related to DataHub backend & deployment product PR or Issue related to the DataHub UI/UX labels Mar 4, 2023
@aditya-radhakrishnan
Copy link
Contributor

Thank you for the contribution! Let me discuss this with the core DataHub team and then get back to you :)

@skrydal
Copy link
Contributor Author

skrydal commented Mar 8, 2023

Thank you for the contribution! Let me discuss this with the core DataHub team and then get back to you :)

Appreciate it, let me know if you need any input from me.

@aditya-radhakrishnan
Copy link
Contributor

Hi there, after talking with the core DataHub team, we're thinking of doing the following:

We're actually introducing the ability to have custom Ownership Types very soon (cc @pedro93), and we want to make sure this contribution covers that case as well.

Would it be okay with you if we held off on merging this in until custom Ownership Types are supported? At that point either we can work with you on accounting for that case, or we can take over getting this to the finish line.

Let me know your thoughts and thank you so much for this contribution!

@skrydal
Copy link
Contributor Author

skrydal commented Mar 16, 2023

Hi there, after talking with the core DataHub team, we're thinking of doing the following:

We're actually introducing the ability to have custom Ownership Types very soon (cc @pedro93), and we want to make sure this contribution covers that case as well.

Would it be okay with you if we held off on merging this in until custom Ownership Types are supported? At that point either we can work with you on accounting for that case, or we can take over getting this to the finish line.

Let me know your thoughts and thank you so much for this contribution!

Hello @aditya-radhakrishnan, thank you for monitoring this issue. Let's wait for the custom Ownership Types then and please let me know once the PR can be updated.

@aditya-radhakrishnan
Copy link
Contributor

Sounds good, will definitely do so!

@skrydal
Copy link
Contributor Author

skrydal commented Mar 31, 2023

Hello @aditya-radhakrishnan I wanted to ask how is the dependency (custom Ownership Types) doing? It seems it is implemented by this PR: #7623 ? Is it blocked by something?

@@ -1622,6 +1622,9 @@ private void configurePolicyResolvers(final RuntimeWiring.Builder builder) {
})).dataFetcher("resolvedRoles", new LoadableTypeBatchResolver<>(dataHubRoleType, (env) -> {
final ActorFilter filter = env.getSource();
return filter.getRoles();
})).dataFetcher("resolvedOwnershipTypes", new LoadableTypeBatchResolver<>(ownershipType, (env) -> {
final ActorFilter filter = env.getSource();
return filter.getResourceOwnersTypesUrns();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you confirm that we will resolve the entire "OwnershipType" entity from this? It appears that we are just mapping to the URNs themselves, as opposed to the OwnershipType objects that are required in the data model

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it does, how else would UI work properly? Here is query and the output screensho (please let me know if I misunderstood you)t:
Screenshot from 2023-06-21 11-51-14

@@ -8093,6 +8097,8 @@ input ActorFilterInput {
"""
resourceOwners: Boolean!

resourceOwnersTypesUrns: [String!]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comments here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

// Toggle the "Owners" switch
const onToggleAppliesToOwners = (value: boolean) => {
setActors({
...actors,
resourceOwners: value,
resourceOwnersTypesUrns: value ? actors.resourceOwnersTypesUrns : null,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above - we can remove the "urns" suffix in all of these names

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@skrydal skrydal requested a review from jjoyce0510 June 21, 2023 11:40
@skrydal
Copy link
Contributor Author

skrydal commented Jun 21, 2023

@jjoyce0510 thank you for the review. I applied requested changes and provided the screenshot with a proof that ownership types resolution works. Is there anything left?

@anshbansal anshbansal added the community-contribution PR or Issue raised by member(s) of DataHub Community label Jun 23, 2023
@skrydal
Copy link
Contributor Author

skrydal commented Jun 26, 2023

I run some performance tests with this feature using locust and it is pretty much fine (note it was done on one host), sharing here (rather limited as everything ran on the same host):

[2023-06-26 12:14:46,550] skrydal-vm/INFO/locust.main: Starting web interface at http://0.0.0.0:8089 (accepting connections from all network interfaces)
[2023-06-26 12:14:46,555] skrydal-vm/INFO/locust.main: Starting Locust 2.15.1
[2023-06-26 12:14:46,555] skrydal-vm/INFO/locust.main: Run time limit set to 150 seconds
[2023-06-26 12:14:46,557] skrydal-vm/INFO/locust.runners: Ramping to 100 users at a rate of 2.00 per second
[2023-06-26 12:15:35,596] skrydal-vm/INFO/locust.runners: All users spawned: {"IngestUser": 100} (100 total users)
[2023-06-26 12:17:16,408] skrydal-vm/INFO/locust.main: --run-time limit reached, stopping test
[2023-06-26 12:17:17,413] skrydal-vm/INFO/locust.main: --autoquit time reached, shutting down
[2023-06-26 12:17:17,413] skrydal-vm/INFO/locust.main: Shutting down (exit code 0)
Type     Name                                                                          # reqs      # fails |    Avg     Min     Max    Med |   req/s  failures/s
--------|----------------------------------------------------------------------------|-------|-------------|-------|-------|-------|-------|--------|-----------
POST     /entities?action=ingest                                                        12375     0(0.00%) |     15       3     161     13 |   82.58        0.00
--------|----------------------------------------------------------------------------|-------|-------------|-------|-------|-------|-------|--------|-----------
         Aggregated                                                                     12375     0(0.00%) |     15       3     161     13 |   82.58        0.00

Response time percentiles (approximated)
Type     Name                                                                                  50%    66%    75%    80%    90%    95%    98%    99%  99.9% 99.99%   100% # reqs
--------|--------------------------------------------------------------------------------|--------|------|------|------|------|------|------|------|------|------|------|------
POST     /entities?action=ingest                                                                13     15     17     19     25     32     44     56    100    130    160  12375
--------|--------------------------------------------------------------------------------|--------|------|------|------|------|------|------|------|------|------|------|------
         Aggregated                                                                             13     15     17     19     25     32     44     56    100    130    160  12375

@jjoyce0510 are there still any concerns about the feature?

Change copy in the details modal
Copy link
Collaborator

@jjoyce0510 jjoyce0510 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking the time to address comments! The PR is very nearly ready. There is one thing I'd missed in my previous review, specifically the comment about moving the logic for "resolving" the specific owner types for the resource from the PolicyEngine into a ResolvedResourceSpec object.

This basically allows us to lazily do the lookup in a way that is not visible to the Policy Engine, keeping the responsibility of evaluating policies and fetching the facts required to evaluate a policy separate

Once this change is addressed, we will be ready to merge this PR!

Cheers
John

@@ -8130,6 +8140,11 @@ input ActorFilterInput {
"""
resourceOwners: Boolean!

"""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for adding these!

Urn entityUrn = UrnUtils.getUrn(resourceSpec.getResource());
EnvelopedAspect ownershipAspect;
try {
EntityResponse response = _entityClient.getV2(entityUrn.getEntityType(), entityUrn,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The final comment from me - I'd really prefer to not see this logic in the Policy Engine. This can be abstracted away to the ResolvedResourceSpec and handled internally.

The policy engine should simply ask for the ownership types for the resouce and get it back without knowing how that happens

@jjoyce0510
Copy link
Collaborator

Also note that I made a small copy change in one of the files, you may need to pull-then-push your changes!

@sgomezvillamor
Copy link
Contributor

Thanks @jjoyce0510 for the review

Correct me if I’m wrong basically, you are proposing to move this logic here

EntityResponse response = _entityClient.getV2(entityUrn.getEntityType(), entityUrn,
Collections.singleton(Constants.OWNERSHIP_ASPECT_NAME), _systemAuthentication);
if (response == null || !response.getAspects().containsKey(Constants.OWNERSHIP_ASPECT_NAME)) {
return Collections.emptySet();
}
ownershipAspect = response.getAspects().get(Constants.OWNERSHIP_ASPECT_NAME);
} catch (Exception e) {
log.error("Error while retrieving ownership aspect for urn {}", entityUrn, e);
return Collections.emptySet();
}

to the ResolvedResourceSpec class

https://github.com/datahub-project/datahub/blob/3077809e37cd9e55dc57bbb2e30de94c0b7d1550/metadata-auth/auth-api/src/main/java/com/datahub/authorization/ResolvedResourceSpec.java

by adding a sort of method Set<Pair<String,String>> get OwnerTypes

That makes sense and actually this is something we already considered when starting the implementation, however we found a limitation in the current implementation of the ResolverSpecs abstraction: the return type is limited to the FieldResolver.FieldValue, which is Set<String> as you can see here

/**
* Container for storing the field value, in case we need to extend this to have more types of field values
*/
@Value
@Builder
public static class FieldValue {
Set<String> values;
}
}

So, we are kindly requesting you to release the feature as it is in this PR and consider the refactor as pending technical debt that goes beyond the scope of this PR. There are several reasons why we think such a refactor should be managed in the future as part of the generalization of the FieldValue:

  • First of all and more importantly, this is not new tech debt but existing one

  • Mixing a heavy refactor and feature development in this PR would be very noisy, hard to review and risky as it would affect many other components.

  • Overall, our contributions on Authorization resulted on being delayed for many months (custom ownership dependency) and this new refactor is just adding more delay. Also, the review of this PR has been split on different and scattered feedback loops that make the contribution experience quite frustrating. Also and unfortunately, the last review was got yesterday and the author of the PR goes on vacation next week. Which means that either we hand over this to someone else in my team or we postpone again for another week, at least. So more delay again.

Thanks

@sgomezvillamor
Copy link
Contributor

sgomezvillamor commented Jul 1, 2023

Hi @jjoyce0510 , about the failing tests, there are two checks failing:

  • check datahub client jar fails with:

    Deprecated Gradle features were used in this build, making it incompatible with Gradle 7.0.
    Use '--warning-mode all' to show the individual deprecation warnings.
    See https://docs.gradle.org/6.9.2/userguide/command_line_interface.html#sec:command_line_warnings
    A problem occurred configuring root project 'datahub'.
    > Could not resolve all dependencies for configuration ':classpath'.
       > Could not determine artifacts for org.tomlj:tomlj:1.0.0
          > Could not get resource 'https://plugins.gradle.org/m2/org/tomlj/tomlj/1.0.0/tomlj-1.0.0.jar'.
             > Could not HEAD 'https://jcenter.bintray.com/org/tomlj/tomlj/1.0.0/tomlj-1.0.0.jar'.
                > Connect to jcenter.bintray.com:443 [jcenter.bintray.com/34.95.74.180] failed: connect timed out
    

    The error is not related to our updates. I don't have permissions to re-run this test, but I guess it will eventually succeed after some retries.

  • Run smoke tests (cypress1). The test failing here is the mutations/managed_ingestion.js... very unlikely to be related to our updates. 🤔 Actually, I just checked master and the same error happens there.

@jjoyce0510
Copy link
Collaborator

CI failures are unrelated. Will be ignoring those!

Thanks for the hard work on this PR, and apologies for the delay!

Cheers
John

@jjoyce0510 jjoyce0510 merged commit cfa864e into datahub-project:master Jul 3, 2023
46 of 48 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community-contribution PR or Issue raised by member(s) of DataHub Community devops PR or Issue related to DataHub backend & deployment product PR or Issue related to the DataHub UI/UX
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants