Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Presto ranger integration #244

Closed
wants to merge 7 commits into from
Closed

Conversation

cryptoe
Copy link

@cryptoe cryptoe commented Feb 15, 2019

Porting project to prestoSQL with review comments addressed.

I was planning to use testing https://github.com/prestosql/presto/blob/master/presto-main/src/main/java/io/prestosql/server/testing/TestingPrestoServer.java which uses presto-main dependency and internally brings jersey 2.xx jars. As ranger client uses jersey 1.xx jars, I was not able to write test cases for the same. Need pointers to solve this unit test cases issue.
@stagraqubole @dain @martint

@cla-bot
Copy link

cla-bot bot commented Feb 15, 2019

Thank you for your pull request and welcome to our community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. In order for us to review and merge your code, please submit the signed CLA to cla@prestosql.io. If you are contributing on behalf of someone else (e.g., your employer), the individual CLA may not be sufficient and your employer may need the Corporate CLA signed.

@shubhamtagra
Copy link
Member

As discussed over slack, we could use product-tests to test the integration.

@dain dain changed the title Wip Presto ranger integration [WIP] Presto ranger integration Feb 20, 2019
@cla-bot cla-bot bot added the cla-signed label Mar 6, 2019
@cryptoe
Copy link
Author

cryptoe commented Mar 6, 2019

@stagraqubole Added the product tests and documentation.

return userGroups.getUserGroups(identity.getUser());
}

public String getRowLevelFilterExp(String catalogName, RangerPrestoResource resource, Identity identity)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The row filter methods should be removed, they arent being used right now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

column mask changes too

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Acked

@Override
public SystemAccessControl create(Map<String, String> config)
{
RangerConfiguration rangerConfig = RangerConfiguration.getInstance();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should store away the pluginClassLoader in constructor and use it in create() method.

We can follow the example of ClassLoaderSafeConnectorMetadata to create a wrapper on top of RangerSystemAccessControl to use pluginClassloader for all SystemAccessControl functions.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Acked

@shubhamtagra
Copy link
Member

The product tests are using file based policy client which is not used outside of tests. I was hoping we could get a container image of ranger, set it up with some preset policies and use the real world RestClient of Ranger for the tests. But wait on the comments from @martint, @dain or @electrum before you put in effort on this.

@bolkedebruin
Copy link

I have created a boilerplate plugin that works with recent Ranger. It is part of the Ranger source code instead of Presto, as Ranger as the infrastructure for creating these type of plugins. Will work on integrating it into Ranger itself.

https://github.com/bolkedebruin/ranger/tree/presto_plugin

@bolkedebruin
Copy link

@Phelodas @cryptoe I have submitted the first iteration of the patch to Ranger: https://issues.apache.org/jira/browse/RANGER-2395 . We are testing it in our production environment.

Couple of points:

  • Does not use private APIs and thus doesn't require any user synchronization. It does uses standard Hadoop user integration (local system (ipa, nss), LDAP etc).
  • Builds against Ranger master
  • No row masking etc yet as Presto lacks support
  • Supports Kerberos
  • Standard auditing, to be seen if this needs to be extended

@sajjoseph
Copy link
Contributor

@cryptoe
Does this connector support Apache Ranger 1.2?

@bolkedebruin
Copy link

bolkedebruin commented Apr 14, 2019

@Phelodas @cryptoe @dain the plugin is in review with Apache Ranger. I fyou have some time to spare and would like to ensure its quality please have a look at:

@sajjoseph the one that is in review with Apache Ranger (note: not the one in this PR) does support recent versions of Ranger. It is applied against master however.

@sajjoseph
Copy link
Contributor

Thanks @bolkedebruin. I spent some time yesterday and got 244 working with ranger 1.2 version. I quickly checked the plugin added in ranger. I liked the brevity. I wonder how you are managing the dependency issue reported by @cryptoe in ranger plugin. Anyway, I had to dig a little with 244 PR to figure out how I can make everything work.

All the entries below might be obvious for others, but not for me.

  1. If your hive source (presto catalog) name is different than the usual "hive", you will have to edit the ranger-security.xml and ranger-audit.xml files and add appropriate changes.

For example, in ranger-security.xml, you will find the following:

     <property>
         <name>ranger.plugin.hive.policy.cache.dir</name>
         <value>/etc/ranger/hive/policycache</value>
     </property>

It should be changed to:

    <property>
        <name>ranger.plugin.newhive.policy.cache.dir</name>
        <value>/etc/ranger/newhive/policycache</value>
    </property>

if the name of the catalog was newhive and similar changes through out the xml files.

  1. User name in ranger is case sensitive. If presto's user used their id in lower case as an example, ranger will reject the request as the names won't match. Looks like there is a workaround for it in ranger - https://community.hortonworks.com/articles/145832/ranger-user-sync-issues-due-to-case-difference.html
  2. I had to disable shading of "org.apache.hadoop". There were multiple packages looking for org.apache.hadoop... based classes and they started throw exception (Class not found).

If there is interest, I can get a PR for ranger 1.2 added here.

Next step is to incorporate kerberos based authentication in the plugin (initial authentication into ranger at the time of loading the plugin. I know it supports it, but not documented at this time).

Hope it helps.

@bolkedebruin
Copy link

bolkedebruin commented Apr 14, 2019

The dependency issue is solved with a shim. That’s why is is now based on the source tree of Ranger rather than Presto as Ranger has the necessary infrastructure for this.

Furthermore, I don’t it’s a great idea to use the hive definitions for Presto and we made presto use its own, this also helps when the row filtering and column masking is used as you can than use ansi sql and no translation is required. The plugin in this Pr is also using private apis from Ranger for syncing users, which is not required. The plugin that is with Apache Ranger supports the latest release of presto.

In other words I don’t think the best place for the plugin is here (presto) but rather with Ranger.

Edit: both versions of the plugin already support Kerberos.

@kokosing kokosing added WIP and removed WIP labels May 8, 2019
@bolkedebruin
Copy link

Support for Presto has been merged into Ranger (no row level security yet, as Presto lacks support at the moment).

apache/ranger@43757e7

@sajjoseph
Copy link
Contributor

@cryptoe - I seem to have difficulty in getting the tag based policies working. For example, if we tag a certain table column and if there is a tag based policies in place, I expected the table suddenly come under the scrutiny of that policy and the defined access control checks take place.

If you can help me understand how the tag based policies will work with the current implementation, that will be great.

@jiegzhan
Copy link

jiegzhan commented Apr 6, 2020

You guys have any timelines to merge this PR? Or is this abandoned?

@bolkedebruin
Copy link

@jkegzhan it is part of Ranger 2.0. Presto > 321 require a newer version that hasnt been merged yet.

In other words it has been working for around a year already.

@bolkedebruin
Copy link

This issue should be closed

@electrum electrum closed this Apr 27, 2020
@tooptoop4
Copy link
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

None yet

8 participants