Skip to content

[FEATURE] [Authz] Support authorization for hive persist function call#3633

Closed
packyan wants to merge 5 commits intoapache:masterfrom
packyan:feature_add_support_for_hive_function_call_authorization
Closed

[FEATURE] [Authz] Support authorization for hive persist function call#3633
packyan wants to merge 5 commits intoapache:masterfrom
packyan:feature_add_support_for_hive_function_call_authorization

Conversation

@packyan
Copy link
Copy Markdown
Contributor

@packyan packyan commented Oct 14, 2022

Why are the changes needed?

to close #3632

The Spark Sql Authz in Kyuubi currently supports authorize create function command , drop function command, and refresh function command.

But the authentication for function usage is not implemented, anyone can use all permanent functions in hive. This behavior needs to be restricted.

Since function expression may be optimized as constants by catalyst optimization rules, the function usage authorization should be injected before the optimization phase.

So before the optimization rules running, perform the collection of permanent function information involved in the LogicalPlan and construct them as functionPrivilegeObjects.

I think it's necessary to create a class like RuleAuthorization and inject it as a post-hoc resolution rule, dedicated to the authentication of function usage privileges.

How was this patch tested?

  • Add some test cases that check the changes thoroughly including negative and positive cases if possible

  • Add screenshots for manual tests if appropriate

  • Run test locally before make a pull request

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Oct 15, 2022

Codecov Report

Merging #3633 (45a1016) into master (be82aca) will decrease coverage by 0.00%.
The diff coverage is 76.00%.

@@             Coverage Diff              @@
##             master    #3633      +/-   ##
============================================
- Coverage     52.86%   52.85%   -0.01%     
  Complexity       13       13              
============================================
  Files           496      499       +3     
  Lines         27971    28008      +37     
  Branches       3857     3862       +5     
============================================
+ Hits          14787    14805      +18     
- Misses        11788    11802      +14     
- Partials       1396     1401       +5     
Impacted Files Coverage Δ
.../kyuubi/plugin/spark/authz/PrivilegesBuilder.scala 71.42% <ø> (+0.27%) ⬆️
...rk/authz/ranger/RuleApplyPermanentViewMarker.scala 100.00% <ø> (ø)
.../spark/sql/hive/HiveFunctionPrivilegeBuilder.scala 56.00% <56.00%> (ø)
...spark/authz/ranger/RuleAuthorizationProvider.scala 81.08% <83.33%> (ø)
...ugin/spark/authz/ranger/RangerSparkExtension.scala 100.00% <100.00%> (ø)
...spark/authz/ranger/RuleFunctionAuthorization.scala 100.00% <100.00%> (ø)
...spark/authz/ranger/RuleRelationAuthorization.scala 100.00% <100.00%> (ø)
.../kyuubi/server/mysql/constant/MySQLErrorCode.scala 13.84% <0.00%> (-6.16%) ⬇️
...ache/kyuubi/server/mysql/MySQLCommandHandler.scala 76.84% <0.00%> (-4.22%) ⬇️
...ache/kyuubi/server/mysql/MySQLGenericPackets.scala 76.59% <0.00%> (-2.13%) ⬇️
... and 6 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@packyan
Copy link
Copy Markdown
Contributor Author

packyan commented Oct 16, 2022

cc @bowenliang123, interested in this feature? Maybe we can collaborate on it.

@bowenliang123
Copy link
Copy Markdown
Contributor

bowenliang123 commented Oct 16, 2022

Are you suggesting checking privileges in LogicalPlan in function usage?
Sounds good to have it to diving deep to check details.
Or you could separate the whole job into several subtasks in one umbrella issue and let the community share and contribute to them.

…_authorization

# Conflicts:
#	extensions/spark/kyuubi-spark-authz/src/main/scala/org/apache/kyuubi/plugin/spark/authz/ranger/RuleAuthorization.scala
@packyan packyan changed the title [FEATURE] [Authz] WIP: Support authorization of function usage [FEATURE] [Authz] Support authorization for hive persist function call Nov 12, 2022
@packyan
Copy link
Copy Markdown
Contributor Author

packyan commented Nov 12, 2022

cc @pan3793, @bowenliang123, @yaooqinn

@bowenliang123
Copy link
Copy Markdown
Contributor

@packyan Would you like to adapt the implementation to refactored with Serde layers in Authz as in #4044 ?

@packyan
Copy link
Copy Markdown
Contributor Author

packyan commented Jan 6, 2023

@packyan Would you like to adapt the implementation to refactored with Serde layers in Authz as in #4044 ?

Yes, I would working on it with guidance from the Kyuubi community to improve :)

@bowenliang123
Copy link
Copy Markdown
Contributor

bowenliang123 commented Jan 10, 2023

Hope to see the reworking soon.

Here's some hints for the new Serde design:

  • prevent using magic values of command / function / plan 's name in source code
  • define proper spec in package org.apache.kyuubi.plugin.spark.authz.gen, and regenerate spec jsons with JsonSpecFileGenerator
  • expose new extractors via SPI in resources/META-INF/services

And also,

  • try not to rename the key classes like RuleAuthorization if it's not necessary
  • small PR is good, and try to separate changes in several PRs
  • prevent introduce new dependencies , eg. spark-hive, use string and reflection for better compatibility

@packyan
Copy link
Copy Markdown
Contributor Author

packyan commented Jan 13, 2023

Hope to see the reworking soon.

Here's some hints for the new Serde design:

  • prevent using magic values of command / function / plan 's name in source code
  • define proper spec in package org.apache.kyuubi.plugin.spark.authz.gen, and regenerate spec jsons with JsonSpecFileGenerator
  • expose new extractors via SPI in resources/META-INF/services

And also,

  • try not to rename the key classes like RuleAuthorization if it's not necessary
  • small PR is good, and try to separate changes in several PRs
  • prevent introduce new dependencies , eg. spark-hive, use string and reflection for better compatibility

Thx for your prompt, I plan to divide the whole feature into several subtasks

  1. Introduce HiveFunctionPrivilegeBuilder refactored with Serde layers
  2. Introduce RuleFunctionAuthorization for persistent function calls authorization
  3. Increase test coverage For HiveFunctionPrivilegeBuilder
  4. Increase test coverage For RuleFunctionAuthorization

@bowenliang123
Copy link
Copy Markdown
Contributor

Good, you can create an umbrella issue and subtask issues, and the following small PRs.

Also cc @yaooqinn

@packyan
Copy link
Copy Markdown
Contributor Author

packyan commented Jan 13, 2023

closed this pr, and new small PRs will be tracked in #3632

@packyan packyan closed this Jan 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Umbrella] [Authz] Support authorization for persistent hive function calls

3 participants