[Spark] [Authz] New Authz Plan Serde Layer by yaooqinn · Pull Request #3904 · apache/kyuubi

yaooqinn · 2022-12-05T12:59:20Z

Why are the changes needed?

This PR redesigned the authorization part of the spark authz module with a New Authz Plan Serde Layer.

Motivation

add a general layer to describe a command, so that we can add a new command or users can add a third-party command easily according to the specification.
get rid of the spark version check. The built-in spark commands frequently vary from version to version, which makes us hard to maintain at compile& runtime phase, and the third-party commands are hard to check by spark versions.

Data structure

Overall, we introduce 2 general basic data structures:

CommandSpec: used to describe a command
- classname as key for the read-side to get the spec by a particular command
- pre-defined operation type
- descriptors
Descriptor: used to describe an object, such as table, db, query,
- fieldName: the object to get
- fieldExtractor: the method to get the object; use SPI to load
- sub-descriptors: such as columns in a table
- etc.

SPI

Extractor: implementations for fieldExtractor
- key: the name of the extractor for the read-side to get itself
- func: converting the field value to specific and general objects

Code Path

Write code path
- automatically generated default json configuration files
- custom json configuration files for thrid-party commands
Read code path
- Load json as maps
- RuleAuthorization -> PrivilegeBuilder.build -> get command desc from maps -> build privileges with the retrieved desc.

TODOs

Add back the ArcticCommand
Add delta command
Add ways for loading custom json configuration files
Add hudi commands
etc

How was this patch tested?

Add some test cases that check the changes thoroughly including negative and positive cases if possible
Add screenshots for manual tests if appropriate
Run test locally before make a pull request

bowenliang123 · 2022-12-06T03:48:01Z

extensions/spark/kyuubi-spark-authz/src/main/resources/table_command_spec.json

+    "fieldExtractor" : "LogicalPlanQueryExtractor"
+  } ]
+}, {
+  "classname" : "org.apache.spark.sql.catalyst.plans.logical.MergeIntoIcebergTable",


If Iceberg's command put in table_command_spec.json used by PrivilegesBuilder, how to extend this list to separate Iceberg support in a single plugin or cover more commands ?

This PR does not cover this case completely, table_command_spec_custom.json maybe used later to support customize third-party commands

OK. Is it posible to put table_command_spec.json in META-INF , as a sample to for followup third-party commands plugin to expose the command spec and extractor in the same way as well ?

codecov-commenter · 2022-12-06T07:53:45Z

Codecov Report

Merging #3904 (379e933) into master (730fd57) will decrease coverage by 0.12%.
The diff coverage is 88.28%.

❗ Current head 379e933 differs from pull request most recent head efafcba. Consider uploading reports for the commit efafcba to get more accurate results

@@             Coverage Diff              @@
##             master    #3904      +/-   ##
============================================
- Coverage     51.91%   51.78%   -0.13%     
  Complexity       13       13              
============================================
  Files           508      521      +13     
  Lines         28996    28763     -233     
  Branches       3982     3849     -133     
============================================
- Hits          15053    14896     -157     
+ Misses        12518    12501      -17     
+ Partials       1425     1366      -59

Impacted Files	Coverage Δ
...ache/kyuubi/plugin/spark/authz/OperationType.scala	`100.00% <ø> (+34.56%)`	⬆️
.../plugin/spark/authz/ranger/RuleAuthorization.scala	`80.95% <0.00%> (-0.45%)`	⬇️
...che/kyuubi/plugin/spark/authz/serde/Function.scala	`0.00% <0.00%> (ø)`
...gin/spark/authz/serde/functionTypeExtractors.scala	`73.07% <73.07%> (ø)`
...apache/kyuubi/plugin/spark/authz/serde/Table.scala	`75.00% <75.00%> (ø)`
.../kyuubi/plugin/spark/authz/serde/CommandSpec.scala	`80.00% <80.00%> (ø)`
...e/kyuubi/plugin/spark/authz/serde/Descriptor.scala	`82.43% <82.43%> (ø)`
...ache/kyuubi/plugin/spark/authz/serde/package.scala	`85.00% <85.00%> (ø)`
.../plugin/spark/authz/serde/functionExtractors.scala	`85.71% <85.71%> (ø)`
.../plugin/spark/authz/serde/databaseExtractors.scala	`90.00% <90.00%> (ø)`
... and 24 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

yaooqinn · 2022-12-06T09:35:42Z

cc @bowenliang123 @ulysses-you @pan3793 PTAL when you have time

...yuubi-spark-authz/src/main/scala/org/apache/kyuubi/plugin/spark/authz/serde/Descriptor.scala

ulysses-you · 2022-12-06T12:09:17Z

...yuubi-spark-authz/src/main/scala/org/apache/kyuubi/plugin/spark/authz/serde/Descriptor.scala

+    val functionExtractor = functionExtractors(fieldExtractor)
+    var function = functionExtractor(functionVal)
+    if (function.database.isEmpty) {
+      function = function.copy(database = databaseDesc.map(_.getValue(v)))


why TableDesc. getValue does not need fill database ?

I think TableDesc.getValue is filling database of Table in TableExtractor, and the database is part of the identifier for resolved table. For functions, the database is missing in some case and we are trying to get database from databaseDesc.

...rk-authz/src/test/scala/org/apache/kyuubi/plugin/spark/authz/gen/JsonSpecFileGenerator.scala

bowenliang123 · 2022-12-07T07:43:58Z

+1, LGTM. And thanks for the effort. Impressively passed the all the existed auth test in CI with in several commits.

It's a great work for redesigning and modernizing Authz core implementation. It significantly improves the pattern and clearification in rule authentication by components in layers of specs, descriptors, and extractors.

The idea and implemented components in resource privilege checking can be also applied to row filtering, column masking, object filtering, and even to other fields like lineage.

bowenliang123 · 2022-12-07T08:03:05Z

fyi @jeanlyn

ulysses-you · 2022-12-07T10:49:29Z

late lgtm

[WIP][Extension][Spark] New Authz Plan Serde Layer

453541b

github-actions bot added kind:build module:extensions module:spark labels Dec 5, 2022

yaooqinn added this to the v1.7.0 milestone Dec 5, 2022

yaooqinn self-assigned this Dec 5, 2022

yaooqinn added 2 commits December 5, 2022 21:06

Merge branch 'master' into na

2bced47

[WIP][Extension][Spark] New Authz Plan Serde Layer

49dbb68

bowenliang123 reviewed Dec 6, 2022

View reviewed changes

yaooqinn added 3 commits December 6, 2022 12:46

[WIP][Extension][Spark] New Authz Plan Serde Layer

c32feef

[WIP][Extension][Spark] New Authz Plan Serde Layer

e47749d

[WIP][Extension][Spark] New Authz Plan Serde Layer

f56148e

yaooqinn added 3 commits December 6, 2022 16:05

[WIP][Extension][Spark] New Authz Plan Serde Layer

b45453a

[WIP][Extension][Spark] New Authz Plan Serde Layer

8926f04

[WIP][Extension][Spark] New Authz Plan Serde Layer

9a24be6

yaooqinn changed the title ~~[WIP][Extension][Spark] New Authz Plan Serde Layer~~ [Spark] New Authz Plan Serde Layer Dec 6, 2022

ulysses-you reviewed Dec 6, 2022

View reviewed changes

yaooqinn added 3 commits December 7, 2022 12:13

comments

b52ab41

style

379e933

ci

7d2b3e4

bowenliang123 reviewed Dec 7, 2022

View reviewed changes

...rk-authz/src/test/scala/org/apache/kyuubi/plugin/spark/authz/gen/JsonSpecFileGenerator.scala Outdated Show resolved Hide resolved

bowenliang123 approved these changes Dec 7, 2022

View reviewed changes

sort

efafcba

yaooqinn closed this in 2540f44 Dec 7, 2022

yaooqinn mentioned this pull request Dec 7, 2022

[Bug] Restore ReplaceArcticData #3927

Open

5 tasks

yaooqinn deleted the na branch December 7, 2022 10:53

bowenliang123 mentioned this pull request Dec 9, 2022

[Bug][AuthZ] Don't have permissions to create UDF functions when not specify a database name #3925

Closed

5 tasks

bowenliang123 mentioned this pull request Jan 5, 2023

[KYUUBI #3698][Subtask] Set table owner for create table commands #3699

Closed

3 tasks

bowenliang123 changed the title ~~[Spark] New Authz Plan Serde Layer~~ [Spark] [Authz] New Authz Plan Serde Layer Feb 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Spark] [Authz] New Authz Plan Serde Layer#3904

[Spark] [Authz] New Authz Plan Serde Layer#3904
yaooqinn wants to merge 13 commits intoapache:masterfrom
yaooqinn:na

yaooqinn commented Dec 5, 2022 •

edited

Loading

Uh oh!

bowenliang123 Dec 6, 2022

Uh oh!

yaooqinn Dec 6, 2022

Uh oh!

bowenliang123 Dec 6, 2022

Uh oh!

codecov-commenter commented Dec 6, 2022 •

edited

Loading

Uh oh!

yaooqinn commented Dec 6, 2022

Uh oh!

Uh oh!

ulysses-you Dec 6, 2022

Uh oh!

bowenliang123 Dec 6, 2022 •

edited

Loading

Uh oh!

Uh oh!

bowenliang123 commented Dec 7, 2022 •

edited

Loading

Uh oh!

bowenliang123 commented Dec 7, 2022

Uh oh!

ulysses-you commented Dec 7, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

yaooqinn commented Dec 5, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why are the changes needed?

Motivation

Data structure

SPI

Code Path

TODOs

How was this patch tested?

Uh oh!

bowenliang123 Dec 6, 2022

Choose a reason for hiding this comment

Uh oh!

yaooqinn Dec 6, 2022

Choose a reason for hiding this comment

Uh oh!

bowenliang123 Dec 6, 2022

Choose a reason for hiding this comment

Uh oh!

codecov-commenter commented Dec 6, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

yaooqinn commented Dec 6, 2022

Uh oh!

Uh oh!

ulysses-you Dec 6, 2022

Choose a reason for hiding this comment

Uh oh!

bowenliang123 Dec 6, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

bowenliang123 commented Dec 7, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bowenliang123 commented Dec 7, 2022

Uh oh!

ulysses-you commented Dec 7, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

yaooqinn commented Dec 5, 2022 •

edited

Loading

codecov-commenter commented Dec 6, 2022 •

edited

Loading

bowenliang123 Dec 6, 2022 •

edited

Loading

bowenliang123 commented Dec 7, 2022 •

edited

Loading