Skip to content

Conversation

@AngersZhuuuu
Copy link
Contributor

@AngersZhuuuu AngersZhuuuu commented Nov 13, 2023

Why are the changes needed?

To close #5677
Typeof expression miss column information, in this pr we add a place holder to support this.

For sql

SELECT typeof(id), typeof(age) FROM default.t1 LIMIT 1

Without this pr, typeof after optimizer passed to PrivilegeBuild is just an attribute, miss the origin child expression info.

GlobalLimit 21
+- LocalLimit 21
   +- Project [int AS typeof(id)#27, int AS typeof(age)#28]
      +- Relation default.t1[id#18,age#19,part#20] parquet

When handle this plan's project list, it's an attribute and miss expression when expression result is a constant value, then we can't extract the TypeOf's child expression.
image

This is caused by TypeOf expression is foldable
image

It will be convert to a constant value after spark optimizer, then cause we miss the child expression, then can't extract columns by collectLeaves

In this pr we wrap the TypeOf by a non-foldable expression placeholder then we can get the expression contains column of table when mergeProjection
After this pr, the plan is like below

GlobalLimit 21
+- LocalLimit 21
   +- Project [typeofplaceholder(id#21) AS typeof(id)#30, typeofplaceholder(day#23) AS typeof(day)#31]
      +- HiveTableRelation [`default`.`table1`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, Data Cols: [id#21, scope#22, day#23], Partition Cols: []]

How was this patch tested?

  • Add some test cases that check the changes thoroughly including negative and positive cases if possible

  • Add screenshots for manual tests if appropriate

  • Run test locally before make a pull request

Was this patch authored or co-authored using generative AI tooling?

No

@AngersZhuuuu
Copy link
Contributor Author

ping @yaooqinn

@yaooqinn
Copy link
Member

What is the problem here?

@AngersZhuuuu
Copy link
Contributor Author

What is the problem here?

Without this pr, it will check the privilege of all table, not column

@yaooqinn
Copy link
Member

I see the id, age field in the project. Why not pull them out?

@AngersZhuuuu
Copy link
Contributor Author

AngersZhuuuu commented Nov 13, 2023

I see the id, age field in the project. Why not pull them out?

Yea
截屏2023-11-13 下午2 01 10

@AngersZhuuuu
Copy link
Contributor Author

Without this pr the UT's result is as below cc @yaooqinn

"Permission denied: user [someone] does not have [select] privilege on [default/table1]" did not contain "does not have [select] privilege on [default/table1/id,default/table1/day]"
ScalaTestFailureLocation: org.apache.kyuubi.plugin.spark.authz.ranger.HiveCatalogRangerSparkExtensionSuite at (RangerSparkExtensionSuite.scala:1316)
org.scalatest.exceptions.TestFailedException: "Permission denied: user [someone] does not have [select] privilege on [default/table1]" did not contain "does not have [select] privilege on [default/table1/id,default/table1/day]"
	at org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:472)
	at org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:471)

@codecov-commenter
Copy link

codecov-commenter commented Nov 13, 2023

Codecov Report

Attention: 4 lines in your changes are missing coverage. Please review.

Comparison is base (0ce697e) 61.44% compared to head (273cd61) 61.43%.
Report is 9 commits behind head on master.

Files Patch % Lines
...park/authz/rule/expression/TypeOfPlaceHolder.scala 66.66% 1 Missing and 1 partial ⚠️
.../plugin/spark/authz/rule/RuleEliminateTypeOf.scala 75.00% 0 Missing and 1 partial ⚠️
.../authz/rule/expression/RuleApplyTypeOfMarker.scala 66.66% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master    #5678      +/-   ##
============================================
- Coverage     61.44%   61.43%   -0.01%     
  Complexity       23       23              
============================================
  Files           603      607       +4     
  Lines         35664    35712      +48     
  Branches       4876     4891      +15     
============================================
+ Hits          21914    21941      +27     
- Misses        11373    11392      +19     
- Partials       2377     2379       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@yaooqinn
Copy link
Member

I see the id, age field in the project. Why not pull them out?

Yea 截屏2023-11-13 下午2 01 10

Why not pull them out?

@AngersZhuuuu
Copy link
Contributor Author

AngersZhuuuu commented Nov 13, 2023

I see the id, age field in the project. Why not pull them out?

Yea 截屏2023-11-13 下午2 01 10

Why not pull them out?

The name and exprId are not the same between the projection list and the relation 's output, then merge projection miss the column.

@yaooqinn
Copy link
Member

Why is typeof special, or how many are they?

@AngersZhuuuu
Copy link
Contributor Author

AngersZhuuuu commented Nov 13, 2023

Why is typeof special, or how many are they?

Since typeof's result only needs the expression's datatype, it doesn't need real value.
Only see one such case.

@yaooqinn
Copy link
Member

Since typeof's result only needs the expression's datatype, it doesn't need real value.

Shall we bypass it?

@AngersZhuuuu
Copy link
Contributor Author

Since typeof's result only needs the expression's datatype, it doesn't need real value.

Shall we bypass it?

But it need table's read privilege, can't just ignore, better extract the correct column?

@yaooqinn
Copy link
Member

It's much more like explain select a from src to me

@AngersZhuuuu
Copy link
Contributor Author

AngersZhuuuu commented Nov 14, 2023

It's much more like explain select a from src to me

But only when the query's result is empty, then this sql have the output. If there is no row, show an empty result

Explain command seems won't trigger privilege check. But when it execute the query, still will check the query's privilege
截屏2023-11-14 上午10 24 29

|typeof(substring(day, 1, 3))
|FROM $db1.$table1""".stripMargin).show()))(
s"does not have [select] privilege on [$db1/$table1/id,$db1/$table1/day]")
doAs(admin, sql(s"SELECT typeof(id), typeof(day) FROM $db1.$table1").show())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check values

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SELECT typeof(typeof)

typeof typeof(id + age)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about current?

@bowenliang123
Copy link
Contributor

Could you add the example for the plan in either the issue or the pr description to show the problem and the plan?

@AngersZhuuuu
Copy link
Contributor Author

Any more suggestion? cc @yaooqinn

@AngersZhuuuu
Copy link
Contributor Author

Could you add the example for the plan in either the issue or the pr description to show the problem and the plan?

How about current?

@AngersZhuuuu
Copy link
Contributor Author

Any more suggestion?

@bowenliang123
Copy link
Contributor

Could you add the example for the plan in either the issue or the pr description to show the problem and the plan?

How about current?

Sorry, it's still too difficult for me. I'm out of the context.
Show the SQL or situation first, then the plan, and the suggested solution.

@bowenliang123
Copy link
Contributor

If you don't have time next time, there's no need to ping me. Lately, I haven't been keeping up with your series of PRs. They always get merged quickly, and my suggestions are being skipped. Not enough information for me to understand. If the Kyuubi community is okay with this 'no-explanation-just-go' approach, I don't have any objections. I attribute all of my willingness to my own shortcomings in abilities.

@AngersZhuuuu
Copy link
Contributor Author

Could you add the example for the plan in either the issue or the pr description to show the problem and the plan?

How about current?

Sorry, it's still too difficult for me. I'm out of the context. Show the SQL or situation first, then the plan, and the suggested solution.

Add explain about root cause and solution.

@yaooqinn yaooqinn added this to the v1.9.0 milestone Nov 15, 2023
@yaooqinn yaooqinn closed this in 3c2291e Nov 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[TASK][EASY] Typeof expression miss column information

4 participants