Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TASK][EASY] Authz will still check source table when persist view contains a subquery #5417

Closed
3 of 4 tasks
Tracked by #5474
AngersZhuuuu opened this issue Oct 13, 2023 · 0 comments
Closed
3 of 4 tasks
Tracked by #5474
Assignees
Labels

Comments

@AngersZhuuuu
Copy link
Contributor

AngersZhuuuu commented Oct 13, 2023

Code of Conduct

Search before asking

  • I have searched in the issues and found no similar issues.

Describe the bug

          sql(s"CREATE TABLE $db.$v1 AS SELECT * FROM $view")
          sql(s"CREATE TABLE $db.$v2 AS SELECT * FROM $view")
          sql(
            s"""
               |CREATE VIEW $db.$my_view
               |AS
               |WITH temp AS (
               |    SELECT max(grade) max_grade
               |    FROM $db.$v2)
               |SELECT id as new_id FROM $db.$v1
               |WHERE grade = (SELECT max_grade FROM temp)
               |""".stripMargin)
          sql(
            s"SELECT * FROM $db.$my_view".stripMargin).explain(true)

For such a case, if the view have a subquery, spark will first execute subquery then it may still check the internal table of persist view.

== Analyzed Logical Plan ==
new_id: int
Project [new_id#54]
+- SubqueryAlias spark_catalog.test_default.my_view
   +- PermanentViewMarker `test_default`.`my_view`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
      +- View (`test_default`.`my_view`, [new_id#54])
         +- Project [cast(new_id#51 as int) AS new_id#54]
            +- WithCTE
               :- CTERelationDef 1
               :  +- SubqueryAlias temp
               :     +- Aggregate [max(grade#57) AS max_grade#53]
               :        +- SubqueryAlias spark_catalog.test_default.v2
               :           +- HiveTableRelation [`test_default`.`v2`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, Data Cols: [id#55, name#56, grade#57, sex#58], Partition Cols: []]
               +- Project [id#59 AS new_id#51]
                  +- Filter (grade#61 = scalar-subquery#52 [])
                     :  +- Project [max_grade#53]
                     :     +- SubqueryAlias temp
                     :        +- CTERelationRef 1, true, [max_grade#53]
                     +- SubqueryAlias spark_catalog.test_default.v1
                        +- HiveTableRelation [`test_default`.`v1`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, Data Cols: [id#59, name#60, grade#61, sex#62], Partition Cols: []]

Affects Version(s)

master, 1.8.0

Kyuubi Server Log Output

No response

Kyuubi Engine Log Output

No response

Kyuubi Server Configurations

No response

Kyuubi Engine Configurations

No response

Additional context

No response

Are you willing to submit PR?

  • Yes. I would be willing to submit a PR with guidance from the Kyuubi community to fix.
  • No. I cannot submit a PR at this time.
@AngersZhuuuu AngersZhuuuu added kind:bug This is a clearly a bug priority:major labels Oct 13, 2023
AngersZhuuuu added a commit to AngersZhuuuu/incubator-kyuubi that referenced this issue Oct 13, 2023
AngersZhuuuu added a commit to AngersZhuuuu/incubator-kyuubi that referenced this issue Oct 17, 2023
yaooqinn pushed a commit that referenced this issue Oct 17, 2023
### _Why are the changes needed?_
Fix #5417
If there is is a view with in-subquery, authz will still request this in-subquery's interval privilege, it's not we want.
For view
```
CREATE VIEW db.view1
AS
WITH temp AS (
    SELECT max(scope) max_scope
    FROM db1.table1)
SELECT id as new_id FROM db1.table2
WHERE scope in (SELECT max_scope FROM temp)
```

When we query the view
```
SEELCT * FROM db.view1
```

Before this pr, since spark will first execute subquery, it will first request `[default/table1/scope]` then request `[default/view1/new_id]`

after this pr, it only request  `[default/view1/new_id]`

### _How was this patch tested?_
- [x] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [ ] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request

### _Was this patch authored or co-authored using generative AI tooling?_
no

Closes #5442 from AngersZhuuuu/KYUUBI-5417-FOLLOWUP.

Closes #5417

6919903 [Angerszhuuuu] Update RuleApplyPermanentViewMarker.scala
5097d80 [Angerszhuuuu] [KYUUBI #5417] should not check in-subquery in permanent view

Authored-by: Angerszhuuuu <angers.zhu@gmail.com>
Signed-off-by: Kent Yao <yao@apache.org>
yaooqinn pushed a commit that referenced this issue Oct 23, 2023
… check view's correct privilege

### _Why are the changes needed?_
To fix #5475

In issue #5417 we fixed the problem that AUTHZ will still check scalar-subquery/in-subquery in permanent will.
But we just ignore the check, the subquery still will run, in this PR,  we record the permanent view's visited column to check the permanent view's privilege to avoid extra execution effort.

For the test `[KYUUBI #5417] should not check scalar-subquery in permanent view` I print all the plan that pass to privilege builder as below
<img width="1398" alt="截屏2023-10-19 下午4 05 46" src="https://github.com/apache/kyuubi/assets/46485123/b136bb47-816c-4066-aba7-a74cbe323f7d">

before this pr
<img width="1310" alt="截屏2023-10-19 下午4 15 29" src="https://github.com/apache/kyuubi/assets/46485123/aa2e3cfe-bca7-493d-a364-b2c196c76c3a">

This two graph shows this pr deny the execution of subquery when we don't have  the veiw's privilege

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [ ] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request

### _Was this patch authored or co-authored using generative AI tooling?_

Closes #5476 from AngersZhuuuu/KYUUBI-5475.

Closes #5475

e1f7920 [Angerszhuuuu] Merge branch 'master' into KYUUBI-5475
3bfd9e6 [Angerszhuuuu] update
6b8c0e6 [Angerszhuuuu] Merge branch 'master' into KYUUBI-5475
f7585a4 [Angerszhuuuu] Update PrivilegesBuilder.scala
faea9c6 [Angerszhuuuu] [KYUUBI #5475] Authz check permanent view's subquery should check view's correct privilege

Authored-by: Angerszhuuuu <angers.zhu@gmail.com>
Signed-off-by: Kent Yao <yao@apache.org>
davidyuan1223 pushed a commit to davidyuan1223/kyuubi that referenced this issue Oct 26, 2023
…should check view's correct privilege

### _Why are the changes needed?_
To fix apache#5475

In issue apache#5417 we fixed the problem that AUTHZ will still check scalar-subquery/in-subquery in permanent will.
But we just ignore the check, the subquery still will run, in this PR,  we record the permanent view's visited column to check the permanent view's privilege to avoid extra execution effort.

For the test `[KYUUBI apache#5417] should not check scalar-subquery in permanent view` I print all the plan that pass to privilege builder as below
<img width="1398" alt="截屏2023-10-19 下午4 05 46" src="https://github.com/apache/kyuubi/assets/46485123/b136bb47-816c-4066-aba7-a74cbe323f7d">

before this pr
<img width="1310" alt="截屏2023-10-19 下午4 15 29" src="https://github.com/apache/kyuubi/assets/46485123/aa2e3cfe-bca7-493d-a364-b2c196c76c3a">

This two graph shows this pr deny the execution of subquery when we don't have  the veiw's privilege

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [ ] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request

### _Was this patch authored or co-authored using generative AI tooling?_

Closes apache#5476 from AngersZhuuuu/KYUUBI-5475.

Closes apache#5475

e1f7920 [Angerszhuuuu] Merge branch 'master' into KYUUBI-5475
3bfd9e6 [Angerszhuuuu] update
6b8c0e6 [Angerszhuuuu] Merge branch 'master' into KYUUBI-5475
f7585a4 [Angerszhuuuu] Update PrivilegesBuilder.scala
faea9c6 [Angerszhuuuu] [KYUUBI apache#5475] Authz check permanent view's subquery should check view's correct privilege

Authored-by: Angerszhuuuu <angers.zhu@gmail.com>
Signed-off-by: Kent Yao <yao@apache.org>
pan3793 pushed a commit that referenced this issue Oct 29, 2023
…ll child nodes

### _Why are the changes needed?_
To close #5503

For lateral sql won't check table2's privilege
```
test("xxx") {
    val db1 = defaultDb
    val table1 = "table1"
    val table2 = "table2"
    val view1 = "view1"
    withCleanTmpResources(
      Seq((s"$db1.$table1", "table"), (s"$db1.$table2", "table"), (s"$db1.$view1", "view"))) {
      doAs(admin, sql(s"CREATE TABLE IF NOT EXISTS $db1.$table1 (id int, scope int)"))
      doAs(admin, sql(s"CREATE TABLE IF NOT EXISTS $db1.$table2 (id int, age int)"))
      doAs(admin, sql(
        s"""
           |SELECT t1.id, age
           |FROM $db1.$table1 t1,
           |LATERAL (
           |  SELECT *
           |  FROM $db1.$table2 t2
           |  WHERE t1.id = t2.id
           |)
           |""".stripMargin).explain(true))
    }
  }
```

It was caused by  #4529
![image](https://github.com/apache/kyuubi/assets/46485123/28cdc061-2c2a-44d3-a8d2-a07f5f6f058c)

But after we fix #5417 and #5415, we didn't need this change. We should remove it.

### _How was this patch tested?_
- [x] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [ ] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request

### _Was this patch authored or co-authored using generative AI tooling?_
No

Closes #5540 from AngersZhuuuu/TEST_REVERT_4504.

Closes #5503

63fa037 [Angerszhuuuu] Update RangerSparkExtensionSuite.scala
f3118d2 [Angerszhuuuu] Update RuleAuthorization.scala
f6b6dcc [Angerszhuuuu] follow comment
2c7ae5d [Angerszhuuuu] Update RuleAuthorization.scala
bddc117 [Angerszhuuuu] Update RangerSparkExtensionSuite.scala
6bb0c39 [Angerszhuuuu] Update RangerSparkExtensionSuite.scala
42c924f [Angerszhuuuu] Update RangerSparkExtensionSuite.scala
bbb2c31 [Angerszhuuuu] [KYUUBI #5503][AUTHZ] Check plan auth checked should not set tag to all child nodes

Authored-by: Angerszhuuuu <angers.zhu@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
@pan3793 pan3793 changed the title [Bug] Authz will still check source table when persist view contains a subquery [TASK][EASY] Authz will still check source table when persist view contains a subquery Nov 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment