-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-34970][3.0][SQL] Redact map-type options in the output of explain() #32085
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…f explain() The `explain()` method prints the arguments of tree nodes in logical/physical plans. The arguments could contain a map-type option that contains sensitive data. We should map-type options in the output of `explain()`. Otherwise, we will see sensitive data in explain output or Spark UI.  Data security. Yes, redact the map-type options in the output of `explain()` Unit tests Closes apache#32066 from gengliangwang/redactOptions. Authored-by: Gengliang Wang <ltnwgl@gmail.com> Signed-off-by: Gengliang Wang <ltnwgl@gmail.com>
This is to backport #32066 to branch-3.0 |
Kubernetes integration test starting |
Kubernetes integration test status failure |
Test build #137052 has finished for PR 32085 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for making a backport, @gengliangwang .
The second test case fails like the following. Could you fix it?
[info] ExplainSuite:
08:38:16.800 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[info] - SPARK-34970: Redact Map type options in explain output (1 second, 409 milliseconds)
[info] - SPARK-34970: Redact CaseInsensitiveMap type options in explain output *** FAILED *** (1 second, 884 milliseconds)
[info] "== Parsed Logical Plan ==
[info] 'UnresolvedRelation [t]
[info]
[info] == Analyzed Logical Plan ==
[info] id: bigint
[info] SubqueryAlias spark_catalog.default.t
[info] +- Relation[id#xL] json
[info]
[info] == Optimized Logical Plan ==
[info] Relation[id#xL] json
[info]
[info] == Physical Plan ==
[info] FileScan json default.t[id#xL] Batched: false, DataFilters: [], Format: JSON, Location: InMemoryFileIndex[file:/Users/dongjoon/PRS/SPARK-PR-32085/sql/core/spark-warehouse/org.apache.spa..., PartitionFilters: [], PushedFilters: [], ReadSchema: struct<id:bigint>
[info]
[info] " did not contain "value" (ExplainSuite.scala:66)
@dongjoon-hyun yes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, @gengliangwang .
Kubernetes integration test starting |
Kubernetes integration test status failure |
…ain() ### What changes were proposed in this pull request? The `explain()` method prints the arguments of tree nodes in logical/physical plans. The arguments could contain a map-type option that contains sensitive data. We should map-type options in the output of `explain()`. Otherwise, we will see sensitive data in explain output or Spark UI.  ### Why are the changes needed? Data security. ### Does this PR introduce _any_ user-facing change? Yes, redact the map-type options in the output of `explain()` ### How was this patch tested? Unit tests Closes #32085 from gengliangwang/redact3.0. Authored-by: Gengliang Wang <ltnwgl@gmail.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
Merged to branch-3.0. |
Test build #137091 has finished for PR 32085 at commit
|
What changes were proposed in this pull request?
The

explain()
method prints the arguments of tree nodes in logical/physical plans. The arguments could contain a map-type option that contains sensitive data.We should map-type options in the output of
explain()
. Otherwise, we will see sensitive data in explain output or Spark UI.Why are the changes needed?
Data security.
Does this PR introduce any user-facing change?
Yes, redact the map-type options in the output of
explain()
How was this patch tested?
Unit tests