Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-42922][SQL] Move from Random to SecureRandom #40568

Closed
wants to merge 1 commit into from

Conversation

mridulm
Copy link
Contributor

@mridulm mridulm commented Mar 27, 2023

What changes were proposed in this pull request?

Most uses of Random in spark are either in testcases or where we need a pseudo random number which is repeatable.
Use SecureRandom, instead of Random for the cases where it impacts security.

Why are the changes needed?

Use of SecureRandom in more security sensitive contexts.
This was flagged in our internal scans as well.

Does this PR introduce any user-facing change?

Directly no.
Would improve security posture of Apache Spark.

How was this patch tested?

Existing unit tests

@github-actions github-actions bot added the SQL label Mar 27, 2023
@mridulm
Copy link
Contributor Author

mridulm commented Mar 27, 2023

+CC @srowen

@mridulm mridulm changed the title SPARK-42922: Move from Random to SecureRandom [SPARK-42922][SQL][Security]: Move from Random to SecureRandom Mar 27, 2023
@mridulm mridulm changed the title [SPARK-42922][SQL][Security]: Move from Random to SecureRandom [SPARK-42922][SQL]: Move from Random to SecureRandom Mar 27, 2023
@srowen
Copy link
Member

srowen commented Mar 27, 2023

I think it's fine. These do look like better usages of RNGs. Let's see what tests say.

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-42922][SQL]: Move from Random to SecureRandom [SPARK-42922][SQL] Move from Random to SecureRandom Mar 27, 2023
Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM (if this is all).

Shall we prevent the future regression by adding some Checkstyle or Scalastyle rule? We can do that separately as a new JIRA.

@dongjoon-hyun
Copy link
Member

According to the Affected Version in JIRA, I also agree with backporting to the applicable release branches.

@srowen srowen closed this in 7444343 Mar 28, 2023
srowen pushed a commit that referenced this pull request Mar 28, 2023
### What changes were proposed in this pull request?

Most uses of `Random` in spark are either in testcases or where we need a pseudo random number which is repeatable.
Use `SecureRandom`, instead of `Random` for the cases where it impacts security.

### Why are the changes needed?

Use of `SecureRandom` in more security sensitive contexts.
This was flagged in our internal scans as well.

### Does this PR introduce _any_ user-facing change?

Directly no.
Would improve security posture of Apache Spark.

### How was this patch tested?

Existing unit tests

Closes #40568 from mridulm/SPARK-42922.

Authored-by: Mridul Muralidharan <mridulatgmail.com>
Signed-off-by: Sean Owen <srowen@gmail.com>
(cherry picked from commit 7444343)
Signed-off-by: Sean Owen <srowen@gmail.com>
srowen pushed a commit that referenced this pull request Mar 28, 2023
### What changes were proposed in this pull request?

Most uses of `Random` in spark are either in testcases or where we need a pseudo random number which is repeatable.
Use `SecureRandom`, instead of `Random` for the cases where it impacts security.

### Why are the changes needed?

Use of `SecureRandom` in more security sensitive contexts.
This was flagged in our internal scans as well.

### Does this PR introduce _any_ user-facing change?

Directly no.
Would improve security posture of Apache Spark.

### How was this patch tested?

Existing unit tests

Closes #40568 from mridulm/SPARK-42922.

Authored-by: Mridul Muralidharan <mridulatgmail.com>
Signed-off-by: Sean Owen <srowen@gmail.com>
(cherry picked from commit 7444343)
Signed-off-by: Sean Owen <srowen@gmail.com>
@srowen
Copy link
Member

srowen commented Mar 28, 2023

Merged to master/3.4/3.3

@mridulm
Copy link
Contributor Author

mridulm commented Mar 28, 2023

Thanks for the reviews everyone !
And thanks for merging it @srowen :-)

@LuciferYang
Copy link
Contributor

late LGTM

snmvaughan pushed a commit to snmvaughan/spark that referenced this pull request Jun 20, 2023
### What changes were proposed in this pull request?

Most uses of `Random` in spark are either in testcases or where we need a pseudo random number which is repeatable.
Use `SecureRandom`, instead of `Random` for the cases where it impacts security.

### Why are the changes needed?

Use of `SecureRandom` in more security sensitive contexts.
This was flagged in our internal scans as well.

### Does this PR introduce _any_ user-facing change?

Directly no.
Would improve security posture of Apache Spark.

### How was this patch tested?

Existing unit tests

Closes apache#40568 from mridulm/SPARK-42922.

Authored-by: Mridul Muralidharan <mridulatgmail.com>
Signed-off-by: Sean Owen <srowen@gmail.com>
(cherry picked from commit 7444343)
Signed-off-by: Sean Owen <srowen@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
6 participants