Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-41415][3.2] SASL Request Retries #39645

Closed

Conversation

akpatnam25
Copy link

Add the ability to retry SASL requests. Will add it as a metric too soon to track SASL retries.

We are seeing increased SASL timeouts internally, and this issue would mitigate the issue. We already have this feature enabled for our 2.3 jobs, and we have seen failures significantly decrease.

No

Added unit tests, and tested on cluster to ensure the retries are being triggered correctly.

Closes #38959 from akpatnam25/SPARK-41415.

Authored-by: Aravind Patnam apatnam@linkedin.com
Signed-off-by: Mridul Muralidharan <mridulgmail.com>

Add the ability to retry SASL requests. Will add it as a metric too soon to track SASL retries.

We are seeing increased SASL timeouts internally, and this issue would mitigate the issue. We already have this feature enabled for our 2.3 jobs, and we have seen failures significantly decrease.

No

Added unit tests, and tested on cluster to ensure the retries are being triggered correctly.

Closes apache#38959 from akpatnam25/SPARK-41415.

Authored-by: Aravind Patnam <apatnam@linkedin.com>
Signed-off-by: Mridul Muralidharan <mridul<at>gmail.com>
@github-actions github-actions bot added the CORE label Jan 18, 2023
@akpatnam25
Copy link
Author

@dongjoon-hyun @mridulm

@akpatnam25
Copy link
Author

will backport SPARK-42090 once this merged

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

*/
private synchronized boolean shouldRetry(Throwable e) {
boolean isIOException = e instanceof IOException
|| (e.getCause() != null && e.getCause() instanceof IOException);
|| e.getCause() instanceof IOException;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like the main change from 3.3/master related to this diff.
This is fine.

mridulm pushed a commit that referenced this pull request Jan 21, 2023
Add the ability to retry SASL requests. Will add it as a metric too soon to track SASL retries.

We are seeing increased SASL timeouts internally, and this issue would mitigate the issue. We already have this feature enabled for our 2.3 jobs, and we have seen failures significantly decrease.

No

Added unit tests, and tested on cluster to ensure the retries are being triggered correctly.

Closes #38959 from akpatnam25/SPARK-41415.

Authored-by: Aravind Patnam <apatnamlinkedin.com>
Signed-off-by: Mridul Muralidharan <mridul<at>gmail.com>

Closes #39645 from akpatnam25/SPARK-41415-backport-3.2.

Authored-by: Aravind Patnam <apatnam@linkedin.com>
Signed-off-by: Mridul Muralidharan <mridul<at>gmail.com>
@mridulm
Copy link
Contributor

mridulm commented Jan 21, 2023

Merged to 3.2
Thanks for fixing this @akpatnam25 !

@mridulm mridulm closed this Jan 21, 2023
sunchao pushed a commit to sunchao/spark that referenced this pull request Jun 2, 2023
Add the ability to retry SASL requests. Will add it as a metric too soon to track SASL retries.

We are seeing increased SASL timeouts internally, and this issue would mitigate the issue. We already have this feature enabled for our 2.3 jobs, and we have seen failures significantly decrease.

No

Added unit tests, and tested on cluster to ensure the retries are being triggered correctly.

Closes apache#38959 from akpatnam25/SPARK-41415.

Authored-by: Aravind Patnam <apatnamlinkedin.com>
Signed-off-by: Mridul Muralidharan <mridul<at>gmail.com>

Closes apache#39645 from akpatnam25/SPARK-41415-backport-3.2.

Authored-by: Aravind Patnam <apatnam@linkedin.com>
Signed-off-by: Mridul Muralidharan <mridul<at>gmail.com>
(cherry picked from commit 1a26c7b)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants