Skip to content

[CELEBORN-1719] Introduce celeborn.client.spark.stageRerun.enabled with alternative celeborn.client.spark.fetch.throwsFetchFailure to enable spark stage rerun#2920

Closed
SteNicholas wants to merge 1 commit into
apache:mainfrom
SteNicholas:CELEBORN-1719

Conversation

@SteNicholas
Copy link
Copy Markdown
Member

What changes were proposed in this pull request?

  1. Introduce celeborn.client.spark.stageRerun.enabled with alternative celeborn.client.spark.fetch.throwsFetchFailure to enable spark stage rerun.
  2. Change the default value of celeborn.client.spark.fetch.throwsFetchFailure from false to true, which enables spark stage rerun at default.

Why are the changes needed?

User could not directly understand the meaning of celeborn.client.spark.fetch.throwsFetchFailure as whether to enable stage rerun, which means that client throws FetchFailedException instead of CelebornIOException. It's recommended to introduce celeborn.client.spark.stageRerun.enabled with alternative celeborn.client.spark.fetch.throwsFetchFailure to enable spark stage rerun.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

CI.

…th alternative celeborn.client.spark.fetch.throwsFetchFailure to enable spark stage rerun
Copy link
Copy Markdown
Contributor

@FMX FMX left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@ErikFang
Copy link
Copy Markdown
Contributor

LGTM

@SteNicholas
Copy link
Copy Markdown
Member Author

Merged to main(v0.6.0).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants