Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-45486][CONNECT] Make add_artifact request idempotent #43314

Closed
wants to merge 8 commits into from

Conversation

cdkrot
Copy link
Contributor

@cdkrot cdkrot commented Oct 10, 2023

What changes were proposed in this pull request?

  1. Make add_artifact request idempotent i.e. subsequent requests will succeed if the same content is provided. This makes retrying more safe.
  2. Fix existing error handling mechanism:

Before the update the error looks like that

>>> spark.addArtifact("tmp.py", pyfile=True)
>>> spark.addArtifact("tmp.py", pyfile=True) # fails
2023-10-09 15:55:30,352 82873 DEBUG __iter__ Will retry call after 60014.279746934524 ms sleep (error: <_InactiveRpcError of RPC that terminated with:
        status = StatusCode.UNKNOWN
        details = ""
        debug_error_string = "UNKNOWN:Error received from peer  {grpc_message:"", grpc_status:2, created_time:"2023-10-09T15:55:30.351541+02:00"}"
>)
(this is also getting retried)

Now it looks:

>>> spark.addArtifact("abc.sh", file=True)
>>> spark.addArtifact("abc.sh", file=True) # passes
>>> # update file's content
>>> spark.addArtifact("abc.sh", file=True) # now fails
Traceback (most recent call last):
[...]
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
        status = StatusCode.ALREADY_EXISTS
        details = "Duplicate Artifact: files/abc.sh. Artifacts cannot be overwritten."
        debug_error_string = "UNKNOWN:Error received from peer  {grpc_message:"Duplicate Artifact: files/abc.sh. Artifacts cannot be overwritten.", grpc_status:6, created_time:"2023-10-10T01:25:38.231317+02:00"}"
>

Why are the changes needed?

Makes retrying more robust, adds user-friendly error (see above).

Does this PR introduce any user-facing change?

Mostly internal improvements

How was this patch tested?

Unit testing, testing against server

Was this patch authored or co-authored using generative AI tooling?

No

@cdkrot
Copy link
Contributor Author

cdkrot commented Oct 10, 2023

cc @grundprinzip, @HyukjinKwon -- (same change as previously discussed).

@HyukjinKwon
Copy link
Member

Mind retriggering the failed tests https://github.com/cdkrot/apache_spark/runs/17602108213 ?

@cdkrot
Copy link
Contributor Author

cdkrot commented Oct 12, 2023

Retriggered, thanks!

@@ -82,6 +83,40 @@ class SparkConnectArtifactManager(sessionHolder: SessionHolder) extends Logging
*/
def getSparkConnectPythonIncludes: Seq[String] = pythonIncludeList.asScala.toSeq

private def areFilesEqual(path1: Path, path2: Path): Boolean = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't you re-use some functions from libs like Apache Common: IOUtils::contentEquals or FileUtils.contentEquals. Or there is some reason to don't do that. Please, clarify that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds good, let me see if we can use it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added this, thanks for the feedback!

@HyukjinKwon
Copy link
Member

Merged to master.

@HyukjinKwon
Copy link
Member

The JIRA number was wrong. Let me switch the JIRA with SPARK-45486

HyukjinKwon pushed a commit that referenced this pull request Jan 16, 2024
### What changes were proposed in this pull request?

Make addArtifact API retrying on errors.

Note this is safe operation since addArtifact is idempotent operation (#43314)

### Why are the changes needed?

For the same reasons as we make other API retryable.

### Does this PR introduce _any_ user-facing change?

Yes

### How was this patch tested?

Added test.

Testing by hand against custom spark server.

### Was this patch authored or co-authored using generative AI tooling?

No & never

Closes #44740 from cdkrot/SPARK-46723-addartifact.

Authored-by: Alice Sayutina <alice.sayutina@databricks.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants