Skip to content

Conversation

@itholic
Copy link
Contributor

@itholic itholic commented Nov 14, 2024

What changes were proposed in this pull request?

This PR proposes to support (add|remove|get|clear)Tag(s) for PySpark

Why are the changes needed?

For improving the compatibility between Spark Connect and Spark Classic. These tag-related APIs are currently only supported from Spark Connect.

Does this PR introduce any user-facing change?

New APIs are added

  • addTag
  • removeTag
  • getTags
  • clearTags

How was this patch tested?

Added UTs

Was this patch authored or co-authored using generative AI tooling?

No

@github-actions github-actions bot added the DOCS label Nov 14, 2024
@itholic itholic changed the title [SPARK-50311][PYTHON] Support (add|remove|get|clear)Tag(s) for PySpark [SPARK-50311][PYTHON] Support (add|remove|get|clear)Tag(s) APIs for PySpark Nov 14, 2024
from pyspark.testing.connectutils import ReusedConnectTestCase


class JobCancellationParityTests(JobCancellationTestsMixin, ReusedConnectTestCase):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved this suite from pyspark.sql.tests.connect.test_session and integrated with JobCancellationTests

@itholic itholic requested a review from HyukjinKwon November 14, 2024 12:23
@xinrong-meng
Copy link
Member

LGTM thank you!

Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I think you should directly set local properties via calling addTag etc instead of having Python thread local.

Python thread <> JVM thread mappings are maintained so using this approach should b efine.

you can throw an exception if PYSPARK_PIN_THREAD is enabled.

@itholic
Copy link
Contributor Author

itholic commented Nov 18, 2024

you can throw an exception if PYSPARK_PIN_THREAD is enabled.

Oh, if we can just throw an exception for PYSPARK_PIN_THREAD then the suggestions sounds reasonable to me. Let me address the comment. Thanks!

@HyukjinKwon HyukjinKwon marked this pull request as draft November 19, 2024 01:38
@itholic
Copy link
Contributor Author

itholic commented Nov 19, 2024

I think we might need resolve the bug from Scala side first.
I left some comment from https://github.com/apache/spark/pull/47815/files#r1847514875.
Also cc @xupefei

@itholic itholic marked this pull request as ready for review November 25, 2024 08:01
@itholic itholic closed this in 13c1da7 Nov 28, 2024
@itholic
Copy link
Contributor Author

itholic commented Nov 28, 2024

Merged to master.
Thanks @HyukjinKwon @xinrong-meng for the review!

itholic added a commit that referenced this pull request Nov 29, 2024
### What changes were proposed in this pull request?

This PR followups #48843 to remove remote_only decorator from supported API and update the test

### Why are the changes needed?

To check Connect parity properly

### Does this PR introduce _any_ user-facing change?

No, it's test-only

### How was this patch tested?

Updated the existing test

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #49012 from itholic/SPARK-50311-followup.

Authored-by: Haejoon Lee <haejoon.lee@databricks.com>
Signed-off-by: Haejoon Lee <haejoon.lee@databricks.com>
itholic added a commit that referenced this pull request Jan 3, 2025
### What changes were proposed in this pull request?

This is minor followup for #48843 to fix test name properly from Python file

### Why are the changes needed?

Use proper test name

### Does this PR introduce _any_ user-facing change?

No, it's test only

### How was this patch tested?

CI

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #49199 from itholic/SPARK-50311-followup-2.

Authored-by: Haejoon Lee <haejoon.lee@databricks.com>
Signed-off-by: Haejoon Lee <haejoon.lee@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants