Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-47202][PYTHON] Fix typo breaking datetimes with tzinfo #45301

Closed
wants to merge 2 commits into from

Conversation

arzavj
Copy link
Contributor

@arzavj arzavj commented Feb 28, 2024

What changes were proposed in this pull request?

This PR fixes a bug caused due to a typo.

Why are the changes needed?

This bug is preventing users from having datetime.datetime objects with tzinfo when using the TimestampType

Does this PR introduce any user-facing change?

No, just a bug fix.

How was this patch tested?

There ought to be CI that lints code and catches these simple errors at the time of opening the PR.

Was this patch authored or co-authored using generative AI tooling?

No

@arzavj
Copy link
Contributor Author

arzavj commented Feb 28, 2024

@zhengruifeng @ueshin could you please review this?

@yaooqinn
Copy link
Member

@arzavj
Copy link
Contributor Author

arzavj commented Feb 28, 2024

I did enable it after it failed but I don't know how to re-run the check

@yaooqinn
Copy link
Member

Try git commit -am "ci" --allow-empty and push once more

@HyukjinKwon HyukjinKwon changed the title [SPARK-47202][PySpark] Fix typo breaking datetimes with tzinfo [SPARK-47202][PYTHON] Fix typo breaking datetimes with tzinfo Feb 28, 2024
@@ -993,7 +993,7 @@ def convert_struct(value: Any) -> Any:

def convert_timestamp(value: Any) -> Any:
if isinstance(value, datetime.datetime) and value.tzinfo is not None:
ts = pd.Timstamp(value)
ts = pd.Timestamp(value)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine as is because it's pretty obvious but would be good to have a test at python/pyspark/sql/tests/test_arrow.py.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for the above comment about adding a test coverage.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I considered adding a test for this but really what we need here is a linter in CI to catch such typos

Copy link
Member

@ueshin ueshin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix!

@HyukjinKwon
Copy link
Member

Let me just merge this in and follow up with a test.

@HyukjinKwon
Copy link
Member

Merged to master and branch-3.5.

HyukjinKwon pushed a commit that referenced this pull request Feb 28, 2024
This PR fixes a bug caused due to a typo.

This bug is preventing users from having datetime.datetime objects with tzinfo when using the `TimestampType`

No, just a bug fix.

There ought to be CI that lints code and catches these simple errors at the time of opening the PR.

No

Closes #45301 from arzavj/SPARK-47202.

Authored-by: Arzav Jain <arzavj@users.noreply.github.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
(cherry picked from commit bf8396e)
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
@HyukjinKwon
Copy link
Member

PTAL: #45308

HyukjinKwon added a commit that referenced this pull request Feb 28, 2024
… toPandas and createDataFrame with Arrow optimized

### What changes were proposed in this pull request?

This PR is a follow up of #45301 that actually test the change.

### Why are the changes needed?

To prevent a regression.

### Does this PR introduce _any_ user-facing change?

No, test-only.

### How was this patch tested?

Manually ran the tests.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #45308 from HyukjinKwon/SPARK-47202-followup.

Authored-by: Hyukjin Kwon <gurwls223@apache.org>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
HyukjinKwon added a commit that referenced this pull request Feb 28, 2024
… toPandas and createDataFrame with Arrow optimized

### What changes were proposed in this pull request?

This PR is a follow up of #45301 that actually test the change.

### Why are the changes needed?

To prevent a regression.

### Does this PR introduce _any_ user-facing change?

No, test-only.

### How was this patch tested?

Manually ran the tests.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #45308 from HyukjinKwon/SPARK-47202-followup.

Authored-by: Hyukjin Kwon <gurwls223@apache.org>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
(cherry picked from commit 721c2a4)
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
TakawaAkirayo pushed a commit to TakawaAkirayo/spark that referenced this pull request Mar 4, 2024
### What changes were proposed in this pull request?
This PR fixes a bug caused due to a typo.

### Why are the changes needed?
This bug is preventing users from having datetime.datetime objects with tzinfo when using the `TimestampType`

### Does this PR introduce _any_ user-facing change?
No, just a bug fix.

### How was this patch tested?
There ought to be CI that lints code and catches these simple errors at the time of opening the PR.

### Was this patch authored or co-authored using generative AI tooling?
No

Closes apache#45301 from arzavj/SPARK-47202.

Authored-by: Arzav Jain <arzavj@users.noreply.github.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
TakawaAkirayo pushed a commit to TakawaAkirayo/spark that referenced this pull request Mar 4, 2024
… toPandas and createDataFrame with Arrow optimized

### What changes were proposed in this pull request?

This PR is a follow up of apache#45301 that actually test the change.

### Why are the changes needed?

To prevent a regression.

### Does this PR introduce _any_ user-facing change?

No, test-only.

### How was this patch tested?

Manually ran the tests.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes apache#45308 from HyukjinKwon/SPARK-47202-followup.

Authored-by: Hyukjin Kwon <gurwls223@apache.org>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
@arzavj
Copy link
Contributor Author

arzavj commented Mar 5, 2024

@HyukjinKwon do you know when I can expect 3.5.2 to be released to be able to take advantage of this bug fix?

ericm-db pushed a commit to ericm-db/spark that referenced this pull request Mar 5, 2024
### What changes were proposed in this pull request?
This PR fixes a bug caused due to a typo.

### Why are the changes needed?
This bug is preventing users from having datetime.datetime objects with tzinfo when using the `TimestampType`

### Does this PR introduce _any_ user-facing change?
No, just a bug fix.

### How was this patch tested?
There ought to be CI that lints code and catches these simple errors at the time of opening the PR.

### Was this patch authored or co-authored using generative AI tooling?
No

Closes apache#45301 from arzavj/SPARK-47202.

Authored-by: Arzav Jain <arzavj@users.noreply.github.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
ericm-db pushed a commit to ericm-db/spark that referenced this pull request Mar 5, 2024
… toPandas and createDataFrame with Arrow optimized

### What changes were proposed in this pull request?

This PR is a follow up of apache#45301 that actually test the change.

### Why are the changes needed?

To prevent a regression.

### Does this PR introduce _any_ user-facing change?

No, test-only.

### How was this patch tested?

Manually ran the tests.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes apache#45308 from HyukjinKwon/SPARK-47202-followup.

Authored-by: Hyukjin Kwon <gurwls223@apache.org>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants