Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-43570][SPARK-43571][PYTHON][TESTS] Enable DateOpsTests.[test_rsub|test_sub] for pandas 2.0.0. #41824

Closed
wants to merge 1 commit into from

Conversation

panbingkun
Copy link
Contributor

@panbingkun panbingkun commented Jul 3, 2023

What changes were proposed in this pull request?

The pr aims to enable DateOpsTests.test_rsub & DateOpsTests.test_sub for pandas 2.0.0.

Why are the changes needed?

Improve UT coverage.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

-Pass GA
-Manually test.

@panbingkun panbingkun changed the title [SPARK-43570][SPARK-43571] Enable DateOpsTests.[test_rsub|test_sub] for pandas 2.0.0. [SPARK-43570][SPARK-43571][PYTHON][TESTS] Enable DateOpsTests.[test_rsub|test_sub] for pandas 2.0.0. Jul 3, 2023
def test_sub(self):
self.assertRaises(TypeError, lambda: self.psser - "x")
self.assertRaises(TypeError, lambda: self.psser - 1)
self.assert_eq(
(self.pser - self.some_date).dt.days,
pd.to_timedelta(self.pser - self.some_date).dt.days,
self.psser - self.some_date,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I think we should also make the behaviour the same with pandas 2.0.0 at pandas API on Spark too. I forgot how we concluded this. Right @itholic ? and should upgrade migration guide too?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, instead of just fixing it to work for testing, we need to modify the behavior to match pandas. However, we have decided to put this task on hold and address it in the next major release, Spark 4.0. For more details, please refer to the comment from SPARK-44101.

IMHO, I'd recommend temporarily holding off on any tasks related to pandas 2.0.0 until starting to prepare Spark 4.0 release to avoid potential mistakes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, Let's to put this task on hold , I will convert it to draft.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Thanks for your cooperation! :-)

@panbingkun panbingkun marked this pull request as draft July 3, 2023 09:19
@itholic
Copy link
Contributor

itholic commented Sep 14, 2023

Oh, I just realized that this is already fixed from #42533.

But seems like the approach is a bit different, so please feel free to proceed this as a follow-up if you think we should need to address the current way?

@panbingkun
Copy link
Contributor Author

panbingkun commented Sep 18, 2023

Oh, I just realized that this is already fixed from #42533.

But seems like the approach is a bit different, so please feel free to proceed this as a follow-up if you think we should need to address the current way?

I'm good with fixing it in the current way, let me close it now.

@panbingkun panbingkun closed this Sep 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants