Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat!: Removes inplace option for `pandas.core.resample.Resampler.i… #58847

Conversation

cbpygit
Copy link
Contributor

@cbpygit cbpygit commented May 27, 2024

…nterpolate`. Fixes / cleans up related test cases.

…nterpolate`. Fixes / cleans up related test cases.
…. Adds breaking API change description to whatsnew.
@cbpygit
Copy link
Contributor Author

cbpygit commented May 27, 2024

@MarcoGorelli somehow it did not auto-assign the reviewers. But in any case, here we are :)

@Aloqeely
Copy link
Member

Do these need to be deprecated first? You can use the deprecate_kwarg decorator for that

@cbpygit
Copy link
Contributor Author

cbpygit commented May 30, 2024

Do these need to be deprecated first? You can use the deprecate_kwarg decorator for that

I don't know what the exact policy is. But as far as I understand the deprecation tells the user that a certain future change will change the API. This does not apply here, as the inplace-functionality is already broken. We would need to add the deprecation to pandas 2 to inform users about the upcoming change, and to ease migration. But I guess this would need a separate PR that targets the next 2.x release.

@MarcoGorelli
Copy link
Member

Do these need to be deprecated first?

Thanks @Aloqeely - in general, yes, this would require a deprecation. However, here, the change is going to breaking anyway (in order to fix a historic feature bug), so to be honest I'd be OK removing inplace without warning too

@mroeschke mroeschke added Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate inplace Relating to inplace parameter or equivalent Resample resample method labels May 31, 2024
@cbpygit cbpygit requested a review from mroeschke June 5, 2024 16:31
@cbpygit
Copy link
Contributor Author

cbpygit commented Jun 18, 2024

@mroeschke can we merge this? 😄

@@ -146,7 +146,7 @@ def test_interpolate_downcast_reference_triggers_copy():

msg = "Can not interpolate with method=pad"
with pytest.raises(ValueError, match=msg):
df.interpolate(method="pad", inplace=True, downcast="infer")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears that a lot of interpolate usages are still modified in the PR. Only tests that used to call.resample(...).interpolate(interpolate=) should have been changed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mroeschke Sorry for the long delay. I was checking this now again and I think we need to clear up the misunderstanding. The wrong behavior is not just related to cases that use .resample(...). See this minimal example (adjusted from the test case I fixed in the other PR):

from pandas import DataFrame, testing as tm, Timestamp
import numpy as np

df = DataFrame(
    {
        "A": [1, 2, np.nan, 4],
        "B": [1, 4, 9, np.nan],
        "C": [Timestamp("2020-08-01 00:00:01"), Timestamp("2020-08-01 00:00:02"), Timestamp("2020-08-01 00:00:03"), Timestamp("2020-08-01 00:00:05")],
        "D": list("abcd"),
    }
)

result = df.set_index("C").interpolate()
expected = df.set_index("C")
expected.loc[Timestamp("2020-08-01 00:00:03"), "A"] = 2.66667
expected.loc[Timestamp("2020-08-01 00:00:05"), "B"] = 9
tm.assert_frame_equal(result, expected)

In pandas 2.* this fails with

---------------------------------------------------------------------------
AssertionError: DataFrame.iloc[:, 0] (column name="A") are different

DataFrame.iloc[:, 0] (column name="A") values are different (25.0 %)
[index]: [2020-08-01T00:00:01.000000000, 2020-08-01T00:00:02.000000000, 2020-08-01T00:00:03.000000000, 2020-08-01T00:00:05.000000000]
[left]:  [1.0, 2.0, 3.0, 4.0]
[right]: [1.0, 2.0, 2.66667, 4.0]
At positional index 2, first diff: 3.0 != 2.66667

This is not using resample. Can you please clarify?

Copy link
Contributor

This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this.

@github-actions github-actions bot added the Stale label Jul 22, 2024
@cbpygit
Copy link
Contributor Author

cbpygit commented Jul 22, 2024

This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this.

I'll work on this, I was just unable to spend time.

@mroeschke
Copy link
Member

Thanks for the pull request, but it appears to have gone stale. If interested in continuing, please merge in the main branch, address any review comments and/or failing tests, and we can reopen.

@mroeschke mroeschke closed this Oct 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
inplace Relating to inplace parameter or equivalent Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Resample resample method Stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: In main, using resample().interpolate(inplace=True) raises an exception
4 participants