Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEPR: Series/DataFrame.append (#35407) #44539

Merged
merged 38 commits into from Dec 27, 2021

Conversation

neinkeinkaffee
Copy link
Contributor

@neinkeinkaffee neinkeinkaffee commented Nov 20, 2021

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok to filter warnings and have a single test to assert the deprecation warning

doc/source/whatsnew/v1.4.0.rst Outdated Show resolved Hide resolved
pandas/core/frame.py Outdated Show resolved Hide resolved
pandas/core/series.py Outdated Show resolved Hide resolved
pandas/tests/frame/methods/test_append.py Outdated Show resolved Hide resolved
@neinkeinkaffee
Copy link
Contributor Author

Thanks for the suggestions. I've tried to address them. Could you re-review and let me know your feedback? Cheers!

pandas/core/series.py Outdated Show resolved Hide resolved
pandas/tests/frame/methods/test_append.py Show resolved Hide resolved
pandas/tests/series/methods/test_append.py Show resolved Hide resolved
doc/source/whatsnew/v1.4.0.rst Outdated Show resolved Hide resolved
@jreback jreback added the Deprecate Functionality to remove in pandas label Nov 20, 2021
Copy link
Member

@phofl phofl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Web and docs is failing because of the warnings

@neinkeinkaffee
Copy link
Contributor Author

Question on the failing web and docs: looking at the failing doc build, some of the cases are due to direct calls to append, which I can change into concat. But others are indirect, e.g. doc/source/info_tutorials/getting_started/02_read_write.rst calls titanic.info() on a DataFrame, which calls frame.memory_usage which calls series.append. In such cases, it's probably preferred to change the series.append into a concat in frame.memory_usage? Or should I just add an :okwarning: to the ipython codeblock?

@jreback
Copy link
Contributor

jreback commented Nov 21, 2021

no all usages should be changes in tests and docs except those we r explicitly doing

@neinkeinkaffee
Copy link
Contributor Author

@jreback Just to make sure I don't misunderstand: given that frame.memory_usage is neither in doc nor test, I shouldn't change the call to series.append to concat there, but instead add a :okwarning: to the ipython codeblock? And likewise not change any other instances were append is called in the pandas code that's neither documentation nor test?

@jreback
Copy link
Contributor

jreback commented Nov 22, 2021

no u do not add okwarning anywhere

u need to fix this for example in the code not to output a warning (as this is tested in many tests)

@jreback
Copy link
Contributor

jreback commented Nov 25, 2021

when you push pls merge master

@neinkeinkaffee
Copy link
Contributor Author

Yes, will do!

@MarcoGorelli MarcoGorelli self-requested a review November 29, 2021 22:13
s = pd.Series([0, 1], index=["a", "b"]).set_flags(allows_duplicate_labels=False)
msg = "Index has duplicates."
with pytest.raises(pd.errors.DuplicateLabelError, match=msg):
func(s)
pd.concat([pd.Series(0, index=["a", "b"]), s])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you construct the series outside of the pytest.raises; makes clearer exactly what part of this line is raising

@@ -538,7 +538,8 @@ def test_partial_set_invalid(self):
# allow object conversion here
df = orig.copy()
df.loc["a", :] = df.iloc[0]
exp = orig._append(Series(df.iloc[0], name="a"))
s = Series(df.iloc[0], name="a")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"s" -> "ser" pls

@@ -179,16 +178,12 @@ def test_dups_index(self):
tm.assert_frame_equal(result.iloc[10:], df)

# append
result = df.iloc[0:8, :].append(df.iloc[8:])
result = concat([df.iloc[0:8, :], df.iloc[8:]])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are these redundant with the tests above? this may be better thought of as a targeted test for _append.

self.index.memory_usage(deep=deep), index=["Index"]
).append(result)
)
result = concat([index_memory_usage, result])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'd rather keep using _append internally; should be more performant and avoids circular dependency

@@ -109,7 +110,8 @@ def test_append_sorts(self, sort):
df1 = DataFrame({"a": [1, 2], "b": [1, 2]}, columns=["b", "a"])
df2 = DataFrame({"a": [1, 2], "c": [3, 4]}, index=[2, 3])

with tm.assert_produces_warning(None):
# GH#35407
with tm.assert_produces_warning(FutureWarning):
result = df1.append(df2, sort=sort)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this file is about testing 'append', then these should just be changed to test '_append'

@jreback
Copy link
Contributor

jreback commented Dec 23, 2021

@neinkeinkaffee can you merge master

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor comment, ping on green

While not especially efficient (since a new object must be created), you can
append a single row to a ``DataFrame`` by passing a ``Series`` or dict to
``append``, which returns a new ``DataFrame`` as above.
You can append a single row in form of a series to a ``DataFrame`` in-place
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't show L416, just make the alternative the thing

@jbrockmendel
Copy link
Member

closes #22957?

@neinkeinkaffee
Copy link
Contributor Author

@jreback I've changed the docs and removed the part about adding rows in-place. However, the Database / Linux_py38_IO pipeline currently fails with pytest: command not found, while the Azure pipelines (pandas-dev.pandas Windows py38_np18_2 and Windows py39_2), fail because of a DeprecationWarning that gets raised in pandas/tests/io/parser/test_header.py::test_no_header_prefix[pyarrow], which doesn't appear to be related to the changes I've made. Is this something known or should I dig into it?

@jreback
Copy link
Contributor

jreback commented Dec 27, 2021

@neinkeinkaffee if you can merge master again; just want to get a clean run here. we do have some occasional test failures on CI (though some recent PRs do address some of these)

Copy link
Member

@mroeschke mroeschke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Great job @neinkeinkaffee

@jreback jreback merged commit f6df343 into pandas-dev:master Dec 27, 2021
@jreback
Copy link
Contributor

jreback commented Dec 27, 2021

thanks @neinkeinkaffee yep very nice!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Deprecate Functionality to remove in pandas
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Deprecate Series / DataFrame.append How should DataFrame.append behave related to indexes types?
6 participants