-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Preserve column names in DataFrame.from_records when nrows=0 #61143
base: main
Are you sure you want to change the base?
Conversation
…if nrows == 0' to return Cls(columns=columns) in core/frame.py. - Added test to verify column preservation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! When making a PR, follow these steps here:
https://pandas.pydata.org/pandas-docs/dev/development/contributing.html#making-a-pull-request
namely step 4.
def test_empty_df_preserve_col(): | ||
rows = [] | ||
df = pd.DataFrame.from_records(iter(rows), columns=['col_1', 'Col_2'], nrows=0) | ||
assert list(df.columns)==['col_1', 'Col_2'] | ||
assert len(df) == 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you follow the dev docs here: https://pandas.pydata.org/pandas-docs/dev/development/contributing_codebase.html#writing-tests
Namely, search the current tests for from_records
and that should give a good indication of where to place this test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file needs to be removed.
I have updated the PR with the latest changes based on feedback. Please review again. |
Looks good! The change in One minor point: It looks like the test was also added as a new standalone file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good! Will also need to add to the whatsnew for v3.0.0 in the I/O
section.
@@ -2780,6 +2780,12 @@ def test_construction_nan_value_timedelta64_dtype(self): | |||
) | |||
tm.assert_frame_equal(result, expected) | |||
|
|||
def test_from_records_empty_iterator_with_preserve_columns(self): | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you start the test with a comment referencing the issue. # GH#61140
def test_empty_df_preserve_col(): | ||
rows = [] | ||
df = pd.DataFrame.from_records(iter(rows), columns=['col_1', 'Col_2'], nrows=0) | ||
assert list(df.columns)==['col_1', 'Col_2'] | ||
assert len(df) == 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file needs to be removed.
@@ -2780,6 +2780,12 @@ def test_construction_nan_value_timedelta64_dtype(self): | |||
) | |||
tm.assert_frame_equal(result, expected) | |||
|
|||
def test_from_records_empty_iterator_with_preserve_columns(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you move this test to tests/frame/constructors/test_from_records.py
def test_from_records_empty_iterator_with_preserve_columns(self): | ||
|
||
rows = [] | ||
df = pd.DataFrame.from_records(iter(rows), columns=["col_1", "Col_2"], nrows=0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you call the result result
instead of df
.
assert list(df.columns) == ["col_1", "Col_2"] | ||
assert len(df) == 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of these two lines, can you check the entire result.
expected = DataFrame(...)
tm.assert_frame_equal(result, expected)`
Description
Updates pandas/core/frame.py to preserve column names in empty DataFrames when nrows == 0. Changed from return Cls() to return Cls(columns=columns).
Closes #61140
Changes Made
Testing