Skip to content

Conversation

johnzangwill
Copy link
Contributor

Factored out multiple occurances of missing name logic into common method.
DataFrame.to_records() changed to use common method rather than missing value count when filling missing names.
Index.to_frame() left alone for the time being.

import pandas as pd
f = pd.DataFrame([1], index=pd.MultiIndex.from_arrays([[2],[3],[4]], names=[None, "a", None]))

Old behavior:

>>> pd.DataFrame(f.to_records())
   level_0  a  level_1  0
0        2  3        4  1

New behavior:

>>> pd.DataFrame(f.to_records())
   level_0  a  level_2  0
0        2  3        4  1

@johnzangwill johnzangwill marked this pull request as draft December 14, 2021 13:39
@johnzangwill johnzangwill marked this pull request as ready for review December 14, 2021 15:48
Copy link
Member

@phofl phofl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small comments, otherwise lgtm

df.index.names = ["A", None]
rs = df.to_records()
assert "level_0" in rs.dtype.fields
assert "level_1" in rs.dtype.fields
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you maybe add a test checking the whole result?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cruel! But done.

- Bug in :meth:`DataFrame.shift` with ``axis=1`` and ``ExtensionDtype`` columns incorrectly raising when an incompatible ``fill_value`` is passed (:issue:`44564`)
- Bug in :meth:`DataFrame.diff` when passing a NumPy integer object instead of an ``int`` object (:issue:`44572`)
- Bug in :meth:`Series.replace` raising ``ValueError`` when using ``regex=True`` with a :class:`Series` containing ``np.nan`` values (:issue:`43344`)
- Bug in :meth:`DataFrame.to_records` missing names filled incorrectly (:issue:`44818`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you specify a bit more?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@johnzangwill johnzangwill requested a review from phofl December 14, 2021 18:59
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small comments, not 100% sold on the location (maybe @jbrockmendel can think of a better one), but ok. ping on green.

- Bug in :meth:`DataFrame.shift` with ``axis=1`` and ``ExtensionDtype`` columns incorrectly raising when an incompatible ``fill_value`` is passed (:issue:`44564`)
- Bug in :meth:`DataFrame.diff` when passing a NumPy integer object instead of an ``int`` object (:issue:`44572`)
- Bug in :meth:`Series.replace` raising ``ValueError`` when using ``regex=True`` with a :class:`Series` containing ``np.nan`` values (:issue:`43344`)
- Bug in :meth:`DataFrame.to_records` where an incorrect n was used when missing names were replaced by level_n (:issue:`44818`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add backticks around the n and level_n

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add backticks around the n and level_n

Done

return _builtin_table.get(arg, arg)


def fill_missing_names(names):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you type in and out here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, and added version and more docstring.

@jreback jreback added the Reshaping Concat, Merge/Join, Stack/Unstack, Explode label Dec 15, 2021
@jreback jreback added this to the 1.4 milestone Dec 15, 2021
@jbrockmendel
Copy link
Member

test location makes sense to me

@pep8speaks
Copy link

pep8speaks commented Dec 16, 2021

Hello @johnzangwill! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2021-12-16 13:28:03 UTC

@johnzangwill
Copy link
Contributor Author

not 100% sold on the location (maybe @jbrockmendel can think of a better one)

I suspect that @jreback was referring to the fact that I factored the logic into core/common.py.
It seemed a good place to me. My method is ref'ed in:

  • core/frame.py
  • core/indexes/multi.py
  • io/pytables.py
  • io/sql.py
  • io/json/_table_schema.py

and core/common is already imported as com into every one.
But if you have a better suggestion...

@johnzangwill johnzangwill requested a review from jreback December 16, 2021 15:21
@jreback jreback merged commit a769e38 into pandas-dev:master Dec 16, 2021
@jreback
Copy link
Contributor

jreback commented Dec 16, 2021

thanks @johnzangwill very nice!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Reshaping Concat, Merge/Join, Stack/Unstack, Explode

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BUG: Inconsistent conversion of missing column names

5 participants