Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix insertion of None value in MultiIndex #59069

Merged
merged 7 commits into from
Jun 25, 2024

Conversation

chaoyihu
Copy link
Contributor

@chaoyihu chaoyihu commented Jun 22, 2024

Copy link
Contributor

@yuanx749 yuanx749 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to also include other NaN in this fix? I think inserting np.nan would cause this bug too.

@@ -543,6 +543,7 @@ MultiIndex
^^^^^^^^^^
- :func:`DataFrame.loc` with ``axis=0`` and :class:`MultiIndex` when setting a value adds extra columns (:issue:`58116`)
- :meth:`DataFrame.melt` would not accept multiple names in ``var_name`` when the columns were a :class:`MultiIndex` (:issue:`58033`)
- :meth:`MultiIndex.insert` would not insert None value correctly at unified location of index -1 (:issue:`59003`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- :meth:`MultiIndex.insert` would not insert None value correctly at unified location of index -1 (:issue:`59003`)
- :meth:`MultiIndex.insert` would not insert ``None`` value correctly at unified location of index -1 (:issue:`59003`)

nitpick

def test_multiindex_insert_level_with_na(self):
# GH 59003
indices = [["one"], ["a"], ["yes"]]
df = DataFrame([0], index=indices).T
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you construct this without .T?

df = DataFrame([0], index=indices).T
df["one", None, "yes"] = 1
expected = DataFrame([1], index=Index(["yes"])).T
tm.assert_frame_equal(df["one"][None], expected)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you test this without doing a chained indexing?

df["one", None, "yes"] = 1
expected = DataFrame([1], index=Index(["yes"])).T
tm.assert_frame_equal(df["one"][None], expected)
df["one", np.nan, "yes"] = 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use pytest.mark.parametrize with np.nan and None

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! I tried to stay close to the reproducible code example in the linked issue, but it's really not necessary. Now the test case looks more concise and to the point. Thanks for the suggestions.

@mroeschke mroeschke added Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate MultiIndex labels Jun 24, 2024
@mroeschke mroeschke added this to the 3.0 milestone Jun 25, 2024
@mroeschke mroeschke merged commit d604aa8 into pandas-dev:main Jun 25, 2024
44 checks passed
@mroeschke
Copy link
Member

Thanks @chaoyihu

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate MultiIndex
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: None value in MultiIndex can't be found
3 participants