Skip to content

DOC: Typo within Series.mask() docs - alignment is done between self and cond, not cond and other #61781

Open
@nickodell

Description

@nickodell

Pandas version checks

  • I have checked that the issue still exists on the latest versions of the docs on main here

Location of the documentation

https://pandas.pydata.org/docs/reference/api/pandas.Series.mask.html#pandas.Series.mask

Similar issues exist for DataFrame.mask, Series.where, and DataFrame.where; they appear to use the same docstring with replacements.

Documentation problem

In this passage:

The mask method is an application of the if-then idiom. For each element in the calling DataFrame, if cond is False the element is used; otherwise the corresponding element from the DataFrame other is used. If the axis of other does not align with axis of cond Series/DataFrame, the misaligned index positions will be filled with True.

The bolded sentence is not correct. Here is an example where the other value is not aligned to cond, because the d value in cond has no match in other. However, cond is still not filled with True.

import pandas as pd

a = pd.Series(['apple', 'banana', 'cherry', 'dango'], index=['a', 'b', 'c', 'd'])
b = pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])

other = pd.Series(['asparagus', 'broccoli', 'carrot', 'dill'], index=['a', 'b', 'c', 'D'])
cond = b.lt(3)

print("Cond matches other?", cond.index == other.index)
print("Cond matches self?", cond.index == a.index)

a.mask(cond, other)

Output:

Cond matches other? [ True  True  True False]
Cond matches self? [ True  True  True  True]

a    asparagus
b     broccoli
c       cherry
d        dango

In this example, you can see that even though cond's d index has no corresponding aligned element in other, it still does not make a replacement for item d - not even to replace it with an NA value.

Rather, the alignment is done between self and cond, not other and cond.

Suggested fix for documentation

Proposed fix:

The mask method is an application of the if-then idiom. For each element in the calling DataFrame, if cond is False the element is used; otherwise the corresponding element from the DataFrame other is used. If the axis of self does not align with axis of cond Series/DataFrame, the misaligned index positions will be filled with True.

Here is an example which shows that this is correct. In the following code, cond and self are not aligned. The unaligned value in cond is treated as True.

import pandas as pd

a = pd.Series(['apple', 'banana', 'cherry', 'dango'], index=['a', 'b', 'c', 'd'])
b = pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'D'])

other = pd.Series(['asparagus', 'broccoli', 'carrot', 'dill'], index=['a', 'b', 'c', 'd'])
cond = b.lt(3)

print("Cond matches other?", cond.index == other.index)
print("Cond matches self?", cond.index == a.index)

a.mask(cond, other)

Output:

Cond matches other? [ True  True  True False]
Cond matches self? [ True  True  True False]

a    asparagus
b     broccoli
c       cherry
d         dill
dtype: object

Metadata

Metadata

Assignees

No one assigned

    Labels

    DocsNeeds TriageIssue that has not been reviewed by a pandas team member

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions