Description
Pandas version checks
- I have checked that the issue still exists on the latest versions of the docs on
main
here
Location of the documentation
pandas.DataFrame.setitem
https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.__setitem__.html
pandas.core.indexing.IndexingMixin.loc
https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.loc.html
User Guide: Indexing and Selecting Data
https://pandas.pydata.org/docs/user_guide/indexing.html
Documentation problem
Documentation Enhancement*
The following behavior is not clearly explained in the documentation:
```python
import pandas as pd
df = pd.DataFrame({'a': [1, 2, 3]})
df['b'] = pd.Series({1: 'b'})
print(df)
# Output:
# a b
# 0 1 NaN
# 1 2 b
# 2 3 NaN
```
- The Series is **reindexed** to match the DataFrame index.
- Values are inserted **by index label**, not by position.
- Missing labels yield **NaN**, and the order is adjusted accordingly.
This behavior is:
- Not explained in the `__setitem__` documentation (which is missing entirely).
- Only mentioned vaguely in `.loc` docs, with no example.
- Absent from the "Indexing and Selecting Data" user guide when assigning Series with unordered or partial index.
Suggested fix for documentation
-
Add docstring for
DataFrame.__setitem__
with clear explanation that:
> When assigning a Series, pandas aligns on index. Values in the Series that don't match an index label will result inNaN
.-
Update
.loc
documentation:
Include a note that when assigning a Series to.loc[row_labels, col]
, pandas aligns the Series by index and not by order. -
Add example in the User Guide under:
Indexing and Selecting DataAssigning a Series with unordered/missing index keys to a DataFrame column.
Suggested example:
df = pd.DataFrame({'a': [1, 2, 3]}) s = pd.Series({2: 'zero', 1: 'one', 0: 'two'}) df['d'] = s # Output: # a d # 0 1 two # 1 2 one # 2 3 zero
📈 Why this is better:
The current documentation is incomplete and vague about how Series alignment works in assignments. This fix:
- Makes
__setitem__
behavior explicit and discoverable. - Improves
.loc
docs with better clarity and practical context. - Adds real-world examples to the user guide to reduce silent bugs and confusion.
These improvements help all users—especially beginners—understand how pandas handles Series assignment internally.
-