## Axis indexes with duplicate values

While many pandas functions (like reindex) require that the labels be unique, it’s not mandatory. Let’s consider a small Series with duplicate indices:

In [1]:
import pandas as pd
import numpy as np
from pandas import Series, DataFrame

In [3]:
obj = Series(range(5), index=['a', 'a', 'b', 'b', 'c'])

obj

a    0
a    1
b    2
b    3
c    4
dtype: int64

The index’s is_unique property can tell you whether its values are unique or not:

In [6]:
obj.index.is_unique

False

Data selection is one of the main things that behaves differently with duplicates. Indexing a value with multiple entries returns a Series while single entries return a scalar value:

In [9]:
obj['a'], obj['c']

(a    0
 a    1
 dtype: int64,
 4)

The same logic extends to indexing rows in a DataFrame:

In [18]:
df = DataFrame(np.random.randn(4,3), index=['a', 'a', 'b', 'b'])

df

Unnamed: 0,0,1,2
a,0.006244,-1.073056,-1.13904
a,1.387099,0.426501,-0.501315
b,1.303064,-0.600027,-1.12267
b,-0.747367,1.417894,0.259119


In [16]:
df.loc['b']

Unnamed: 0,0,1,2
b,-1.809294,0.530087,-1.657723
b,-0.506407,0.327899,0.764833
