# Duplicate Index Values
* It is possible to have duplicate index values in a Pandas Series or DataFrame
* Accessing these indecies by their label using .loc[] returns all corrisponding rows
* Duplicate index values are not advised, but there are edge cases where they are useful

In [2]:
import numpy as np
import pandas as pd

In [9]:
sales = [0, 5, 155, 0, 518]
items = ["coffee", "coffee", "tea", "coconut", "sugar"]

sales_series = pd.Series(sales, index=items, name="Sales")

sales_series # coffee will be returned as an idex value twice

coffee       0
coffee       5
tea        155
coconut      0
sugar      518
Name: Sales, dtype: int64

In [5]:
sales_series.loc["coffee"] # returns both rows with the same label

coffee    0
coffee    5
Name: Sales, dtype: int64

### You can reset the index in a Pandas Series or DataFrame back to the default range of integers by using the .reset_index() method.
* By default, the existing index will become a new column in a dataframe.

In [12]:
sales_series.reset_index(drop=True) # including drop=True argument will overwrite the index with a 0 to -1 index range

0      0
1      5
2    155
3      0
4    518
Name: Sales, dtype: int64

In [13]:
sales_series.reset_index() # with no argument the output will be stored as a dataframe with the original index being stored as a column

Unnamed: 0,index,Sales
0,coffee,0
1,coffee,5
2,tea,155
3,coconut,0
4,sugar,518


### More examples

In [16]:
my_series = pd.Series(range(5), index=["Day 0", "Day 0", "Day 0", "Day 2", "Day 2"])

my_series.index

Index(['Day 0', 'Day 0', 'Day 0', 'Day 2', 'Day 2'], dtype='object')

In [18]:
my_series["Day 0"] # duplicate index values make it difficult to work with specific data we're interested in.

Day 0    0
Day 0    1
Day 0    2
dtype: int64

In [22]:
my_series["Day 0"][1] # an additional argument slice is needed to access a specfic row with [1]

  my_series["Day 0"][1] # an additional argument is needed to access a specfic row with [1]


1

In [23]:
my_series.reset_index() # with no argument the original index is moved into its own column

Unnamed: 0,index,0
0,Day 0,0
1,Day 0,1
2,Day 0,2
3,Day 2,3
4,Day 2,4


In [25]:
my_series.reset_index(drop=True) # resets to the index to a 0 to -1 index range

0    0
1    1
2    2
3    3
4    4
dtype: int64

In [26]:
my_series.reset_index(drop=True).loc[2]

2

In [29]:
my_series.reset_index(drop=True).loc[2:4]

2    2
3    3
4    4
dtype: int64