Unlike all the other elements in the pandas data structures(Series and DataFrame), the index objects are immutable. Once declared, they cannot be changed. This ensures their secure sharing between the various data structures.

## Methods on Index
Some indexes are available to get some information abour indexes from a data structure.

In [1]:
import numpy as np
import pandas as pd

In [18]:
data = {'color' : ['blue','green','yellow','red','white'],
        'object' : ['ball','pen','pencil','paper','mug'],
        'price' : [1.2,1.0,0.6,0.9,1.7]}
frame = pd.DataFrame(data)

In [2]:
ser = pd.Series([5,0,3,8,4],index=['red','blue','yellow','white','green'])
ser

red       5
blue      0
yellow    3
white     8
green     4
dtype: int64

In [4]:
# Returns index with the lowest value
ser.idxmin()

'blue'

In [5]:
# Returns index with the highest value
ser.idxmax()

'white'

## Index with Duplicate Labels

In [6]:
serd = pd.Series(range(6), index=['white','white','blue','green', 'green','yellow'])
serd

white     0
white     1
blue      2
green     3
green     4
yellow    5
dtype: int64

In [7]:
serd['white']

white    0
white    1
dtype: int64

Pandas provides you with the is_unique attribute belonging to the Index objects. This attribute will tell you if there are indexes with duplicate labels inside the structure data (both series and dataframe).

In [8]:
serd.index.is_unique

False

In [19]:
frame.index.is_unique

True

## Other Functionalities on Indexes

### Reindexing

In [10]:
ser = pd.Series([2,5,7,4], index=['one','two','three','four'])
ser

one      2
two      5
three    7
four     4
dtype: int64

In order to reindex this series, pandas provides you with the reindex() function. This function creates a new series object with the values of the previous series rearranged according to the new sequence of labels.<p>During reindexing, it is possible to change the order of the sequence of indexes, delete some of them, or add new ones. In the case of a new label, pandas adds NaN as the corresponding value.</p>

In [12]:
ser.reindex(['three','four','five','six','one'])

three    7.0
four     4.0
five     NaN
six      NaN
one      2.0
dtype: float64

In [13]:
# Interpolation; because the labels are not in order and discontinuous
ser3 = pd.Series([1,5,6,3],index=[0,3,5,6])
ser3

0    1
3    5
5    6
6    3
dtype: int64

In [14]:
ser3.reindex(np.arange(6), method='ffill')

0    1
1    1
2    1
3    5
4    5
5    6
dtype: int64

By interpolation, those with the lowest index in the original series have been assigned as values. In fact, index 1 and 2 have values 1 which belongs to index 0.<p>If you want this index value to be assigned during the interpolation, you have to use the bfill method.</p>

In [16]:
ser3.reindex(np.arange(6), method='bfill')

0    1
1    5
2    5
3    5
4    6
5    6
dtype: int64

In [20]:
frame

Unnamed: 0,color,object,price
0,blue,ball,1.2
1,green,pen,1.0
2,yellow,pencil,0.6
3,red,paper,0.9
4,white,mug,1.7


In [23]:
frame.reindex(range(5),method='ffill',columns=['colors','price','new','object'])

Unnamed: 0,colors,price,new,object
0,blue,1.2,blue,ball
1,green,1.0,green,pen
2,yellow,0.6,yellow,pencil
3,red,0.9,red,paper
4,white,1.7,white,mug


## Dropping
This method will return a new object without the items that you want to delete.

In [24]:
ser = pd.Series(np.arange(4.), index=['red','blue','yellow','white'])
ser

red       0.0
blue      1.0
yellow    2.0
white     3.0
dtype: float64

In [27]:
ser.drop('yellow')

red      0.0
blue     1.0
white    3.0
dtype: float64

In [28]:
# The dropping is not reflected in the original data structure. It just returns the object without the dropped items.
ser.drop(['blue','white'])

red       0.0
yellow    2.0
dtype: float64

In [30]:
frame = pd.DataFrame(np.arange(16).reshape((4,4)),index=['red','blue','yellow','white'],columns=['ball','pen','pencil','paper'])

In [31]:
frame

Unnamed: 0,ball,pen,pencil,paper
red,0,1,2,3
blue,4,5,6,7
yellow,8,9,10,11
white,12,13,14,15


In [32]:
frame.drop(['blue','yellow'])

Unnamed: 0,ball,pen,pencil,paper
red,0,1,2,3
white,12,13,14,15


To delete columns, you always need to specify the indexes of the columns, but you must specify the axis from which to delete the elements, and this can be done using the axis option.

In [33]:
frame.drop(['pencil','paper'],axis=1)

Unnamed: 0,ball,pen
red,0,1
blue,4,5
yellow,8,9
white,12,13


In [34]:
frame

Unnamed: 0,ball,pen,pencil,paper
red,0,1,2,3
blue,4,5,6,7
yellow,8,9,10,11
white,12,13,14,15


## Arithmetic and Data Alignment
Perhaps the most powerful feature involving the indexes in a data structure, is that pandas can align indexes coming from two different data structures.

In [35]:
s1 = pd.Series([3,2,5,1],['white','yellow','green','blue'])
s2 = pd.Series([1,4,7,2,1],['white','yellow','black','blue','brown'])

In [36]:
# When the labels are present in both the objects, their values will be added. If not, the value will be NaN.
s1 + s2

black     NaN
blue      3.0
brown     NaN
green     NaN
white     4.0
yellow    6.0
dtype: float64

In [37]:
frame1 = pd.DataFrame(np.arange(16).reshape((4,4)), 
         index=['red','blue','yellow','white'],
         columns=['ball','pen','pencil','paper'])

In [40]:
frame2 = pd.DataFrame(np.arange(12).reshape((4,3)),
                       index=['blue','green','white','yellow'],
                       columns=['mug','pen','ball'])

In [41]:
frame1

Unnamed: 0,ball,pen,pencil,paper
red,0,1,2,3
blue,4,5,6,7
yellow,8,9,10,11
white,12,13,14,15


In [42]:
frame2

Unnamed: 0,mug,pen,ball
blue,0,1,2
green,3,4,5
white,6,7,8
yellow,9,10,11


In [43]:
frame1 + frame2

Unnamed: 0,ball,mug,paper,pen,pencil
blue,6.0,,,6.0,
green,,,,,
red,,,,,
white,20.0,,,20.0,
yellow,19.0,,,19.0,
