# Essential Functionalities

## Reindexing

In [1]:
import numpy as np
import pandas as pd
from pandas import Series, DataFrame

In [2]:
obj = pd.Series([4.5,7.2,-5.3,3.6], index=["d", 'b', 'a', 'c'])
obj

d    4.5
b    7.2
a   -5.3
c    3.6
dtype: float64

In [4]:
obj2 = obj.reindex(['a', 'b', 'c', 'd', 'e'])
obj2

a   -5.3
b    7.2
c    3.6
d    4.5
e    NaN
dtype: float64

- reindexing does not change the underlying dataframe.
- a new dataframe is created.

In [5]:
obj

d    4.5
b    7.2
a   -5.3
c    3.6
dtype: float64

- method `ffill`: forward fill can be used to do some interpolation of filling of values when reindexing for missing values.

In [7]:
obj3 = pd.Series(['blue','purple','yellow'], index=[0,2,4])
obj3

0      blue
2    purple
4    yellow
dtype: object

In [8]:
obj3.reindex(np.arange(6), method='ffill')

0      blue
1      blue
2    purple
3    purple
4    yellow
5    yellow
dtype: object

- With Datafram, reindex can alter the (row) index, columns, or both.
- When passed only a sequence, it reindexes the rows in the result

In [9]:
frame = pd.DataFrame(np.arange(9).reshape((3,3)), index=['a', 'c', 'd'], columns=['Ohio', 'Texas', 'California'])
frame

Unnamed: 0,Ohio,Texas,California
a,0,1,2
c,3,4,5
d,6,7,8


In [10]:
frame2 = frame.reindex(index=['a','b','c','d'])
frame2

Unnamed: 0,Ohio,Texas,California
a,0.0,1.0,2.0
b,,,
c,3.0,4.0,5.0
d,6.0,7.0,8.0


- The columns can be reindexed with the `columns` keyword
- If a column is not present in the reindex, the data will be dropped from the result. for example: Utah is not present in df and hence when reindexing Utah becomes nan
- Another way to reindex based on the columns is to pass in the axis.

In [11]:
states = ['Texas', 'Utah', 'California']
frame.reindex(columns=states)

Unnamed: 0,Texas,Utah,California
a,1,,2
c,4,,5
d,7,,8


In [12]:
frame

Unnamed: 0,Ohio,Texas,California
a,0,1,2
c,3,4,5
d,6,7,8


In [14]:
frame.reindex(states, axis='columns')

Unnamed: 0,Texas,Utah,California
a,1,,2
c,4,,5
d,7,,8


In [15]:
frame.reindex(states, axis=1)

Unnamed: 0,Texas,Utah,California
a,1,,2
c,4,,5
d,7,,8


- Most preferred way to reindex is to use the `loc` operator

In [16]:
frame.loc[['a', 'd', 'c'], ['California', 'Texas']]

Unnamed: 0,California,Texas
a,2,1
d,8,7
c,5,4
