# Essential Functionality
This section has to do with some mechanics of working with data contained in a Series or DataFrame.

## Reindexing
We do this using the 'reindex' method. This means we create a new object with the data conformed to a new index.

In [2]:
import numpy as np
import pandas as pd

In [11]:
# example
obj = pd.Series([3,2,5,2,1,5], index=['e','c','a','b','d','f'])
obj

e    3
c    2
a    5
b    2
d    1
f    5
dtype: int64

In [16]:
# calling reindex on this series rearranges the data according to the new index, introducing missing values if any index values were not already present.
obj2 = obj.reindex(['a','b','c','d','e','f'])
obj2

a    5
b    2
c    2
d    1
e    3
f    5
dtype: int64

For ordered data like time series, it may be desirable to do some interpolation or fill‐
ing of values when reindexing. The method option allows us to do this, using a
method such as ffill , which forward-fills the values:

In [18]:
obj3 = pd.Series(['blue', 'purple', 'yellow'], index=[0,2,4])
obj3

0      blue
2    purple
4    yellow
dtype: object

In [26]:
obj3.reindex(range(6), method='ffill')

0      blue
1      blue
2    purple
3    purple
4    yellow
5    yellow
dtype: object

In [27]:
# backward fill does it...backwards
obj3.reindex(range(6), method='bfill')

0      blue
1    purple
2    purple
3    yellow
4    yellow
5       NaN
dtype: object

With DataFrame, reindex can alter either the (row) index, columns, or both. When passed only a sequence, it reindexes the rows in the result:

In [30]:
frame = pd.DataFrame(np.arange(9).reshape((3,3)),
                    index=['a','c','d'],
                    columns=['Ohio', 'Texas', 'California']
                    )
frame

Unnamed: 0,Ohio,Texas,California
a,0,1,2
c,3,4,5
d,6,7,8


In [33]:
frame2 = frame.reindex(['a','b','c','d'])
frame2

Unnamed: 0,Ohio,Texas,California
a,0.0,1.0,2.0
b,,,
c,3.0,4.0,5.0
d,6.0,7.0,8.0


In [41]:
states = ['Texas', 'Utah', 'California']
frame.reindex(columns=states)
frame.reindex(columns=['Ohio','Texas', 'California'])

Unnamed: 0,Ohio,Texas,California
a,0,1,2
c,3,4,5
d,6,7,8


## Dropping Entries from an Axis
Dropping one or more entries from an axis is easy if you already have an index array or list without those entries

In [50]:
ser = pd.Series(np.arange(5.), index=['a','b','c','d','e'])
ser

a    0.0
b    1.0
c    2.0
d    3.0
e    4.0
dtype: float64

In [51]:
# the drop bruhh
new_ser = ser.drop('c')
new_ser

a    0.0
b    1.0
d    3.0
e    4.0
dtype: float64

In [52]:
# drop double
new_ser = ser.drop(['a', 'c'])
new_ser

b    1.0
d    3.0
e    4.0
dtype: float64

In [53]:
ser

a    0.0
b    1.0
c    2.0
d    3.0
e    4.0
dtype: float64

With dataframe, index values can be deleted from either axis. To illustrate this, let us look at this my example.


In [56]:
data = pd.DataFrame(np.arange(16).reshape((4,4)) + 1,
                   index=['Monday','Tuesday','Thursday', 'Saturday'],
                   columns=['Vince','Elroy','Kanye','Karnage'])
data

Unnamed: 0,Vince,Elroy,Kanye,Karnage
Monday,1,2,3,4
Tuesday,5,6,7,8
Thursday,9,10,11,12
Saturday,13,14,15,16


 - You specify that you want to drop from columns by passing 'axis=1' or 'axis=columns'
 - calling drop on a sequence of labels will drop from the row labels. or just use 'axis=0'

In [60]:
data.drop('Thursday', axis=0)

Unnamed: 0,Vince,Elroy,Kanye,Karnage
Monday,1,2,3,4
Tuesday,5,6,7,8
Saturday,13,14,15,16


In [61]:
data.drop('Elroy', axis=1)

Unnamed: 0,Vince,Kanye,Karnage
Monday,1,3,4
Tuesday,5,7,8
Thursday,9,11,12
Saturday,13,15,16


In [62]:
(data.drop('Thursday', axis=0)).drop('Elroy', axis=1)

Unnamed: 0,Vince,Kanye,Karnage
Monday,1,3,4
Tuesday,5,7,8
Saturday,13,15,16
