**2. Conditional selection and reindexing of a dataframe**

Conditional selecting values in a dataframe is similar to selecting values within a numpy array

In [3]:
import pandas as pd
import numpy as np

from numpy.random import randint
np.random.seed(100)

df = pd.DataFrame(randint(0, 100,(4,4)), index = ['Sachin', 'Sehwag', 'Ganguly', 'David'],\
                  columns = ['Match1', 'Match2', 'Match3', 'Final'])

df

Unnamed: 0,Match1,Match2,Match3,Final
Sachin,8,24,67,87
Sehwag,79,48,10,94
Ganguly,52,98,53,66
David,98,14,34,24


In [4]:
df>50  # conditional mask

Unnamed: 0,Match1,Match2,Match3,Final
Sachin,False,False,True,True
Sehwag,True,False,False,True
Ganguly,True,True,True,True
David,True,False,False,False


In [5]:
df[df>50]  # display scores where the players have scored greater than 50

Unnamed: 0,Match1,Match2,Match3,Final
Sachin,,,67.0,87.0
Sehwag,79.0,,,94.0
Ganguly,52.0,98.0,53.0,66.0
David,98.0,,,


In [14]:
df[df['Match1']>50]  # display rows where players have scored ">50" in  Match1

# the above code is also a conditional mask

Unnamed: 0,Match1,Match2,Match3,Final
Sehwag,79,48,10,94
Ganguly,52,98,53,66
David,98,14,34,24


In [15]:
df[df['Match1']<10]  # display rows where players have scored "<10" in Match1

Unnamed: 0,Match1,Match2,Match3,Final
Sachin,8,24,67,87


In [16]:
# display scores of players in Match2 who scored ">50" in Match1

df[df['Match1']>50]['Match2']

Sehwag     48
Ganguly    98
David      14
Name: Match2, dtype: int32

In [19]:
# display scores of players in Match2 & Match3 who scored ">50" in Match1

df[df['Match1']>50][['Match2', 'Match3']]

Unnamed: 0,Match2,Match3
Sehwag,48,10
Ganguly,98,53
David,14,34


For multiple conditions we can use & and | operator within parethesis:

In [21]:
# Displaying scores of players who scored ">50" in Match1 and Final

df[(df['Match1']>50) & (df['Final']>50)]

Unnamed: 0,Match1,Match2,Match3,Final
Sehwag,79,48,10,94
Ganguly,52,98,53,66


In [23]:
# Displaying scores of players who scored ">50" in Match3 or Final

df[(df['Match3']>50) | (df['Final']>50)]

Unnamed: 0,Match1,Match2,Match3,Final
Sachin,8,24,67,87
Sehwag,79,48,10,94
Ganguly,52,98,53,66


**3. Resetting and setting new index of a dataframe**

In [24]:
df

Unnamed: 0,Match1,Match2,Match3,Final
Sachin,8,24,67,87
Sehwag,79,48,10,94
Ganguly,52,98,53,66
David,98,14,34,24


In [26]:
df.reset_index()  # reset index to default 0, 1...

Unnamed: 0,index,Match1,Match2,Match3,Final
0,Sachin,8,24,67,87
1,Sehwag,79,48,10,94
2,Ganguly,52,98,53,66
3,David,98,14,34,24


In [27]:
ind = ['Rio', 'Dowman', 'Amad', 'Isak']  # a list of new players

In [28]:
df['Players'] = ind  # creating a new column by name 'Players'

In [29]:
df

Unnamed: 0,Match1,Match2,Match3,Final,Players
Sachin,8,24,67,87,Rio
Sehwag,79,48,10,94,Dowman
Ganguly,52,98,53,66,Amad
David,98,14,34,24,Isak


In [30]:
df.set_index('Players', inplace = True)  # setting player names as index

In [31]:
df

Unnamed: 0_level_0,Match1,Match2,Match3,Final
Players,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Rio,8,24,67,87
Dowman,79,48,10,94
Amad,52,98,53,66
Isak,98,14,34,24
