# Contents

- Basic Indexing Definition 
- 4 Dataframe Indexers 
    - [  ]  - Row operations
    - .loc[ ] - Selecting by Lables
    - .iloc[ ] - Selecting by Positions
    - .ix[ ] (Pending)
- Boolean Indexing

### What is Indexing in Pandas?

Indexing in pandas means simply selecting particular rows and columns of data from a DataFrame. Indexing could mean selecting all the rows and some of the columns, some of the rows and all of the columns, or some of each of the rows and columns. Indexing can also be known as Subset Selection.

In [74]:
import numpy as np
import pandas as pd

In [5]:
data = pd.read_csv('nba.csv')

In [7]:
data.head(5)

Unnamed: 0,Name,Team,Number,Position,Age,Height,Weight,College,Salary
0,Avery Bradley,Boston Celtics,0,PG,25,2-Jun,180,Texas,7730337.0
1,Jae Crowder,Boston Celtics,99,SF,25,6-Jun,235,Marquette,6796117.0
2,John Holland,Boston Celtics,30,SG,27,5-Jun,205,Boston University,
3,R.J. Hunter,Boston Celtics,28,SG,22,5-Jun,185,Georgia State,1148640.0
4,Jonas Jerebko,Boston Celtics,8,PF,29,10-Jun,231,,5000000.0


### Goal is find a subset of the above DATA

#### Pandas Indexing

Pandas support four types of Multi-axes indexing they are:

- Dataframe.[ ] ; This function also known as indexing operator
###### pandas provides a suite of methods in order to have purely label based indexing.
- Dataframe.loc[ ] : This function is used for labels.
- Dataframe.iloc[ ] : This function is used for positions or integer based
- Dataframe.ix[ ] : This function is used for both label and integer based

These are four function which help in getting the elements, rows, and columns from a DataFrame.
 
Indexing a Dataframe using indexing operator [ ] :
Indexing operator is used to refer to the square brackets following an object. The .loc and .iloc indexers also use the indexing operator to make selections.

The df.iloc indexer is very similar to df.loc but only uses integer locations to make its selections.

### 1. The most basic indexing using [ ]

In [59]:
data['Name'].head(10) # The object type is Series 

0    Avery Bradley
1      Jae Crowder
2     John Holland
3      R.J. Hunter
4    Jonas Jerebko
5     Amir Johnson
6    Jordan Mickey
7     Kelly Olynyk
8     Terry Rozier
9     Marcus Smart
Name: Name, dtype: object

In [64]:
# SYNTAX: data[start:end:step]

data[1:3].head(5)

Unnamed: 0,Name,Team,Number,Position,Age,Height,Weight,College,Salary
1,Jae Crowder,Boston Celtics,99,SF,25,6-Jun,235,Marquette,6796117.0
2,John Holland,Boston Celtics,30,SG,27,5-Jun,205,Boston University,


In [65]:
# SYNTAX: data[start:end:step]

data[3:].head(5)

Unnamed: 0,Name,Team,Number,Position,Age,Height,Weight,College,Salary
3,R.J. Hunter,Boston Celtics,28,SG,22,5-Jun,185,Georgia State,1148640.0
4,Jonas Jerebko,Boston Celtics,8,PF,29,10-Jun,231,,5000000.0
5,Amir Johnson,Boston Celtics,90,PF,29,9-Jun,240,,12000000.0
6,Jordan Mickey,Boston Celtics,55,PF,21,8-Jun,235,LSU,1170960.0
7,Kelly Olynyk,Boston Celtics,41,C,25,Jul-00,238,Gonzaga,2165160.0


In [69]:
# SYNTAX: data[start:end:step]

data[::100]

Unnamed: 0,Name,Team,Number,Position,Age,Height,Weight,College,Salary
0,Avery Bradley,Boston Celtics,0,PG,25,2-Jun,180,Texas,7730337.0
100,Chris Paul,Los Angeles Clippers,3,PG,31,Jun-00,175,Wake Forest,21468695.0
200,George Hill,Indiana Pacers,3,PG,30,3-Jun,188,IUPUI,8000000.0
300,Danny Green,San Antonio Spurs,14,SG,28,6-Jun,215,North Carolina,10000000.0
400,Kevin Garnett,Minnesota Timberwolves,21,PF,40,11-Jun,240,,8500000.0


In [70]:
# SYNTAX: data[start:end:step]

data[::-100]

Unnamed: 0,Name,Team,Number,Position,Age,Height,Weight,College,Salary
456,Jeff Withey,Utah Jazz,24,C,26,Jul-00,231,Kansas,947276.0
356,Aaron Gordon,Orlando Magic,0,PF,20,9-Jun,220,Arizona,4171680.0
256,Jason Terry,Houston Rockets,31,SG,38,2-Jun,185,Arizona,947276.0
156,Pau Gasol,Chicago Bulls,16,C,35,Jul-00,250,,7448760.0
56,Jahlil Okafor,Philadelphia 76ers,8,C,20,11-Jun,275,Duke,4582680.0


### methods in the operator [   ]

In [94]:
data[data.isnull().any(axis = 1)].head(10)

Unnamed: 0,Name,Team,Number,Position,Age,Height,Weight,College,Salary
2,John Holland,Boston Celtics,30,SG,27,5-Jun,205,Boston University,
4,Jonas Jerebko,Boston Celtics,8,PF,29,10-Jun,231,,5000000.0
5,Amir Johnson,Boston Celtics,90,PF,29,9-Jun,240,,12000000.0
15,Bojan Bogdanovic,Brooklyn Nets,44,SG,27,8-Jun,216,,3425510.0
20,Sergey Karasev,Brooklyn Nets,10,SG,22,7-Jun,208,,1599840.0
32,Thanasis Antetokounmpo,New York Knicks,43,SF,23,7-Jun,205,,30888.0
34,Jose Calderon,New York Knicks,3,PG,34,3-Jun,200,,7402812.0
40,Kristaps Porzingis,New York Knicks,6,PF,20,3-Jul,240,,4131720.0
41,Kevin Seraphin,New York Knicks,1,C,26,10-Jun,278,,2814000.0
43,Sasha Vujacic,New York Knicks,18,SG,32,7-Jun,195,,947276.0


### Filtering

In [96]:
data[data["Name"] == "John Holland"]

Unnamed: 0,Name,Team,Number,Position,Age,Height,Weight,College,Salary
2,John Holland,Boston Celtics,30,SG,27,5-Jun,205,Boston University,


### 2: You can pass a *list* of columns to [  ] to select columns in that order.

In [24]:
data[['Name','Weight']].head(10)

Unnamed: 0,Name,Weight
0,Avery Bradley,180
1,Jae Crowder,235
2,John Holland,205
3,R.J. Hunter,185
4,Jonas Jerebko,231
5,Amir Johnson,240
6,Jordan Mickey,235
7,Kelly Olynyk,238
8,Terry Rozier,190
9,Marcus Smart,220


### 3: Indexing using the location of the index loc[   ]

In [27]:
data.loc[0]

Name         Avery Bradley
Team        Boston Celtics
Number                   0
Position                PG
Age                     25
Height               2-Jun
Weight                 180
College              Texas
Salary         7.73034e+06
Name: 0, dtype: object

In [35]:
data.loc[9]

Name          Marcus Smart
Team        Boston Celtics
Number                  36
Position                PG
Age                     22
Height               4-Jun
Weight                 220
College     Oklahoma State
Salary         3.43104e+06
Name: 9, dtype: object

In [36]:
data.loc[[0,9]]

Unnamed: 0,Name,Team,Number,Position,Age,Height,Weight,College,Salary
0,Avery Bradley,Boston Celtics,0,PG,25,2-Jun,180,Texas,7730337.0
9,Marcus Smart,Boston Celtics,36,PG,22,4-Jun,220,Oklahoma State,3431040.0


In [31]:
# Indexing the data set with a column 
indexed_data = pd.read_csv('nba.csv', index_col = "Name")

In [37]:
indexed_data.loc['Avery Bradley']

Team        Boston Celtics
Number                   0
Position                PG
Age                     25
Height               2-Jun
Weight                 180
College              Texas
Salary         7.73034e+06
Name: Avery Bradley, dtype: object

In [40]:
indexed_data.loc[['Avery Bradley', 'Marcus Smart']]

Unnamed: 0_level_0,Team,Number,Position,Age,Height,Weight,College,Salary
Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
Avery Bradley,Boston Celtics,0,PG,25,2-Jun,180,Texas,7730337.0
Marcus Smart,Boston Celtics,36,PG,22,4-Jun,220,Oklahoma State,3431040.0


### Selecting two rows and three cloumns

In [44]:
indexed_data.loc[['Avery Bradley', 'Marcus Smart'], ['Position','Weight','Salary']]

Unnamed: 0_level_0,Position,Weight,Salary
Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Avery Bradley,PG,180,7730337.0
Marcus Smart,PG,220,3431040.0


### Selecting all of the rows and some columns

In [89]:
indexed_data.loc[:, ['Position','Weight','Salary']].head(5)

Unnamed: 0_level_0,Position,Weight,Salary
Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Avery Bradley,PG,180,7730337.0
Jae Crowder,SF,235,6796117.0
John Holland,SG,205,
R.J. Hunter,SG,185,1148640.0
Jonas Jerebko,PF,231,5000000.0


### Selecting some of the rows and some columns

In [90]:
data.loc[450:, ['Position','Weight','Salary']].head(5)

Unnamed: 0,Position,Weight,Salary
450,SF,226,2050000.0
451,SF,206,981348.0
452,PF,234,2239800.0
453,PG,203,2433333.0
454,PG,179,900000.0


### 3: Indexing using the location of the index: iloc[  ]

In [51]:
# Selecting a row
indexed_data.iloc[0] 

Team        Boston Celtics
Number                   0
Position                PG
Age                     25
Height               2-Jun
Weight                 180
College              Texas
Salary         7.73034e+06
Name: Avery Bradley, dtype: object

In [54]:
#Selecting Multiple ROWS and all columns 
indexed_data.iloc[[0, 1, 2, 3]]

Unnamed: 0_level_0,Team,Number,Position,Age,Height,Weight,College,Salary
Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
Avery Bradley,Boston Celtics,0,PG,25,2-Jun,180,Texas,7730337.0
Jae Crowder,Boston Celtics,99,SF,25,6-Jun,235,Marquette,6796117.0
John Holland,Boston Celtics,30,SG,27,5-Jun,205,Boston University,
R.J. Hunter,Boston Celtics,28,SG,22,5-Jun,185,Georgia State,1148640.0


In [56]:
# Selecting Multiple rows and columns 

indexed_data.iloc[[0, 1, 2, 3],[2,5,7] ]

Unnamed: 0_level_0,Position,Weight,Salary
Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Avery Bradley,PG,180,7730337.0
Jae Crowder,SF,235,6796117.0
John Holland,SG,205,
R.J. Hunter,SG,185,1148640.0


In [71]:
# Selecting all ROWS not columns 2,5,7
indexed_data.iloc[:,[2,5,7] ].head(5)

Unnamed: 0_level_0,Position,Weight,Salary
Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Avery Bradley,PG,180,7730337.0
Jae Crowder,SF,235,6796117.0
John Holland,SG,205,
R.J. Hunter,SG,185,1148640.0
Jonas Jerebko,PF,231,5000000.0


### Series Indexing

In [77]:
S1 = pd.Series(np.random.randn(6),index = list('abcdef'))
S1

a    0.936408
b    0.386765
c    0.702319
d    0.793005
e   -0.739255
f   -0.725367
dtype: float64

In [81]:
S1.loc['a']

0.9364079175151515

In [83]:
S1.iloc[0]

0.9364079175151515

In [86]:
S1.loc['d':]

d    0.793005
e   -0.739255
f   -0.725367
dtype: float64

In [136]:
S1.loc[::-2]

f   -0.725367
d    0.793005
b    0.386765
dtype: float64

### Boolean Indexing 
- Using a single column’s values to select data.

In [138]:
data[data["Age"] > 38]

Unnamed: 0,Name,Team,Number,Position,Age,Height,Weight,College,Salary
102,Pablo Prigioni,Los Angeles Clippers,9,PG,39,3-Jun,185,,947726.0
261,Vince Carter,Memphis Grizzlies,15,SG,39,6-Jun,220,North Carolina,4088019.0
298,Tim Duncan,San Antonio Spurs,21,C,40,11-Jun,250,Wake Forest,5250000.0
304,Andre Miller,San Antonio Spurs,24,PG,40,3-Jun,200,Utah,250750.0
400,Kevin Garnett,Minnesota Timberwolves,21,PF,40,11-Jun,240,,8500000.0


In [140]:
data[data.Age > 38]

Unnamed: 0,Name,Team,Number,Position,Age,Height,Weight,College,Salary
102,Pablo Prigioni,Los Angeles Clippers,9,PG,39,3-Jun,185,,947726.0
261,Vince Carter,Memphis Grizzlies,15,SG,39,6-Jun,220,North Carolina,4088019.0
298,Tim Duncan,San Antonio Spurs,21,C,40,11-Jun,250,Wake Forest,5250000.0
304,Andre Miller,San Antonio Spurs,24,PG,40,3-Jun,200,Utah,250750.0
400,Kevin Garnett,Minnesota Timberwolves,21,PF,40,11-Jun,240,,8500000.0


In [141]:
data[data["Age"].isin([39,40])]

Unnamed: 0,Name,Team,Number,Position,Age,Height,Weight,College,Salary
102,Pablo Prigioni,Los Angeles Clippers,9,PG,39,3-Jun,185,,947726.0
261,Vince Carter,Memphis Grizzlies,15,SG,39,6-Jun,220,North Carolina,4088019.0
298,Tim Duncan,San Antonio Spurs,21,C,40,11-Jun,250,Wake Forest,5250000.0
304,Andre Miller,San Antonio Spurs,24,PG,40,3-Jun,200,Utah,250750.0
400,Kevin Garnett,Minnesota Timberwolves,21,PF,40,11-Jun,240,,8500000.0


In [151]:
data[data["College"] == "Utah"]

Unnamed: 0,Name,Team,Number,Position,Age,Height,Weight,College,Salary
75,Delon Wright,Toronto Raptors,55,PG,24,5-Jun,190,Utah,1509360.0
78,Andrew Bogut,Golden State Warriors,12,C,31,Jul-00,260,Utah,13800000.0
304,Andre Miller,San Antonio Spurs,24,PG,40,3-Jun,200,Utah,250750.0


In [156]:
data[data["College"].isnull()].head(5) 

Unnamed: 0,Name,Team,Number,Position,Age,Height,Weight,College,Salary
4,Jonas Jerebko,Boston Celtics,8,PF,29,10-Jun,231,,5000000.0
5,Amir Johnson,Boston Celtics,90,PF,29,9-Jun,240,,12000000.0
15,Bojan Bogdanovic,Brooklyn Nets,44,SG,27,8-Jun,216,,3425510.0
20,Sergey Karasev,Brooklyn Nets,10,SG,22,7-Jun,208,,1599840.0
32,Thanasis Antetokounmpo,New York Knicks,43,SF,23,7-Jun,205,,30888.0


In [157]:
data[data["College"].isna()].head(5) 

Unnamed: 0,Name,Team,Number,Position,Age,Height,Weight,College,Salary
4,Jonas Jerebko,Boston Celtics,8,PF,29,10-Jun,231,,5000000.0
5,Amir Johnson,Boston Celtics,90,PF,29,9-Jun,240,,12000000.0
15,Bojan Bogdanovic,Brooklyn Nets,44,SG,27,8-Jun,216,,3425510.0
20,Sergey Karasev,Brooklyn Nets,10,SG,22,7-Jun,208,,1599840.0
32,Thanasis Antetokounmpo,New York Knicks,43,SF,23,7-Jun,205,,30888.0


In [160]:
data[(data["College"].isna()) & (data["Age"] > 38)] 

Unnamed: 0,Name,Team,Number,Position,Age,Height,Weight,College,Salary
102,Pablo Prigioni,Los Angeles Clippers,9,PG,39,3-Jun,185,,947726.0
400,Kevin Garnett,Minnesota Timberwolves,21,PF,40,11-Jun,240,,8500000.0
