### [How do I select multiple rows and columns from a pandas DataFrame?](https://www.youtube.com/watch?v=xvpNA7bC8cs)

In [1]:
import pandas as pd

In [2]:
ufo = pd.read_csv('http://bit.ly/uforeports')
ufo.head()

Unnamed: 0,City,Colors Reported,Shape Reported,State,Time
0,Ithaca,,TRIANGLE,NY,6/1/1930 22:00
1,Willingboro,,OTHER,NJ,6/30/1930 20:00
2,Holyoke,,OVAL,CO,2/15/1931 14:00
3,Abilene,,DISK,KS,6/1/1931 13:00
4,New York Worlds Fair,,LIGHT,NY,4/18/1933 19:00


### loc - dataframe method to filter rows and columns by using labels
Here, for rows it means index and for columns it means the column names.  
We dont use parenthesis(), we use square brackets[].   
Eg: df[`what rows we need`, `what columns we need`]

In [3]:
# To select a single row
ufo.loc[0,:]

City                       Ithaca
Colors Reported               NaN
Shape Reported           TRIANGLE
State                          NY
Time               6/1/1930 22:00
Name: 0, dtype: object

In [4]:
# To select mulitple rows continuously
# Observe that its inclusive on both 0 and 3, it doesnt exclude 3 as it would do for range
# So, loc is inclusive on both sides when you use this notation
ufo.loc[0:3, :]

Unnamed: 0,City,Colors Reported,Shape Reported,State,Time
0,Ithaca,,TRIANGLE,NY,6/1/1930 22:00
1,Willingboro,,OTHER,NJ,6/30/1930 20:00
2,Holyoke,,OVAL,CO,2/15/1931 14:00
3,Abilene,,DISK,KS,6/1/1931 13:00


In [5]:
## Gives same result but go with the above one better readability
ufo.loc[0:3]

Unnamed: 0,City,Colors Reported,Shape Reported,State,Time
0,Ithaca,,TRIANGLE,NY,6/1/1930 22:00
1,Willingboro,,OTHER,NJ,6/30/1930 20:00
2,Holyoke,,OVAL,CO,2/15/1931 14:00
3,Abilene,,DISK,KS,6/1/1931 13:00


In [6]:
## To select multiple rows unordered
ufo.loc[[3,5,6], :]

Unnamed: 0,City,Colors Reported,Shape Reported,State,Time
3,Abilene,,DISK,KS,6/1/1931 13:00
5,Valley City,,DISK,ND,9/15/1934 15:30
6,Crater Lake,,CIRCLE,CA,6/15/1935 0:00


In [7]:
## To select columns
ufo.loc[0:5, 'City']

0                  Ithaca
1             Willingboro
2                 Holyoke
3                 Abilene
4    New York Worlds Fair
5             Valley City
Name: City, dtype: object

In [8]:
## To select multiple columns
ufo.loc[0:5, ['City', 'State']]

Unnamed: 0,City,State
0,Ithaca,NY
1,Willingboro,NJ
2,Holyoke,CO
3,Abilene,KS
4,New York Worlds Fair,NY
5,Valley City,ND


In [9]:
## To select all columns from city through state
ufo.loc[0:5, 'City':'State']

Unnamed: 0,City,Colors Reported,Shape Reported,State
0,Ithaca,,TRIANGLE,NY
1,Willingboro,,OTHER,NJ
2,Holyoke,,OVAL,CO
3,Abilene,,DISK,KS
4,New York Worlds Fair,,LIGHT,NY
5,Valley City,,DISK,ND


#### Using loc with binary

In [10]:
## without loc
ufo[ufo['City']=='Ithaca']

Unnamed: 0,City,Colors Reported,Shape Reported,State,Time
0,Ithaca,,TRIANGLE,NY,6/1/1930 22:00
4068,Ithaca,,CIGAR,NY,6/1/1979 19:00
5631,Ithaca,,OTHER,MI,6/1/1987 17:00
6961,Ithaca,,OTHER,NY,1/10/1993 0:30
7573,Ithaca,RED GREEN,LIGHT,NY,10/15/1994 18:00
9088,Ithaca,,,NY,2/16/1996 21:45
16537,Ithaca,,FLASH,MI,6/3/2000 22:35
17049,Ithaca,,TEARDROP,NY,7/30/2000 20:20


In [11]:
## without loc - chained indexing - sometimes can cause problems better to go with loc
## 2 operations involved
ufo[ufo['City']=='Ithaca']['State']

0        NY
4068     NY
5631     MI
6961     NY
7573     NY
9088     NY
16537    MI
17049    NY
Name: State, dtype: object

In [12]:
## Doing the same with loc is more flexible as we can specify which columns  we need
## Using loc only 1 operation so safer and faster
ufo.loc[ufo['City']=='Ithaca',:]

Unnamed: 0,City,Colors Reported,Shape Reported,State,Time
0,Ithaca,,TRIANGLE,NY,6/1/1930 22:00
4068,Ithaca,,CIGAR,NY,6/1/1979 19:00
5631,Ithaca,,OTHER,MI,6/1/1987 17:00
6961,Ithaca,,OTHER,NY,1/10/1993 0:30
7573,Ithaca,RED GREEN,LIGHT,NY,10/15/1994 18:00
9088,Ithaca,,,NY,2/16/1996 21:45
16537,Ithaca,,FLASH,MI,6/3/2000 22:35
17049,Ithaca,,TEARDROP,NY,7/30/2000 20:20


In [13]:
ufo.loc[ufo['City']=='Ithaca','State']

0        NY
4068     NY
5631     MI
6961     NY
7573     NY
9088     NY
16537    MI
17049    NY
Name: State, dtype: object

***
***

### iloc - Used for filtering rows and selecting columns by integer position (i-integer)

uisng iloc(:, x:y) - its inclusive of x but exclusive of y

In [14]:
ufo.iloc[0:6,0:4]

Unnamed: 0,City,Colors Reported,Shape Reported,State
0,Ithaca,,TRIANGLE,NY
1,Willingboro,,OTHER,NJ
2,Holyoke,,OVAL,CO
3,Abilene,,DISK,KS
4,New York Worlds Fair,,LIGHT,NY
5,Valley City,,DISK,ND


In [16]:
ufo.iloc[0:3,:]

Unnamed: 0,City,Colors Reported,Shape Reported,State,Time
0,Ithaca,,TRIANGLE,NY,6/1/1930 22:00
1,Willingboro,,OTHER,NJ,6/30/1930 20:00
2,Holyoke,,OVAL,CO,2/15/1931 14:00


## To select multiple columns

In [18]:
ufo[['City', 'State']].head()

Unnamed: 0,City,State
0,Ithaca,NY
1,Willingboro,NJ
2,Holyoke,CO
3,Abilene,KS
4,New York Worlds Fair,NY


In [20]:
ufo.loc[0:2,:]   # is preferred to ufo[0:2]

Unnamed: 0,City,Colors Reported,Shape Reported,State,Time
0,Ithaca,,TRIANGLE,NY,6/1/1930 22:00
1,Willingboro,,OTHER,NJ,6/30/1930 20:00
2,Holyoke,,OVAL,CO,2/15/1931 14:00
