## V7: Pandas Part2:
### Introduction to loc and iloc function.

#### Pandas series:
    * brics["country"] will be a pandas.core.series.Series. this is nothing but a 1D labelled array.
    
    * Where as, brics[["country"]] will be a pandas.core.frame.DataFrame. or we can slice the dataframe as brics[1:4]. This will be having rows from 0 to 3 rows.

#### Square brackets:
    * Column access: brics[["country", "capital"]]
    * Row access: Only through slicing. brics[1:4]

#### loc (label-based):
    * Row access: brics.loc[["RU", "IN", "CH"]]. where "RU" is index of row.
    * Coulmn access: brics.loc[:, ["country", "capital"]]
    * Row & Column access
        brics.loc[
            ["RU", "IN", "CH"],
            ["country", "capital"]
            ]
            
#### iloc (index based):
1. Here we are passing index instead of there labels.
2. Else it is very similar to loc.
3. Here subset argument is index.
    * brics.iloc[[1,2]] or brics.loc[["RU", "CH"]]
    * brics.loc[["RU", "IN", "CH"], ["country", "capital"]]
    * brics.loc[:,["country", "capital"]] or brics.iloc[:, [0,1]]
    
    
    

### Example 1: Square Brackets (1)
In the video, you saw that you can index and select Pandas DataFrames in many different ways. The simplest, but not the most powerful way, is to use square brackets.

In the sample code, the same cars data is imported from a CSV files as a Pandas DataFrame. To select only the cars_per_cap column from cars, you can use:

    cars['cars_per_cap']
    cars[['cars_per_cap']]
The single bracket version gives a Pandas Series, the double bracket version gives a Pandas DataFrame.

#### Steps:
1. Use single square brackets to print out the country column of cars as a Pandas Series.
2.  double square brackets to print out the country column of cars as a Pandas DataFrame.
3.  double square brackets to print out a DataFrame with both the country and capital columns of cars, in this order.



In [1]:
# 1. Use single square brackets to print out the country column of cars as a Pandas Series.

import pandas as pd

cars_df = pd.read_csv('cars.csv')
print('Car DF:\n', cars_df)

print('\nUsing single bracket to pring pandas series:\n', cars_df['cars_per_cap'])

# 2.double square brackets to print out the cars_per_cap column of cars as a Pandas DataFrame

print('\nUsing double square bracktes:\n', cars_df[['cars_per_cap']])

# 3. double square brackets to print out a DataFrame with both the cars_per_cap and drives_right columns of cars

print('\nUsing double square bracktes:\n', cars_df[['cars_per_cap','drives_right']])

Car DF:
   Unnamed: 0  cars_per_cap        country  drives_right
0         US           809  United States          True
1        AUS           731      Australia         False
2        JPN           588          Japan         False
3         IN            18          India         False
4         RU           200         Russia          True
5        MOR            70        Morocco          True
6         EG            45          Egypt          True

Using single bracket to pring pandas series:
 0    809
1    731
2    588
3     18
4    200
5     70
6     45
Name: cars_per_cap, dtype: int64

Using double square bracktes:
    cars_per_cap
0           809
1           731
2           588
3            18
4           200
5            70
6            45

Using double square bracktes:
    cars_per_cap  drives_right
0           809          True
1           731         False
2           588         False
3            18         False
4           200          True
5            70          Tru

## Example 2: Square Brackets (2)
Square brackets can do more than just selecting columns. You can also use them to get rows, or observations, from a DataFrame. The following call selects the first five rows from the cars DataFrame:

    cars[0:5]

The result is another DataFrame containing only the rows you specified.

Pay attention: You can only select rows using square brackets if you specify a slice, like 0:4. Also, you're using the integer indexes of the rows here, not the row labels!

### Steps: 
1. Select the first 3 observations from cars and print them out.
2. Select the fourth, fifth and sixth observation, corresponding to row indexes 3, 4 and 5, and print them out.

In [2]:
# Seleting first 3 observatjions from cars.

print('\nFirst 3 observations/rows from dataframe:\n', cars_df[0:3])

# Selecting 4th, 5th observation.

print('\n\nSelecting 4th, 5th observation:\n',cars_df[3:6])


First 3 observations/rows from dataframe:
   Unnamed: 0  cars_per_cap        country  drives_right
0         US           809  United States          True
1        AUS           731      Australia         False
2        JPN           588          Japan         False


Selecting 4th, 5th observation:
   Unnamed: 0  cars_per_cap  country  drives_right
3         IN            18    India         False
4         RU           200   Russia          True
5        MOR            70  Morocco          True


## Example 3: loc and iloc (1)
With loc and iloc you can do practically any data selection operation on DataFrames you can think of. loc is label-based, which means that you have to specify rows and columns based on their row and column labels. iloc is integer index based, so you have to specify rows and columns by their integer index like you did in the previous exercise.

Try out the following commands in the IPython Shell to experiment with loc and iloc to select observations. Each pair of commands here gives the same result.

    cars.loc['RU']
    cars.iloc[4]

    cars.loc[['RU']]
    cars.iloc[[4]]

    cars.loc[['RU', 'AUS']]
    cars.iloc[[4, 1]]
    
As before, code is included that imports the cars data as a Pandas DataFrame.

### Steps:
1. Use loc or iloc to select the observation corresponding to Japan as a Series. The label of this row is JPN, the index is 2. Make sure to print the resulting Series.
2. Use loc or iloc to select the observations for Australia and Egypt as a DataFrame. You can find out about the labels/indexes of these rows by inspecting cars in the IPython Shell. Make sure to print the resulting DataFrame.

In [34]:
import pandas as pd

cars = pd.read_csv('cars.csv', index_col=0)

print('\n\n\n\nCar Details:\n', cars)

print('\n\n\n\nCountry Column as DataFrame:\n',type(cars[['country']]), '\n', cars[['country']] )

print('\n\n\n\nCountry column as series:\n',type(cars['country']), '\n', cars['country'] )


# Using LOC[] and o/p as Series
print('\n\n\n\nJapan result using LOC as series:\n',type(cars.loc['JPN']), '\n', cars.loc['JPN'] )

# Using ILOC[] and o/p as Series
print('\n\n\n\nJapan result using iLOC as series:\n',type(cars.iloc[2]), '\n', cars.iloc[2] )

# Using LOC[] and o/p as DataFrame
print('\n\n\n\nJapan result using loc[[]] as DataFrame:\n',type(cars.loc[['JPN']]), '\n\n', cars.loc[['JPN']] )


# Using ILOC[] and o/p as DataFrame
print('\n\n\n\nJapan result using iloc[[]] as DataFrame:\n',type(cars.iloc[[2]]), '\n\n', cars.iloc[[2]] )





Car Details:
      cars_per_cap        country  drives_right
US            809  United States          True
AUS           731      Australia         False
JPN           588          Japan         False
IN             18          India         False
RU            200         Russia          True
MOR            70        Morocco          True
EG             45          Egypt          True




Country Column as DataFrame:
 <class 'pandas.core.frame.DataFrame'> 
            country
US   United States
AUS      Australia
JPN          Japan
IN           India
RU          Russia
MOR        Morocco
EG           Egypt




Country column as series:
 <class 'pandas.core.series.Series'> 
 US     United States
AUS        Australia
JPN            Japan
IN             India
RU            Russia
MOR          Morocco
EG             Egypt
Name: country, dtype: object




Japan result using LOC as series:
 <class 'pandas.core.series.Series'> 
 cars_per_cap      588
country         Japan
drives_right  

## Example 4: loc and iloc (2)
loc and iloc also allow you to select both rows and columns from a DataFrame. To experiment, try out the following commands in the IPython Shell. Again, paired commands produce the same result.

    cars.loc['IN', 'cars_per_cap']
    cars.iloc[3, 0]

    cars.loc[['IN', 'RU'], 'cars_per_cap']
    cars.iloc[[3, 4], 0]

    cars.loc[['IN', 'RU'], ['cars_per_cap', 'country']]
    cars.iloc[[3, 4], [0, 1]]
    
### Steps: 
1. Print out the drives_right value of the row corresponding to Morocco (its row label is MOR)
2. Print out a sub-DataFrame, containing the observations for Russia and Morocco and the columns country and drives_right.



In [49]:
print('Cars: ',cars)

# Print out the drives_right value of the row corresponding to Morocco (its row label is MOR)
print('\n\n\n',type(cars.loc['MOR','drives_right'] ), '\n', cars.loc['MOR','drives_right'])

#2. Print out a sub-DataFrame, containing the observations for Russia and Morocco and the columns country and drives_right.
print('\n\n\n',cars.loc[['RU', 'MOR'],['drives_right']])

Cars:       cars_per_cap        country  drives_right
US            809  United States          True
AUS           731      Australia         False
JPN           588          Japan         False
IN             18          India         False
RU            200         Russia          True
MOR            70        Morocco          True
EG             45          Egypt          True



 <class 'numpy.bool_'> 
 True



      drives_right
RU           True
MOR          True


## Example 5: loc and iloc (3)
It's also possible to select only columns with loc and iloc. In both cases, you simply put a slice going from beginning to end in front of the comma:

    cars.loc[:, 'country']
    cars.iloc[:, 1]

    cars.loc[:, ['country','drives_right']]
    cars.iloc[:, [1, 2]]
    
### Steps:
1. Print out the drives_right column as a Series using loc or iloc.
2. Print out the drives_right column as a DataFrame using loc or iloc.
3. Print out both the cars_per_cap and drives_right column as a DataFrame using loc or iloc.


In [69]:
print(cars,'\n\n')
# 1. Print out the drives_right column as a Series using loc or iloc.
print('\n\nSeries Using loc[]:\n',type(cars.loc[:,'drives_right']), '\n\n',cars.loc[:,'drives_right'])
print('\n\nSeries Using iloc[]:\n',type(cars.iloc[:,2]), '\n\n',cars.iloc[:,2])

# 2. Print out the drives_right column as a DataFrame using loc or iloc.
print('\n\nDataFrame Using loc[]:\n',type(cars.loc[:,['drives_right']]), '\n\n', cars.loc[:,['drives_right']])
print('\n\nDataFrame Using iloc[]:\n',type(cars.iloc[:,[2]]), '\n\n',cars.iloc[:,[2]])

# 3. Print out both the cars_per_cap and drives_right column as a DataFrame using loc or iloc.
print('\n\nDataFrame Print out both the cars_per_cap and drives_right column Using iloc[]:\n',
      type(cars.iloc[:,[0,2]]), '\n\n',cars.iloc[:,[0,2]]
     )


     cars_per_cap        country  drives_right
US            809  United States          True
AUS           731      Australia         False
JPN           588          Japan         False
IN             18          India         False
RU            200         Russia          True
MOR            70        Morocco          True
EG             45          Egypt          True 




Series Using loc[]:
 <class 'pandas.core.series.Series'> 

 US      True
AUS    False
JPN    False
IN     False
RU      True
MOR     True
EG      True
Name: drives_right, dtype: bool


Series Using iloc[]:
 <class 'pandas.core.series.Series'> 

 US      True
AUS    False
JPN    False
IN     False
RU      True
MOR     True
EG      True
Name: drives_right, dtype: bool


DataFrame Using loc[]:
 <class 'pandas.core.frame.DataFrame'> 

      drives_right
US           True
AUS         False
JPN         False
IN          False
RU           True
MOR          True
EG           True


DataFrame Using iloc[]:
 <class 'pand