# Pandas

We have seen Numpy in the last section. It is good at performing math operation on 2d-arrays of numbers. But the major drawback is, it cannot deal with heterogenous values. So, Pandas dataframes are helpful in that aspect for storing different data types and referring the values like a dict in python instead of just referring each item with index.

[Link to Official Documentation](http://pandas.pydata.org/pandas-docs/version/0.23/dsintro.html)

## Series

Pandas series are almost same as nd arrays in numpy, with a additional inferencing ability with custom labels like *keys* in a *dictionary* in python.

In [1]:
import numpy as np
import pandas as pd

In [10]:
#Example

series2 = pd.Series(data = [1,2,3], index = ['key1', 'key2', 'key3'])
series2

key1    1
key2    2
key3    3
dtype: int64

### Question 1

Convert a given dict to pd series.

[**Hint:** Use **.Series**]

In [1]:
import pandas as pd
mydict = {'speed': [1,2,3,4,6]}
ser1 = pd.Series(data=mydict)
print(ser1)

speed    [1, 2, 3, 4, 6]
dtype: object


## Dataframes

A dataframe is a table with labeled columns which can hold different types of data in each column. 

In [2]:
# Example
d1 = {'a': [1,2,3], 'b': [3,4,5], 'c':[6,7,8] }
df1 = pd.DataFrame(d1)
df1

Unnamed: 0,a,b,c
0,1,3,6
1,2,4,7
2,3,5,8


### Question 3

Select second row in the above dataframe df1.



In [4]:
print(df1.iloc[1])

a    2
b    4
c    7
Name: 1, dtype: int64


### Question 4

Select column c in second row of df1.

[ **Hint: ** For using labels use **df.loc[row, column]**. For using numeric indexed use **df.iloc[]**. For using mixture of numeric indexes and labels use **df.ix[row, column]** ]



In [10]:
df1['c'][1]

7

## Using Dataframes on a dataset

##### Using the mtcars dataset.

For the below set of questions, we will be using the cars data from [Motor Trend Car Road Tests](http://stat.ethz.ch/R-manual/R-devel/library/datasets/html/mtcars.html)

The data was extracted from the 1974 Motor Trend US magazine, and comprises fuel consumption and 10 aspects of automobile design and performance for 32 automobiles (1973–74 models). 


Details :
    
A data frame with 32 observations on 11 (numeric) variables.

[, 1] 	mpg 	Miles/(US) gallon

[, 2] 	cyl 	Number of cylinders

[, 3] 	disp 	Displacement (cu.in.)

[, 4] 	hp 	Gross horsepower

[, 5] 	drat 	Rear axle ratio

[, 6] 	wt 	Weight (1000 lbs)

[, 7] 	qsec 	1/4 mile time

[, 8] 	vs 	Engine (0 = V-shaped, 1 = straight)

[, 9] 	am 	Transmission (0 = automatic, 1 = manual)

[,10] 	gear 	Number of forward gears

[,11] 	carb 	Number of carburetors 

In [13]:
## Reading a dataset from a csv file using pandas.
mtcars = pd.read_csv('mtcars.csv')
mtcars.index = mtcars['model']


Following questions are based on analysing a particular dataset using dataframes.

### Question 5

Check the type and dimensions of given dataset - mtcars.


[ **Hint: ** Use **type()** and **df.shape** ]

In [14]:
print(mtcars.shape)

(32, 12)


In [15]:
print(type(mtcars))

<class 'pandas.core.frame.DataFrame'>


### Question 6

Check the first 10 lines and last 10 lines of the given dataset- mtcars.

[**Hint:** Use **.head()** and **.tail()**]

In [17]:
print(mtcars.head(10))
print('')
print(mtcars.tail(10))

                               model   mpg  cyl   disp   hp  drat     wt  \
model                                                                      
Mazda RX4                  Mazda RX4  21.0    6  160.0  110  3.90  2.620   
Mazda RX4 Wag          Mazda RX4 Wag  21.0    6  160.0  110  3.90  2.875   
Datsun 710                Datsun 710  22.8    4  108.0   93  3.85  2.320   
Hornet 4 Drive        Hornet 4 Drive  21.4    6  258.0  110  3.08  3.215   
Hornet Sportabout  Hornet Sportabout  18.7    8  360.0  175  3.15  3.440   
Valiant                      Valiant  18.1    6  225.0  105  2.76  3.460   
Duster 360                Duster 360  14.3    8  360.0  245  3.21  3.570   
Merc 240D                  Merc 240D  24.4    4  146.7   62  3.69  3.190   
Merc 230                    Merc 230  22.8    4  140.8   95  3.92  3.150   
Merc 280                    Merc 280  19.2    6  167.6  123  3.92  3.440   

                    qsec  vs  am  gear  carb  
model                                   

### Question 7

Print all the column labels in the given dataset - mtcars.

[ **Hint: ** Use **df.columns** ]

In [19]:
print(mtcars.columns)

Index(['model', 'mpg', 'cyl', 'disp', 'hp', 'drat', 'wt', 'qsec', 'vs', 'am',
       'gear', 'carb'],
      dtype='object')


### Question 8

Select first 6 rows and 3 columns in mtcars dataframe.

**Hint: **  
mtcars.ix[:,:] gives all rows and columns in the dataset.

In [30]:
print(mtcars.iloc[0:5,0:3])

                               model   mpg  cyl
model                                          
Mazda RX4                  Mazda RX4  21.0    6
Mazda RX4 Wag          Mazda RX4 Wag  21.0    6
Datsun 710                Datsun 710  22.8    4
Hornet 4 Drive        Hornet 4 Drive  21.4    6
Hornet Sportabout  Hornet Sportabout  18.7    8


### Question 9

Select rows from name **Mazda RX4** to **Valiant** in the mtcars dataset and display only mpg and cyl values of those cars. 

**Hint:** Use df **.ix[rows,columns]**

In [34]:
mtcars.index

Index(['Mazda RX4', 'Mazda RX4 Wag', 'Datsun 710', 'Hornet 4 Drive',
       'Hornet Sportabout', 'Valiant', 'Duster 360', 'Merc 240D', 'Merc 230',
       'Merc 280', 'Merc 280C', 'Merc 450SE', 'Merc 450SL', 'Merc 450SLC',
       'Cadillac Fleetwood', 'Lincoln Continental', 'Chrysler Imperial',
       'Fiat 128', 'Honda Civic', 'Toyota Corolla', 'Toyota Corona',
       'Dodge Challenger', 'AMC Javelin', 'Camaro Z28', 'Pontiac Firebird',
       'Fiat X1-9', 'Porsche 914-2', 'Lotus Europa', 'Ford Pantera L',
       'Ferrari Dino', 'Maserati Bora', 'Volvo 142E'],
      dtype='object', name='model')

In [38]:
mtcars.loc['Mazda RX4':'Valiant']

Unnamed: 0_level_0,model,mpg,cyl,disp,hp,drat,wt,qsec,vs,am,gear,carb
model,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
Mazda RX4,Mazda RX4,21.0,6,160.0,110,3.9,2.62,16.46,0,1,4,4
Mazda RX4 Wag,Mazda RX4 Wag,21.0,6,160.0,110,3.9,2.875,17.02,0,1,4,4
Datsun 710,Datsun 710,22.8,4,108.0,93,3.85,2.32,18.61,1,1,4,1
Hornet 4 Drive,Hornet 4 Drive,21.4,6,258.0,110,3.08,3.215,19.44,1,0,3,1
Hornet Sportabout,Hornet Sportabout,18.7,8,360.0,175,3.15,3.44,17.02,0,0,3,2
Valiant,Valiant,18.1,6,225.0,105,2.76,3.46,20.22,1,0,3,1
