_This introduction to pandas is derived from Data School's pandas Q&A with my own notes and code on top of what was provided._

## Reading subset of columns or rows, iterating through a Series or DataFrame, dropping all non-numeric columns and passing arguments

### 1. Reading subset of columns or rows

In [1]:
import pandas as pd

In [2]:
link = 'http://bit.ly/uforeports'
ufo = pd.read_csv(link)

In [3]:
ufo.columns

Index(['City', 'Colors Reported', 'Shape Reported', 'State', 'Time'], dtype='object')

In [4]:
# reference using String
cols = ['City', 'State']

ufo = pd.read_csv(link, usecols=cols)

In [5]:
ufo.head()

Unnamed: 0,City,State
0,Ithaca,NY
1,Willingboro,NJ
2,Holyoke,CO
3,Abilene,KS
4,New York Worlds Fair,NY


In [6]:
# reference using position (Integer)
cols2 = [0, 4]

ufo = pd.read_csv(link, usecols=cols2)

In [7]:
ufo.head()

Unnamed: 0,City,Time
0,Ithaca,6/1/1930 22:00
1,Willingboro,6/30/1930 20:00
2,Holyoke,2/15/1931 14:00
3,Abilene,6/1/1931 13:00
4,New York Worlds Fair,4/18/1933 19:00


In [8]:
# if you only want certain number of rows
ufo = pd.read_csv(link, nrows=3)

In [9]:
ufo

Unnamed: 0,City,Colors Reported,Shape Reported,State,Time
0,Ithaca,,TRIANGLE,NY,6/1/1930 22:00
1,Willingboro,,OTHER,NJ,6/30/1930 20:00
2,Holyoke,,OVAL,CO,2/15/1931 14:00


### 2. Iterating through a Series and DataFrame

In [11]:
# intuitive method
for c in ufo.City:
    print(c)

Ithaca
Willingboro
Holyoke


In [12]:
# pandas method
# you can grab index and row
for index, row in ufo.iterrows():
    print(index, row.City, row.State)

0 Ithaca NY
1 Willingboro NJ
2 Holyoke CO


### 3. Drop non-numeric column in a DataFrame

In [13]:
link = 'http://bit.ly/drinksbycountry'
drinks = pd.read_csv(link)

In [14]:
# you have 2 non-numeric columns
drinks.dtypes

country                          object
beer_servings                     int64
spirit_servings                   int64
wine_servings                     int64
total_litres_of_pure_alcohol    float64
continent                        object
dtype: object

In [17]:
import numpy as np
drinks.select_dtypes(include=[np.number]).dtypes

beer_servings                     int64
spirit_servings                   int64
wine_servings                     int64
total_litres_of_pure_alcohol    float64
dtype: object

### 4. Passing arguments, when to use list or string

In [19]:
drinks.describe(include='all')

Unnamed: 0,country,beer_servings,spirit_servings,wine_servings,total_litres_of_pure_alcohol,continent
count,193,193.0,193.0,193.0,193.0,193
unique,193,,,,,6
top,Bahrain,,,,,Africa
freq,1,,,,,53
mean,,106.160622,80.994819,49.450777,4.717098,
std,,101.143103,88.284312,79.697598,3.773298,
min,,0.0,0.0,0.0,0.0,
25%,,20.0,4.0,1.0,1.3,
50%,,76.0,56.0,8.0,4.2,
75%,,188.0,128.0,59.0,7.2,


In [21]:
# here you pass a list
# use shift + tab to know what arguments to pass in
list_include = ['object', 'float64']
drinks.describe(include=list_include)

Unnamed: 0,country,total_litres_of_pure_alcohol,continent
count,193,193.0,193
unique,193,,6
top,Bahrain,,Africa
freq,1,,53
mean,,4.717098,
std,,3.773298,
min,,0.0,
25%,,1.3,
50%,,4.2,
75%,,7.2,
