# The Top 5 Machine Learning Libraries in Python

Python is an easy to learn, powerful programming language. It has efficient high-level data structures and a simple but effective approach to object-oriented programming. Python’s elegant syntax and dynamic typing, together with its interpreted nature, make it an ideal language for scripting and rapid application development in many areas on most platforms.

# Lesson 1: Intro to pandas data structures

Our first top 5 Library is Pandas. Pandas is an open source Python library for data analysis. Pandas makes Python great for analysis.

In [1]:
# Let's bring in pandas
import pandas as pd

We can import the entire library but that's often not the Pythonic way.

In the code below we are using the from keyword to only import what we need.

Interstingly, in most tutorials all of pandas is imported.

In [2]:
from pandas import DataFrame

At the very basic level, Pandas objects can be thought of as enhanced versions of NumPy structured arrays in which the rows and columns are identified with labels rather than simple integer indices.

There are three core Pandas data structures. They are:

Series - A pandas Series is a one-dimensional array of indexed data.

Dataframe - The DataFrame can be thought of either as a generalization of a NumPy array, or as a specialization of a Python dictionary.

Index - This Index object is an interesting structure in itself, and it can be thought of as an ordered set.

In [3]:
# Let's create a series
# A series is a one dimensional array like object. 

s = pd.Series([3,5,5,9,6])
s

0    3
1    5
2    5
3    9
4    6
dtype: int64

In [4]:
s.head()

0    3
1    5
2    5
3    9
4    6
dtype: int64

Methods end with parentheses, while attributes don't:

So head() in the exmple above is a method.

Let's read some tabular data into our workspace using pandas.

When you hear the word tabular data think excel spreadsheet.

Let's read some data into a table and manipulate that data

In [5]:
ufo = pd.read_table('http://bit.ly/uforeports', sep=',')

In [6]:
ufo.head(10)

Unnamed: 0,City,Colors Reported,Shape Reported,State,Time
0,Ithaca,,TRIANGLE,NY,6/1/1930 22:00
1,Willingboro,,OTHER,NJ,6/30/1930 20:00
2,Holyoke,,OVAL,CO,2/15/1931 14:00
3,Abilene,,DISK,KS,6/1/1931 13:00
4,New York Worlds Fair,,LIGHT,NY,4/18/1933 19:00
5,Valley City,,DISK,ND,9/15/1934 15:30
6,Crater Lake,,CIRCLE,CA,6/15/1935 0:00
7,Alma,,DISK,MI,7/15/1936 0:00
8,Eklutna,,CIGAR,AK,10/15/1936 17:00
9,Hubbard,,CYLINDER,OR,6/15/1937 0:00


In [7]:
ufo['State']

0        NY
1        NJ
2        CO
3        KS
4        NY
5        ND
6        CA
7        MI
8        AK
9        OR
10       CA
11       AL
12       SC
13       IA
14       MI
15       CA
16       CA
17       GA
18       TN
19       AK
20       NE
21       LA
22       LA
23       KY
24       WV
25       CA
26       WV
27       NM
28       NM
29       UT
         ..
18211    MA
18212    CA
18213    CA
18214    TX
18215    TX
18216    CA
18217    CO
18218    TX
18219    CA
18220    CA
18221    NH
18222    PA
18223    SC
18224    OK
18225    CA
18226    CA
18227    CA
18228    TX
18229    IL
18230    CA
18231    CA
18232    WI
18233    AK
18234    CA
18235    AZ
18236    IL
18237    IA
18238    WI
18239    WI
18240    FL
Name: State, Length: 18241, dtype: object

In [8]:
ufo['Location'] = ufo.City + ', ' + ufo.State
ufo.head()

Unnamed: 0,City,Colors Reported,Shape Reported,State,Time,Location
0,Ithaca,,TRIANGLE,NY,6/1/1930 22:00,"Ithaca, NY"
1,Willingboro,,OTHER,NJ,6/30/1930 20:00,"Willingboro, NJ"
2,Holyoke,,OVAL,CO,2/15/1931 14:00,"Holyoke, CO"
3,Abilene,,DISK,KS,6/1/1931 13:00,"Abilene, KS"
4,New York Worlds Fair,,LIGHT,NY,4/18/1933 19:00,"New York Worlds Fair, NY"


In [9]:
ufo.describe()

Unnamed: 0,City,Colors Reported,Shape Reported,State,Time,Location
count,18216,2882,15597,18241,18241,18216
unique,6476,27,27,52,16145,8029
top,Seattle,RED,LIGHT,CA,11/16/1999 19:00,"Seattle, WA"
freq,187,780,2803,2529,27,187


In [10]:
ufo.shape

(18241, 6)

In [11]:
ufo.dtypes

City               object
Colors Reported    object
Shape Reported     object
State              object
Time               object
Location           object
dtype: object

In [12]:
ufo.columns

Index(['City', 'Colors Reported', 'Shape Reported', 'State', 'Time',
       'Location'],
      dtype='object')

In [13]:
ufo = pd.read_table('http://bit.ly/uforeports', sep=',')
ufo.head()

Unnamed: 0,City,Colors Reported,Shape Reported,State,Time
0,Ithaca,,TRIANGLE,NY,6/1/1930 22:00
1,Willingboro,,OTHER,NJ,6/30/1930 20:00
2,Holyoke,,OVAL,CO,2/15/1931 14:00
3,Abilene,,DISK,KS,6/1/1931 13:00
4,New York Worlds Fair,,LIGHT,NY,4/18/1933 19:00


In [14]:
ufo.drop('Colors Reported', axis=1, inplace=True)
ufo.head()

Unnamed: 0,City,Shape Reported,State,Time
0,Ithaca,TRIANGLE,NY,6/1/1930 22:00
1,Willingboro,OTHER,NJ,6/30/1930 20:00
2,Holyoke,OVAL,CO,2/15/1931 14:00
3,Abilene,DISK,KS,6/1/1931 13:00
4,New York Worlds Fair,LIGHT,NY,4/18/1933 19:00


In [15]:
ufo.drop(['State','Time'], axis=1, inplace=True)
ufo.head()

Unnamed: 0,City,Shape Reported
0,Ithaca,TRIANGLE
1,Willingboro,OTHER
2,Holyoke,OVAL
3,Abilene,DISK
4,New York Worlds Fair,LIGHT


In [16]:
ufo = pd.read_table('http://bit.ly/uforeports', sep=',')
ufo.head()

Unnamed: 0,City,Colors Reported,Shape Reported,State,Time
0,Ithaca,,TRIANGLE,NY,6/1/1930 22:00
1,Willingboro,,OTHER,NJ,6/30/1930 20:00
2,Holyoke,,OVAL,CO,2/15/1931 14:00
3,Abilene,,DISK,KS,6/1/1931 13:00
4,New York Worlds Fair,,LIGHT,NY,4/18/1933 19:00


In [17]:
ufo.State.sort_values(ascending=False).head(25)

12079    WY
11490    WY
11333    WY
4866     WY
3326     WY
3328     WY
16594    WY
1177     WY
378      WY
5065     WY
7684     WY
6116     WY
10729    WY
12072    WY
16637    WY
14618    WY
7491     WY
12063    WY
14240    WY
14586    WY
15667    WY
7485     WY
14747    WY
1461     WY
1442     WY
Name: State, dtype: object

In [18]:
ufo.sort_values('City').head()

Unnamed: 0,City,Colors Reported,Shape Reported,State,Time
1761,Abbeville,,DISK,SC,12/10/1968 0:30
4553,Aberdeen,,CYLINDER,WA,6/15/1981 22:00
16167,Aberdeen,,VARIOUS,OH,3/29/2000 3:00
14703,Aberdeen,,TRIANGLE,WA,9/30/1999 21:00
389,Aberdeen,ORANGE,CIRCLE,SD,11/15/1956 18:30


In [19]:
ufo.sort_values(['City','State']).head(25)

Unnamed: 0,City,Colors Reported,Shape Reported,State,Time
1761,Abbeville,,DISK,SC,12/10/1968 0:30
2297,Aberdeen,,TRIANGLE,MD,8/18/1972 1:30
9404,Aberdeen,,DISK,MD,6/15/1996 13:30
16167,Aberdeen,,VARIOUS,OH,3/29/2000 3:00
389,Aberdeen,ORANGE,CIRCLE,SD,11/15/1956 18:30
4553,Aberdeen,,CYLINDER,WA,6/15/1981 22:00
12294,Aberdeen,,FIREBALL,WA,10/4/1998 4:42
14703,Aberdeen,,TRIANGLE,WA,9/30/1999 21:00
17809,Aberdeen,GREEN,FIREBALL,WA,10/29/2000 17:25
3,Abilene,,DISK,KS,6/1/1931 13:00
