# Let's get started
This is what I learnt about the Pandas library, from the free Udemy free course named 'A Gentle Introduction to the Top Python Libraries used in Applied Machine Learning'.

## Part 1 - Basics
### Importing a library
The word import means to bring to Python

In [4]:
import pandas as pd

### Creating a series
Let's crete a series. A series is a one-dimensional array like object.

In [10]:
s = pd.Series([3,5,5,9,6,8,10,4,2])
s

0    3
1    5
2    5
3    9
4    6
dtype: int64

### Checking out head
I want to see the first 5 observations.

In [13]:
s.head()

0    3
1    5
2    5
3    9
4    6
dtype: int64

## Part 2 - Importing CSV

### Pull data from a url
Let's get some data from a url

In [109]:
ufo = pd.read_table('http://bit.ly/uforeports', sep=',')

### View the head

In [111]:
ufo.head()

Unnamed: 0,City,Colors Reported,Shape Reported,State,Time
0,Ithaca,,TRIANGLE,NY,6/1/1930 22:00
1,Willingboro,,OTHER,NJ,6/30/1930 20:00
2,Holyoke,,OVAL,CO,2/15/1931 14:00
3,Abilene,,DISK,KS,6/1/1931 13:00
4,New York Worlds Fair,,LIGHT,NY,4/18/1933 19:00


### View observations (rows) of an attribute

In [113]:
ufo["State"]

0        NY
1        NJ
2        CO
3        KS
4        NY
         ..
18236    IL
18237    IA
18238    WI
18239    WI
18240    FL
Name: State, Length: 18241, dtype: object

### Concatenate two or more attributes

In [115]:
ufo["Location"] = ufo.City + ":" + ufo.State
ufo["Location"]

0                      Ithaca:NY
1                 Willingboro:NJ
2                     Holyoke:CO
3                     Abilene:KS
4        New York Worlds Fair:NY
                  ...           
18236              Grant Park:IL
18237             Spirit Lake:IA
18238             Eagle River:WI
18239             Eagle River:WI
18240                    Ybor:FL
Name: Location, Length: 18241, dtype: object

In [117]:
ufo.head()

Unnamed: 0,City,Colors Reported,Shape Reported,State,Time,Location
0,Ithaca,,TRIANGLE,NY,6/1/1930 22:00,Ithaca:NY
1,Willingboro,,OTHER,NJ,6/30/1930 20:00,Willingboro:NJ
2,Holyoke,,OVAL,CO,2/15/1931 14:00,Holyoke:CO
3,Abilene,,DISK,KS,6/1/1931 13:00,Abilene:KS
4,New York Worlds Fair,,LIGHT,NY,4/18/1933 19:00,New York Worlds Fair:NY


### Checking the shape of our data
'Shape' gives us the number of rows and columns

In [119]:
ufo.shape

(18241, 6)

### Let's check the data types

In [121]:
ufo.dtypes

City               object
Colors Reported    object
Shape Reported     object
State              object
Time               object
Location           object
dtype: object

### Let's check the columns

In [127]:
ufo.columns

Index(['City', 'Colors Reported', 'Shape Reported', 'State', 'Time',
       'Location'],
      dtype='object')

## Part 3 - Remove columns and sort data

In [129]:
ufo.head()

Unnamed: 0,City,Colors Reported,Shape Reported,State,Time,Location
0,Ithaca,,TRIANGLE,NY,6/1/1930 22:00,Ithaca:NY
1,Willingboro,,OTHER,NJ,6/30/1930 20:00,Willingboro:NJ
2,Holyoke,,OVAL,CO,2/15/1931 14:00,Holyoke:CO
3,Abilene,,DISK,KS,6/1/1931 13:00,Abilene:KS
4,New York Worlds Fair,,LIGHT,NY,4/18/1933 19:00,New York Worlds Fair:NY


### Remove one column

In [131]:
ufo.drop('Shape Reported', axis=1, inplace=True)

In [133]:
ufo.head()

Unnamed: 0,City,Colors Reported,State,Time,Location
0,Ithaca,,NY,6/1/1930 22:00,Ithaca:NY
1,Willingboro,,NJ,6/30/1930 20:00,Willingboro:NJ
2,Holyoke,,CO,2/15/1931 14:00,Holyoke:CO
3,Abilene,,KS,6/1/1931 13:00,Abilene:KS
4,New York Worlds Fair,,NY,4/18/1933 19:00,New York Worlds Fair:NY


### Remove more than one column

In [138]:
ufo.drop(['State', 'Time'], axis=1, inplace=True)
ufo.head()

Unnamed: 0,City,Colors Reported,Location
0,Ithaca,,Ithaca:NY
1,Willingboro,,Willingboro:NJ
2,Holyoke,,Holyoke:CO
3,Abilene,,Abilene:KS
4,New York Worlds Fair,,New York Worlds Fair:NY


### Sort data
#### Sort one column only

In [141]:
ufo = pd.read_table('http://bit.ly/uforeports', sep=',')
ufo.head()

Unnamed: 0,City,Colors Reported,Shape Reported,State,Time
0,Ithaca,,TRIANGLE,NY,6/1/1930 22:00
1,Willingboro,,OTHER,NJ,6/30/1930 20:00
2,Holyoke,,OVAL,CO,2/15/1931 14:00
3,Abilene,,DISK,KS,6/1/1931 13:00
4,New York Worlds Fair,,LIGHT,NY,4/18/1933 19:00


In [147]:
ufo.State.sort_values(ascending=False).head(10)

1461     WY
1442     WY
2437     WY
5065     WY
1985     WY
14586    WY
12063    WY
12072    WY
12079    WY
2267     WY
Name: State, dtype: object

#### Sort the entire table based on a single column

In [149]:
ufo = pd.read_table('http://bit.ly/uforeports', sep=',')
ufo.head()

Unnamed: 0,City,Colors Reported,Shape Reported,State,Time
0,Ithaca,,TRIANGLE,NY,6/1/1930 22:00
1,Willingboro,,OTHER,NJ,6/30/1930 20:00
2,Holyoke,,OVAL,CO,2/15/1931 14:00
3,Abilene,,DISK,KS,6/1/1931 13:00
4,New York Worlds Fair,,LIGHT,NY,4/18/1933 19:00


In [153]:
ufo.sort_values('City').head()

Unnamed: 0,City,Colors Reported,Shape Reported,State,Time
1761,Abbeville,,DISK,SC,12/10/1968 0:30
17809,Aberdeen,GREEN,FIREBALL,WA,10/29/2000 17:25
2297,Aberdeen,,TRIANGLE,MD,8/18/1972 1:30
9404,Aberdeen,,DISK,MD,6/15/1996 13:30
389,Aberdeen,ORANGE,CIRCLE,SD,11/15/1956 18:30


#### Sort by two columns

In [160]:
ufo = pd.read_table('http://bit.ly/uforeports', sep=',')
ufo.head()

Unnamed: 0,City,Colors Reported,Shape Reported,State,Time
0,Ithaca,,TRIANGLE,NY,6/1/1930 22:00
1,Willingboro,,OTHER,NJ,6/30/1930 20:00
2,Holyoke,,OVAL,CO,2/15/1931 14:00
3,Abilene,,DISK,KS,6/1/1931 13:00
4,New York Worlds Fair,,LIGHT,NY,4/18/1933 19:00


In [164]:
ufo.sort_values(['City', 'State']).head()

Unnamed: 0,City,Colors Reported,Shape Reported,State,Time
1761,Abbeville,,DISK,SC,12/10/1968 0:30
2297,Aberdeen,,TRIANGLE,MD,8/18/1972 1:30
9404,Aberdeen,,DISK,MD,6/15/1996 13:30
16167,Aberdeen,,VARIOUS,OH,3/29/2000 3:00
389,Aberdeen,ORANGE,CIRCLE,SD,11/15/1956 18:30


## Summary
After we successfully install Python we open our Jupyter Notebook by navigating to our command prompt and typing jupyter notebook. 

The word import means to "bring in."
 
Markdown language allows us to type in HTML commands in our cells.  This provides us with a way to easily annotate our workbooks.
 
We can use Crtl + Enter to execute the contents of a cell
. 
The pound sign (#) allows us to comment our work. It's a way to document our work inside a code cel
l. 
When we use the as keyword we are creating an alias. It's just a way to reference our library more simp
ly. 
A pandas Series is a one-dimensional array of indexed 
data.
Once we've declared and place data in our variable we can view the first X number of rows using variable.he
ad(). 
A function takes parameters and returns a value.  A "method" is a specific type of function: it must be part of a "class", so has access to the class' member variables.  A function is usually discrete and all variables must be pa

Recall that methods end with parenthesis. So, head() is a method. 

Keep in mind that in a Series and most other objects in Python the 0 counts. So, inside most data sets the 0 will actually map to 1. 

Once you understand the code go back and try to document each cell 
ssed in.