## How To Install Pandas for Data Analysis 

Open Terminal 
and type the following command: "pip3 install panda"

## To Import Pandas

In [2]:
import pandas as pd

## Series

Series is a one-dimensional array-like object containing a sequence of values and associated index

In [3]:
example = pd.Series([1,2,3,4,5])

In [4]:
example

0    1
1    2
2    3
3    4
4    5
dtype: int64

The left side of the Series shows the Index.
The right side shows the Values.
dtype stands for data type 

In [5]:
example.index # shows the range of index 

RangeIndex(start=0, stop=5, step=1)

In [6]:
example.values # show the values of the series

array([1, 2, 3, 4, 5])

Let's create another example of a series

In [7]:
fruit = pd.Series(["Apple","Orange","Blackberry","Mango"])

In [8]:
fruit

0         Apple
1        Orange
2    Blackberry
3         Mango
dtype: object

You can also create a Series with a Python Dict

In [9]:
test = {"Alice":90,"Bob":70,"Carlos":50} # python dict

In [10]:
obj = pd.Series(test) # change dict into Series

In [11]:
obj

Alice     90
Bob       70
Carlos    50
dtype: int64

Notice that the index of the Series is not a range of numbers but names

Let's say we have a list of students (including those not taken exams)

In [12]:
name = ['Alice','Bob','Carlos','Danielle','Evan']

We can merge the name list with our Series  

In [13]:
obj2 = pd.Series(test, index=name)

In [14]:
obj2

Alice       90.0
Bob         70.0
Carlos      50.0
Danielle     NaN
Evan         NaN
dtype: float64

Note that Danielle and Evan don't have score results as they were not provided

To Check if a value is null 

In [15]:
obj2.isnull()

Alice       False
Bob         False
Carlos      False
Danielle     True
Evan         True
dtype: bool

## DataFrame

DataFrame represents a rectangular table of data and contains ordered collection of columns. Dictionary of Series sharing the same index.

Using the student score example:

In [16]:
data = {'Name': ['Alice','Bob','Carlos','Danielle','Evan'],
       'History': [90,70,50,60,80],
        'Maths':[60,70,90,80,50]}
frame = pd.DataFrame(data)

In [17]:
frame

Unnamed: 0,History,Maths,Name
0,90,60,Alice
1,70,70,Bob
2,50,90,Carlos
3,60,80,Danielle
4,80,50,Evan


To specify the Order of columns:

In [18]:
pd.DataFrame(data,columns=['Name','History','Maths'])

Unnamed: 0,Name,History,Maths
0,Alice,90,60
1,Bob,70,70
2,Carlos,50,90
3,Danielle,60,80
4,Evan,80,50


If you pass on a column name not contained in the dictionary. It will return with Null values

In [19]:
pd.DataFrame(data,columns=['Name','History','Maths','Biology'])

Unnamed: 0,Name,History,Maths,Biology
0,Alice,90,60,
1,Bob,70,70,
2,Carlos,50,90,
3,Danielle,60,80,
4,Evan,80,50,


Return to the orginal dataframe 


In [20]:
frame

Unnamed: 0,History,Maths,Name
0,90,60,Alice
1,70,70,Bob
2,50,90,Carlos
3,60,80,Danielle
4,80,50,Evan


In [21]:
frame.History 

0    90
1    70
2    50
3    60
4    80
Name: History, dtype: int64

In [22]:
frame.Maths

0    60
1    70
2    90
3    80
4    50
Name: Maths, dtype: int64

In [23]:
frame.loc[0] # shows what data is in Index "0"

History       90
Maths         60
Name       Alice
Name: 0, dtype: object

## To load external files with Pandas

In [28]:
df = pd.read_csv("testscore.csv")

In [29]:
df

Unnamed: 0,Name,History,Math,Biology
0,Alice,100,60,70
1,Bob,70,70,50
2,Carlos,60,70,30
3,Danielle,50,50,60


In [32]:
df['Biology'] = 0

In [33]:
df

Unnamed: 0,Name,History,Math,Biology
0,Alice,100,60,0
1,Bob,70,70,0
2,Carlos,60,70,0
3,Danielle,50,50,0


In [43]:
import os

In [46]:
#df.to_csv("newtest.csv")
#os.listdir('Users/SUH/panda101')