# pandas Series and DataFrame

## pandas
**pandas** is an open source library providing data structures and data analysis tools for Python programmers

In [1]:
import pandas as pd

## Series
The pandas **Series** is a one dimensional array, similar to a Python list

In [3]:
airports = pd.Series([
                      'Seattle-Tacoma', 
                      'Dulles', 
                      'London Heathrow', 
                      'Schiphol', 
                      'Changi', 
                      'Pearson', 
                      'Narita'
                      ])

# When using a notebook, you can use the print statement
# print(airports) to examine the contents of a variable
# or you can print a value on the screen by just typing the object name
airports

0     Seattle-Tacoma
1             Dulles
2    London Heathrow
3           Schiphol
4             Changi
5            Pearson
6             Narita
dtype: object

You can reference an individual value in a Series using it's index

In [4]:
airports[2]

'London Heathrow'

You can use a loop to iterate through all the values in a Series

In [5]:
for value in airports:
    print(value) 

Seattle-Tacoma
Dulles
London Heathrow
Schiphol
Changi
Pearson
Narita


## DataFrame
Most of the time when we are working with pandas we are dealing with two-dimensional arrays

The pandas **DataFrame** can store two dimensional arrays

In [4]:
airports = pd.DataFrame1

airports

Unnamed: 0,Bam,Rogers,Irine
0,Seatte-Tacoma,Seattle,USA
1,Dulles,Washington,USA
2,London Heathrow,London,United Kingdom
3,Schiphol,Amsterdam,Netherlands
4,Changi,Singapore,Singapore
5,Pearson,Toronto,Canada
6,Narita,Tokyo,Japan


Use the **columns** parameter to specify names for the columns when you create the DataFrame

In [5]:
airports = pd.DataFrame([
                        ['Seatte-Tacoma', 'Seattle', 'USA'],
                        ['Dulles', 'Washington', 'USA'],
                        ['London Heathrow', 'London', 'United Kingdom'],
                        ['Schiphol', 'Amsterdam', 'Netherlands'],
                        ['Changi', 'Singapore', 'Singapore'],
                        ['Pearson', 'Toronto', 'Canada'],
                        ['Narita', 'Tokyo', 'Japan']
                        ],
                        columns = ['Name', 'City', 'Country']
                        )

airports 

Unnamed: 0,Name,City,Country
0,Seatte-Tacoma,Seattle,USA
1,Dulles,Washington,USA
2,London Heathrow,London,United Kingdom
3,Schiphol,Amsterdam,Netherlands
4,Changi,Singapore,Singapore
5,Pearson,Toronto,Canada
6,Narita,Tokyo,Japan


In [14]:
list_of_names = {
    "rogers" : ["Rogers", 22, "Gulu", "Solomon Island"],
    "bam" : ["Bam J", 15, "Kampala", "Uganda"],
    "irine" : ["Irine", 13, "Mokono", "Uganda"],
    "hellen" : ["Hellen", 41, "Pakwach", "Canda"],
    "Jerry" : ["Jerry", 60, "Boro", "USA"]
}

df = pd.DataFrame(list_of_names)
df

Unnamed: 0,rogers,bam,irine,hellen,Jerry
0,Rogers,Bam J,Irine,Hellen,Jerry
1,22,15,13,41,60
2,Gulu,Kampala,Mokono,Pakwach,Boro
3,Solomon Island,Uganda,Uganda,Canda,USA


In [15]:
airports

Unnamed: 0,Name,City,Country
0,Seatte-Tacoma,Seattle,USA
1,Dulles,Washington,USA
2,London Heathrow,London,United Kingdom
3,Schiphol,Amsterdam,Netherlands
4,Changi,Singapore,Singapore
5,Pearson,Toronto,Canada
6,Narita,Tokyo,Japan


In [16]:
airports.head()

Unnamed: 0,Name,City,Country
0,Seatte-Tacoma,Seattle,USA
1,Dulles,Washington,USA
2,London Heathrow,London,United Kingdom
3,Schiphol,Amsterdam,Netherlands
4,Changi,Singapore,Singapore


In [18]:
airports.head(2)

Unnamed: 0,Name,City,Country
0,Seatte-Tacoma,Seattle,USA
1,Dulles,Washington,USA


In [19]:
airports.head(3)

Unnamed: 0,Name,City,Country
0,Seatte-Tacoma,Seattle,USA
1,Dulles,Washington,USA
2,London Heathrow,London,United Kingdom


In [20]:
airports.tail()

Unnamed: 0,Name,City,Country
2,London Heathrow,London,United Kingdom
3,Schiphol,Amsterdam,Netherlands
4,Changi,Singapore,Singapore
5,Pearson,Toronto,Canada
6,Narita,Tokyo,Japan


In [21]:
airports.tail(2)

Unnamed: 0,Name,City,Country
5,Pearson,Toronto,Canada
6,Narita,Tokyo,Japan


In [22]:
airports.shape

(7, 3)

In [24]:
airports.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 7 entries, 0 to 6
Data columns (total 3 columns):
 #   Column   Non-Null Count  Dtype 
---  ------   --------------  ----- 
 0   Name     7 non-null      object
 1   City     7 non-null      object
 2   Country  7 non-null      object
dtypes: object(3)
memory usage: 296.0+ bytes


In [25]:
airports.dtypes

Name       object
City       object
Country    object
dtype: object

In [26]:
airports["Name"]

0      Seatte-Tacoma
1             Dulles
2    London Heathrow
3           Schiphol
4             Changi
5            Pearson
6             Narita
Name: Name, dtype: object

In [31]:
airports[["City", "Name"]]

Unnamed: 0,City,Name
0,Seattle,Seatte-Tacoma
1,Washington,Dulles
2,London,London Heathrow
3,Amsterdam,Schiphol
4,Singapore,Changi
5,Toronto,Pearson
6,Tokyo,Narita


In [33]:
airports.iloc[:3]

Unnamed: 0,Name,City,Country
0,Seatte-Tacoma,Seattle,USA
1,Dulles,Washington,USA
2,London Heathrow,London,United Kingdom


In [36]:
airports.iloc[2, 2]

'United Kingdom'

In [37]:
airports.iloc[:,:]

Unnamed: 0,Name,City,Country
0,Seatte-Tacoma,Seattle,USA
1,Dulles,Washington,USA
2,London Heathrow,London,United Kingdom
3,Schiphol,Amsterdam,Netherlands
4,Changi,Singapore,Singapore
5,Pearson,Toronto,Canada
6,Narita,Tokyo,Japan


In [38]:
airports.iloc[:, 1]

0       Seattle
1    Washington
2        London
3     Amsterdam
4     Singapore
5       Toronto
6         Tokyo
Name: City, dtype: object

In [40]:
airports.loc[:,"Name"]

0      Seatte-Tacoma
1             Dulles
2    London Heathrow
3           Schiphol
4             Changi
5            Pearson
6             Narita
Name: Name, dtype: object