# Import Pandas

To use the Pandas library, you need to import it into your coding environment. For these exercises, our coding environment is this Jupyter notebook. Pandas is typically imported and named as "pd".

In [82]:
# import the pandas library and use the alias 'pd'
import pandas as pd

# Series

A series is a data type used by Pandas. It is like a one dimensional list, whose elements can be of any data type themselves.
Pandas contains a method, ``.Series()``, that initiates a series object from a Python list, tuple, 

In [83]:
#create the s1 series from the list: [10,20,30,40,50]
list = [10,20,30,40,50]
s1 = pd.Series(list)
print(s1)

s1
# additional practice
# create a series called s_tuple from the tuple: ("apple", "banana", "cherry")
tuple_list = ("apple", "banana", "cherry")
s_tuple = pd.Series(tuple_list)

print(s_tuple)
# create a series called s_dictionary from the dictionary: {"brand": "Ford", "model": "Mustang", "year": 1964}
dictionary_list = {"brand": "Ford", "model": "Mustang", "year": 1964}
s_dictionary = pd.Series(dictionary_list)

print(s_dictionary)

0    10
1    20
2    30
3    40
4    50
dtype: int64
0     apple
1    banana
2    cherry
dtype: object
brand       Ford
model    Mustang
year        1964
dtype: object


In [84]:
# fill in the blank with the series values and index values
print("The s1 series values are:", [10, 20, 30, 40, 50])
print("The s1 index values are:", [0, 1, 2, 3, 4])
print("\n")
print("The s_tuple series values are:", ["apple", "banana", "cherry"])
print("The s_tuple index values are:", [0, 1, 2])
print("\n")
print("The s_dictionary series values are:", ["Ford", "Mustang", 1964])
print("The s_dictionary index values are:", ["brand", "model", "year"])


The s1 series values are: [10, 20, 30, 40, 50]
The s1 index values are: [0, 1, 2, 3, 4]


The s_tuple series values are: ['apple', 'banana', 'cherry']
The s_tuple index values are: [0, 1, 2]


The s_dictionary series values are: ['Ford', 'Mustang', 1964]
The s_dictionary index values are: ['brand', 'model', 'year']


Index values in a series are generated by default. We can also define custom indexes at the time of creating series.

Below is a series of marks (grades) and the associated subjects can be defined as custom row indexes.

In [85]:
# create the s2 series from the list [80,93,78,85,97] and with labeled indexes ['English','Science','Social','Tamil','Maths']
import pandas as pd
grades = [80,93,78,85,97]
subjects = ['English','Science','Social','Tamil','Maths']

s2 = pd.Series(grades, subjects)

print(s2)

English    80
Science    93
Social     78
Tamil      85
Maths      97
dtype: int64


In [86]:
# print the s2 series
print(s2)

English    80
Science    93
Social     78
Tamil      85
Maths      97
dtype: int64


We can provide a name for the whole series, as well as a name for the indexes.

In [87]:
# assign the s2 series name to be "Student Marks"
s2.name = "Student Marks"
print(s2)

# assign the s2 index name to be "Subject"
s2.index.name = "Subject"
print(s2)

English    80
Science    93
Social     78
Tamil      85
Maths      97
Name: Student Marks, dtype: int64
Subject
English    80
Science    93
Social     78
Tamil      85
Maths      97
Name: Student Marks, dtype: int64


In [88]:
# print the s2 series again to see the newly assigned name and index name
print("Series index name 'subjects' and name 'Student Marks'")
print(s2)

Series index name 'subjects' and name 'Student Marks'
Subject
English    80
Science    93
Social     78
Tamil      85
Maths      97
Name: Student Marks, dtype: int64


## Subsetting the Series

Values inside the series can be accessed using the ``[ ]`` square bracket slicing operation

- Series can be sliced using the default index
- Series can also be sliced using string indices (indexes)

In [89]:
# return the first element in the s2 series
first_element = s2[0]
print(first_element)

80


  first_element = s2[0]


In [90]:
# slicing using default integer index
slice_integer_index = s2[0:3]

print(slice_integer_index)

# return the elements starting at position one and ending with the element at index 3
element_index = s2[1:4]

print(element_index)

Subject
English    80
Science    93
Social     78
Name: Student Marks, dtype: int64
Subject
Science    93
Social     78
Tamil      85
Name: Student Marks, dtype: int64


In [91]:
# Slicing using string index
slice_integer_index =s2['Science':'Tamil']
print(slice_integer_index)

# return the element associated with the index 'Tamil'
element_associated_index = s2['Tamil']

print(element_associated_index)

Subject
Science    93
Social     78
Tamil      85
Name: Student Marks, dtype: int64
85


## Creating a Series from a Dictionary

- A Python dictionary is a data structure which maps keys to a set of values.
- A dictionary contains key-value pairs.
- A series is similar to a dictionary in a way that it maps given indexes to a set of values.

Let's see how to create a series from a dictionary

I have a dictionary that represents fruits and prices. Let's create a series from this dictionary

In [92]:
# create a dictionary of fruit names and their prices. Use the values from the article
fruit_prices ={'Apple':0.50, 'Banana': 0.30, 'Cherry': 2.50, 'Watermelon':6.75, 'Grapes': 3.25}
fruit_series = pd.Series(fruit_prices)
fruit_series.name = "Fruit Prices"
print(fruit_series)

Apple         0.50
Banana        0.30
Cherry        2.50
Watermelon    6.75
Grapes        3.25
Name: Fruit Prices, dtype: float64


In [93]:
# Lets convert this dictionary into a series
fruit_series = pd.Series(fruit_prices)
# use the pandas method .Series() 
fruit_series.name = "Fruit Prices"
print(fruit_series)

Apple         0.50
Banana        0.30
Cherry        2.50
Watermelon    6.75
Grapes        3.25
Name: Fruit Prices, dtype: float64


In [94]:
# print the series
print(fruit_series)

Apple         0.50
Banana        0.30
Cherry        2.50
Watermelon    6.75
Grapes        3.25
Name: Fruit Prices, dtype: float64


In [95]:
# Slice the series and retrieve price of grapes
grapes_price = fruit_series['Grapes']
print(f"price of Grapes: ${grapes_price}")

price of Grapes: $3.25


# DataFrame

## Create a DataFrame from a Series object

Let's create a DataFrame from the marks series we created in the above step.

In [96]:
# create a dataframe from the s2 series
import pandas as pd
grapes = [80,93,78,85,97]
subjects = ['English', 'Science', 'Social', 'Tamil', 'Maths']
s2 = pd.Series(grades, index=subjects)
s2.name = "Student"
df = pd.DataFrame(s2)
print(df)


         Student
English       80
Science       93
Social        78
Tamil         85
Maths         97


In [97]:
# print the dataframe
print(df)


         Student
English       80
Science       93
Social        78
Tamil         85
Maths         97


## Create a DataFrame from a Dictionary

Let's say we have 2 series of heights and weights of the same set of persons and we want to put it together in a table.

Here we can create dictionary using both series and create a DataFrame using the Pandas ``.DataFrame()`` method.

In [98]:
# Create height series using [5.3, 6.2,5.8,5.0,5.5] for values, ['Person 1','Person 2','Person 3','Person 4','Person 5'] for indexes
import pandas as pd
values = [5.3, 6.2,5.8,5.0,5.5]
indexes = ['Person 1','Person 2','Person 3','Person 4','Person 5']
heights = pd.Series(values, index=indexes)
heights.name = "Height (ft)"

print(heights)


# Create weight series (in kgs) using [65,89,75,60,59] for values and the same indexes as height
import pandas as pd
weight_values = [65,89,75,60,59]
weights = pd.Series(weight_values, index=indexes)
weights.name = "weights (kg)"
weights.index.name = "Person"

print(weights)

Person 1    5.3
Person 2    6.2
Person 3    5.8
Person 4    5.0
Person 5    5.5
Name: Height (ft), dtype: float64
Person
Person 1    65
Person 2    89
Person 3    75
Person 4    60
Person 5    59
Name: weights (kg), dtype: int64


In [100]:
# Create a dataframe from the height and weight series above
import pandas as pd
values = [5.3, 6.2,5.8,5.0,5.5]
indexes = ['Person 1','Person 2','Person 3','Person 4','Person 5']
heights = pd.Series(values, index=indexes)
heights.name = "Height (ft)"
weight_values = [65,89,75,60,59]
weights = pd.Series(weight_values, index=indexes)
weights.name = "Weights (kg)"
df = pd.DataFrame({'Height': heights, 'Weight': weights})

print(df)

# print the dataframe
print(df)

          Height  Weight
Person 1     5.3      65
Person 2     6.2      89
Person 3     5.8      75
Person 4     5.0      60
Person 5     5.5      59
          Height  Weight
Person 1     5.3      65
Person 2     6.2      89
Person 3     5.8      75
Person 4     5.0      60
Person 5     5.5      59
