# Pandas Demos

This notebook contains live code for the Pandas examples from the [SN1 textbook](https://dawsoncollege.gitlab.io/thinkcs-sn1/thinkcspy.html).

Press "Run All" above, or run each code block separately in order.

## Importing Pandas

In [None]:
# Importing Pandas

import pandas as pd

## DataFrames

### Creating DataFrames

In [None]:
# Creating a DataFrame from a dictionary of lists that will become columns

cities = {
    "City": ["Montreal", "Toronto", "Vancouver"],
    "Province": ["QC", "ON", "BC"],
    "Population": [1762949, 2794356, 662248]
}

city_df = pd.DataFrame(cities)

city_df # we can display the DataFrame without a print because this is a Jupyter Notebook

In [None]:
# Creating a DataFrame from a dictionary where key and value both become columns

score = {
    "Alice": 21,
    "Bob": 12,
    "Charlie": 15
}

score_df = pd.DataFrame(score.items())

score_df # we can display the DataFrame without a print because this is a Jupyter Notebook

In [None]:
# Creating a DataFrame from a CSV

iris_df = pd.read_csv('iris.csv')

iris_df # we can display the DataFrame without a print because this is a Jupyter Notebook

### `.head`, `.tail`, and `.sample`

In [None]:
iris_df.head()

In [None]:
iris_df.tail()

In [None]:
iris_df.sample()

In [None]:
iris_df.sample(4)

### Cleaning Data

In [None]:
new_iris_df = iris_df.dropna()
# or iris_df.dropna(inplace=True) to modify original

In [None]:
new_iris_df = iris_df.fillna({"petal_width": 0.8})
# or iris_df.fillna({"petal_width": 0.8}, inplace=True)

### Selecting and Filtering

In [None]:
# Selecting a single column (returns a Series)

iris_df["petal_length"]

In [None]:
# Selecting multiple columns (returns a DataFrame)

iris_df[["petal_length", "petal_width"]]

In [None]:
# Filtering (filter condition on separate line)

filter_condition = iris_df["sepal_length"] > 5.0
long_sepal_df = iris_df[filter_condition]

long_sepal_df

In [None]:
# Filtering (filter condition on same line)

setosa_df = iris_df[iris_df["species"] == "setosa"]

setosa_df

In [None]:
# Slicing

iris_df[3:10]

## Analyzing Data

### Overview

In [None]:
iris_df.info()

In [None]:
iris_df.describe() # returns a new DataFrame about the original DataFrame!

### Statistical Calculations

In [None]:
float(iris_df["petal_width"].max())

In [None]:
float(iris_df["petal_width"].min())

In [None]:
float(iris_df["petal_width"].mean())

In [None]:
float(iris_df["petal_width"].median())

In [None]:
iris_modes = iris_df["petal_width"].mode()
# Note that this returns a collection, as there could be more than one mode

for mode in iris_modes:
    print(mode)

In [None]:
iris_df["species"].value_counts()