# Tutorial: Pandas

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/insop/ML_crash_course/blob/main/tutorial_pandas.ipynb)

[Pandas](https://pandas.pydata.org/) is python library for handling structured data, such as `csv` file input. We will review some of the basics of Pandas usage in this notebook.

We will use covid case from Italy for this tutorial.

In [None]:
# load csv file and save locally
from urllib.request import urlretrieve

italy_covid_url = 'https://gist.githubusercontent.com/aakashns/f6a004fa20c84fec53262f9a8bfee775/raw/f309558b1cf5103424cef58e2ecb8704dcd4d74c/italy-covid-daywise.csv'

urlretrieve(italy_covid_url, 'italy-covid-daywise.csv')

In [None]:
import pandas as pd

# load as pandas 'dataframe'
covid_df = pd.read_csv('italy-covid-daywise.csv')

In [None]:
type(covid_df)

In [None]:
# check the loaded data

covid_df

In [None]:
covid_df.head()

`.describe()` API shows statistical summary, such as min, max, avg...

In [None]:
# check the statistical summary, max, min, average...
covid_df.describe()

In [None]:
# show the row with 'new_cases' value of -148
covid_df[covid_df['new_cases'] == -148]

In [None]:
# handling column
covid_df["new_cases"] # or covid_df.new_cases

In [None]:
# handling row
covid_df.loc[172]

In [None]:
# create a new data frame

cases_df = covid_df[['date', 'new_cases']]

In [None]:
cases_df

In [None]:
# querying

high_new_cases = covid_df.new_cases > 1000
covid_df[high_new_cases] # or covid_df[covid_df.new_cases > 1000] 

In [None]:
case_gt_1000_df = covid_df[high_new_cases]
case_gt_1000_df.describe()

## Plotting example

In [None]:
import matplotlib.pyplot as plt

# plot the data
plt.plot(cases_df.date, cases_df.new_cases)

In [None]:
# plot scatter plot
plt.scatter(cases_df.date, cases_df.new_cases)


## Credits

This notebook is a condensed version of the excellent Pandas tutorial from Jovian.ai, [Pandas tutorial from Jovian](https://jovian.ai/learn/data-analysis-with-python-zero-to-pandas/lesson/lesson-4-analyzing-tabular-data-with-pandas).

## References

- [Pandas tutorial from Jovian](https://jovian.ai/learn/data-analysis-with-python-zero-to-pandas/lesson/lesson-4-analyzing-tabular-data-with-pandas)