# Using Pandas Package to Analyze Dataframes

This notebook will take you through the process of reading in text data (comma separated values also known as .csv) using pnadas module. By the end of this exercise it is expected that you will be able to 

In [1]:
import pandas as pd  ## Main package used for dataframes in Python

# help(pd)       ## Will provide more details about the package

In [45]:
## Read in some data

# help(pd.read_csv)    ## Querry the function for reading csv files

df = pd.read_csv('../data/pr_AFR-44_CCCma-CanESM2_historical_SMHI-RCA4__mon_1951-2005_Emali.csv', sep=',')#;df

In [46]:
## Convert the Dates column to Pandas Datetime object and make that column an index

df['Dates'] = pd.to_datetime(df['Dates'])      ## Convert to a DatetTime object
df = df.set_index('Dates')#;df                  ## Setting the 'Dates' column as an index

In [43]:
## Select the months in March to May season using a condition

df_mam = df.iloc[((df.index.month >= 3) & (df.index.month <= 5))]#;df_mam

In [51]:
## Group by year and get the annual MAM rainfall totals. Use mean for temperature

ann_mam = df_mam.groupby(df_mam.index.year).sum()#;ann_mam

# adding column name to the respective column

ann_mam.columns =['mam']#;ann_mam

## Plotting the inter-annual variability in MAM season

In [53]:
import matplotlib.pyplot as plt

In [None]:
# Create figure and plot space
fig, ax = plt.subplots(figsize=(10, 10))

# Add x-axis and y-axis
ax.plot(ann_mam.index.values,
       ann_mam['mam'],
       color='purple')

# Set title and labels for axes
ax.set(xlabel="Years",
       ylabel="Precipitation (mm)",
       title="Total Rainfall for MAM Season")

# Rotate tick marks on x-axis
plt.setp(ax.get_xticklabels(), rotation=45)

plt.show()

## Exercise 

- Repeat the process above and add the JJAS and OND seasons to the same plot using a diferent color and line type.
- Repear the process for Emali station but add the different climate models (i.e four climate models) in the same plot (remember that you cannot use the same variable name to assign different values, i.e. use df1, df2, and so on for multiple datasets/models).
- Calculate the anomalies (i.e deviation of each mam/ond/jjas value from the long-term mean of 1981-2010) and plot.
- Repeat the process but plot the annual cycle i.e (long-term mean rainfall amounts for each month versus the months).
- Load the temperature data and repeat the above.
- Save the plots in png format in the ./plots folder.

Use the cells below for the exercies.