# Graphing/plotting data
***
This Notebook shows you how to use Pandas to load and plot data. Here, we use two different types of MeteoSwiss data as examples.

During the Data Visualisation session we discussed what good graphs can look like. We use the following packages to try this using Python:

- `pandas` - for loading the data into a DataFrame.
- `matplotlib` - the basic framework for plotting with Python.
- `seaborn` - mainly a statistical plotting package, but also sets some nicer aesthetic defaults.
- `datetime` - for setting the limits of the x axis.

More help on plotting can be found in the matplotlib cheatsheets at https://matplotlib.org/cheatsheets/, and/or ask the teachers for help.


In [None]:
# Load the necessary packages.
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import datetime as dt
import os

In [None]:
# Make a function to download stuff from SwitchDrive.
def dl(fn, switch_url):
    if os.path.exists(fn):
        print('File already downloaded.')
    else:
        print('Downloading...')
        import requests
        response = requests.get(switch_url)
        open(fn, "wb").write(response.content)
        print('Done.')

## Monthly data downloaded from the MeteoSwiss website

In [None]:
# First download the data from switchdrive.
file_monthly = "climate-reports-tables-homogenized_PAY_monthly.txt"
dl(file_monthly, "https://www.dropbox.com/scl/fi/p950wesj4qgilbmmlv72h/climate-reports-tables-homogenized_PAY_monthly.txt?rlkey=vh053tzybs3095yh3d9f9uajo&dl=1")

In [None]:
# Load the data.
data = pd.read_table(
    file_monthly, 
    skiprows=27, # Skip the first 27 rows
    sep='\s+', # The data are separated by at least one space (\s)
    )
data.index = pd.to_datetime(data[['Year', 'Month']].assign(DAY=1))

In [None]:
# Look at the data.
data

Here's an example of how to plot these monthly data.

In [None]:
# Using seaborn, we say that we want a figure to look suitable for use in a paper.
sns.set_context('paper')

# Set up the figure, and the axis within the figure.
fig, ax = plt.subplots(figsize=(6,3.5))
# ax = plt.subplot(111) # We want just a single axis.

# The line commented out below is the simple way of plotting, but you 
# cannot change the xlimits after!
#data.Temperature.plot(ax=ax)

# Plot the data
ax.plot(data.index, data.Temperature)

# x axis settings
ax.set_xlabel('Date')
ax.set_xlim(dt.date(1965, 1, 1), dt.date(2022, 1, 1))

# y axis settings
ax.set_ylabel('Air Temperature (degrees C)')

# I've disabled the gridlines in the background completely
ax.grid(visible=False)

# This command gets rid of more borders and ticks.
sns.despine()


# We can save this figure as a PNG file which you can then use in software like Word and PowerPoint.
plt.savefig('PAY_monthly.png', dpi=300)

## Files from the Meteoswiss computer

The files that you create on the computer with the MeteoSwiss program have a different format to the monthly website files. Below is an example that works with daily temperature and precipitation data downloaded from the Delemont station in the Jura for 2022.

In [None]:
file_daily = 'meteoswiss_delemont_2022_daily.dat'
dl(file_daily, 'https://www.dropbox.com/scl/fi/217a9t2j4vfz0frivd0go/meteoswiss_delemont_2022_daily.dat?rlkey=50t2krkwr94ftqbgc5wnkk2hg&dl=1')

In [None]:
met = pd.read_fwf(
    file_daily,                                # Put your filename here
    skiprows=8,                                # Skip the first 8 rows
    sep='\s+',                                 # The data are separated by at least one space (\s),
    encoding='windows-1252',                   # Don't worry about what this means! Not usually needed, this is the first time I ever used it...
    parse_dates={'date':['JAHR', 'MO', 'TG']}, # Convert Year, Month, Day into Python dates
    index_col='date',                          # Set pandas DataFrame 'coordinate' (index) to the date.,
    decimal=','                                # MeteoSwiss data use the comma as a decimal point.
    )

# Let's get rid of columns that we don't want.
# STA = Station identifier
# HH = Hour
# MM = Minute
met = met.drop(labels=['STA','HH','MM'], axis='columns')

# Finally, you can rename the columns to something helpful. 
# The example file I'm using has two columns named 211 and 236, which are temperature and precipitation.
# Modify this according to your needs.
met = met.rename({'211':'temperature', '236':'precipitation'}, axis='columns')

# Finally you can do a plot...
# Here directly with the library-internal function .plot() of pandas
met['temperature'].plot()

In [None]:
met

We can see that we have two columns, named `temperature` and `precipitation`.

## Bonus material

### Summary statistics

In [None]:
met.describe()

### Annual time series from the monthly data

This works in a very similar way to xarray, we just don't need to provide the 'time' coordinate name.

In [None]:
annual_t = data.resample('1YS').mean()
annual_t.head()

In [None]:
fig, ax = plt.subplots()
ax.plot(annual_t.index, annual_t.Temperature)
sns.despine()

2022 is incomplete, which means we are visualising a higher annual temperature than we should. Let's remove and re-plot.

In [None]:
annual_t_clean = annual_t[:'2021']
fig, ax = plt.subplots()
ax.plot(annual_t_clean.index, annual_t_clean.Temperature)
ax.set_ylabel('Air Temperature (degrees C)')
sns.despine()
plt.savefig('PAY_annual.png', dpi=300)

## Map plotting tips/tricks

The code in this section shows how to plot a raster (later on more on that) **and change the label of the colorbar**.

In [None]:
import numpy as np
import xarray as xr
import matplotlib.pyplot as plt

# Create a 50x60 raster with random values using numpy
data = np.random.rand(50, 60)

# Create an xarray DataArray
raster = xr.DataArray(data,
                      dims=["y", "x"],
                      coords={"y": np.arange(50),
                              "x": np.arange(60)}
                     )

# Plot the raster using xarray's built-in plot function
raster.plot(cmap="viridis", 
            cbar_kwargs={'label':'my new label'}
           )
plt.title("50x60 Raster")
