### Modules used in this notebook
`xarray`, `cfgrib`, `matplotlib`

# Climate: C.001 - Loading climate data and subsetting

Weather and climate data is large and can have many dimensions, for example climate model data would generally have dimensions [time , latitude , longitude]. For this reason filetypes like `.csv` and `.dat` are not suitable, and some different formats are used. The most common of these are `.netcdf` and `.grib`

To read these files you will need some particular python libraries. There are multiple options (e.g. 'xarray', 'cfpython') but for this example `cfgrib` and `eccodes` are needed to read GRIB files.

> Q1: What is the GRIB format? https://en.wikipedia.org/wiki/GRIB

After reading this you should be happy with how the file type differs from the type of data files you could load into software like Excel.

For the purposes of this example the data that we are using has been downloaded from the climate data store in advance: https://cds.climate.copernicus.eu/#!/home

We will be using data from the ERA5 reanalysis today. For information on what a reanalysis is in broad terms see this page: https://climate.copernicus.eu/climate-reanalysis.

Take this opportunity to look through the extra documentation we provided for more information on reanalysis and other types of weather and climate data: https://research.reading.ac.uk/met-energy/wp-content/uploads/sites/53/2021/09/energymet_education_videos_links_checked.pdf

Please explore the climate data store website in your free time. There are good examples of how to download the data to your machine of choice using the 'cdsapi'.


# Opening the file with xarray

xarray is a powerful open-source library designed to access and manipulate multi-dimensional data. With the cfgrib engine, developed by ECMWF, we can access GRIB data using the ecCodes library that was previously downloaded..

> Q2: what is the structure of a xarray dataset? https://docs.xarray.dev/en/stable/user-guide/data-structures.html#dataset

Run the code below to import the xarray library and open the dataset. The file naming convention here tells us some information (e.g. that the data is from era5 and probably from March 2019) but all this information can be checked once the data is opened.


In [None]:
import xarray as xr
d = xr.open_dataset('../data/era5-u100_v100_201903.grib')
d

## Basic plotting

Working with these large climate datasets can be much easier if you have some skills to visualise the data. Above we see the data has three dimensions: time, latitude and longitude. So we may want to:
- plot timeseries at given points.
- plot latitude-longitude slices of data at a given time point (or averaged over many time points).  


> Q3: What are the basic plotting features provided by xarray? https://docs.xarray.dev/en/stable/user-guide/plotting.html

Take a few minutes to explore this and the examples within. There are many types of visualisations that are possible.

Run the line of code below to make a simple map of the first day of the data. Note that using the inbuild xarray functions mean that you don't have to write lines of code to define the colourbars etc.

These maps are made using the pcolormesh function from matplotlib. See this link for more examples: https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.pyplot.pcolormesh.html




In [None]:
import matplotlib.pyplot as plt

(d['u100']
.sel(time='2019-03-01')
.plot(
    x="longitude", 
    y="latitude", 
    col="time", 
    col_wrap = 4,
    cmap=plt.cm.viridis)
)

You may wish to visualise many time slices at once, this can be done by selecting multiple time slices and specifying the number of columns.

Run the lines of code below then experiment with changing the parameters to produce different plots.

In [None]:
import matplotlib.pyplot as plt
(
    d['u100']
    .isel(time=slice(0,48, 8))
    .plot(
        x="longitude", 
        y="latitude", 
        col="time", 
        col_wrap=3, cmap=plt.cm.PiYG)
)

Have a look through the xarray visualisation gallery for other types of plots you could create with this data: https://docs.xarray.dev/en/stable/examples/visualization_gallery.html

## Subsetting the data in time

There are many ways you may wish to subset a dataset and xarray provides some handy inbuilt features to do this.
The documentation for this function is here: https://docs.xarray.dev/en/stable/generated/xarray.DataArray.sel.html

> Q4: Can you add in an extra block of code that extracts all of the data on the 14th March from 8pm to midnight?

> Q5: Can you then make a map to visualise your subset?


In [None]:
# single day
d.sel(time = '2019-03-12')
# days period
d.sel(time = slice('2019-03-12', '2019-03-15'))
# only the hours between 10 and 14
d.sel(time=(d.time.dt.hour >= 10) & (d.time.dt.hour <= 14))
d.sel(latitude=slice(60, 30))

# Subsetting data in space

Similar instructions can be given to those described above to subset the dat ain space using the 'sel' function.


> Q6: Can you print out the latitudes and longitudes of the dataset above?

> Q7: Can you select a point in space and subset the timeseries from this point?

> Q8: Can you plot the timeseries using the .plot() function?
