In [1]:
%%capture
!uv pip install s3fs zarr ctime fsspec

*Workshop 4. Doing research with hydrological data*


# Practical 1: Uncertainty in rainfall estimates

## Contents
- Load in rain gauge and grid data
- Remove rain gauge data from dodgy gauges
- Compare estimations of events, and the differences


## Objectives:
- Understand the uncertainties in rainfall estimates between gridded rainfall and rain gauge data
-


# Introduction
Rainfall is famously difficult to measure/observe. But the best appraoches we currently use are rain gauges, rain radar. Satelitte-derived precipitation is much less accurate  

[Map of UK Gauge station]  

Also, a basic [interactive map](https://thomasjkeel.github.io/UK-Rain-Gauge-Network/gauges.html)  

We will be using the CEH-GEAR dataset for UK rainfall, but there is also the HadUK-Grid (link), which is more common, but uses a more uncertain spatial interpolation method.

> *Gridded rainfall products provide a best guess estimate of rainfall*

[image of gridding]  


In [2]:
# Load required libraries
import fsspec
import zarr

import pandas as pd
import polars as pl
import xarray as xr

import matplotlib.pyplot as plt

## 1. Load data

**Data sources:**
- rain gauge and gridded rainfall data - from JASMIN object-store

In [3]:
fdri_fs = fsspec.filesystem("s3", asynchronous=True, anon=True, endpoint_url="https://fdri-o.s3-ext.jc.rl.ac.uk")
gear_daily_zstore = zarr.storage.FsspecStore(fdri_fs, path="geardaily/GB/geardaily_fulloutput_yearly_100km_chunks.zarr")
gear_daily = xr.open_zarr(gear_daily_zstore, decode_times=True, decode_cf=True)
gear_daily

Unnamed: 0,Array,Chunk
Bytes,6.69 MiB,78.12 kiB
Shape,"(1251, 701)","(100, 100)"
Dask graph,104 chunks in 2 graph layers,104 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 6.69 MiB 78.12 kiB Shape (1251, 701) (100, 100) Dask graph 104 chunks in 2 graph layers Data type float64 numpy.ndarray",701  1251,

Unnamed: 0,Array,Chunk
Bytes,6.69 MiB,78.12 kiB
Shape,"(1251, 701)","(100, 100)"
Dask graph,104 chunks in 2 graph layers,104 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,6.69 MiB,78.12 kiB
Shape,"(1251, 701)","(100, 100)"
Dask graph,104 chunks in 2 graph layers,104 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 6.69 MiB 78.12 kiB Shape (1251, 701) (100, 100) Dask graph 104 chunks in 2 graph layers Data type float64 numpy.ndarray",701  1251,

Unnamed: 0,Array,Chunk
Bytes,6.69 MiB,78.12 kiB
Shape,"(1251, 701)","(100, 100)"
Dask graph,104 chunks in 2 graph layers,104 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,310.23 GiB,27.85 MiB
Shape,"(47481, 1251, 701)","(365, 100, 100)"
Dask graph,13624 chunks in 2 graph layers,13624 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 310.23 GiB 27.85 MiB Shape (47481, 1251, 701) (365, 100, 100) Dask graph 13624 chunks in 2 graph layers Data type float64 numpy.ndarray",701  1251  47481,

Unnamed: 0,Array,Chunk
Bytes,310.23 GiB,27.85 MiB
Shape,"(47481, 1251, 701)","(365, 100, 100)"
Dask graph,13624 chunks in 2 graph layers,13624 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,310.23 GiB,27.85 MiB
Shape,"(47481, 1251, 701)","(365, 100, 100)"
Dask graph,13624 chunks in 2 graph layers,13624 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 310.23 GiB 27.85 MiB Shape (47481, 1251, 701) (365, 100, 100) Dask graph 13624 chunks in 2 graph layers Data type float64 numpy.ndarray",701  1251  47481,

Unnamed: 0,Array,Chunk
Bytes,310.23 GiB,27.85 MiB
Shape,"(47481, 1251, 701)","(365, 100, 100)"
Dask graph,13624 chunks in 2 graph layers,13624 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray


In [4]:
severn_rain_gauge_data = pl.read_csv('s3://rain-gauge/hourly_severn_rain_gauge_data.csv', storage_options={'endpoint_url': "https://fdri-o.s3-ext.jc.rl.ac.uk", 'anon': True}, try_parse_dates=True)
severn_rain_gauge_metadata = pl.read_csv('s3://rain-gauge/hourly_severn_rain_gauge_metadata.csv', storage_options={'endpoint_url': "https://fdri-o.s3-ext.jc.rl.ac.uk", 'anon': True})

In [5]:
severn_rain_gauge_data.head()

ID,DATETIME,PRECIPITATION
i64,datetime[μs],f64
89714,1978-10-01 00:00:00,0.0
89714,1978-10-02 00:00:00,2.0
89714,1978-10-03 00:00:00,0.0
89714,1978-10-04 00:00:00,0.0
89714,1978-10-05 00:00:00,0.0


In [6]:
severn_rain_gauge_metadata.head()

ID,SRC_ID,NAME,COUNTRY_CODE,EASTING,NORTHING,HYDROMETRIC_AREA,ELEVATION,GEOG_PATH
i64,i64,str,str,i64,i64,i64,i64,str
89714,2913,"""STRONGFORD W WKS""","""GB-GBN""",387932,339157,28,95,"""/BI/UK/GB/ENG/STS/"""
90358,2918,"""SUGNALL HALL""","""GB-GBN""",379831,331185,28,143,"""/BI/UK/GB/ENG/STS/"""
90359,2919,"""ECCLESHALL, SUGNALL HALL""","""GB-GBN""",379800,331200,28,137,"""/BI/UK/GB/ENG/STS/"""
90492,2920,"""WALTON HALL GARDENS""","""GB-GBN""",384900,328500,28,99,"""/BI/UK/GB/ENG/STS/"""
90537,2921,"""WHITMORE P STA""","""GB-GBN""",379900,340100,28,121,"""/BI/UK/GB/ENG/CHS/"""


# 2. Format the data
Most of data science is about data cleaning and formatting. This is especially true for those using environmental data

### 2.1 Quality controlling rain gauge dataset
Data from the rain gauge data is not Quality-controlled


In [7]:
# severn_rain_gauge_data.loc[severn_rain_gauge_data["ID"] == 90537]

In [8]:
one_gauge = severn_rain_gauge_data.filter(pl.col("ID") == 90537)

In [9]:
one_gauge_monthly_sums = one_gauge.group_by_dynamic('DATETIME', every='1m').agg(pl.col('PRECIPITATION').mean())

In [10]:
one_gauge_monthly_sums

DATETIME,PRECIPITATION
datetime[μs],f64
1961-01-01 00:00:00,2.0
1961-01-02 00:00:00,5.1
1961-01-03 00:00:00,1.5
1961-01-04 00:00:00,0.0
1961-01-05 00:00:00,10.9
…,…
1989-03-27 00:00:00,7.3
1989-03-28 00:00:00,0.1
1989-03-29 00:00:00,0.0
1989-03-30 00:00:00,0.0


#### 🤨 Tasks

???
Replace the ??? below with your answer

# 3. Compare gridded rainfall product (CEH-GEAR) with rain gauge data



## 3.1 Join one gauge to one grid cell

In [11]:
# Join one gauge to one grid

#### 🤨 Tasks

???
Replace the ??? below with your answer

## 3.2 Join multiple surrounding grid cells

#### 🤨 Tasks

???
Replace the ??? below with your answer

## 3.3. Case Study: Examining unseen rain gauge data
Unseen rain gauge at Carreg Wen near Plynlimon 

In [16]:
carreg_daily_new = pl.read_csv('s3://rain-gauge/carreg_wen_daily_rainfall.csv', storage_options={'endpoint_url': "https://fdri-o.s3-ext.jc.rl.ac.uk", 'anon': True}, try_parse_dates=True)
carreg_daily_new.head()

ID,DATETIME,PRECIPITATION
i64,datetime[μs],f64
420649,1976-01-03 09:00:00,0.5
420649,1976-01-04 09:00:00,62.0
420649,1976-01-05 09:00:00,12.5
420649,1976-01-06 09:00:00,5.0
420649,1976-01-07 09:00:00,3.5


#### 🤨 Tasks

???
Replace the ??? below with your answer

## ❗❗ FURTHER TASKS ❗❗  
Feel free to stop at this point, but below are some additional and more advanced topics and tasks requiring more of your own input. We will provide help.

Task 1. Look at another catchment  
Task 2. Use CHESS-SCAPE data  
Task 3. How does this effect estimates of floods?  


## Extra task: How does uncertainty about gridded products affect estimations of flood events

In [17]:
## Here are the dates for some high-flow event in the Upper River Severn
high_flow_dates = ["XXXX-XX-XX", ...]

## Extra task: using CHESS-SCAPE climate projection data for precipitation and temperature

In [None]:
# We are accessing TASMAX & PRCPT for the Ensemble member #01 from the catalogue
fs = fsspec.filesystem("s3", asynchronous=True, anon=True, endpoint_url="https://chess-scape-o.s3-ext.jc.rl.ac.uk")
tmax_zstore = zarr.storage.FsspecStore(fs, path="ens01-year100kmchunk/tmax_01_year100km.zarr")
pr_zstore = zarr.storage.FsspecStore(fs, path="ens01-year100kmchunk/pr_01_year100km.zarr")

chess_tmax = xr.open_zarr(tmax_zstore, decode_times=True, decode_cf=True, consolidated=False)
chess_pr = xr.open_zarr(pr_zstore, decode_times=True, decode_cf=True, consolidated=False)

# Additional Reading

- <https://github.com/NERC-CEH/FDRI-comparing-rainfall-data-in-upper-severn/tree/main>  

- <https://github.com/NERC-CEH/FDRI-high-altitude-rainfall-and-floods>