# Extract Met-office climate model data to point - shapefile

The following notebook demonstrates how I have extracted climate model data (e.g. rainfall etc.) from netcdf to shapefile points. The goal of this process is to obtain monthly figures for these variables for each plot, similarly to the other notebook in this repo, which extracts NDVI from Sentinel 2 for the same purpose. There are doubtless more efficient ways to do this for Met data, but the aim was to a similar method as for the EO-based data.   

**Ensure you are running the correct kernel. Look at the top right  it should be [conda env:eot] as seen in the top right.** 

**If not, Kernel > Change kernel > eot from the menu.** 

The points in question here are Countryside survey data which may be accessed via the CEH environmental data centre (https://eip.ceh.ac.uk/).  

Please obtain your own CSS points file and replace the shapefile string as appropriate.

In [None]:
import os 
from glob2 import glob
from src.downloader import dload, dloadbatch, setup_sesh
from src.met_tseries import met_time_series
from tqdm import tqdm

## Download

Firstly, we must obtain the relevant data, which is provided here to save the tedium of clicking through multiple pages to get it.

First we setup our credentials to download the data by running the setup_sesh cmd

**This will only work if you have registered with CEDA - ensure you have done so and replace the strings**

In [None]:
setup_sesh("mycedausrname", "mypasswrd")

Make a dir for it to go in.

In [None]:
metdir = 'metdwn'
if not os.path.isdir(metdir):
    os.mkdir(metdir)

A lazy way to obtain the URLs we need for each climate variable. For an explanation of what each one is see here:

https://www.metoffice.gov.uk/research/climate/maps-and-data/data/haduk-grid/datasets

We can resuse the climate var names later too...

In [None]:
# the rainfall one as a template
rain_url = ('https://dap.ceda.ac.uk/badc/ukmo-hadobs/data/insitu/MOHC/'
 'HadOBS/HadUK-Grid/v1.0.2.1/1km/rainfall/mon/v20200731/'
 'rainfall_hadukgrid_uk_1km_mon_201601-201612.nc')
# list of vars
clim_vars = ['groundfrost', 'hurs', 'psl', 'pv', 'sfcWind', 'sun', 'tas']

# final list of urls
dwnlist = [rain_url.replace('rainfall', c) for c in clim_vars]

# add the rainfall
dwnlist.append(rain_url)
clim_vars.append('rainfall')

In [None]:
# sanity.....
print(dwnlist[0], clim_vars[0], '\n', dwnlist[7], clim_vars[7])

Now we can download the data - this also returns a list of the file paths to be used later. This may take a wee while......

In [None]:
imlist = dloadbatch(dwnlist, metdir)

## Data extraction

Using the CSS file that you will have obtained in advance - loop through the netcdfs adding the climate data.

In [None]:
inShp = "mycss.shp"

for d,p in tqdm(zip(imlist, clim_vars)):
    met_time_series(d, inShp, inShp, p) 
    

### Read in the newly created file and extract the variables

For reference, the variable names are cut down versions of the clim_vars list above (as shapefiles only accept strings of a certain length) which may be obtained by using a list comp. These can be used to extract a df of one variable on another.

```python

names[7] 
```
....being rain

In [None]:
names = [c[0:4] for c in clim_vars]
names

In [None]:
ndf = gpd.read_file(inShp)

### Extract a particular group from the  shapefile

```python
tseries_group(ndf, name, other_inds=None)
```
Where:
- ```name``` = srting - the attribute to extract ('rain' in the case below)
- ```other_inds``` = list -  additional attributes to extract for later indexing (not used here)



In [None]:
rain = tseries_group(ndf, names[7], other_inds=None)

Check the df

In [None]:
rain.head()