### Modules used in this notebook
`xarray`, `cfgrib`, `matplotlib`, `pandas`, `numpy`

# Climate: C.003 - From speed to power

Weather and climate data is large and can have many dimenstions, for example climate model data would generally have dimensions [time , latitude , longitude]. For this reason filetypes like .csv .dat are not suitable, and some different formats are used. The most common of these are .pp .netcdf and .grib

To read these files you will need some particular python libraries. There are multiple options (e.g. 'xarray', 'cfpython') but for this example `cfgrib` and `eccodes` are needed to read GRIB files.

> Q1: What is the GRIB format? https://en.wikipedia.org/wiki/GRIB

After reading this you should be happy with how the file type differs from the type of data files you could load into software like Excel.

For this exercise we will be creating wind power data, so lets also load in data from a wind turbine power curve, The Vestas v110 wind turbine.

# Opening the file with xarray
`xarray` is a powerful open-source library designed to access and manipulate multi-dimensional data. With the `cfgrib` engine, [developed by ECMWF](https://github.com/ecmwf/cfgrib), we can access GRIB data using the `ecCodes` library that was previously downloaded..

> Q2: what is the structure of a `xarray` dataset? https://docs.xarray.dev/en/stable/user-guide/data-structures.html#dataset


Run the code below to import the xarray librariy and open the dataset.
The file naming convention here tells us some information (e.g. that the data is from era5 and probably from March 2019) but all this information can be checked once the data is opened.


In [None]:
import xarray as xr
d = xr.open_dataset('..\data\era5-u100_v100_201903.grib')
d

## Calculating wind speed
`u100` and `v100` are respectively the west-east (known as the zonal component) and the south-north (known as meridional) components.

<!-- <div style="max-width:400px;margin-left: auto; margin-right: 0;">

![windspeed-diagram.png](https://disc.gsfc.nasa.gov/media/image/07af14c37a0a44e482feea5975e1731f/windspeed-diagram.png)

</div> -->

<div>
<img src="https://disc.gsfc.nasa.gov/media/image/07af14c37a0a44e482feea5975e1731f/windspeed-diagram.png" width="500"/>
</div>

Run the line of code below for an example of how to do this using xarray, and then to reprint the open dataset to see the new field within it.


In [None]:
d['ws100'] = (d['u100']**2 + d['v100']**2)**(1/2)
d

## Extracting a time-series

We may want to calculate timeseries of wind or solar power at a particular location. To do this we need some knowledge of the area covered within the data file (see above).

Run the following lines of code for examples of how to extract a time-series of data by selecting the nearest grid point to a location of interest, and plotting this out.

Note we are using our new 100m wind speed field created in the previous example.

> Q4: Can you adapt the above code to extact some data from an operational wind farm location?

In [None]:
sel_lat = 56.84
sel_lon = 23.88

single_nearest = d.sel(latitude = sel_lat, longitude = sel_lon, method = 'nearest')
print(single_nearest)

# Conversion to wind power

Now the data is loaded and we have a timeseries we need to load in a few more python libraries to load in the power curve.
Here we use 'pandas' to read the csv file, and then load in 'numpy' and 'matplotlib' to explore the data further.

Run the following two blocks of code to load and plot the wind turbine power curve.

Note that knowledge of the names of the two fields wind speed 'ws' and capacity factor 'cf' were needed in advance. You can see these in the printed header of the file

> Q5: What characteristics do you expect to see in a wind farm power curve? Are these present i nthe curve you can see plotted below?

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

In [None]:
cv = pd.read_csv('https://raw.githubusercontent.com/hcbloomfield19/UREAD_energy_models_demo_scripts/main/Vestas_v110_2000MW_ECEM_turbine.csv', names = ['ws', '', 'cf'], delimiter= '  ')
print(cv.head())
plt.plot(cv['ws'],cv['cf'])
plt.xlabel('100m wind speed (ms$^{-1})')
plt.ylabel('Capacity factor')

Above we see that this turbine starts generating if hub-height winds are > 3 m/s and then has a 'ramping region' until around 11 m/s. The hub then reaches rated power and produces the same amount of generation for all wind speeds until the cut-out threshold.

Note that this curve represents an individual wind farm and does not try to model the interactions between wind turbines (due to turbulent wakes etc.) It also does not account for any foreced outages or efficiency reductions.


To convert from wind speed to wind power a function is defined below.
This interpolates the wind speeds from the curve onto a very fine resolution (501 points) and then uses the numpy digitise function https://numpy.org/doc/stable/reference/generated/numpy.digitize.html to assign each wind speed to a given capacity factor.

Note this is part of the method used in Bloomfield et al., (2020) https://rmets.onlinelibrary.wiley.com/doi/full/10.1002/met.1858 But many wind power models (e.g. renewables.ninja)  do similar style of methods with some added complexity.

In [None]:
def convert_to_windpower(wind_speed_data,power_curve_data):
    # convert to an array
    power_curve_w = np.array(power_curve_data['ws'])
    power_curve_p = np.array(power_curve_data['cf'])

    #interpolate to fine resolution.
    pc_winds = np.linspace(0,50,501) # make it finer resolution
    pc_power = np.interp(pc_winds,power_curve_w,power_curve_p)

    reshaped_speed = wind_speed_data.flatten()
    test = np.digitize(reshaped_speed,pc_winds,right=False) # indexing starts
    #from 1 so needs -1: 0 in the next bit to start from the lowest bin.
    test[test ==len(pc_winds)] = 500 # make sure the bins don't go off the
    #end (power is zero by then anyway)
    wind_power_flattened = 0.5*(pc_power[test-1]+pc_power[test])

    wind_power_cf = np.reshape(wind_power_flattened,(np.shape(wind_speed_data)))

    return(wind_power_cf)


Run the line of code below to convert the wind speed data into capacity factors.
Note this is a two step process which is both:
1. passing ws100 through the wind power function and then creating a new field callled 'cf' in the data file.

In [None]:
d['cf'] = (['time', 'latitude', 'longitude'],  convert_to_windpower(d['ws100'].values, cv) )
d

# Visualising the Capacity factor data

There are various methods that can be used to visualise the capacity factor data. An example of this is shown below using the inbuilt plotting functions from within xarray.

This line of code selects a set of time slices, defines the coordinates and then defines the plotting parameters.

> Q6: Can you plot the capacity factor map for the day in which the whole area expierences the highest average generation and lower average generation (Hint: see the 'Box Average' tutorital as a starting point).

In [None]:

d['cf'].isel(time=slice(0,48, 8)).plot(x="longitude", y="latitude", col="time", col_wrap=3, cmap=plt.cm.viridis)




In [None]:
d['cf'].isel(time=d['cf'].mean(dim=['latitude', 'longitude']).argmax('time')).plot(x="longitude", y="latitude")