# KSA206: Polar Observations and Modelling

# Week 6 - Comparing ACCESS to observations

It is important to remember that a model is not the real ocean - and that how well it represents a certain area/process is going to be variable. Therefore, something we usually do when we do a model study is to carry out a "model validation" which is essentially a comparison between the model and any available observations in the study region - which are very scarse in the Southern Ocean if you remember from our `1_Introduction_to_EN422.ipynb` exercise.

There are several elements  to take into account in a model validation:
 - Models are usually initialised with observations - we tell them what the ocean looks like at the beginning and then let them simulate unconstrained. This means that many models end up "drifting" away from that initial state. This is not necessarily a bad thing, it just happens. 
 - Models and observations are not going to have the same dimensions/coordinate. Some manipulation and interpolation will be needed.
 - As usual, careful with units! You want to be comparing apples to apples, not apples to oranges. 
 - After you've done your comparison, you need to use your judgement to interpret whether the model is doing adequately *for the purpose of your study*.

## ACCESS-OM2 validation against I09S

We will do a model validation using the temperature, salinity observations from I09S from Week 4. 

In [120]:
import gsw
import matplotlib.pyplot as plt
import numpy as np
import xarray as xr

## Load data

Let's load our model temperature and salinity, and the EasyOcean product.

In [121]:
model = xr.open_dataset('data/access-om2_temp_salt_avg_2000-2018.nc')
obs = xr.open_dataset('../Week4/data/i09s.nc')

In [122]:
model

In [123]:
obs

Our first obstacle is the spatial coordinates. Specifically, the model's longitude dimension, `xt_ocean` goes from -280 to 80. We need to convert it to -180 to 180 in order to select 114.7 which is the longitude of I09S.

Doing this is relatively simple. We just need to shift the extent, which can be done by selecting all the longitudes smaller tha  -180, and adding 360. We can do that using `xr.where()`:

In [124]:
new_longitude = xr.where(model['xt_ocean']<-180, model['xt_ocean']+360, model['xt_ocean'])

Now we can replace `xt_ocean` with this `new_longitude`:

In [125]:
model['xt_ocean'] = new_longitude.values

And sort so that the longitudes are in ascending order:

In [126]:
model = model.sortby('xt_ocean')

And done! We have shifted our dimensions accordingly. Now let's rename them to be 'latitude' and 'longitude'

In [127]:
model = model.rename({'xt_ocean':'longitude', 'yt_ocean':'latitude'})

In [128]:
model

Now let's select the I09S longitude:

In [129]:
model_i09s = model.sel(longitude = 114.7, method = 'nearest')

In [130]:
model_i09s

In [131]:
obs

Our second obstacle is that our model has a vertical coordinate that is depth, and the observations come in a vertical coordinate that is pressure. We will convert the pressures from the observations to depths. 

<h4 style="color: red;">Question 1</h4>

Use `gsw` to convert pressures to depth levels - name the variable `depth_levels`. Use a mean latitude of -50.

*Answer here*

In [132]:
depth_levels = gsw.z_from_p(-obs['pressure'], -50)

Now that you have calculated the depth levels, let's replace and rename in our obs dataset:

In [133]:
obs = obs.rename({'pressure':'depth'})
obs['depth'] = depth_levels.values

And rename the model's `st_ocean` to `depth`:

In [134]:
model_i09s = model_i09s.rename({'st_ocean':'depth'})

Now we are ready to interpolate to the same depth levels and lat/lon locations. But we need to make a choice: do we interpolate the model to the obs, or the obs to the model? 

You will notice that the model's resolution is significantly lower than the observations. This means that if we interpolate it to match the observations, we will be artifically "creating" data points - which is not ideal. On the other hand, to interpolate the observations we would just need to subsample. 

In other words, you would be incurring in less "errors" by lowering your resolution than increasing. Perhaps that was a convoluted way of explaining that we will interpolate the obs to the model's locations!

In [135]:
obs_interp = obs.interp(latitude = model_i09s['latitude'].values, depth = model_i09s['depth'].values)

In [136]:
obs_interp

In [137]:
model_i09s

<h4 style="color: red;">Question 2</h4>

Calculate the average I09S section (time mean), calculate the difference with the model's cross section and plot temperature and salinity differences. We will refer to this as "temperature and salinity biases". Take into account the following:

 - The model has conservative temperature in Kelving and practical salinity.
 - Use `.squeeze()` to remove redundant dimensions when needed.

You should get a figure like this:

<p align="center">
<img src="images/model_validation.png" width="70%"/>
</p>

*Answer here*