# Loading and Preprocessing the Data

## ERA5 (Ekman Upwelling Index)
I am working with the mean turbulent surface stress $\big[\frac{N}{m^2}\big]$ or [Pa] in eastward and northward direction, available from the [Copernicus Climate Data Store](https://cds.climate.copernicus.eu/cdsapp#!/dataset/reanalysis-era5-single-levels?tab=overview). I download the data with the Copernicus tool that resamples it from hourly to daily resolution. [Download Link](https://cds.climate.copernicus.eu/apps/user-apps/app-c3s-daily-era5-statistics?dataset=reanalysis-era5-single-levels&product_type=reanalysis&variable_e5sl=eastward_turbulent_surface_stress&pressure_level_e5sl=-&statistic=daily_mean&year_e5sl=2020&month=01&frequency=1-hourly&time_zone=UTC%2B00:00&grid_e5=0.25/0.25&area.lat:record:list:float=36&area.lat:record:list:float=45&area.lon:record:list:float=-20&area.lon:record:list:float=-5).
- Period: 01/12/1981-31/01/2024 (daily)
- Resolution: 0.25° x 0.25°

### Processing
1. Download data with ERA5_download.py
    - conda activate IbUpPy3.9.12
    - python3 ERA5_donwload_surface_stress.py
2. Load the data
    - after the download the data are all stored in individual files, I have one northward and eastward file per year
    - load and combine the datasets
    - save as MTSS.nc
3. Add land mask
    - get ERA5 land sea mask: Download_ERA5_land_sea_mask.ipynb
    - add to MTSS
4. Resample the data
    - the data has daily resolution resample to weekly resolution (match the format of UI SST data as it already is at weekly resolution)
    - wanted format: weekly mean Sat-Fr & time stamp Tue
    - save the resampled data as MTSS_weekly.nc
    - also calculate std and save as MTSS_weekly_std.nc
5. (Ekaman transport -› calculate when needed)
    - calculate Ekman transport from the wind stress data
    - calculate the Ekman upwelling index (aka the westward component of the Ekman transport)
    - (save as UI_Ek.nc -› don't)

## SST Upwelling Index
I downloaded the data from [CoastNET geoportal](http://geoportal.coastnet.pt). This product is calculated with SST data obtained from [CORTAD](ahttps://www.ncei.noaa.gov/products/coral-reef-temperature-anomaly-database)
- Period: 04/01/1982 - 09/11/2021 (weekly)
- Resolution: lat: ~0.04166° and lon: 5.019 - 0.04166° (lon res does not really matter because this index calculates the difference between the temperature on the mid-shelf and at 15°W)

### Processing
1. Download the data and save as UI_SST_CoastNET.nc
2. Load the data
3. Convert the temperature from Kelvin to °C for more intuitive understanding
4. Change the sign of the index
    - the index is calculated by substracting the midshelf temperatures from the 15°W temperatures ($T_{mid-shelf} \ - \ T_{15^{\circ}W} \ = \ UI_{SST}$)
    - multiply by -1 so that positive values indicate upwelling
    - save as UI_SST.nc

## ECCO2 
### SST and SSH
Download the [ECCO2 data](https://ecco.jpl.nasa.gov/drive/files/ECCO2/cube92_latlon_quart_90S90N/)
- Period 01/01/1992 - 31/12/2023
- Resolution:

1. Download the data with Download_ECCO2.txt
    - Download_ECCO2.txt is and executable file
    - execute the file in the terminal by just running its name ./Download_ECCO2.txt
3. Load the data
4. Select research area (35°N to 45°N, 20°W (340°E) to 5°W (355°E))
5. Resample the data
    - the data has daily resolution resample to weekly resolution (match the format of UI SST data as it already is at weekly resolution)
    - wanted format: weekly mean Sat-Fr & time stamp Tue
    - save the resampled data as ECCO2_weekly.nc
    - also calculate std and save as ECCO2_weekly_std.nc
      
## IBI SSH
Downloaded the data for the Iberian Peninsula from [Copernicus Marine Serivce](https://data.marine.copernicus.eu/product/IBI_MULTIYEAR_PHY_005_002/description)
- Period 01/01/1993 - 28/12/2021
- Resolution: 0.083° x 0.083°

### Processing
1. Download the data with Download_SSH.ipynb
    - rename latitude, lat and longitude, lon
2. Load the data
3. Rename lat, lon, time
4. Cut research area from global dataset
5. Resample the data
    - the data has daily resolution resample to weekly resolution (match the format of UI SST data as it already is at weekly resolution)
    - wanted format: weekly mean Sat-Fr & time stamp Tue
    - save the resampled data as SSH_weekly.nc
    - also calculate std and save as SSH_weekly_std.nc
  
## CoRTAD SST
Download data from [NOAA](https://www.ncei.noaa.gov/access/metadata/landing-page/bin/iso?id=gov.noaa.nodc:NCEI-CoRTADv6)
- Period 01/05/1982 - 27/12/2022
- Resolution: 0.04165649° x 0.04165649°

### Processing
1. Download the cortadv6_FilledSST.nc via the link on the HTTPS download on the NOAA website and save as CoRTAD_global.nc
2. Drop the dimension 'nv'
3. Select research area (35°N to 45°N, 20°W (340°E) to 5°W (355°E))
4. Is the dataset to which we are all matching our weekly resampling (so don't need to do anything)
5. Change °K to °C
    - Save as CoRTAD_weekly.nc

In [1]:
## import packages
import xarray as xr
import numpy as np
import glob
import os
import my_functions
import warnings
warnings.filterwarnings('ignore') # ignore runtime warning for SSH.resample(...).std()

In [3]:
## set directory to where datasets are stored
os.chdir("/Users/marie-louisekorte/Documents/Uni Leipzig/Lisbon/Data.nosync/")
#os.chdir("/Volumes/Jamie/")

## Load the data

### ERA5

In [None]:
## load ERA5 mean turbulent surface stress
# load data in chunks to avoid jupyter lab crashing
MTSS_19th = xr.merge([xr.open_dataset(f) for f in glob.glob('ERA5/Surface_stress/Turbulent_mean/*_19*.nc')])

In [None]:
MTSS_20th_N = xr.merge([xr.open_dataset(f) for f in glob.glob('ERA5/Surface_stress/Turbulent_mean/N_20*.nc')])

In [None]:
MTSS_20th_E = xr.merge([xr.open_dataset(f) for f in glob.glob('ERA5/Surface_stress/Turbulent_mean/E_20*.nc')])

In [None]:
# merge datasets  (and drop empty coordinate "realization")
MTSS = xr.merge([MTSS_19th, MTSS_20th_N, MTSS_20th_E])
MTSS = MTSS.drop_vars(["realization"])

In [None]:
# save as netcdf 
MTSS.to_netcdf("MTSS.nc")

In [None]:
## load ERA5 land sea mask
LSM = xr.open_dataset('Land_sea_mask.nc')

### UI SST

In [4]:
## load UI SST
UI_SST = xr.open_dataset('UI_SST_CoastNET.nc')

### ECCO2 SST

In [18]:
## load ECCO2 SST
for year in np.arange(1992, 2024):
    for month in np.arange(1, 13):
        ds = xr.open_dataset(f'ECCO2/ECCO2_SST/SST.1440x720.{year}{month:02d}.nc')
        ds = ds.sel(LATITUDE_T = slice(35, 45), LONGITUDE_T = slice(340, 355))
        if ((year == 1992) and (month == 1)):
            ECCO2_SST = ds
        ECCO2_SST = xr.merge([ECCO2_SST, ds])

In [9]:
## load ECCO2 SST
for year in np.arange(2020, 2024):
    for month in np.arange(1, 13):
        ds = xr.open_dataset(f'ECCO2/ECCO2_SSH/SSH.1440x720.{year}{month:02d}.nc')
        ds = ds.sel(LATITUDE_T = slice(35, 45), LONGITUDE_T = slice(340, 355))
        if ((year == 1992) and (month == 1)):
            ECCO2_SSH = ds
        ECCO2_SSH = xr.merge([ECCO2_SSH, ds])

### IBI SSH

In [71]:
## load SSH
SSH = xr.open_dataset('SSH_daily.nc')

### CoRTAD SST

In [3]:
SST = xr.open_dataset('cortadv6_FilledSST.nc')
SST = SST.drop_dims(["nv"])

## Process data

In [4]:
# set wd to where I want the datasets to be saved
os.chdir("/Users/marie-louisekorte/Documents/Uni Leipzig/Lisbon/Data.nosync/")

### ERA5

In [None]:
## process ERA5 - add land sea mask
# add land-sea mask to the MTSS dataset
MTSS['lsm'] = LSM.lsm

In [None]:
## process ERA5 - weekly resmaple
# resample ERA5 data to same weekly resolution as SST upwelling index (Weekly mean Sat-Fr & time stamp Tue)
# time = 'W-SAT' -> resamples to weekly time res. starting on a Saturday (default is Sunday)
# closed = 'left' -> means [start date, end_date) i.e. start date is included and end_date is exluded in the interval I choose 
# label = 'left' -> the time stamp from the start of the interval is assigned
MTSS_weekly_mean = MTSS.resample(time = 'W-SAT', closed = 'left', label = 'left').mean() 
# change time label ->I want to add 3 days to my time coordinate to move my time stamp from Sat to Tue to match SST upwelling index format
MTSS_weekly_mean['time'] = MTSS_weekly_mean.time + np.timedelta64(3, 'D')

# save as netcd
MTSS_weekly_mean.to_netcdf("MTSS_weekly.nc")

# same for std
#MTSS_weekly_std =  MTSS.resample(time = 'W-SAT', closed = 'left', label = 'left').std() 
#MTSS_weekly_std['time'] = MTSS_weekly_std.time + np.timedelta64(3, 'D')
#MTSS_weekly_std.to_netcdf("MTSS_weekly_std.nc")

In [None]:
## process ERA5 - calculate Ekman upwelling index
# calculate upwelling index from wind stress dataset -> use my upwelling function (from my_functions.py)
# UI_Ek = my_functions.calc_upwelling_index(MTSS, MTSS.lat, MTSS.lon, MTSS.metss, MTSS.mntss)

# save as netcdf
# UI_Ek.to_netcdf("UI_Ek.nc")

### SST UI

In [5]:
## process SST UI - convert Kelvin to °C
# also upedate the attributes
UI_SST['Tmid'] = UI_SST.Tmid - 273.15
UI_SST.Tmid.attrs.update({"name" : "sea_surface_skin_temperature", "units" : "degree Celsius °C"})
UI_SST['Toff15W'] = UI_SST.Toff15W - 273.15
UI_SST.Toff15W.attrs.update({"name" : "sea_surface_skin_temperature", "units" : "degree Celsius °C"})

In [6]:
## process SST UI - change sign of index
UI_SST['UI'] = UI_SST.UI * -1
UI_SST.UI.attrs.update({"name" : "difference in sea_surface_skin_temperature", "units" : " degree Celsius °C", "method" : " Toff15 - Tmid", "info" : " > 2°C upwelling event"})

# save as netcdf
UI_SST.to_netcdf("UI_SST.nc")

### ECCO2 SST & SSH

In [19]:
## process rename lat, lon and time
ECCO2 = xr.merge([ECCO2_SST, ECCO2_SSH])
ECCO2 = ECCO2.rename({'LATITUDE_T' : 'lat', 'LONGITUDE_T' : 'lon', 'TIME' : 'time'})

In [20]:
## process ECCO2 SST -> resample to weeky res
ECCO2_weekly_mean = ECCO2.resample(time = 'W-SAT', closed = 'left', label = 'left').mean() 
ECCO2_weekly_mean['time'] = ECCO2_weekly_mean.time + np.timedelta64(3, 'D')

# save as netcd
ECCO2_weekly_mean.to_netcdf("ECCO2_weekly.nc")

## same for std
#ECCO2_weekly_std =  ECCO2.resample(time = 'W-SAT', closed = 'left', label = 'left').std() 
#ECCO2_weekly_std['time'] = ECCO2_weekly_std.time + np.timedelta64(3, 'D')
#ECCO2_weekly_std.to_netcdf("SSH_weekly_std.nc")

### IBI SSH


In [72]:
## process SSH -> resample to weeky res
SSH_weekly_mean = SSH.resample(time = 'W-SAT', closed = 'left', label = 'left').mean() 
SSH_weekly_mean['time'] = SSH_weekly_mean.time + np.timedelta64(3, 'D')

# save as netcd
SSH_weekly_mean.to_netcdf("SSH_weekly.nc")

## same for std
#SSH_weekly_std =  SSH.resample(time = 'W-SAT', closed = 'left', label = 'left').std() 
#SSH_weekly_std['time'] = SSH_weekly_std.time + np.timedelta64(3, 'D')
#SSH_weekly_std.to_netcdf("SSH_weekly_std.nc")


### CoRTAD SST

In [7]:
## process CoRTAD SST 
# reverse order of lat
SST = SST.reindex(lat=list(reversed(SST.lat)))

# select research area
SST = SST.sel(lat = slice(45, 35), lon = slice(-20, -5))

SST['SST'] = SST.FilledSST - 273.15
SST.SST.attrs.update({"standard_name" : "sea_surface_skin_temperature", "units" : " degree Celsius °C", "info" : "WeeklySST - 273.15"})

# save as netcdf
SST.to_netcdf("CoRTAD_weekly.nc")

In [11]:
SST.to_netcdf("CoRTAD_weekly.nc")