# 0. Installing MITgcm in the ECCO configuration
The ECCO configuration of the MITgcm is the basis for all of our results. While it is not necessary to install and run the model to obtain the forcing data matrices (which can be calculated straight from the model input files), it is needed to obtain adjoint sensitivity matrices, and, ultimately, the DPC--EDF decomposition, as well as being used to test the model response to EDF patterns.

Full instructions on installing the model can be found in `ECCO_adjoint_instructions.pdf`.

# 1. Obtaining ECCO forcing files and calculating associated data matrices and EOF/PC decomposition

As the DPC--EDF method requires information from both a forcing data matrix and an adjoint sensitivity matrix, we begin by obtaining the forcing data matrix directly from the ECCOv4r4 forcing files. These files are in the binary "MDS" format used by MITgcm, with one file per year, at 6 hourly frequency (1460 entries for non-leap years, 1464 entries for leap years).

We will eventually use these files to re-run the (flux-forced) ECCO state estimate, so it is worth downloading all ECCOv4r4 forcing files. The process is described in Section 2.5.1 of `ECCO_adjoint_instructions.pdf`, but is recapped here. The download requires a podaac account and 210GB of storage space. The following command on Linux-based systems will obtain the necessary files. Change `<YOUR_USERNAME>` for your podaac username.

```
wget -r --no-parent --user <YOUR_USERNAME> --ask-password \
https://ecco.jpl.nasa.gov/drive/files/Version4/Release4/other
mv ecco.jpl.nasa.gov/drive/files/Version4/Release4/other/ .
rm -r ecco.jpl.nasa.gov/
```

We also want the file `ECCO-GRID.nc` which contains information about the LLC90 grid on which the state estimate is run.

```
wget -r --no-parent --user <YOUR_USERNAME> --ask-password \
https://ecco.jpl.nasa.gov/drive/files/Version4/Release4/nctiles_grid/ECCO-GRID.nc
mv ecco.jpl.nasa.gov/drive/files/Version4/Release4/nctiles_grid/ECCO-GRID.nc .
rm -r ecco.jpl.nasa.gov/
```

We will use the function `get_ecco_forcing` from the attached `DPC_functions.py` to load the `hflux` (net heat flux, Wm⁻², positive out of the ocean),`oceTAUX`, and `oceTAUY` (x-oriented and y-oriented -not zonal and meridional- wind stress, Nm⁻²) variables used in part to force the flux-forced simulation. This function returns the raw forcing and a climatology calculated from it. The function `forcing_anom` takes these two outputs and diagnoses the anomaly data matrix used to calculate EOFs and PCs.

In [None]:
from DPC_functions import *

## Get the heat flux data matrix

In [2]:
# forcingdir='/where/you/downloaded/ECCO/files/other/flux-forced/forcing/'
forcingdir='/glade/work/dafydd/ECCOv4r4_input/other/flux-forced/forcing/'
hflux_X,hflux_X_clim=get_ecco_forcing('TFLUX',forcing_dir=forcingdir,show_progress=True)
hflux_X=forcing_anom(hflux_X,hflux_X_clim)

Year :1992, 4.53e-06  seconds elapsed
Year :1993, 3.42  seconds elapsed
Year :1994, 6.51  seconds elapsed
Year :1995, 9.63  seconds elapsed
Year :1996, 12.9  seconds elapsed
Year :1997, 16  seconds elapsed
Year :1998, 19.1  seconds elapsed
Year :1999, 22  seconds elapsed
Year :2000, 24.5  seconds elapsed
Year :2001, 27.2  seconds elapsed
Year :2002, 30  seconds elapsed
Year :2003, 33  seconds elapsed
Year :2004, 36.1  seconds elapsed
Year :2005, 39.8  seconds elapsed
Year :2006, 43.6  seconds elapsed
Year :2007, 46.3  seconds elapsed
Year :2008, 49  seconds elapsed
Year :2009, 51.8  seconds elapsed
Year :2010, 56.3  seconds elapsed
Year :2011, 59.4  seconds elapsed
Year :2012, 62.6  seconds elapsed
Year :2013, 65.6  seconds elapsed
Year :2014, 68.4  seconds elapsed
Year :2015, 71.1  seconds elapsed
Year :2016, 74  seconds elapsed
Year :2017, 77.7  seconds elapsed


## Reduce the spatial dimensions of the data matrix to the Atlantic in [-35,80]°N 
We use the indexing variable `Ti` which is calculated in `DPC_functions`. This reduces the heat flux anomaly data matrix to shape (37988,10469) [time x space]

In [3]:
hflux_X=hflux_X.reshape(-1,13*90*90)[:,Ti]
print(hflux_X.shape)

(37988, 10469)


In [4]:
np.save(forcingdir+'hflux_anomaly_data_matrix.npy')

## Calculate EOFs as the eigendecomposition of the covariance matrix XᵀX

In [5]:
%%time
hflux_C = ( 1/len(hflux_X) ) * (hflux_X.T).dot(hflux_X)
hflux_λ,hflux_EOFs=la.eigsh(hflux_C,k=10469)



CPU times: user 4min 6s, sys: 1.65 s, total: 4min 7s
Wall time: 4min 14s


In [6]:
%%time
hflux_PCs=hflux_X.dot(hflux_EOFs)/np.sqrt(hflux_λ*len(hflux_X))

CPU times: user 1min 24s, sys: 1.49 s, total: 1min 25s
Wall time: 1min 28s


In [13]:
np.save(forcingdir+'hflux_eigenvalues.npy' ,hflux_λ)
np.save(forcingdir+'hflux_eigenvectors.npy',hflux_EOFs)

## As above, for wind stress
We concatenate the two variables `oceTAUX` and `oceTAUY` into a single data matrix:

In [8]:
%%time
#forcingdir='</where/you/downloaded/ECCO/files>/other/flux-forced/forcing/'
forcingdir='/glade/work/dafydd/ECCOv4r4_input/other/flux-forced/forcing'
taux_X,taux_X_clim=get_ecco_forcing('oceTAUX',forcing_dir=forcingdir,show_progress=False)
taux_X=forcing_anom(taux_X,taux_X_clim)
taux_X=taux_X.reshape(-1,13*90*90)[:,Ui]

tauy_X,tauy_X_clim=get_ecco_forcing('oceTAUY',forcing_dir=forcingdir,show_progress=False)
tauy_X=forcing_anom(tauy_X,tauy_X_clim)
tauy_X=tauy_X.reshape(-1,13*90*90)[:,Vi]

tauxy_X=np.hstack([taux_X,tauy_X])

CPU times: user 1min 53s, sys: 1min 31s, total: 3min 25s
Wall time: 3min 56s


In [None]:
np.save(forcingdir+'tauxy_anomaly_data_matrix.npy',tauxy_X)

In [9]:
%%time
tauxy_C = ( 1/len(tauxy_X) ) * (tauxy_X.T).dot(tauxy_X)
tauxy_λ,tauxy_EOFs=la.eigsh(tauxy_C,k=20280)
tauxy_PCs=tauxy_X.dot(tauxy_EOFs)/np.sqrt(tauxy_λ*len(tauxy_X))



CPU times: user 31min 44s, sys: 11.2 s, total: 31min 55s
Wall time: 32min 58s


In [15]:
np.save(forcingdir+'tauxy_eigenvalues.npy' ,tauxy_λ)
np.save(forcingdir+'tauxy_eigenvectors.npy',tauxy_EOFs)