## CESM2 - LARGE ENSEMBLE (LENS2)

#### by Mauricio Rocha and Dr. Gustavo Marques

#### The objective of this notebook is to cut out the South Atlantic region, but keeping a part of the Tropical Atlantic so that the Intertropical Convergence Zone (ITCZ) region is contemplated.

## Imports

In [None]:
import intake
import intake_esm
import xarray as xr
import matplotlib.pyplot as plt
import cartopy.crs as ccrs
import numpy as np
import fsspec
import cmocean
import cartopy
import cartopy.feature as cfeature
from cartopy.mpl.ticker import LongitudeFormatter, LatitudeFormatter
import pop_tools
import sys
from distributed import Client
from ncar_jobqueue import NCARCluster
sys.path.append('/glade/u/home/mauricio/south_atlantic_heat_balance/functions')
import util
from cartopy.util import add_cyclic_point
from misc import get_ij

## Improve the workflow using clusters 

### We tested different numbers of workers and memory so as to decide which one had better computational performance evaluated by the following: How much time does it require to process the average of 11 members and 10000 days of SST? 
##### code line used to that test: sst_mean = ds.SST.isel(member_id=slice(0, 10)).isel(time=slice(0,10000)).mean(dim=["member_id","time"]).plot()
#### Results:
###### * 40 workers and 20 Gb of memory took 14 sec. 
###### * 70 workers and 5 Gb of memeory took 15.2 sec.
###### * 10 workers and 35 Gb of memory took 35 sec. 
###### * 40 workers and 3 Gb of memory did not work due to the too low memory. 
###### * 40 workers and 6 Gb of memory took 14.4 sec.
###### * 50 workers and 0.5 Gb of memory did not work due to the too low memory.
###### * 50 workers and 1 Gb of memory took did not work due to the too low memory.
###### * 60 workers and 6 Gb of memory took 13.7 sec and was the WINNER!

In [None]:
cluster = NCARCluster(cores=2, # The number of cores you want
                      processes=1, # How many processes
                      resource_spec='select=1:ncpus=1:mem=6GB') # Specify resources
cluster.scale(60) # Workers
client = Client(cluster)
client

## Data Ingest

### Path

In [None]:
%%time
catalog = intake.open_esm_datastore(
    '/glade/collections/cmip/catalog/intake-esm-datastore/catalogs/glade-cesm2-le.json'
)

In [None]:
catalog.df

### How does the variable look like? 

In [None]:
cat_subset = catalog.search(component='ocn',
                            variable='SST',
                            frequency='day_1')
#                           frequency='day_1').df.variable.unique() # Here, you can see all the variables available for the frequency and for the component specified. 

In [None]:
%%time
dset_dict_raw = cat_subset.to_dataset_dict()

In [None]:
ds = dset_dict_raw['ocn.historical.pop.h.nday1.cmip6.SST']   # daily

In [None]:
ds

##### Here we know that the structure used 850 chunks and each chunk has 1.67 GB. The total data memory is 1.35 TB. Each member has 27 GB.

## Import the POP grid

##### If you choose the ocean component of LENS2, you will need to import the POP grid. For the other components, you can use the emsemble's own grid. 

##### In ds, TLONG and TLAT have missing values (NaNs), so we need to override them with the values from pop_grid, which does not have missing values.

In [None]:
# Read the pop 1 deg grid from pop_tools
# We will use variables TLONG and TLAT
pop_grid = pop_tools.get_grid('POP_gx1v7')
ds['TLONG'] = pop_grid.TLONG   # Longitud
ds['TLAT'] = pop_grid.TLAT     # Latitudes

## Test

#### Here we made a test to evaluate the amount of workers and memory. We plotted the average SST from the first to the eleventh member and from the first to 10000 time. 

In [None]:
%%time
#SST_mean=ds.SST.isel(time=slice(0,10000)).mean(dim="time")[1,:,:].compute()
#SST_mean.plot()
sst_mean = ds.SST.isel(member_id=slice(0, 10)).isel(time=slice(0,10000)).mean(dim=["member_id","time"]).plot()

In [None]:
sst_mean = ds.SST.isel(member_id=slice(0, 10)).isel(time=slice(0,10000)).mean(dim=['member_id','time'])
sst_mean

In [None]:
sst_mean

## Map 

In [None]:
%%time
plt.figure(figsize=(10,6));
ax = plt.axes(projection=ccrs.Robinson());
#pc = ds.SST.isel(time=0, member_id=0).plot.pcolormesh(ax=ax,
pc = sst_mean.plot.pcolormesh(ax=ax,
                              transform=ccrs.PlateCarree(),
                              cmap=cmocean.cm.balance,
                              x='TLONG',
                              y='TLAT',
                              vmin=-3,
                              vmax=30,
                              cbar_kwargs={"orientation": "horizontal"})                                    
ax.gridlines(draw_labels=True);
ax.coastlines()
ax.gridlines()

In [None]:
#dsp = util.pop_add_cyclic(ds)
#dsp

## Centralize the South Atlantic 

### Concatenation 

In [None]:
#sw_lo, sw_la = get_ij(-80, -60, dsp)   # southwest (lon, lat)
#se_lo, se_la = get_ij(30, -60, dsp)  # southeast (lon, lat)
#nw_lo, nw_la = get_ij(-80, 30, dsp)   # northwest (lon, lat)
#ne_lo, ne_la = get_ij(30, 30, dsp)  # northeast (lon, lat)
#print('Southwest Edge (indices): sw_lo = {}, sw_la = {}'.format(sw_lo,sw_la))
#print('Southeast Edge (indices): se_lo = {}, se_la = {}'.format(se_lo,se_la))
#print('Northwest Edge (indices): nw_lo = {}, nw_la = {}'.format(nw_lo,nw_la))
#print('Northeast Edge (indices): ne_lo = {}, ne_la = {}'.format(ne_lo,ne_la))

In [None]:
#area = dsp.TAREA.isel(nlon = slice(sw_lo,se_lo),nlat = slice(sw_la,ne_la))
#start = "1960-01-01" # first time
#end   = "2015-01-01" # last time
#area

In [None]:
#%%time
#ds_remapped = ds.SST.isel(nlon = slice(sw_lo,se_lo), nlat = slice(sw_la,ne_la),
#                            ).sel(time = slice(start,end)).weighted(area).mean(dim=['time']).load()

In [None]:
#xr.concat([ds.SST[:,:,20:250,290:320], ds.SST[:,:,20:250,0:60]], dim='nlon')

In [None]:
sa_ds=xr.combine_nested([
     [ds.isel(nlat = slice(20,280),nlon = slice(290,320)),
      ds.isel(nlat = slice(20,280),nlon = slice(0,60))]],
    concat_dim=['nlat','nlon']
)

In [None]:
%%time
plt.figure(figsize=(10,6));
ax = plt.axes(projection=ccrs.Robinson());
pc = sa_ds.SST.isel(time=0, member_id=0).plot.pcolormesh(ax=ax,
                              transform=ccrs.PlateCarree(),
                              cmap=cmocean.cm.balance,
                              x='TLONG',
                              y='TLAT',
                              vmin=-3,
                              vmax=30,
                              cbar_kwargs={"orientation": "horizontal"})                                    
ax.gridlines(draw_labels=True);
ax.coastlines()
ax.gridlines()