# Lab 2 Part 1

In this lab, we'll work with [ECCO](https://www.ecco-group.org/products-ECCO-V4r4.htm), a state estimate, which is a type of model that combines observations and dynamical equations to estimate the state of the climate. ECCO is an ocean state estimate ocean between 1992 and 2018. Unlike the simple ocean model we developed at the end of Lab 1, ECCO accounts for horizontal variation, and it includes many more variables besides temperature. The major objective of this lab is to learn to use model output to answer questions about ocean and climate dynamics.

In this first part, we'll learn how to retrieve and plot ECCO data. **Quantitative** students will also learn how to select data at specific coordinates. **Qualitative** students will use these plots discuss features of ocean circulation.

**Both tracks are asked to save some plots. Create a separate document for these plots and give each plot a figure number and a descriptive caption. Refer to the figures by their figure number in the documents that you turn in, whether that is a Jupyter notebook (quantitative) or a written document (qualitative).**

In [None]:
# we need to update our environment to include the packages below
!pip install ipympl
!pip install ecco_v4_py

In [None]:
# load packages
%matplotlib ipympl
import math
import os
import requests
import datetime
import xgcm
import numpy as np
import pandas as pd
import xarray as xr
import matplotlib.pyplot as plt
import ipywidgets as widgets
from platform import system
from netrc import netrc
from urllib import request
from http.cookiejar import CookieJar
from io import StringIO
from warnings import filterwarnings
from matplotlib.colors import Normalize
import ecco_v4_py as ecco
from ecco_v4_py import vector_calc, scalar_calc
filterwarnings("ignore", category=FutureWarning)

## Introduction to ECCO (all)

### Variables

ECCO includes many variables in its data, which are listed in the following three documents:

- Most variables have [monthly and daily averages](https://raw.githubusercontent.com/ECCO-GROUP/ECCO-v4-Python-Tutorial/master/varlist/v4r4_nctiles_monthly_varlist.txt), recorded between 1992 and 2018. In this lab, we'll work only with the 2017 data.
- A few of these variables also have [daily snapshots](https://raw.githubusercontent.com/ECCO-GROUP/ECCO-v4-Python-Tutorial/master/varlist/v4r4_nctiles_snapshots_varlist.txt), recorded for the same time period. These may differ slightly from the daily averages, but they should be pretty close.
- The remaining variables are [time series data and grid parameters](https://raw.githubusercontent.com/ECCO-GROUP/ECCO-v4-Python-Tutorial/master/varlist/v4r4_tseries_grid_varlist.txt).

The variables are grouped into datasets with descriptive names like `ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4`. These are already downloaded on Oscar (to save space and time). Each file contains a number of different variables. Beside each variable name in the links above is a description of what that data represents, and its units are given in parentheses. (For example, `SSH` is dynamic sea surface height anomaly, and its units are meters.)

Run the command below to see the contents of the data directory.

In [None]:
!ls /oscar/data/eeps1400_24fall/DATA/ECCO_V4r4_PODAAC

**Task:** Take a look at the following potential questions of investigation about the ocean. Next to each one, write which variables might be needed to answer it.

- *Which coastal areas may be at risk of flooding in the future?* 
- *What is the volume flux into the Atlantic ocean?* 
- *How is the ocean warming over time?* 
- *What factors affect ocean salinity?* 

Variables are recorded on a large grid covering the entire globe, which is composed of thirteen tiles. Twelve of these tiles are mostly aligned with latitude and longitude lines, although six of them are rotated 90 degrees. The remaining tile (Tile 6) is a 'cap' over the North Pole.

<figure>
    <img src="Lab2Images/fig1.png" width="500"/>
    <figcaption>
        <a href="https://ecco-v4-python-tutorial.readthedocs.io/fields.html#tile-native-lat-lon-cap-90-grid"> Fig. 1: ECCO Grid Tiles</a>
    </figcaption>
</figure>

Most variables are also provided at 50 depth levels. Altogether, each variable is actually recorded across five dimensions: time, tile number, x- and y-coordinates within each tile, and depth.

**Task:** Answer the following question: Which way does north point on each tile?

- Tiles 0-5: 
- Tile 6: 
- Tiles 7-12: 

In [None]:
# The code below stores information about the tiles so that we can create more legible maps

# Information to generate an xgcm grid
tile_connections = {'tile': {
    0: {'X': ((12, 'Y', False), (3, 'X', False)), 'Y': (None, (1, 'Y', False))},
    1: {'X': ((11, 'Y', False), (4, 'X', False)), 'Y': ((0, 'Y', False), (2, 'Y', False))},
    2: {'X': ((10, 'Y', False), (5, 'X', False)), 'Y': ((1, 'Y', False), (6, 'X', False))},
    3: {'X': ((0, 'X', False), (9, 'Y', False)), 'Y': (None, (4, 'Y', False))},
    4: {'X': ((1, 'X', False), (8, 'Y', False)), 'Y': ((3, 'Y', False), (5, 'Y', False))},
    5: {'X': ((2, 'X', False), (7, 'Y', False)), 'Y': ((4, 'Y', False), (6, 'Y', False))},
    6: {'X': ((2, 'Y', False), (7, 'X', False)), 'Y': ((5, 'Y', False), (10, 'X', False))},
    7: {'X': ((6, 'X', False), (8, 'X', False)), 'Y': ((5, 'X', False), (10, 'Y', False))},
    8: {'X': ((7, 'X', False), (9, 'X', False)), 'Y': ((4, 'X', False), (11, 'Y', False))},
    9: {'X': ((8, 'X', False), None), 'Y': ((3, 'X', False), (12, 'Y', False))},
    10: {'X': ((6, 'Y', False), (11, 'X', False)), 'Y': ((7, 'Y', False), (2, 'X', False))},
    11: {'X': ((10, 'X', False), (12, 'X', False)), 'Y': ((8, 'Y', False), (1, 'X', False))},
    12: {'X': ((11, 'X', False), None), 'Y': ((9, 'Y', False), (0, 'X', False))}
}}

# subplots[i] is the index of tile #i in the array of subplots
subplots = [(3, 0), (2, 0), (1, 0), (3, 1), (2, 1), (1, 1), (0, 0),
            (1, 2), (2, 2), (3, 2), (1, 3), (2, 3), (3, 3)]
# rotations[i] is the orientation of tile #i, as a multiple of 90 degrees
rotations = [0, 0, 0, 0, 0, 0, 3, 1, 1, 1, 1, 1, 1]

# Trigonometry for multiples of 90 degrees 
def cos90(angle):
    if angle % 4 == 0: return 1
    elif angle % 4 == 2: return -1
    else: return 0
def sin90(angle):
    if angle % 4 == 1: return 1
    elif angle % 4 == 3: return -1
    else: return 0

### Retrieving data

**Running on OSCAR:** We have provided some codes to make reading and plotting ECCO data easier. Run the following lines to perform these operations. If everything works correctly, you should see `Setup complete` after a few seconds. (Each time you open this notebook, you need to re-run this cell.)

In [None]:
# This is the path to the data
downloads = '/oscar/data/eeps1400_24fall/DATA/ECCO_V4r4_PODAAC'

In [None]:
# Plotting codes
# Information to look up a variable by name
all_variables = ['global_mean_barystatic_sea_level_anomaly', 'global_mean_sterodynamic_sea_level_anomaly', 'global_mean_sea_level_anomaly', 'Pa_global', 'xoamc', 'yoamc', 'zoamc', 'xoamp', 'yoamp', 'zoamp', 'mass', 'xcom', 'ycom', 'zcom', 'sboarea', 'xoamc_si', 'yoamc_si', 'zoamc_si', 'mass_si', 'xoamp_fw', 'yoamp_fw', 'zoamp_fw', 'mass_fw', 'xcom_fw', 'ycom_fw', 'zcom_fw', 'mass_gc', 'xoamp_dsl', 'yoamp_dsl', 'zoamp_dsl', 'CS', 'SN', 'rA', 'dxG', 'dyG', 'Depth', 'rAz', 'dxC', 'dyC', 'rAw', 'rAs', 'drC', 'drF', 'PHrefC', 'PHrefF', 'hFacC', 'hFacW', 'hFacS', 'maskC', 'maskW', 'maskS', 'DIFFKR', 'KAPGM', 'KAPREDI', 'SSH', 'SSHIBC', 'SSHNOIBC', 'ETAN', 'EXFatemp', 'EXFaqh', 'EXFuwind', 'EXFvwind', 'EXFwspee', 'EXFpress', 'EXFtaux', 'EXFtauy', 'oceTAUX', 'oceTAUY', 'EXFhl', 'EXFhs', 'EXFlwdn', 'EXFswdn', 'EXFqnet', 'oceQnet', 'SIatmQnt', 'TFLUX', 'EXFswnet', 'EXFlwnet', 'oceQsw', 'SIaaflux', 'EXFpreci', 'EXFevap', 'EXFroff', 'SIsnPrcp', 'EXFempmr', 'oceFWflx', 'SIatmFW', 'SFLUX', 'SIacSubl', 'SIrsSubl', 'SIfwThru', 'SIarea', 'SIheff', 'SIhsnow', 'sIceLoad', 'SIuice', 'SIvice', 'ADVxHEFF', 'ADVyHEFF', 'DFxEHEFF', 'DFyEHEFF', 'ADVxSNOW', 'ADVySNOW', 'DFxESNOW', 'DFyESNOW', 'oceSPflx', 'oceSPDep', 'MXLDEPTH', 'OBP', 'OBPGMAP', 'PHIBOT', 'UVEL', 'VVEL', 'WVEL', 'THETA', 'SALT', 'RHOAnoma', 'DRHODR', 'PHIHYD', 'PHIHYDcR', 'UVELMASS', 'VVELMASS', 'WVELMASS', 'Um_dPHdx', 'Vm_dPHdy', 'ADVx_TH', 'ADVy_TH', 'ADVr_TH', 'DFxE_TH', 'DFyE_TH', 'DFrE_TH', 'DFrI_TH', 'ADVx_SLT', 'ADVy_SLT', 'ADVr_SLT', 'DFxE_SLT', 'DFyE_SLT', 'DFrE_SLT', 'DFrI_SLT', 'oceSPtnd', 'UVELSTAR', 'VVELSTAR', 'WVELSTAR', 'GM_PsiX', 'GM_PsiY']
all_datasets = ['GMSL_TIME_SERIES', 'GMAP_TIME_SERIES', 'SBO_CORE_TIME_SERIES', 'GEOMETRY_LLC0090GRID', 'OCEAN_3D_MIX_COEFFS_LLC0090GRID', 'SSH_LLC0090GRID', 'ATM_STATE_LLC0090GRID', 'STRESS_LLC0090GRID', 'HEAT_FLUX_LLC0090GRID', 'FRESH_FLUX_LLC0090GRID', 'SEA_ICE_CONC_THICKNESS_LLC0090GRID', 'SEA_ICE_VELOCITY_LLC0090GRID', 'SEA_ICE_HORIZ_VOLUME_FLUX_LLC0090GRID', 'SEA_ICE_SALT_PLUME_FLUX_LLC0090GRID', 'MIXED_LAYER_DEPTH_LLC0090GRID', 'OBP_LLC0090GRID', 'OCEAN_VEL_LLC0090GRID', 'TEMP_SALINITY_LLC0090GRID', 'DENS_STRAT_PRESS_LLC0090GRID', 'OCEAN_3D_VOLUME_FLUX_LLC0090GRID', 'OCEAN_3D_MOMENTUM_TEND_LLC0090GRID', 'OCEAN_3D_TEMPERATURE_FLUX_LLC0090GRID', 'OCEAN_3D_SALINITY_FLUX_LLC0090GRID', 'BOLUS_LLC0090GRID', 'OCEAN_BOLUS_STREAMFUNCTION_LLC0090GRID']
datasets = pd.Series(['GMSL_TIME_SERIES', 'GMSL_TIME_SERIES', 'GMSL_TIME_SERIES', 'GMAP_TIME_SERIES', 'SBO_CORE_TIME_SERIES', 'SBO_CORE_TIME_SERIES', 'SBO_CORE_TIME_SERIES', 'SBO_CORE_TIME_SERIES', 'SBO_CORE_TIME_SERIES', 'SBO_CORE_TIME_SERIES', 'SBO_CORE_TIME_SERIES', 'SBO_CORE_TIME_SERIES', 'SBO_CORE_TIME_SERIES', 'SBO_CORE_TIME_SERIES', 'SBO_CORE_TIME_SERIES', 'SBO_CORE_TIME_SERIES', 'SBO_CORE_TIME_SERIES', 'SBO_CORE_TIME_SERIES', 'SBO_CORE_TIME_SERIES', 'SBO_CORE_TIME_SERIES', 'SBO_CORE_TIME_SERIES', 'SBO_CORE_TIME_SERIES', 'SBO_CORE_TIME_SERIES', 'SBO_CORE_TIME_SERIES', 'SBO_CORE_TIME_SERIES', 'SBO_CORE_TIME_SERIES', 'SBO_CORE_TIME_SERIES', 'SBO_CORE_TIME_SERIES', 'SBO_CORE_TIME_SERIES', 'SBO_CORE_TIME_SERIES', 'GEOMETRY_LLC0090GRID', 'GEOMETRY_LLC0090GRID', 'GEOMETRY_LLC0090GRID', 'GEOMETRY_LLC0090GRID', 'GEOMETRY_LLC0090GRID', 'GEOMETRY_LLC0090GRID', 'GEOMETRY_LLC0090GRID', 'GEOMETRY_LLC0090GRID', 'GEOMETRY_LLC0090GRID', 'GEOMETRY_LLC0090GRID', 'GEOMETRY_LLC0090GRID', 'GEOMETRY_LLC0090GRID', 'GEOMETRY_LLC0090GRID', 'GEOMETRY_LLC0090GRID', 'GEOMETRY_LLC0090GRID', 'GEOMETRY_LLC0090GRID', 'GEOMETRY_LLC0090GRID', 'GEOMETRY_LLC0090GRID', 'GEOMETRY_LLC0090GRID', 'GEOMETRY_LLC0090GRID', 'GEOMETRY_LLC0090GRID', 'OCEAN_3D_MIX_COEFFS_LLC0090GRID', 'OCEAN_3D_MIX_COEFFS_LLC0090GRID', 'OCEAN_3D_MIX_COEFFS_LLC0090GRID', 'SSH_LLC0090GRID', 'SSH_LLC0090GRID', 'SSH_LLC0090GRID', 'SSH_LLC0090GRID', 'ATM_STATE_LLC0090GRID', 'ATM_STATE_LLC0090GRID', 'ATM_STATE_LLC0090GRID', 'ATM_STATE_LLC0090GRID', 'ATM_STATE_LLC0090GRID', 'ATM_STATE_LLC0090GRID', 'STRESS_LLC0090GRID', 'STRESS_LLC0090GRID', 'STRESS_LLC0090GRID', 'STRESS_LLC0090GRID', 'HEAT_FLUX_LLC0090GRID', 'HEAT_FLUX_LLC0090GRID', 'HEAT_FLUX_LLC0090GRID', 'HEAT_FLUX_LLC0090GRID', 'HEAT_FLUX_LLC0090GRID', 'HEAT_FLUX_LLC0090GRID', 'HEAT_FLUX_LLC0090GRID', 'HEAT_FLUX_LLC0090GRID', 'HEAT_FLUX_LLC0090GRID', 'HEAT_FLUX_LLC0090GRID', 'HEAT_FLUX_LLC0090GRID', 'HEAT_FLUX_LLC0090GRID', 'FRESH_FLUX_LLC0090GRID', 'FRESH_FLUX_LLC0090GRID', 'FRESH_FLUX_LLC0090GRID', 'FRESH_FLUX_LLC0090GRID', 'FRESH_FLUX_LLC0090GRID', 'FRESH_FLUX_LLC0090GRID', 'FRESH_FLUX_LLC0090GRID', 'FRESH_FLUX_LLC0090GRID', 'FRESH_FLUX_LLC0090GRID', 'FRESH_FLUX_LLC0090GRID', 'FRESH_FLUX_LLC0090GRID', 'SEA_ICE_CONC_THICKNESS_LLC0090GRID', 'SEA_ICE_CONC_THICKNESS_LLC0090GRID', 'SEA_ICE_CONC_THICKNESS_LLC0090GRID', 'SEA_ICE_CONC_THICKNESS_LLC0090GRID', 'SEA_ICE_VELOCITY_LLC0090GRID', 'SEA_ICE_VELOCITY_LLC0090GRID', 'SEA_ICE_HORIZ_VOLUME_FLUX_LLC0090GRID', 'SEA_ICE_HORIZ_VOLUME_FLUX_LLC0090GRID', 'SEA_ICE_HORIZ_VOLUME_FLUX_LLC0090GRID', 'SEA_ICE_HORIZ_VOLUME_FLUX_LLC0090GRID', 'SEA_ICE_HORIZ_VOLUME_FLUX_LLC0090GRID', 'SEA_ICE_HORIZ_VOLUME_FLUX_LLC0090GRID', 'SEA_ICE_HORIZ_VOLUME_FLUX_LLC0090GRID', 'SEA_ICE_HORIZ_VOLUME_FLUX_LLC0090GRID', 'SEA_ICE_SALT_PLUME_FLUX_LLC0090GRID', 'SEA_ICE_SALT_PLUME_FLUX_LLC0090GRID', 'MIXED_LAYER_DEPTH_LLC0090GRID', 'OBP_LLC0090GRID', 'OBP_LLC0090GRID', 'OBP_LLC0090GRID', 'OCEAN_VEL_LLC0090GRID', 'OCEAN_VEL_LLC0090GRID', 'OCEAN_VEL_LLC0090GRID', 'TEMP_SALINITY_LLC0090GRID', 'TEMP_SALINITY_LLC0090GRID', 'DENS_STRAT_PRESS_LLC0090GRID', 'DENS_STRAT_PRESS_LLC0090GRID', 'DENS_STRAT_PRESS_LLC0090GRID', 'DENS_STRAT_PRESS_LLC0090GRID', 'OCEAN_3D_VOLUME_FLUX_LLC0090GRID', 'OCEAN_3D_VOLUME_FLUX_LLC0090GRID', 'OCEAN_3D_VOLUME_FLUX_LLC0090GRID', 'OCEAN_3D_MOMENTUM_TEND_LLC0090GRID', 'OCEAN_3D_MOMENTUM_TEND_LLC0090GRID', 'OCEAN_3D_TEMPERATURE_FLUX_LLC0090GRID', 'OCEAN_3D_TEMPERATURE_FLUX_LLC0090GRID', 'OCEAN_3D_TEMPERATURE_FLUX_LLC0090GRID', 'OCEAN_3D_TEMPERATURE_FLUX_LLC0090GRID', 'OCEAN_3D_TEMPERATURE_FLUX_LLC0090GRID', 'OCEAN_3D_TEMPERATURE_FLUX_LLC0090GRID', 'OCEAN_3D_TEMPERATURE_FLUX_LLC0090GRID', 'OCEAN_3D_SALINITY_FLUX_LLC0090GRID', 'OCEAN_3D_SALINITY_FLUX_LLC0090GRID', 'OCEAN_3D_SALINITY_FLUX_LLC0090GRID', 'OCEAN_3D_SALINITY_FLUX_LLC0090GRID', 'OCEAN_3D_SALINITY_FLUX_LLC0090GRID', 'OCEAN_3D_SALINITY_FLUX_LLC0090GRID', 'OCEAN_3D_SALINITY_FLUX_LLC0090GRID', 'OCEAN_3D_SALINITY_FLUX_LLC0090GRID', 'BOLUS_LLC0090GRID', 'BOLUS_LLC0090GRID', 'BOLUS_LLC0090GRID', 'OCEAN_BOLUS_STREAMFUNCTION_LLC0090GRID', 'OCEAN_BOLUS_STREAMFUNCTION_LLC0090GRID'],
                     index=all_variables)
timings = pd.Series(['Daily', 'Snapshot', 'Snapshot', 'None', 'None', 'All', 'Daily', 'Daily', 'Daily', 'Daily', 'All', 'All', 'Daily', 'Daily', 'Daily', 'All', 'Daily', 'All', 'Daily', 'Daily', 'Daily', 'Daily', 'Daily', 'Daily', 'Daily'],
                    index=all_datasets)
granule_prefixes = pd.Series(['GLOBAL_MEAN_SEA_LEVEL', 'GLOBAL_MEAN_ATM_SURFACE_PRES', 'SBO_CORE_PRODUCTS', 'GRID_GEOMETRY', 'OCEAN_3D_MIXING_COEFFS', 'SEA_SURFACE_HEIGHT', 'ATM_SURFACE_TEMP_HUM_WIND_PRES', 'OCEAN_AND_ICE_SURFACE_STRESS', 'OCEAN_AND_ICE_SURFACE_HEAT_FLUX', 'OCEAN_AND_ICE_SURFACE_FW_FLUX', 'SEA_ICE_CONC_THICKNESS', 'SEA_ICE_VELOCITY', 'SEA_ICE_HORIZ_VOLUME_FLUX', 'SEA_ICE_SALT_PLUME_FLUX', 'OCEAN_MIXED_LAYER_DEPTH', 'OCEAN_BOTTOM_PRESSURE', 'OCEAN_VELOCITY', 'OCEAN_TEMPERATURE_SALINITY', 'OCEAN_DENS_STRAT_PRESS', 'OCEAN_3D_VOLUME_FLUX', 'OCEAN_3D_MOMENTUM_TEND', 'OCEAN_3D_TEMPERATURE_FLUX', 'OCEAN_3D_SALINITY_FLUX', 'OCEAN_BOLUS_VELOCITY', 'OCEAN_BOLUS_STREAMFUNCTION'],
                             index=all_datasets)

# So that you don't have to remember whether to put dimension names in quotes or not
i, i_g, j, j_g, k, k_u, k_l, k_p1, tile, XC, YC, XG, YG, Z, Zp1, Zu, Zl, XC_bnds, YC_bnds, Z_bnds, tile, time = 'i', 'i_g', 'j', 'j_g', 'k', 'k_u', 'k_l', 'k_p1', 'tile', 'XC', 'YC', 'XG', 'YG', 'Z', 'Zp1', 'Zu', 'Zl', 'XC_bnds', 'YC_bnds', 'Z_bnds', 'tile', 'time'

# Used to select i, j, i_g, and j_g for quiver plots to space out data
skip = range(2, 88, 5)

def adjust_timing(variable: str, timing: str) -> str:
    dataset = datasets[variable]
    if timing not in {'None', 'Monthly', 'Daily', 'Snapshot'}:
        raise ValueError(str(timing) + ' is not a valid timing (select either Monthly, Daily, or Snapshot)')
    elif timing == 'Snapshot' and timings[dataset] == 'Daily':
        raise ValueError('No snapshots available for ' + str(variable))
    elif timing in {'Monthly', 'Daily'} and timings[dataset] == 'Snapshot':
        raise ValueError('No monthly or daily averages available for ' + str(variable))
    elif timings[dataset] == 'None':
        return 'None'
    elif timing == 'None' and timings[dataset] == 'Snapshot':
        return 'Snapshot'
    elif timing == 'None' and timings[dataset] in {'Daily', 'All'}:
        return 'Monthly'
    else:
        return timing

def get_granule(granule: str, directory: str) -> str:
    file = os.path.join(directory, os.path.basename(granule))
    if not os.path.isfile(file):
        print('File not downloaded: ' + granule)
    return file

def ecco_dataset(dataset: str, start: datetime.date = None, end: datetime.date = None, timing: str = 'None'):
    short_timing_names = {'None': '', 'Monthly': '_MONTHLY', 'Daily': '_DAILY', 'Snapshot': '_SNAPSHOT'}
    long_timing_names = {'None': '', 'Monthly': '_mon_mean', 'Daily': '_day_mean', 'Snapshot': '_snap'}
    if timing not in short_timing_names:
        raise ValueError('Unrecognized timing: ' + str(timing))
    shortname = 'ECCO_L4_' + dataset + short_timing_names[timing] + '_V4R4'
    if 'LLC0090' in dataset:
        if timing == 'Monthly':
            start = datetime.date(start.year, start.month, 1)
            dates = [date.strftime('_%Y-%m') for date in pd.date_range(start, end, freq='MS')]
        elif timing == 'Daily':
            dates = [date.strftime('_%Y-%m-%d') for date in pd.date_range(start, end)]
        elif timing == 'Snapshot':
            dates = [date.strftime('_%Y-%m-%dT000000') for date in pd.date_range(start, end)]
        elif timing == 'None':
            dates = ['']
        longnames = [granule_prefixes[dataset] + long_timing_names[timing] + date + '_ECCO_V4r4_native_llc0090.nc'
                    for date in dates]
    else:
        longnames = [granule_prefixes[dataset] + long_timing_names[timing] + '_ECCO_V4r4_1D.nc']
    granules = ['https://archive.podaac.earthdata.nasa.gov/podaac-ops-cumulus-protected/' + shortname + '/' + longname
                for longname in longnames]
    granule_dir = downloads + '/' + shortname
    try: os.mkdir(granule_dir)
    except FileExistsError: pass
    files = [get_granule(granule, granule_dir) for granule in granules]
    array = xr.open_mfdataset(files, data_vars='minimal', coords='minimal', compat='override')
    if timing == 'Monthly':
        times = pd.DatetimeIndex(array.time)
        return array.assign_coords(time=[str(time)[:10] for time in times])
    elif timing in {'Daily', 'Snapshot'}:
        times = pd.DatetimeIndex(array.time)
        return array.assign_coords(time=[str(time)[:10] for time in times])
    else:
        return array

def ecco_variable(variable: str, start: datetime.date = None, end: datetime.date = None, timing: str = 'None'):
    if variable not in all_variables:
        raise ValueError(str(variable) + ' is not an ECCO variable')
    timing = adjust_timing(variable, timing)
    if timing != 'None' and start is None and 'LLC0090' in datasets[variable]:
        raise ValueError('Enter a date to retrieve \'' + str(variable) + '\'')
    if type(start) == str:
        if len(start) == 7:
            start += '-01'
        start = datetime.datetime.strptime(start, '%Y-%m-%d')
    if type(end) == str:
        if len(end) == 7:
            end += '-01'
        end = datetime.datetime.strptime(end, '%Y-%m-%d')
    if end is None:
        end = start
    if timing == 'Monthly':
        if start < datetime.datetime(2017, 1, 1) or end > datetime.datetime(2017, 12, 31):
            raise ValueError('Monthly averages are only available for 2017')
    elif timing == 'Daily' or timing == 'Snapshot':
        if start < datetime.datetime(2017, 1, 1) or end > datetime.datetime(2017, 12, 31):
            raise ValueError('Daily averages and snapshots are only available for January 2017 and July 2017')
    return ecco_dataset(datasets[variable], start, end, timing)[variable]

def print_value(array):
    if len(array.dims) > 1:
        dims = ', '.join(array.dims)
        raise ValueError('To get a single value, select or average along the remaining dimensions: ' + dims)
    else:
        value = array.values.item()
        if math.isnan(value):
            print('No value found (location is outside the bounds of the ocean)')
        else:
            if 'long_name' in array.attrs:
                print(array.long_name[:-1] + ': ' + str(value))
            else:
                print(value)
        for coord in {'XC', 'XG'}:
            if coord in array.coords:
                longitude = array[coord].values.item()
                print('Longitude: ' + str(abs(round(longitude, 3))) + ('°W' if longitude < 0 else '°E'))
                break
        for coord in {'YC', 'YG'}:
            if coord in array.coords:
                latitude = array[coord].values.item()
                print('Latitude: ' + str(abs(round(latitude, 3))) + ('°S' if latitude < 0 else '°N'))
                break
        for coord in {'Z', 'Zl', 'Zu', 'Zp1'}:
            if coord in array.coords:
                depth = array[coord].values.item()
                print('Depth: ' + str(round(-depth, 3)) + ' meters')
                break

def bounds(bottom, top): return range(bottom, top + 1)

geometry = ecco_dataset('GEOMETRY_LLC0090GRID')
xgcm_grid = xgcm.Grid(geometry, periodic=False, face_connections=tile_connections)

def interpolate(array, dim):
    if dim not in array.dims: raise ValueError(str(dim) + ' is not a dimension of the given variable')
    if dim in {'i', 'i_g', 'XC', 'XG'}: dim = 'X'
    elif dim in {'j', 'j_g', 'YC', 'YG'}: dim = 'Y'
    elif dim in {'k', 'k_u', 'k_l', 'k_p1', 'Z', 'Zp1', 'Zu', 'Zl'}: dim = 'Z'
    else: raise ValueError('Cannot interpolate along ' + str(dim))
    tiled = ('tile' in array.dims)
    if not tiled: array = array.expand_dims('tile')
    coords = array.isel(tile=0).coords if 'tile' in array.dims else array.coords
    nanarray = xr.DataArray(coords=coords).expand_dims(tile)
    tile_arrays = [array.sel(tile=tile) if tile in array.tile else nanarray for tile in range(13)]
    interp = xgcm_grid.interp(xr.concat(tile_arrays, dim='tile').load(), dim).sel(tile=array.tile)
    if tiled: return interp
    else: return interp.squeeze('tile')

def difference(array, dim):
    if dim not in array.dims: raise ValueError(str(dim) + ' is not a dimension of the given variable')
    if dim in {'i', 'i_g', 'XC', 'XG'}: dim = 'X'
    elif dim in {'j', 'j_g', 'YC', 'YG'}: dim = 'Y'
    elif dim in {'k', 'k_u', 'k_l', 'k_p1', 'Z', 'Zp1', 'Zu', 'Zl'}: dim = 'Z'
    else: raise ValueError('Cannot calculate difference along ' + str(dim))
    tiled = ('tile' in array.dims)
    if not tiled: array = array.expand_dims('tile')
    coords = array.isel(tile=0).coords if 'tile' in array.dims else array.coords
    nanarray = xr.DataArray(coords=coords).expand_dims(tile)
    tile_arrays = [array.sel(tile=tile) if tile in array.tile else nanarray for tile in range(13)]
    diff = xgcm_grid.diff(xr.concat(tile_arrays, dim='tile').load(), dim).sel(tile=array.tile)
    if tiled: return diff
    else: return diff.squeeze('tile')

def colormap(data: xr.DataArray):
    cmin = np.nanpercentile(data, 10)
    cmax = np.nanpercentile(data, 90)
    if cmin < 0 and cmax > 0:
        cmax = np.nanpercentile(np.abs(data), 90)
        cmin = -cmax
        cmap = 'RdBu_r'
    else:
        cmap = 'viridis'

    return cmap, cmin, cmax

def update_plot(fig, data, x, y, selection):
    names = data.data_vars.keys()
    title = widgets.Text(description='Plot title:')
    adjust_widgets = [title]
    if 'c' in data.data_vars:
        clabel = widgets.Text(description='Units:')
        cmap = widgets.Dropdown(description='Color map:', options=[
            ('viridis', 'viridis'), ('inferno', 'inferno'), ('cividis', 'cividis'), ('gray', 'binary'), ('gray (inverted)', 'gray'),
            ('pale', 'pink'), ('heat', 'gist_heat'), ('red-blue', 'RdBu_r'), ('seismic', 'seismic'), ('spectral', 'Spectral'),
        ])
        adjust_widgets.extend([clabel, cmap])
        if {'u', 'v'} <= data.data_vars.keys():
            acolor = widgets.Dropdown(description='Arrow color:', options=[('Black', 'k'), ('White', 'w')], value='k')
            adjust_widgets.append(acolor)
    display(widgets.HBox(adjust_widgets))

    fig.clf()
    for (dim, val) in selection.items():
        data = data.sel({dim: val})
    tiles = data.tile if ('tile' in data.dims and len(data.tile) > 1) else None
    variables = dict(data.data_vars)
    for (name, var) in variables.items():
        for dim in {'i_g', 'j_g', 'k_u', 'k_l', 'k_p1'}:
            if dim in var.dims:
                variables[name] = interpolate(var, dim)
    if 'c' in variables:
        cmap.value, cmin, cmax = colormap(variables['c'])
    if {'u', 'v'} <= variables.keys():
        x_skip, y_skip = math.ceil(len(geometry[x]) / 20), math.ceil(len(geometry[y]) / 20)
        quiver_x, quiver_y = geometry[x][(x_skip//2)::x_skip], geometry[y][(y_skip//2)::y_skip]
        uvmax = max(np.nanpercentile(np.abs(variables['u']), 90), np.nanpercentile(np.abs(variables['v']), 90))
    if tiles is not None:
        axes = fig.subplots(4, 4)
        fig.set_size_inches(12.5, 10.1)
        fig.subplots_adjust(wspace=0, hspace=0)
        for ax in axes.ravel():
            ax.axis('off')
        axes = [axes[row][col] for (row, col) in subplots]
        title.observe(lambda change: fig.suptitle(change['new']), names='value')
        meshes, quivers = [], []
        for tile, ax in enumerate(axes):
            if tile not in tiles: continue
            ax.axis('on')
            ax.set_aspect('equal')
            ax.get_xaxis().set_visible(False)
            ax.get_yaxis().set_visible(False)
            if 'c' in variables:
                c_rotated = np.rot90(variables['c'].sel(tile=tile).load(), rotations[tile])
                meshes.append(ax.pcolormesh(geometry[x], geometry[y], c_rotated, cmap=cmap.value, vmin=cmin, vmax=cmax))
            if {'u', 'v'} <= variables.keys():
                # Rotate head of each vector around the tile to the correct orientation
                u_rotated = np.rot90(variables['u'].sel({'tile': tile, x: quiver_x, y: quiver_y}), rotations[tile])
                v_rotated = np.rot90(variables['v'].sel({'tile': tile, x: quiver_x, y: quiver_y}), rotations[tile])
                # Rotate tail of each vector around the head by the same amount
                u_adjusted = u_rotated * cos90(rotations[tile]) + v_rotated * sin90(rotations[tile])
                v_adjusted = v_rotated * cos90(rotations[tile]) - u_rotated * sin90(rotations[tile])
                quivers.append(ax.quiver(quiver_x, quiver_y, u_adjusted, v_adjusted, scale=20*uvmax, width=0.006, clip_on=False))
        if 'c' in variables:
            cbar = fig.colorbar(meshes[0], ax=axes)
            cbar.set_label(clabel.value)
            clabel.observe(lambda change: cbar.set_label(change['new']), names='value')
            cmap.observe(lambda change: [mesh.set_cmap(change['new']) for mesh in meshes], names='value')
            if {'u', 'v'} <= variables.keys():
                [quiver.set_color(acolor.value) for quiver in quivers]
                acolor.observe(lambda change: [quiver.set_color(change['new']) for quiver in quivers], names='value')
    else:
        ax = fig.subplots()
        fig.set_size_inches(6.5, 5)
        ax.set_xlabel('Tile x-coordinate')
        ax.set_ylabel('Tile y-coordinate')
        title.observe(lambda change: ax.set_title(change['new']), names='value')
        if 'c' in names:
            mesh = ax.pcolormesh(geometry[x], geometry[y], variables['c'], cmap=cmap.value, vmin=cmin, vmax=cmax)
            cbar = fig.colorbar(mesh)
            cbar.set_label(clabel.value)
            clabel.observe(lambda change: cbar.set_label(change['new']), names='value')
            cmap.observe(lambda change: mesh.set_cmap(change['new']), names='value')
        if {'u', 'v'} <= names:
            quiver_u = variables['u'].sel({x: quiver_x, y: quiver_y})
            quiver_v = variables['v'].sel({x: quiver_x, y: quiver_y})
            quiver = ax.quiver(quiver_x, quiver_y, quiver_u, quiver_v, scale=20*uvmax, width=0.006)
            if 'c' in names:
                quiver.set_color(acolor.value)
                acolor.observe(lambda change: quiver.set_color(change['new']), names='value')

def plot(c: xr.DataArray, u: xr.DataArray = None, v: xr.DataArray = None):
    for (x, x_name) in [(c, 'c'), (u, 'u'), (v, 'v')]:
        if x is not None:
            x.name = x_name
    data = xr.merge([x for x in (c, u, v) if x is not None])
    area_options = [('All tiles', -1)] + [('Tile ' + str(tile), tile) for tile in data.tile.values]
    area = widgets.Dropdown(options = area_options, description = 'Plot area:')
    plot_widgets = [area]
    for (k_dim, Z_dim) in {('k', 'Z'), ('k_l', 'Zl'), ('k_u', 'Zu'), ('k_p1', 'Zp1')}:
        if k_dim in data.dims:
            depth = widgets.SelectionSlider(options=[(str(int(-k)) + ' m', i) for (i, k) in enumerate(data[Z_dim].values)], description='Depth:')
            plot_widgets.append(depth)
            break
        else:
            k_dim, Z_dim = None, None
    if 'time' in data.dims:
        date = widgets.SelectionSlider(options=[(str(t)[:10], t) for t in data.time.values], description='Date:')
        plot_widgets.append(date)
    plot_button = widgets.Button(description='Plot')
    clear_button = widgets.Button(description='Clear plot')
    output = widgets.Output()
    fig = plt.figure()
    fig.set_size_inches(0.01, 0.01)

    def on_plot_button(_):
        output.clear_output()
        selection = {}
        if area.value >= 0:
            selection['tile'] = area.value
        if k_dim is not None:
            selection[k_dim] = depth.value
        if 'time' in data.dims:
            selection['time'] = date.value
        with output: update_plot(fig, data, 'i', 'j', selection)

    def on_clear_button(_):
        output.clear_output()
        fig.clf()
        fig.set_size_inches(0.01, 0.01)

    plot_button.on_click(on_plot_button)
    clear_button.on_click(on_clear_button)
    display(widgets.HBox(plot_widgets), widgets.HBox([plot_button, clear_button]), output)
    plt.show()

def plot_utility():
    plt.close()
    color = widgets.Text(description='Color plot:', value='THETA')
    quiver_x = widgets.Text(description='Arrow plot x:', value='UVELMASS')
    quiver_y = widgets.Text(description='Arrow plot y:', value='VVELMASS')
    hbox1 = widgets.HBox([color, quiver_x, quiver_y])
    start = widgets.DatePicker(description='Start date:', value=datetime.datetime(2017, 1, 1))
    end = widgets.DatePicker(description='End date:', value=datetime.datetime(2017, 1, 10))
    timing = widgets.Dropdown(options=['Monthly', 'Daily', 'Snapshot'], value='Daily', description='Timing:')
    hbox2 = widgets.HBox([start, end, timing])
    load_button = widgets.Button(description='Load data')
    clear_button = widgets.Button(description='Clear data')
    load_status = widgets.Label(value='')
    hbox3 = widgets.HBox([load_button, clear_button, load_status])
    output = widgets.Output()
    
    def on_load_button(_):
        if (quiver_x.value == '') ^ (quiver_y.value == ''):
            load_status.value = 'Enter both x- and y-components for the arrow plot'
        elif not (color.value or quiver_x.value or quiver_y.value):
            load_status.value = 'Enter variable names above'
        elif not (start.value and end.value):
            load_status.value = 'Enter start and end dates'
        elif start.value > end.value:
            load_status.value = 'Start date must be before end date'
        elif start.value < np.datetime64('1992-01-01'):
            load_status.value = 'Start date must not be before 1992'
        elif end.value >= np.datetime64('2018-01-01'):
            load_status.value = 'End date must not be after 2017'
        else:
            load_status.value = ''
            c, x, y = None, None, None
            monthly = True
            if color.value:
                try:
                    c = ecco_variable(color.value, start.value, end.value, timing.value)
                except ValueError as e:
                    load_status.value = str(e)
                    return
            if quiver_x.value and quiver_y.value:
                try:
                    x = ecco_variable(quiver_x.value, start.value, end.value, timing.value)
                    y = ecco_variable(quiver_y.value, start.value, end.value, timing.value)
                except ValueError as e:
                    load_status.value = str(e)
                    return
            output.clear_output()
            with output: plot(c, x, y)

    def on_clear_button(_):
        output.clear_output()
    
    load_button.on_click(on_load_button)
    clear_button.on_click(on_clear_button)
    display(hbox1, hbox2, hbox3, output)

print('Setup complete')

In order to read a file, you can use the `xr.open_dataset` function. This is part of the `xarray` package, which is for reading gridded data. You need to give this function the path to the file. Let's start by looking at the daily average temperature. 

In [None]:
dataset = # fill in the folder where the daily temperature data is stored
file = # fill in the file for the temperature on a particular day
path = downloads + '/' + dataset + '/' + file
ds_daily = xr.open_dataset(path)

We have saved the data as `ds_daily`. If you write this in a code block, it will print information about that dataset. Note that there are multiple variables and dimensions in this dataset.

In [None]:
ds_daily

Now let's just look at the temperature data, which is `THETA`. Note below how we select one variable from a dataset.
If we only select one variable we get a reduced number of dimensions and attributes.

The top line is the most important: it includes a list of the dimensions along which that variable varies, along with how large each dimension is. In the following example, note how all five dimensions are included:
- `time`, which consists of 1 day
- `k`, which consists of 50 depth levels
- `tile`, which has one option for each of the 13 tiles
- `i` and `j`, which select x- and y-coordinates within each tile

At the bottom of the description is a dropdown menu called Attributes, which shows you more information about how to interpret that variable. 

In [None]:
ds_daily['THETA']

The bold variables above are indices while the other dimensions are not indices. For example, observe that `k` is the index in the depth direction, but `Z` shows the labels in that direction. We can obtain the values for the `Z` coordinate by selecting that variable. Below we first read the values of `Z` and then plot as a function of the index `k`.

In [None]:
ds_daily['THETA']['Z'].values

In [None]:
fig = plt.figure()
ds_daily['THETA']['Z'].plot()

**Task**: Make an observation about the vertical grid of this model. Are the cells uniform thickness? Drawing on the "Heart of the Machine" lectures, speculate about why or why not.

If we want to read more than one timestep, we use the `xr.open_mfdataset` function

In [None]:
folder = downloads+'/ECCO_L4_TEMP_SALINITY_LLC0090GRID_MONTHLY_V4R4'
files = os.listdir(folder) # list all files in the folder (each month of 2017)
paths = []
for file in files:
   paths.append(os.path.join(folder, file)) # make a list of all files with their complete path
ds_monthly = xr.open_mfdataset(paths)

Examine the temperature variable in the file below and note that the time dimension has changed.

In [None]:
ds_monthly['THETA']

Using `.sel`, we can *select* variables along each of its dimensions. The following code selects temperature along the dimensions `time`, `k`, and `tile`. Notice how those dimensions no longer appear as bold in the output because we've selected a specific point along each of them. We use the `method` nearest to state that we want the nearest point available. 

**Task**: What time step is selected in the output below?

In [None]:
ds_monthly['THETA'].sel(time = '2017-01-01', k = 10, tile = 4,method = 'nearest')

### Plotting ECCO data

We can plot a specific tile, depth, and time by selecting on those dimensions and plotting using the built-in `xarray` plotting function.

In [None]:
fig = plt.figure()
ds_monthly['THETA'].sel(tile = 2,k = 0,time = '2017-01-01',method = 'nearest').plot()

**Task:** Modify the code above to make and save the following plots. 

- Plot sea surface temperature for a tile off the coast of Africa.
- Plot sea surface salinity for a tile off the coast of Africa.
- Plot temperature at the depth level nearest to 100 m in a tile off the coast of North America.

In [None]:
# TO-DO sea surface temperature for a tile off the coast of Africa

In [None]:
# TO-DO sea surface salinity for a tile off the coast of Africa

In [None]:
# TO-DO temperature at the depth level nearest to 100 m in a tile off the coast of North America

To help plotting global data, we have provided a `plot` function that will create a global map. Run the line below to obtain a global plot. Run the following code block to make a widget appear. Above the plot, you can enter names for the axes, add a title, and change some properties of the plot. Keep in mind that when you are using the plot utility, your figures won't be saved if you exit out of the notebook; thus, it's important to **manually save all the images you create**. 

The plot function takes one or three inputs. Below we demonstrate the function one input, where a single variable is provided. This variable should have x, y, z, and time dimensions. If three inputs are provided, the 2nd and 3rd inputs result in a vector.

**Task:** Generate a plot using the line below, and add an appropriate title and units. Save the image to your computer using either Shift + Right click (regular right click won't work) or with the Save icon that appears on the left side when you hover over the plot.

In [None]:
plot(ds_daily['THETA'])

## Heat transport and circulation (Qual.)

### Circulation features

To complete the qualitative track labs, you can use the provided function `ecco_variable` to load variables for the remainder of this lab assignment. This function has four inputs:

- the name of the variable
- a start date, expressed in ISO date format (YYYY-MM-DD)
- an end date
- whether you want a monthly average (`Monthly`), a daily average (`Daily`), or a snapshot (`Snapshot`).

In the example below, we load monthly temperature and velocity data. Remember that the plot function takes one or three inputs. Below we demonstrate the function three inputs. The first input is plotted in the background with shading. The second and third inputs result in velocity vectors the second input points east-west and the third input points north-south.

In [None]:
temperature_C = ecco_variable('THETA', '2017-01', '2017-12', 'Monthly')
velocity_x = ecco_variable('UVEL', '2017-01', '2017-12', 'Monthly')
velocity_y = ecco_variable('VVEL', '2017-01', '2017-12', 'Monthly')
fig = plt.figure()
plot(temperature_C, velocity_x, velocity_y)

**Task**: Use the plotting utility above to plot the monthly average for at least two different months. Describe the dominant circulation features and relate these features to concepts discussed in class. Do these circulation features vary depending on time of year? (approximately 200 words)

**Task**: Modify the plot above to plot the daily average sea surface temperature and velocity rather than the monthly average. Save plots for at least two different days. Compare the daily and monthly plots and note any differences and similarities between the monthly and daily averages. (approximately 200 words)

In [None]:
# to-do plot atmospheric pressure and surface winds. These are external forcing fields for ECCO.
pressure_C = # load the monthly average pressure
wind_x = # load the east-west wind
wind_y = # load the north-south wind
plot(pressure_C, wind_x, wind_y)

**Task**: What does it mean that atmospheric variables are external forcing?

**Task**: Choose a particular region (for example, the upwelling on the eastern boundary of the the South Pacific — the west coast of South America) and discuss the effects of the observed arrangement of sea surface temperature and atmospheric pressure on:
- weather?
- fisheries?

### Discussion

Discussion during the first part of the lab will center around the figure below. We will discuss how climate dynamics result in heat storage in different parts of the climate system.
<figure>
    <img src="Lab2Images/IPCC_AR6_WGI_Figure_9_6.png" width="700"/>
    <figcaption>
        <a href="https://www.ipcc.ch/report/ar6/wg1/chapter/chapter-9/"> IPCC AR6 Fig. 9.6: Ocean heat content (OHC) and its changes with time. (a) Time series of global OHC anomaly relative to a 2005–2014 climatology in the upper 2000 m of the ocean. Shown are observations (Ishii et al., 2017; Baggenstos et al., 2019; Shackleton et al., 2020), model-observation hybrids (Cheng et al., 2019; Zanna et al., 2019), and multi-model means from the Coupled Model Intercomparison Project Phase 6 (CMIP6) historical (29 models) and Shared Socio-economic Pathway (SSP) scenarios (label subscripts indicate number of models per SSP). (b–g) Maps of OHC across different time periods, in different layers, and from different datasets/experiments. Maps show the CMIP6 ensemble bias and observed (Ishii et al., 2017) trends of OHC for (b, c) 0–700 m for the period 1971–2014, and (e, f) 0–2000 m for the period 2005–2017. CMIP6 ensemble mean maps show projected rate of change 2015–2100 for (d) SSP5-8.5 and (g) SSP1-2.6 scenarios. Also shown are the projected change in 0–700 m OHC for (d) SSP1-2.6 and (g) SSP5-8.5 in the CMIP6 ensembles, for the period 2091–2100 versus 2005–2014. No overlay indicates regions with high model agreement, where ≥80% of models agree on the sign of change. Diagonal lines indicate regions with low model agreement, where <80% of models agree on the sign of change (see Cross-Chapter Box Atlas.1 for more information). Further details on data sources and processing are available in the chapter data table (Table 9.SM.9).</a>
    </figcaption>
</figure>

## Quantifying circulation (Quant.)

### Circulation features

The plot function takes one or three inputs. With three inputs, the first input is plotted in the background with shading (pcolor plot). The second and third inputs result in velocity vectors the second input points east-west and the third input points north-south.

**Task**: Create a plot of sea surface temperature with surface ocean velocity vectors for each month of the year. Save all of the figures to a folder and note changes in circulation throughout the year in at least 3 bullet points.

### Arithmetic on `xarray` arrays

We can do arithmetic on `xarray` arrays by treating them just like regular variables. For example, the following plot shows temperature in Kelvins:

In [None]:
temperature_K = ds_daily['THETA'] + 273.15
plot(temperature_K)

**Task:** With this in mind, one of the ECCO variables is seawater density anomaly. See if you can plot actual seawater density, rather than just the anomaly. (Hint: check the variable's attributes)

In [None]:
# TO-DO: Plot seawater density -- you can use any date in 2017. 

Density depends on temperature and salinity according to the following graph:

<figure>
    <img src="Lab2Images/fig2.png" width="500"/>
    <figcaption>
        <a href="https://www.researchgate.net/figure/Temperature-salinity-graph-showing-lines-of-constant-density-isopycnals-for-seawater-at_fig3_335607252">
            Fig. 2: Density vs. temperature and salinity
        </a>
    </figcaption>
</figure>

**Task:** Verify this relationship using ECCO data. First, using ECCO variables, write a formula that approximates density using only temperature and salinity. Then, subtract the approximate density from the real density to make a plot that shows where the approximation is more or less correct. You should verify that the approximation is mostly correct by checking that most of the data lies between -1 kg/m3 and +1 kg/m3. Does the accuracy of this approximation depend on depth?

In [None]:
# TO-DO: Compare density to temperature, and salinity -- you should use the same date as before

### Grid Geometry

ECCO uses an Arakawa C-grid, which is described in *Computing the Climate*, p. 147-148. This type of grid is actually a combination of overlapping grids; each component grid has a name and is used by only some variables in the model. The *tracer grid* is used by scalar quantities like temperature, salinity and density (these quantities are sometimes called *tracers*). The *u-grid* and *v-grid* are used by vector quantities like velocity and heat flux. According to the specifications of the Arakawa C-grid, the u-grid should be staggered in the *x*-direction versus the tracer grid, while the v-grid should be staggered in the *y*-direction. Lastly, the *g-grid* is used by a few quantities like vorticity, and it's staggered in both the *x*- and *y*-directions versus the tracer grid.

**Task:** Draw a a diagram of the Arakawa C-grid. In the next cell, answer the following questions about the C-grid:

- Referring to Figure 5.13 in *Computing the Climate*, which component grid should be used by velocity in the *x*-direction? 
- How is the g-grid staggered relative to the v-grid?

The component grid used by an ECCO variable is reflected in the names of its dimensions. For example, consider the velocity variables `UVELMASS` and `VVELMASS`.

**Task:** Make a color plot of `UVELMASS`. What does this variable represent? Explain why its value changes suddenly along certain tile boundaries.

**Task:** In the next cell, answer the following questions:

- How and why do the dimensions differ between the two components of horizontal velocity?
- Based on the Arakawa C-grid, infer the meanings of the four dimensions `i`, `j`, `i_g`, and `j_g`. What do their coordinates indicate?
- Which dimensions would be used for a variable recorded on the g-grid?
- On your diagram of the C-grid, show what the coordinates of these four dimensions look like. (You may need to make multiple diagrams to avoid too many overlapping lines.)

One final note: variables can also be staggered in the vertical direction, such as vertical velocity. This is indicated with the dimension `k_l` replacing `k` (depth). In ECCO, these variables are always horizontally aligned with the tracer grid, and never the u-, v- or g-grids.

Up until now, we've only looked at time-varying quantities like temperature and velocity. But as we start performing more calculations using ECCO variables, it will be necessary to use the [grid parameter variables](https://raw.githubusercontent.com/ECCO-GROUP/ECCO-v4-Python-Tutorial/master/varlist/v4r4_tseries_grid_varlist.txt). Understanding the C-grid is crucial to understanding what these variables mean.

**Task:** Read through the variables in the `ECCO_L4_GEOMETRY_LLC0090GRID_V4R4` dataset, and answer the following questions:

- On your diagram of the C-grid, depict each of the following variables: `dxC`, `dxG`, `dyC`, `dyG`, `rA`, `rAw`, `rAs`, `rAz`, `CS`, `SN`
- Using the plot utility, plot arrows that point north from every point on all tiles. Save this plot to your folder.
- Write two different ways to calculate the volume of a tracer grid cell.
- Write two different ways to calculate the volume of a u-grid cell.
- Write two different ways to calculate the volume of a grid cell staggered vertically.

### Selecting along dimensions

We can take differences between slices of the model. For example, the following code plots the difference in temperature between two different depths.

In [None]:
plot(ds_monthly['THETA'].sel(k = 5) - ds_monthly['THETA'].sel(k = 0))

Once you've selected along every dimension and gotten a single value, you can read the value using the built-in `.values` function. (Using the regular `print` function will just print information about the array, not the value you're looking for.)

In [None]:
temperature_reading = ds_monthly['THETA'].sel(time = '2017-01-01', k = 10, tile = 4, i = 10, j = 20,method = 'nearest')
temperature_reading.values

**Task:** Print a temperature near the bottom of the ocean at the North Pole on January 1st, 2017.

In [None]:
# TO-DO: Print temperature in degrees C

We can also average along dimensions with `.mean`; other operations like `.sum`, `.min` (minimum), and `.max` (maximum) work similarly. The following code plots the temperature anomaly relative to the time average.

In [None]:
fig = plt.figure()
plot(ds_monthly['THETA'] - ds_monthly['THETA'].mean(time))

**Task:** Add comments describing each line the following calculation, and indicate what the value printed at the end means.

In [None]:
# TO-DO: Annotate
volume = ecco_variable('rA') * ecco_variable('drF') 
temp_volume = temperature_C.sel(time = '2017-01-01') * volume 
total_volume = volume.sum(tile).sum(i).sum(j).sum(k) 
total_temp_volume = temp_volume.sum(tile).sum(i).sum(j).sum(k) 
avg_temp = total_temp_volume / total_volume 
print_value(avg_temp) 

When selecting along a dimension, we can also choose a range of coordinates with the `bounds` function:

In [None]:
plot(temperature_C.sel(time = '2017-01-01', tile = bounds(3, 9)))

We can also mask certain parts of the data using the `where` function. Below we also demonstrate use of the ECCO built-in plotting function.

In [None]:
lat = 26
ones = xr.ones_like(ds_monthly.YC)
dome_maskC = ones.where(ds_monthly.YC>=lat,0)

In [None]:
plt.figure(figsize=(12,6))
ecco.plot_proj_to_latlon_grid(ds_monthly.XC,ds_monthly.YC,dome_maskC,
                              projection_type='robin',cmap='viridis',user_lon_0=0,show_colorbar=True);

**Task:** Using `.where` and `.sum`, find approximately the average northward velocity of the Atlantic ocean on 1 January 2017. You'll do this by using masks that are provided by the ECCO team. This will greatly simplify any grid geometry considerations! First look at the locations that are available.

In [None]:
print(ecco.get_available_basin_names())

We will load the ECCO grid and join it to the dataset file. This will allow us to make good use of the ECCO tools that consider the grid geometry. 

In [None]:
ds_grid = xr.open_mfdataset(downloads+'/ECCO_L4_GEOMETRY_LLC0090GRID_V4R4/GRID_GEOMETRY_ECCO_V4r4_native_llc0090.nc')
ecco_ds = xr.merge((ds_grid,ds_monthly))

Make a 2D mask for the Atlantic by selecting the first depth level using the code below. You will likely need to download the basins data to your individual folder, which the code below will do the first time that you run it.

In [None]:
from os.path import join,expanduser,exists,split
try:
    atl_maskW = ecco.get_basin_mask(basin_name='atl',mask=maskW.isel(k=0))
    atl_maskS = ecco.get_basin_mask(basin_name='atl',mask=maskS.isel(k=0))
except:
    # depending on how ecco_v4_py is downloaded/installed,
    # the basin mask file may not be in the location expected by ecco_v4_py.
    # This will download the file from the ECCOv4-py GitHub online.
    basin_path = join('./','ECCOv4-py','binary_data')
    if exists(join(basin_path,'basins.meta')) == False:
        import requests
        url_basin_mask = "https://github.com/ECCO-GROUP/ECCOv4-py/raw/master/binary_data/basins.data"
        source_file = requests.get(url_basin_mask, allow_redirects=True)
        if exists(basin_path) == False:
            os.makedirs(basin_path)
        target_file = open(join(basin_path,'basins.data'),'wb')
        target_file.write(source_file.content)
        url_basin_mask = "https://github.com/ECCO-GROUP/ECCOv4-py/raw/master/binary_data/basins.meta"
        source_file = requests.get(url_basin_mask, allow_redirects=True)
        target_file = open(join(basin_path,'basins.meta'),'wb')
        target_file.write(source_file.content)
    atl_maskW = ecco.get_basin_mask(basin_name='atl',mask=maskW.isel(k=0),\
                                    basin_path=basin_path)
    atl_maskS = ecco.get_basin_mask(basin_name='atl',mask=maskS.isel(k=0),\
                                    basin_path=basin_path)

In [None]:
plt.figure(figsize=(12,6))
ecco.plot_proj_to_latlon_grid(ecco_ds.XC,ecco_ds.YC,atl_maskW,
                              projection_type='robin',cmap='viridis',user_lon_0=-30,show_colorbar=True);

In [None]:
# TO-DO: Find average northward velocity of the Atlantic ocean