In [2]:
import xarray as xr
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

# Hints
 - you don't *have* to use xarray to do these problems, but many of them are much easier if you do. See the Example lab for some ways to do this if you are unfamiliar. 
 - xarray has the ability to apply operations to different months with the where operation, e.g. to get the mean of all the Januarays in ds:

    ``` ds.where(ds.time.dt.month==1).mean()```

- you can take mean and standard deviations in xarray easily using ```.mean()``` and ```.std()```. Note if you don't specify a dimension it will take the mean or standard deviation over all dimensions.
    
- you can use pandas to make really nice tables with little titles, see the next cell for a little example

In [3]:
dfo = pd.DataFrame(  np.random.randn(3,3),
                    columns=['1','2','3'],
                    index=['A','B','C'])

dfo.style.format('{:1.2e}').set_caption('Example Table')

Unnamed: 0,1,2,3
A,1.11,0.00737,0.765
B,-0.484,0.337,1.14
C,-0.743,0.378,-0.495


# Even More Hints {-}

- The CanESM files all have the format ```CanESM_VARIABLE_historical_r1i1p1f1.nc```, where VARIABLE is a code for one of the different variables. If you need to figure out which vairbale it is, you can use ```ncdump -h``` or open the file in xarray to see whats inside.

- the CanESM data is from a climate model run which was run using 

- If you're not familiar with xarray yet its probably easier to load each variable as a seperate dataset, e.g.

``` ds_Clh=xr.open_dataset(f'{indir}/CanESM_hfls_1970_r1i1p1f1.nc') ```
    
- You can load all the data you need to at once using the xr.open_mfdataset function like so:

```ds_C = xr.open_mfdataset(f'{indir}/Can*1970*.nc') ```

note that if you do this you won't be able to compute things directly, for instance if you want to add two variables together you will have to do something like

``` ds_C.X = (ds_C.A+ds_C.B).compute()```

- For question 1, the final output will be extremley similar to figure 5.1 in Hartman, although its probably better to use kg/s/m^2 instead of cm/year as the units. 

- For now and the rest of this course, you can assume that the latent heat of vaporization is a constant set to 2.5 x 10^6 J/kg. In reality it varies with temperature, but the variations are small compared to some of the other approximations that we are making. 

- The _1970 version of the files are the same as the ones without the _1970, but the time series goes from 1970-2015 instead of from 1850-2015. You should use the _1970 version, to make the calculations faster. 

- If you need to area weight your data, you should take a look at the example lab.

- For the CERES data there is a file called ```CERES_EBAF_v4.1_2000_2014_climatology.nc```, which has all of the variables in it. I have kept the cmip6 naming conventions in that file, although NASA does not by default. 

- The TOA output is similar to (some of) the results that you will find in Figure 2.4 (and subsequent discussion of) Figure 2.4 of Hartman. 
