# Lab 2: Global energy balance data

The goal of this lab is to read in, plot, and interpret data about Earth's global radiation balance.  Our science questions are:
- Is the Earth system gaining energy (net positive energy balance) or losing energy (net negative energy balance)?
- What is the spatial pattern of this energy balance?  e.g. Is it consistent everywhere, or are some places net positive and some net negative?
- What, if anything, is the seasonal influence on the energy balance?

To address these questions we are going to analyse data from the [CERES](https://climatedataguide.ucar.edu/climate-data/ceres-ebaf-clouds-and-earths-radiant-energy-systems-ceres-energy-balanced-and-filled) (Clouds and the Earth's Radiant Energy System) mission.   On an upcoming assignment, you will use the same dataset to address the role of clouds in the Earth system.  Get excited!

**You can download the files** [here for TOA](https://cluster.klima.uni-bremen.de/~fmaussion/teaching/climate/CERES_EBAF-TOA_Ed4.1_Clim-2005-2015.nc),  [here for Surface](https://cluster.klima.uni-bremen.de/~fmaussion/teaching/climate/CERES_EBAF-Surface_Ed4.1_Clim-2005-2015.nc).

***
## Part 1: Demonstration and skills practice

First, let's import the tools we need, drawing on some of the packages we set up in your environment.

In [None]:
# Import the tools we are going to need today:
import matplotlib.pyplot as plt  # plotting library
import numpy as np  # numerical library
import xarray as xr  # netCDF library
import cartopy  # Map projections libary
import cartopy.crs as ccrs  # Projections list
# Some defaults:
plt.rcParams['figure.figsize'] = (12, 5)  # Default plot size

### 1. Reading in and inspecting data
Let's say you have a global dataset you want to analyse.  Most of today's climate data is stored in the NetCDF format (`*.nc`). NetCDF files are binary files, which means that *you can't just open them in a text editor*. You need a special reader for it.

There are many ways to read data in to Python, but for our purposes, you will almost always use a built-in function from [_xarray_](http://xarray.pydata.org) or [_pandas_](https://pandas.pydata.org/).  Today we focus on xarray.

To read data in to Python, you need to specify the file path and then use a read-in command.  We use `xr.open_dataset` here.

In [None]:
## Here I downloaded the file to a "data" folder which I set up 
## in a folder close to this notebook
fpath = r'/Users/lizz/Documents/GitHub/climdyn-labs/data/CERES_EBAF-TOA_Ed4.1_Clim-2005-2015.nc'
## The variable name "ds" stands for "dataset"
ds = xr.open_dataset(fpath)

You'll have to give an absolute or relative path to the file for this to work. For example `r'C:\PATH\TO\FILE\CERES_EBAF-TOA_Ed2.8_Avg-2001-2014.nc'` on Windows.

**Windows users: don't forget to add the `r` before the path, which allows you to use backlashes in the string.**

In [None]:
# Inspect it
ds

The NetCDF dataset consists of various elements:
- The *dimensions* specify the number of elements of each data coordinate. Their names should be understandable and specific.
- The *attributes* provide some information about the file (metadata).
- The *variables* contain the actual data. In our file there are five variables. All have the dimensions [month, lat, lon], so we can expect an array of size [12, 180, 360].
- The *coordinates* locate the data in space and time.

#### *Exercise:*
Describe how to find the filepath of a data file you have downloaded to your personal computer.

*(...your response here...)*

***
### 2. Closer inspection: summary statistics

Xarray provides us easy tools to analyse multidimensional climate data.  These come in the form of *attributes* and *functions* associated with the Dataset *object*.  We access these with a dot, `ds.<attribute>.<function_name>()`, as shown in Lab 1. 

First let's compute the time average of the TOA Shortwave Flux over the year:

In [None]:
sw_avg = ds.toa_sw_all_clim.mean(dim='month')

What did we just do? From the NetCDF Dataset, we took the toa_sw_all_clim variable (`ds.toa_sw_all_clim`) and we applied the function `.mean()` to it. So an equivalent formulation, using named variables `sw` and `sw_avg`, could be:

In [None]:
# Equivalent code:
sw = ds.toa_sw_all_clim
sw_avg = sw.mean(dim='month')

Let's inspect `sw_avg`.

In [None]:
sw_avg

So `sw_avg` is a 2-dimensional array of dimensions [lat, lon]. Note that the month dimension has disappeared.

When we applied the `mean()` function, we added an argument (called a **keyword argument**): `dim='month'`. With this argument, we told the function to compute the average *over the month dimension* (so the result no longer has a month dimension).

Let's remove this keyword and compute the mean again:

In [None]:
sw.mean()

Ha! We now have an array without dimensions: a single element array, also called a **scalar**. This is the total average over all the dimensions. We'll come back to this later.

*Note: scalar output is quite verbose in xarray. You can print simpler scalars on screen with the .item() method:*

In [None]:
sw.mean().item()

#### *Exercise:* 
Based on what is shown above, which of the following are the same?
- `ds.toa_sw_all_clim`
- `ds.toa_sw_clr_c_clim`
- `ds.toa_sw_all_clim.mean(dim=‘month’)`
- `sw`
- `sw_avg`

*(...your response here...)*

#### *Exercise*:
What should we expect from the following commands?

    sw.mean(dim='lon')
    sw.mean(dim='month').mean(dim='lon')
    sw.mean(dim=['month', 'lon'])

Discuss with your partner and then write code to check.

In [None]:
## [your code here]

#### *Exercise*:
Compute the maximum and minimum values of top-of-atmosphere outgoing shortwave radiation. ([hint](http://xarray.pydata.org/en/stable/generated/xarray.DataArray.min.html))

In [None]:
## [your code here]

***
### 3. Global plots

### Spatial data
We are now going to use xarray's built-in `.plot` capability to plot the time-averaged Top of Atmosphere Shortwave Flux on a map:

In [None]:
## Define the map projection (how is the spherical Earth transformed to 2D view)
ax = plt.axes(projection=ccrs.EqualEarth())
## ax is an empty Matplotlib plot. We now plot the relevant variable onto ax, using xarray
ds.toa_sw_all_clim.mean(dim='month').plot(ax=ax, transform=ccrs.PlateCarree()) 
#sw_avg.mean(dim='month').plot(ax=ax, transform=ccrs.PlateCarree()) ## Note that we can use the named variable we set up above
## the keyword "transform" tells the function in which projection the data is stored 
ax.coastlines(); ax.gridlines(); # Add gridlines and coastlines to the plot

We are looking at the average TOA outgoing shorwage flux, expressed in W m$^{-2}$. Such time averages are often writen with a bar on top of them:

$\overline{SW_{TOA}} = temporal\_mean(SW_{TOA})$

### Plotting 1D (zonal) averages

Xarray will also easily plot 1d data. In this case, we are going to compute the zonal average of `sw_avg`. "Zonal average" means "along a latitude circle". It is often writen with `[]` or `<>` in formulas:

$\left[ \overline{SW_{TOA}} \right] = zonal\_mean(temporal\_mean(SW_{TOA}))$

Note that the two operators are commutative, so you can take the average in either order:

$\left[ \overline{SW_{TOA}} \right] = \overline{\left[ SW_{TOA} \right]}$

With xarray, we can compute an average and plot it immediately, in one line of code:

In [None]:
sw_avg.mean(dim='lon').plot();

#### *Exercise*:
Interpret the global plots.  Where are the highest and lowest values found?  Why do you think they are where they are?

*(...your interpretation here...)*

***
### 4. Closer inspection: the `sel` command

We have seen that taking a mean over one dimension reduces our data to 1D, as in the zonal average above.  Another common task to narrow down our data is selecting a slice of interest.  For example, we might want to inspect January values only.

Xarray can select slices along a particular dimension using the built-in `sel` function.

In [None]:
sw_jan = sw.sel(month=1)
sw_jan.plot()

#### *Exercise*:
Use `sel` to select and plot the average outgoing shortwave radiation along a line of longitude that includes Greenland.  
Greenland is located at 42.5 degrees West, or 317.5 degrees in the 0-to-360 longitude units of the CERES dataset.

In [None]:
## your code here

***
### 5. Mathematical note: Arithmetic on a sphere
If we want to compute something like a global average (the scalar computed in demo 2. above, for example) then we need to account for the fact that the Earth is spherical.  The area within each latitude-longitude grid box is largest at the equator and smallest at the poles.  That means that, in summing them all up to take an average, polar values will be over-represented and equatorial values will be under-represented.

We can correct for this by assuming the Earth is a sphere -- it's not perfect, but close enough.  Then, we multiply each area by the cosine of its latitude.  This is a [weighted mean](https://docs.xarray.dev/en/latest/examples/area_weighted_temperature.html).

First, we make a weight array:

In [None]:
weight = np.cos(np.deg2rad(ds.lat))
weight = weight / weight.sum()

In [None]:
weight.plot();

`weight` is an array of 180 elements, which is normalised so that its sum is 1. This is exactly what we need to compute a weighted average! Now, we have to average over the longitudes (another *zonal average*, because along a latitude circle all points have the same weight), and then multiply by the weights to compute the weighted average.

In [None]:
zonal_sw_avg = sw_avg.mean(dim='lon')  # important! Always average over longitudes first
# this averaging is needed so that the arithmetic below makes sense 
# (multiply two arrays of 180 elements together)
weighted_sw_avg = np.sum(zonal_sw_avg * weight)
weighted_sw_avg.item()

If this seems like a pain to remember, I have good news for you: Xarray also has a built-in function to help with weighted averages.

In [None]:
sw_weighted = sw_avg.weighted(weight)
sw_weighted

In [None]:
weighted_mean = sw_weighted.mean()
weighted_mean

#### *Exercise*:
- Compute a weighted average of the incoming solar radiation.  
- Compare it with the raw (unweighted) average of incoming solar radiation.  
- Assess: Which is closer to the value shown in the literature ([Trenberth, Fasullo & Kiehl 2009](https://www2.cgd.ucar.edu/staff/trenbert/trenberth.papers/TFK_bams09.pdf), Figure 1)?

In [None]:
## your code here

***

## Part 2: Access and process the data

### Lab Procedure
1. Read and inspect the data.  Answer questions 1-3 on the paper lab sheet.
2. Conduct a closer inspection with summary statistics.  Compute the mean of at least one component -- for simplicity choose outgoing longwave radiation. Determine whether it has the correct order of magnitude by comparing with the Trenberth, Fasullo, and Kiehl figure.
3. Make global plots of each component (solar_clim, shortwave, longwave)
4. Compute the **energy balance**: incoming (solar_clim) minus outgoing (shortwave and longwave).
5. Address the science questions by:
    - Computing the *area weighted average* of the energy balance
    - Producing a global plot of net energy balance
    - Producing two seasonally offset plots of net energy balance, selecting a specific month (e.g. months 1 and 7)

In [None]:
## Add cells of code and markdown as needed here to complete the lab

***
### Endnotes
- Data availability: You downloaded a copy of the data files stored on a remote server.  You can find the official source of the EBAF-TOA and the EBAF-Surface data products [on this webpage](https://ceres.larc.nasa.gov/data/)) as climatologies (i.e. monthly averages 2005-2015). The data quality summary of these data (PDF) can be found [here](https://ceres.larc.nasa.gov/documents/DQ_summaries/CERES_EBAF_Ed4.1_DQS.pdf), and more accessible publications can be found [here for TOA](https://journals.ametsoc.org/doi/pdf/10.1175/JCLI-D-17-0208.1) and [here for Surface](https://journals.ametsoc.org/doi/pdf/10.1175/JCLI-D-17-0523.1).
- Development: The content of this lab is based on Fabien Maussion's Physics of the Climate System notebooks ([landing page](https://fabienmaussion.info/climate_system/welcome.html); [original notebook](https://fabienmaussion.info/climate_system/week_02/01_Lesson_NetCDF_Data.html)).
- This lab was last updated by Lizz Ultee, 20 Feb 2024.