# GPGN268 - Geophysical Data Analysis
## Data Story 02 - Global Warming
#### Due: March 28, 2023 at 11:59 pm


For this data Story you will look at [Argo](https://argo.ucsd.edu/) profiles and climatology data. Profiles from two floats are available from Canvas (float `5901429` and float `1901487`). The climatology can be downloaded trhough the [Scripps Argo website](https://sio-argo.ucsd.edu/RG_Climatology.html) selecting the file "2004-2018 RG Argo Temperature Climatology" (You will need ~3 GB of space for that).


You may discuss this assignment with your peers, but everyone should submit their assignment individually. If there is anyone who has *signficantly* contrbuted to your work, helped you figure something important out, etc., list them as a collaborator below with a short description of their input. **Collaborating will not impact your grade**. Please be honest.



### Preparation

- Navigate to the GPGN268-CORE directory and do a `git pull` to get this notebook. 

- Then navigate to your `ds02-global-warming` and copy this notebook to your notebooks directory. 

```
$ cd ~/work/classes/GPGN268/coursework-lastname/ds02-global-warming/
$ cp ~/work/classes/GPGN268/GPGN268-CORE/assignments/DS02-global-warming.ipynb notebooks
```

- Launch Jupyter Lab. Remember to activate the GPGN268 conda environment first.

```
$ conda activate GPGN268
$ jupyter lab
```

- Using the left navigation toolbar in Jupyter Lab, go to the `notebooks` directory rename this notebook to `dev.ipynb`– this will be where you will develop the code for your data story (try things out, make draft figures, etc). **You will not** turn in (i.e., push to GitHub) the `dev.ipynb` file. 

- Create another notebook called `ds02-global-warming.ipynb`. This is where you will put the final version of your Data Story, with polished text, and clean and well-documented code.

- Copy the text below onto the first cell (Markdown) of your `ds02-global-warming.ipynb` notebook and fill it out with your name and date.

```markdown
# GPGN268 - Geophysical Data Analysis
## Data Story 02 - Global Warming

**Student:** Blaster the Burro 
**Collaborators:**
- Yoda helped me figure out how to use the force
- Obi-Wan provided input on my code to plot resistivity
**Date:** May the 4th, 2078
```

- Complete the tasks below. Use this notebook (`dev.ipynb`) to explore and follow the instructions. After your are done with the final version of your assignment, git add `ds02-global-warming.ipynb`, commit, and push to GitHub.

## Part I – Introducing your story. 

Whatch this [Youtube video](https://www.youtube.com/watch?v=3VUNhHABcB0) and browse the [Argo program website](https://argo.ucsd.edu/). Using the resources above and any other material that comes up on your research, write a couple of paragraphs introducing your Data Story. Some questions to keep in mind are:

- What Argo floats, how do they work, and what do they measure? 
- What data are you going to be looking at and where did it come from?
- What are these type of data used for? What are some scientific/societal applications of this type of observations? 
- How is the data structured (time series, spatial data, 1D, 2D), what is the format of the data, and advantages/disvantages of this particular format?
- What are some common tools used for analyzing and interpreting these type of data?

## Part II - Float analysis
### Task 2.1 – Reading and processing float data
The float data comes with lots of variables that will not be useful for our analysis. In class, we saw how to select specific variables from an `xarray.Dataset` and how to manipulate/organize the variables. The code below illustrates how you would process one file.

```python
# Load the data
ds_raw = xr.load_dataset('float_file1.nc')
# List of variables that we will use
variables = ['TEMP_ADJUSTED','PSAL_ADJUSTED',
             'LATITUDE', 'LONGITUDE', 'JULD']
# Select only these variables from the whole dataset
ds = ds_raw[variables]
# Rename the variable 'JULD' to 'time' and make time a dimension
# (inplace of 'N_PROF') 
ds = ds.rename({'JULD':'time'}).swap_dims({'N_PROF':'time'})
# Change variable names to names that are cleaner and easier to type
ds = ds.rename({'TEMP_ADJUSTED':'temperature',
                'PSAL_ADJUSTED':'salinity',
                'LATITUDE':'latitude',
                'LONGITUDE':'longitude'})
# Defining a common pressure coordinate based on the average
# pressure at each level accross all profiles
mean_pressure = ds_raw.PRES_ADJUSTED.mean(dim='N_PROF')
# Create a new variable "pressure" in the dataset and specify the units
ds['pressure'] = mean_pressure
ds['pressure'].attrs['units'] = 'dbar'
# Make "pressure" on of the dimensions (inplace of 'N_LEVELS')
ds = ds.swap_dims({'N_LEVELS':'pressure'})
```

Use the code above (after making any necessary adjustments) to read the data from one float. After you test for one float, write a function that takes the path to an Argo profile netCDF file as input and returns the cleaned-up dataset. For example:

```python
ds29 = read_float_data('path_to_5901429_prof.nc')
ds87 = read_float_data('path_to_1901487_prof')
```

Document your function and use expressive programming.

### Task 2.2 – Visualizing float temperature and salinity sections

- Use `matplotlib` to plot sections of temperature and salinity for each float as in the figure below:

![](media/float_profiles.png)

- Properly label your axes and variables. Work on this problem by parts, by trying to plot each variable individually before combining them into a subplot. Here is some code to help you get things into subplots:

```python
fig, axes = plt.subplots(2, 2, figsize=(12,8))
[ax1, ax2, ax3, ax4] = axes.flatten()
# Note you will need to transpose the temperature (using .T) to plot
cs1 = ax1.pcolormesh(ds29.time, ds29.pressure, ds29.temperature.T, cmap='magma')
ax1.invert_yaxis()
ax1.set_ylabel('....................')
ax1.set_title("................", y=1.4, fontsize=20)
cbar1 = plt.colorbar(cs1, ax=ax1,
                     label='................',
                     orientation='..............',
                     location = '.............')

```

### Task 2.3 – Float trajectories

- Use `cartopy` and `matplotlib` to plot the trajectories of each float, like in the figure below:

![](media/float_trajectories.png)


### Task 2.4 – Interpreting ocean temperature and salinity

- Based on the temperature and salinity sections that you produced for Task 2.2 and the float trajectories for Task 2.3, describe and interpret your results. Below are some guiding questions; however, I'm not looking for bullet point answers to each of them, but instead a coherent description of your results in paragraph form, as you would do in a report/paper.

    - What story can you tell from your sections? 
    - How does temperature and salinity vary with time, depth, and geographical locations? 
    - What are some main differences between the observations from float `5901429` and float `1901487`? 
    - Is that consistent with what you would expect? In what way?


### Task 2.5 – Average temperature and salinity profiles

Often times in data analysis, it helps to have two different variables (with different units and scales) plotted on the same plot. We will use the mean temperature and salinity profiles to learn how to do that. 

Based on our map from Task 2.3, float `1901487` didn't move much, but float `5901429` went on a long ride. To gain insight on how ocean properties vary in polar regions, let's select profiles from float `5901429` only at places south of 60$^\circ$S.

```python
south_temp = ds29.temperature[ds29.latitude<-60]
south_sal = .......................
```

Now, we would like to plot the time average temperature and salinity for float `5901429` south of 60$^\circ$S both on the same axis. For that, we are going to use matplotlib's [twiny](https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.axes.Axes.twiny.html)
function.

```python
fig, ax1 = plt.subplots(figsize=(4, 6))

color = 'tomato'
ax1.plot(south_temp.mean(..........), ............, color=color)
ax1.set_xlabel('..............', color=color)
ax1.set_ylabel('..................')
ax1.set_ylim([0, 800])
ax1.invert_yaxis()
ax1.tick_params(axis='x', labelcolor=color)
ax1.grid(ls='dotted', which='both')

# instantiate a second axes that shares the same y-axis
ax2 = ax1.twiny()
color = 'steelblue'
ax2.plot(................, ....................)
ax2.set_xlabel('................', color=color)
ax2.tick_params(axis='x', labelcolor=color)
```

Modify the code above to produce the plot below (remember to add comments so you understand what each command is doing).

<img src="media/float_twiny.png" width="400">



## Part III - Climatology analysis

For this part of your Data Story, we will be looking at a product that represents a [climatology](https://en.wikipedia.org/wiki/Climatology) of Argo float data between 2004 and 2019. These are monthly averaged temperature measurements from all available Argo floats, which were gridded on a $1^\circ \times 1^\circ$ grid. You can read more about this product [here](https://sio-argo.ucsd.edu/RG_Climatology.html).


### Task 3.1 – Reading and processing climatology data

- If you try to load the climatology file, you will get an error. Explain (in writing) what this error is and how to fix it. 
- Load the data into a dataset

```python
ds = xr.open_dataset('path_to_climatology_RG_ArgoClim_Temperature_2019.nc', ..........)
```

Now, you will see that the variable 'TIME' has units of "months since 2004-01-01 00:00:00" and one of its attributes is "time_origin". We would like out time to be in [numpy.datetime64](https://numpy.org/doc/stable/reference/arrays.datetime.html) format so we can easly do opperations with our data. We will have to manually create a time variable and substitute that on our dataset

```python
# Create a variable with the date of our first data point (origin)
t0 = np.datetime64("2004-01")
# Create a sequence representing the number 
# of months since 2004-01 ranging from 0 to 179
months = range(len(ds.TIME))
# Create and array of dates where you add 1...179 months 
# to the origin date t0
time = np.array([t0 + np.timedelta64(m, "M") for m in months])
```

After running the steps above, check what the variable `time` is so you understand what we just did.

Now we will do some clean up on our dataset and add the correct time variable

```python
ds = ds.rename({'TIME':'time'})
ds = ds.rename({'ARGO_TEMPERATURE_MEAN':'temp_mean',
                'ARGO_TEMPERATURE_ANOMALY':'temp_anom',
                'PRESSURE':'pressure',
                'LATITUDE':'latitude',
                'LONGITUDE':'longitude'})
# Replace the old values with the time array that we created
ds['time'] = time
```

- Look at what `ds` is and write a short explanation of what this data structure represents.

#### 🤯 Bonus on hvplot

Python has an iteractive plotting library called [hvplot](https://hvplot.holoviz.org/). We can quickly plot the depth (pressure) averaged temperature by running:

```python
import hvplot.xarray

depth_ave_temp = ds.temp_anom.mean(dim='pressure')
depth_ave_temp.hvplot('longitude', 'latitude', cmap='RdBu_r', clim=(-3, 3))
```

### Task 3.2 – Seasonal Averages

Similarly to `pandas`, `xarray` has a very clever way of grouping data to perform opperations. We would like to plot maps of pressure-averaged temperature averaged over each season. We will use the metho [groupby](https://docs.xarray.dev/en/stable/generated/xarray.DataArray.groupby.html) to perform this operation. Here, we're grouping the data by season and computing the average. 

```python
seasonal_temp = depth_ave_temp.groupby("time.season").mean()
```

This works for other time periods too. You could try to use "time.year" for an yearly average, for example.

- Look at the variable `seasonal_temp`, its dimansion and coordinates to understand what happend.

- Use `matplotlib`, `cartopy`, and the code snippets below to make seasonal maps of temperature anomaly.

- Describe is plotted on the maps and interpret your results


```python

# Define limits for the colorbar
vmin = -1
vmax = 1
# Creates four subplots (2 rows and 2 columns)
fig, axes = plt.subplots(2, 2, figsize=(8, 6), subplot_kw={'projection': ccrs.PlateCarree()})
axes = axes.flatten()

for i, s in enumerate(seasonal_temp.season):
    ax = axes[i]
    cs = ax.pcolormesh(....., ....., seasonal_temp.sel(season=....),
                       transform = ccrs.PlateCarree(),
                       vmin=vmin,vmax=vmax,cmap='RdBu_r')
    ax.set_title("{}".format(s.values))
    ax.coastlines()
    gl = ax.gridlines(crs=ccrs.PlateCarree(), draw_labels=True,
                  linewidth=1, color='gray',
                  alpha=0.5, linestyle='dotted')
    gl.top_labels = False
    gl.right_labels = False

cax = fig.add_axes([0.99, 0.35, 0.02, 0.4])
cbar = plt.colorbar(cs, cax=cax,label='.............')
fig.subplots_adjust(bottom=0.25, top=0.9, left=0.05, right=0.95,
                    wspace=0.3, hspace=0.01)
```


![](media/seasonal_averages.png)

### Task 3.3 – Ocean Warming

- Now, we will look at the temporal evolution of the global mean ocean temperature in the past 15 years. What do you see in the data? 

```python
global_mean = ds.depth_ave_temp.mean(dim=[...., ....])
global_mean.plot()
```

In other to minimy the fluctuations and focus on the long-term evolution of the temperature, we will apply a technique that is commonly used in geosciences: smoothing. We will use `xarray`'s method [rolling](https://docs.xarray.dev/en/stable/generated/xarray.DataArray.rolling.html) to smooth the data every 12 months (equivalent to 12-points in the time dimension in our case)

```python
running_mean = global_mean.rolling(time=12, center=True).mean()
```

- Use matplotlib to reproduce the figure below

![](media/global_warming.png)


The early 2000s was marked by what scientists call a "hiatus", where the trend in ocean warming flattened. Around 2012 the temperature anomaly started to pick up again. 

- Using any method of your choice (including eye-balling), estimate the temperature trend between 2012 and 2018 and give your final answer in units of **degrees Celsius per decade**. Do some research to check if this is comparable with values described in the literature/web. Include links to your references in your answer.