## This Jupyter notebook will show you to handle 3 dimensional data (cube: lat x lon x time)

## We will load daily Chl-a data from 2009 for the Portuguese Coast:

Now, we will need to import several modules/libraries that are essential for nearly every scientific work in Python.

In [None]:
import os #change folders
import numpy as np # perform calculations and basic math
import matplotlib.pyplot as plt # plot data
import pandas as pd # work with dataframes,tables, spreadsheets, etc.
import netCDF4 as nc4 # work with netcdf files, the standard file for satellite 2D and 3D data
import cartopy #work with geographical projections and maps
#import datetime # this library is also useful for working with dates, convert dates, etc.

## First, lets load the 3D dataset using the netCDF4 module.

In [None]:
# Let's open the first image (31st August 2005)
file = 'chl_2009.nc' #write the name of the file
chl_2009 = nc4.Dataset(file, mode='r') #open the file in python
print(chl_2009.variables) # check variables

## Notice what has changed:
### * Chl-a now has an extra dimension with a lenght of 365 elements
### * We now have a new variable: time (with 365 elements)

Let's try printing the new variable

In [None]:
print(chl_2009['time'][:]) #Date in DDMMYYYY

In [None]:
# Extracting variables
lon = np.array(chl_2009['longitude'])
lat = np.array(chl_2009['latitude'])
chl = np.array(chl_2009['Chl-a'])
dates = np.array(chl_2009['time'])

## We now have satellite images for each day of the year 2009!
## That means we have 365 2D datasets similar to what we saw in the previous exercise

### Let's try:
* Calculating the yearly mean Chl-a map
* Calculating the average Chl-a for March
* Checking how March deviates from the yearly average

In [None]:
# Calculating the yearly average!
chl_2009mean = np.nanmean(chl, 2)
# numpy.nanmean handles missing data
# If you don't have missing data, you can use just numpy.mean
# The 2 corresponds to the dimension on which you are want to calculate the mean. We want to calculate the average
# along a period of time and time is the third dimension (LAT X LON X TIME). Remember, in python we start at 0!

Plot it using Cartopy and Matplotlib!

In [None]:
plt.figure(figsize=(6,6))
map = plt.axes(projection=cartopy.crs.PlateCarree())
map.coastlines(resolution='10m', color='black', linewidth=1) #add a coastline
map.set_extent([-15, -6, 36, 45]) # set the extent of the map to avoid blank spaces
map.add_feature(cartopy.feature.NaturalEarthFeature(category='physical', name='land', #add different color to land
                                                    scale='10m',
                                                    facecolor=cartopy.feature.COLORS['land']))
f1 = map.pcolormesh(lon, lat, np.log10(chl_2009mean), vmin=np.log10(0.1),
                    vmax=np.log10(10), cmap=plt.cm.jet)
gl = map.gridlines(draw_labels=True, alpha=0.5, linestyle='dotted', color='black') # Add gridlines
plt.xticks(fontsize=14) #increase size of ticks
plt.yticks(fontsize=14)
cbar = plt.colorbar(f1, ticks=[np.log10(0.1), np.log10(0.5), np.log10(1), np.log10(3), np.log10(10)]) #add a colorbar
cbar.ax.set_yticklabels(['0.1', '0.5', '1', '3', '10'], fontsize=14)
cbar.set_label('Clorophyll $\it{a}$ (mg.m$^{-3}$)', fontsize=14) #add a label to the colorbar

## Now let's the average Chl-a for March 2009

### First, we have to find which images correspond to March by looking at the time variable!

### NASA 2009 Day of the Year Calendar: https://asd.gsfc.nasa.gov/Craig.Markwardt/doy2009.html



In [None]:
# Let's try it - Remember the indices starts on 0
print(dates[59]) # 01-03-2009
print(dates[89])# 31-03-2009

# Getting chl-a data just for March

chl_March2009 = chl[:, :, 59:90] # We write 90 instead of 89 because we want to include 89 (31 March)
print(chl_March2009.shape) # Check shape to see if it's ok!

In [None]:
## Let's calculate and plot the average Chl-a during March 2009
chl_March2009_mean = np.nanmean(chl_March2009, 2)
plt.figure(figsize=(6,6))
map = plt.axes(projection=cartopy.crs.PlateCarree())
map.coastlines(resolution='10m', color='black', linewidth=1) #add a coastline
map.set_extent([-15, -6, 36, 45]) # set the extent of the map to avoid blank spaces
map.add_feature(cartopy.feature.NaturalEarthFeature(category='physical', name='land', #add different color to land
                                                    scale='10m',
                                                    facecolor=cartopy.feature.COLORS['land']))
f1 = map.pcolormesh(lon, lat, np.log10(chl_March2009_mean), vmin=np.log10(0.1),
                    vmax=np.log10(10), cmap=plt.cm.jet)
gl = map.gridlines(draw_labels=True, alpha=0.5, linestyle='dotted', color='black') # Add gridlines
plt.xticks(fontsize=14) #increase size of ticks
plt.yticks(fontsize=14)
cbar = plt.colorbar(f1, ticks=[np.log10(0.1), np.log10(0.5), np.log10(1), np.log10(3), np.log10(10)]) #add a colorbar
cbar.ax.set_yticklabels(['0.1', '0.5', '1', '3', '10'], fontsize=14)
cbar.set_label('Clorophyll $\it{a}$ (mg.m$^{-3}$)', fontsize=14) #add a label to the colorbar

## March 2009 appears to have much higher Chl-a concentrations than the average for 2009
## Let's calculate and plot the difference between the two (anomaly)

In [None]:
# Calculate the difference. When the sizes of the match, you can just subtract them
chl_March2009_anomaly = chl_March2009_mean - chl_2009mean
# Now let's plot the differences
plt.figure(figsize=(6,6))
map = plt.axes(projection=cartopy.crs.PlateCarree())
map.coastlines(resolution='10m', color='black', linewidth=1) #add a coastline
map.set_extent([-15, -6, 36, 45]) # set the extent of the map to avoid blank spaces
map.add_feature(cartopy.feature.NaturalEarthFeature(category='physical', name='land', #add different color to land
                                                    scale='10m',
                                                    facecolor=cartopy.feature.COLORS['land']))
f1 = map.pcolormesh(lon, lat, chl_March2009_anomaly, cmap=plt.cm.jet)
gl = map.gridlines(draw_labels=True, alpha=0.5, linestyle='dotted', color='black') # Add gridlines
plt.xticks(fontsize=14) #increase size of ticks
plt.yticks(fontsize=14)
cbar = plt.colorbar(f1) #add a colorbar
#cbar.ax.set_yticklabels(['0.1', '0.5', '1', '3', '10'], fontsize=14)
cbar.set_label('Clorophyll $\it{a}$ (mg.m$^{-3}$)', fontsize=14) #add a label to the colorbar


## Notice how the colorbar is not correctly alligned: 0 should at the center

## Plus, since we are looking at the difference between March and the entire year, let's choose another colormap/palette that is more suitable.

### We can see all colormaps that matplotlib offers here: https://matplotlib.org/stable/tutorials/colors/colormaps.html

In [None]:
# Now let's plot the differences
plt.figure(figsize=(6,6))
map = plt.axes(projection=cartopy.crs.PlateCarree())
map.coastlines(resolution='10m', color='black', linewidth=1) #add a coastline
map.set_extent([-15, -6, 36, 45]) # set the extent of the map to avoid blank spaces
map.add_feature(cartopy.feature.NaturalEarthFeature(category='physical', name='land', #add different color to land
                                                    scale='10m',
                                                    facecolor=cartopy.feature.COLORS['land']))
f1 = map.pcolormesh(lon, lat, chl_March2009_anomaly, cmap=plt.cm.seismic, vmin=-1.5, vmax=1.5)
gl = map.gridlines(draw_labels=True, alpha=0.5, linestyle='dotted', color='black') # Add gridlines
plt.xticks(fontsize=14) #increase size of ticks
plt.yticks(fontsize=14)
cbar = plt.colorbar(f1) #add a colorbar
#cbar.ax.set_yticklabels(['0.1', '0.5', '1', '3', '10'], fontsize=14)
cbar.set_label('Clorophyll $\it{a}$ (mg.m$^{-3}$)', fontsize=14) #add a label to the colorbar

### Finally, let's go back to the beggining and convert the 3D data into a 1D timeseries.

## Again, let's use the upper left corner 100 pixels (as in the previous notebook)

If you remember, our chlorophyll-a data for 2009 now has the following dimensions: Lat X Lon X Time (216 X 216 X 365)

Therefore, we want to extract the first 10 pixels of latitude and longitude and keep the entire time dimension

In [None]:
chl_2009_subset = chl[0:10, 0:10, :] #notice how we leave : in the third dimension (time)
print(chl_2009_subset.shape) # 10 * 10 * 365 pixels

# Calculate the spatial average within this 10 X 10 pixel box to get a 1D dataset
chl_2009_subset_1D = np.nanmean(chl_2009_subset, (0,1)) # Now we want to average spatially (first and second dimension)
print(chl_2009_subset_1D.shape) # 10 * 10 * 365 pixels

# And plot using what we learned from the first jupyter notebook of the class!

In [None]:
plt.figure(figsize=(12,6))
#plt.plot(pixel1_chla, c='r', label='Pixel 1')
#plt.plot(chl_2009_subset_1D, c='b', linestyle='--', label='Pixel 2')
plt.plot(chl_2009_subset_1D, c='r', linestyle='-', marker='o', markerfacecolor='k', markeredgecolor='k')
plt.xlabel('Date', fontsize=20)
plt.ylabel('Chl-$\it{a}$ (mg.m$^{-3}$)', fontsize=20)
plt.yticks(fontsize=16)
plt.xticks(ticks= [0, 31, 59, 90, 120, 151, 181, 212, 243, 273, 304, 334],
           labels=['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'], fontsize=16)
#plt.xlim(0,len(pixel1_chla))
#plt.ylim(0, 2)
plt.title('2009 Chl-$\it{a}$', fontsize=26)
#plt.legend(loc=0, fontsize=14)
#plt.tight_layout()

## Notice the missing data and the spring bloom.