In [1]:
# Import necessary packages here, e.g.
import numpy as np               # for numerical operations
import matplotlib.pyplot as plt  # for plotting

# Add other packages as needed, for example:
# import pandas as pd            # for data manipulation

# The exactl El Niño influence on the production rate in the west coast of south America
*(Replace "Descriptive Project Title" above with the actual title of your project)*

**Yiyang Liu**  
*(Replace "Your Name" with your full name)*  

OCEAN 215 Final Project  
Autumn 2024

## Introduction

<!-- Include your motivation and any background information needed to understand your research question and data analysis. Clearly state your research question and hypothesis. -->

*(Write 1–2 paragraphs below)*  

Your introduction here.
El Niño has been a very well known and influential oceanographic phenomenon, its root cause is the weakening of trade winds. 
During El Niño, trade winds weaken. Warm water is pushed back east, this results in surface warming of the east tropical Pacific Ocean, weakening the Humboldt current (4~45 degreed south), thus weaking the upwelling of the west coast of South America.

Without the nutrients from the deep, there are fewer phytoplankton off the coast.
The El Niño would result a decline in phytoplankton population as well as the primary production rate over the west coast of South America due to the lack of nutrient that was due to the weaken in up welling caused by the increasing temperature.

## Data sources:
<!-- List each data source used in your project. For each dataset, include: 
     - Where you accessed it from 
     - Date you accessed/downloaded it 
     - Data collection process
     - Relevant variables (including unit) 
     - Spatial and temporal coverage and resolution 
     - Description of any obstacles or challenges you faced in obtaining the dataset
     - URL to the dataset if available -->

- {Dataset 1}
  - Downloaded from {data source} GHRSST Level 4 AVHRR_OI Global Blended Sea Surface Temperature Analysis (GDS2) from NCEI
  - Data collection process: By satellite, ocean surface
  - Relevant variables included: 
    - {Variable 1 including units} Sea surface temperature unit K
  - Spatial coverage and resolution: {coverage}, {resolution} global, 0.1-degree grid
  - Obstacles to data access: Data for each day had to be accessed individually so I had to acquire 32 data files and merge them together using pandas
  - Temporal coverage and resolution: {coverage}, {resolution} all-time and on-going
  - URL to dataset: https://simonscmap.com/catalog/datasets/Near_Real_Time_SST_AVHRR_OI 
<br>
- {Dataset 2}
    - Downloaded from {data source} Global Ocean Biogeochemistry Analysis and Forecast(attachment:6ef1dfd4-6aae-4e74-9deb-b6b06d3de9be.png)
  - Data collection process: By satellite, ocean surface
  - Relevant variables included: 
    - {Variable 1 including units} CHL unit mg/m^3
  - Spatial coverage and resolution: {coverage}, {resolution} global, 0.5-degree grid
  - Obstacles to data access: hard to graph it into animation
  - Temporal coverage and resolution: {coverage}, {resolution}  2011-2019
  - URL to dataset: https://data.marine.copernicus.eu/product/GLOBAL_ANALYSISFORECAST_BGC_001_028/description
<br>

## 1. {Section 1 Title}  
*(Replace "Section 1 Title" with something meaningful, e.g., "Analyzing Seasonal Temperature Trends")* 

<!-- In this cell, introduction the data set(s) you are working with, specify which aspect of your research question this section addresses, and describe the figure you will produce. -->

Short introduction to this section here.

In [2]:
# load data file(s)
import numpy as np
import pandas as pd
import xarray as xr
import os 
import matplotlib.pyplot as plt
import pycmap
import cartopy.crs as ccrs
import cartopy.feature as cfeature
from cartopy.mpl.gridliner import LONGITUDE_FORMATTER, LATITUDE_FORMATTER

path = "MUR-JPL-L4-GLOB-v4.1_4.1-20241204_043114/"
fl_list_tem = os.listdir(path) 
print(fl_list_tem)

average_temperature_all = []
for deployment in fl_list_tem:
     ds = xr.open_dataset('MUR-JPL-L4-GLOB-v4.1_4.1-20241204_043114/'+deployment)
     average_temperature = ds['analysed_sst'].mean()
     average_temperature_all.append(average_temperature)
# as part of commenting your code throughout this section, discuss any problems you encountered and how you solved them
#Data for each day had to be accessed individually so I had to acquire 32 data files and merge them together using pandas, it was huge and need to be organized

In [3]:
# perform data cleaning steps
print(average_temperature_all)
max_tem = max(average_temperature_all)
print(max_tem)
filepath = 'MUR-JPL-L4-GLOB-v4.1_4.1-20241204_043114/20151001090000-JPL-L4_GHRSST-SSTfnd-MUR-GLOB-v02.0-fv04.1.nc'
read_file = xr.open_dataset(filepath)
cleaned_read_file = read_file.dropna(dim='lat', how='all')
cleaned_read_file = read_file.dropna(dim='lon', how='all')

# Coarsen the latitude and longitude dimensions to 0.5 degree resolution
cleaned_read_file = cleaned_read_file.coarsen(lat=50, lon=50, boundary='trim').mean() #the boundary parameter in the coarsen method determines how to handle the edges of the data
# trim: Drops the excess entries that do not fit into the window size, which can be useful for avoiding errors or unnecessary padding

# Define the range of latitude and longitude
lat_range = slice(-30, -5)
lon_range = slice(-110, -70)

# Select the data within the specified range
cleaned_read_file = cleaned_read_file.sel(lat=lat_range, lon=lon_range)
display(cleaned_read_file)



filepath_normal = 'MUR-JPL-L4-GLOB-v4.1_4.1-20131001-20131031/20131001090000-JPL-L4_GHRSST-SSTfnd-MUR-GLOB-v02.0-fv04.1.nc'
read_file_normal = xr.open_dataset(filepath_normal)
display(read_file_normal)

cleaned_read_file_normal = read_file_normal.dropna(dim='lat', how='all')

# Coarsen the latitude and longitude dimensions to 0.5 degree resolution
cleaned_read_file_normal = cleaned_read_file_normal.coarsen(lat=50, lon=50, boundary='trim').mean() #the boundary parameter in the coarsen method determines how to handle the edges of the data
# trim: Drops the excess entries that do not fit into the window size, which can be useful for avoiding errors or unnecessary padding
cleaned_read_file_normal = read_file_normal.dropna(dim='lon', how='all')

# Define the range of latitude and longitude
lat_range_normal = slice(-30, -5)
lon_range_normal = slice(-110, -70)

# Select the data within the specified range
cleaned_read_file_normal = cleaned_read_file_normal.sel(lat=lat_range_normal, lon=lon_range_normal)
display(cleaned_read_file_normal)


In [4]:
# perform data analysis
# Calculate the average temperature along the longitude dimension
avg_temp_per_lat = cleaned_read_file['analysed_sst'].mean(dim='lon')

# Find the index of the latitude with the highest average temperature
max_temp_index = avg_temp_per_lat.argmax().item()

# Get the latitude value with the highest average temperature
max_temp_lat = avg_temp_per_lat['lat'].isel(lat=max_temp_index).item()
max_temp_value = avg_temp_per_lat.isel(lat=max_temp_index).item()

display(max_temp_lat)
display(max_temp_value)
selected_lat = cleaned_read_file.sel(lat=max_temp_lat, method='nearest')
display(selected_lat)


# the normal compared data
avg_temp_per_lat_normal = cleaned_read_file_normal['analysed_sst'].mean(dim='lon')

max_temp_index_normal = avg_temp_per_lat_normal.argmax().item()

max_temp_lat_normal = avg_temp_per_lat_normal['lat'].isel(lat=max_temp_index_normal).item()
max_temp_value_normal = avg_temp_per_lat_normal.isel(lat=max_temp_index_normal).item()

display(max_temp_lat)
display(max_temp_value)

selected_lat_normal = cleaned_read_file_normal.sel(lat=max_temp_lat_normal, method='nearest')

In [5]:
# make and display the first figure
lon = selected_lat['lon']
tem = selected_lat['analysed_sst']

lon_normal = selected_lat_normal['lon']
tem_normal = selected_lat_normal['analysed_sst']
display(selected_lat_normal)

plt.figure(figsize=(10, 6))
plt.plot(lon, tem.isel(time=0), label='SST',marker='o')
plt.plot(lon_normal, tem_normal.isel(time=0), label='SST_normal',marker='.',color='lightskyblue')

plt.xlabel('Longitude')
plt.ylabel('SST K')
plt.title('SST vs Longitude')
plt.legend()
plt.grid(color='gray', linestyle='--', linewidth=0.5)

# Show the plot
plt.show()

<!-- Describe how this figure helps address your research question. What patterns or insights can you observe? -->
**Figure 1 Caption:** Your caption here.
This figure helps to fill up the blank in the temperature of the referance graph, and shows how much the temperature increased compared to the normal.

## 2. {Section 2 Title}  

<!-- In the introduction below, specify the data set(s) you are working with in this section, which aspect of your research question this section addresses, and the figure you will produce -->

Short introduction to this section here.

In [6]:
# if a new dataset, load data and perform data cleaning steps
# otherwise, continue to data analysis and figure creation
# Import packages
import numpy as np
import pandas as pd
import xarray as xr
import matplotlib.pyplot as plt
import pycmap

import cartopy.crs as ccrs
import cartopy.feature as cfeature
from cartopy.mpl.gridliner import LONGITUDE_FORMATTER, LATITUDE_FORMATTER

In [7]:
# perform data analysis
key = "91445b4d-7eb0-45d4-ab1a-23ddc877977d"                     
#call the CMAP API using your unique key
api = pycmap.API(token=key)
# call the API to pull data into a dataframe named pisces
pisces=api.query(
         '''
         SELECT [time], lat, lon, depth, CHL, NO3 FROM tblPisces_NRT
         WHERE 
         [time] BETWEEN '2015-01-01' AND '2015-12-31' AND 
         lat BETWEEN -35 AND -5 AND 
         lon BETWEEN -110 AND -70 AND
         depth BETWEEN 0 AND 1
         '''
         ) 
# create a pandas dataframe and convert to xarray
df_rows = pd.DataFrame(pisces).set_index(["time",  "lat", "lon", "depth"])  # since this is a gridded data, it is useful to transorm from a table into an array 
ds = xr.Dataset.from_dataframe(df_rows)
display(ds)
# select only depth index and average across time for chl and no3
chl = ds['CHL'].isel(depth=0)#.mean('time')  ####### taking an average in time, once we have the data in xarray format
no3 = ds['NO3'].isel(depth=0)#.mean('time')
display(chl)
del pisces

In [8]:
# make and display figure
# Extract data dimensions
lat = chl['lat'].values
lon = chl['lon'].values
chl_vals = chl.values
no3_vals = no3.values
time = chl['time'].values

# Define the region needed
region_extent = [-110, -70, -30, -5]  # [lon_min, lon_max, lat_min, lat_max]

assert len(chl_vals.shape) == 3, "chl_vals must be 3D (time, lat, lon)"
assert len(time) == chl_vals.shape[0], "Number of time steps must match"

# Function to initialize the map
def init():
    ax.set_extent(region_extent, crs=ccrs.PlateCarree())  # Set map zoom
    ax.coastlines(resolution='110m', color='black')
    ax.add_feature(cfeature.LAND, color='lemonchiffon')
    pcolormesh = ax.pcolormesh(
        lon, lat, chl_vals[0, :, :],
        cmap="Greens",
        transform=ccrs.PlateCarree(),
        norm=LogNorm(vmin=0.01, vmax=1),
    )
    # Add gridlines and labels
    gl = ax.gridlines(crs=ccrs.PlateCarree(), draw_labels=True, linewidth=1, color='gray', alpha=0.5, linestyle='--')
    gl.top_labels = False
    gl.right_labels = False
    gl.xlocator = mticker.FixedLocator([-110, -100, -90, -80, -70])
    gl.ylocator = mticker.FixedLocator([-30, -20, -10, -5])
    gl.xlabel_style = {'size': 10}
    gl.ylabel_style = {'size': 10}

    # Add title and colorbar once
    plt.title(f'Chlorophyll Map - Time: {str(time[0])[:10]}', fontsize=14)
    cbar = plt.colorbar(pcolormesh, ax=ax, orientation='vertical', shrink=0.9, pad=0.05)
    cbar.set_label('Chlorophyll Concentration (mg/m³)', fontsize=12)
    return pcolormesh,

# Function to update the map for each frame
def update(frame):
    ax.clear()
    ax.set_extent(region_extent, crs=ccrs.PlateCarree())  # Set map zoom
    ax.coastlines(resolution='110m', color='black')
    ax.add_feature(cfeature.LAND, color='lemonchiffon')
    pcolormesh = ax.pcolormesh(
        lon, lat, chl_vals[frame, :, :],
        cmap="Greens",
        transform=ccrs.PlateCarree(),
        norm=LogNorm(vmin=0.01, vmax=1),
    )
    gl = ax.gridlines(crs=ccrs.PlateCarree(), draw_labels=True, linewidth=1, color='gray', alpha=0.5, linestyle='--')
    gl.top_labels = False
    gl.right_labels = False
    gl.xlocator = mticker.FixedLocator([-110, -100, -90, -80, -70])
    gl.ylocator = mticker.FixedLocator([-30, -20, -10, -5])
    gl.xlabel_style = {'size': 10}
    gl.ylabel_style = {'size': 10}

    # Update title
    plt.title(f'Chlorophyll Map - Time: {str(time[frame])[:10]}', fontsize=14)
    return pcolormesh,

# Set up the figure and axis with adjusted layout
fig = plt.figure(figsize=(9, 5))  # Adjust figure size to balance map and colorbar
ax = plt.axes(projection=ccrs.PlateCarree())
plt.subplots_adjust(right=0.85)  # Leave space for the colorbar on the right

# Create animation
ani = FuncAnimation(fig, update, frames=len(time), init_func=init, blit=False)

# Save the animation as a GIF
ani.save('chlorophyll_animation_custom_lat_lon_2013.gif', writer='pillow', fps=10) #writer='pillow' tells matplotlib to use the Pillow library for creating an animated GIF
plt.close()


##### ploting the nitrate map 

assert len(no3_vals.shape) == 3, "chl_vals must be 3D (time, lat, lon)"
assert len(time) == no3_vals.shape[0],  "Number of time steps must match"

# Function to initialize the map
def init():
    global cbar
    ax.set_extent(region_extent, crs=ccrs.PlateCarree())  # Set map zoom
    ax.coastlines(resolution='110m', color='black')
    ax.add_feature(cfeature.LAND, color='lemonchiffon')
    pcolormesh = ax.pcolormesh(
        lon2d, lat2d, no3_vals[0, :, :],
        cmap="Reds",
        transform=ccrs.PlateCarree(),
        norm=LogNorm(vmin=1, vmax=40),
    )
    # Add gridlines with customized labels
    gl = ax.gridlines(crs=ccrs.PlateCarree(), draw_labels=True, linewidth=1, color='gray', alpha=0.5, linestyle='--')
    gl.top_labels = False
    gl.right_labels = False
    gl.xlocator = mticker.FixedLocator([-110, -100, -90, -80, -70])
    gl.ylocator = mticker.FixedLocator([-30, -20, -10, -5])
    gl.xlabel_style = {'size': 10}
    gl.ylabel_style = {'size': 10}

    # Add title and colorbar once
    plt.title(f'NO3 Map - Time: {str(time[0])[:10]}', fontsize=14)
    cbar = plt.colorbar(pcolormesh, ax=ax, orientation='vertical', shrink=0.9, pad=0.05)
    cbar.set_label('NO3 Concentration (mg/m³)', fontsize=12)
    return pcolormesh,

# Function to update the map for each frame
def update(frame):
    ax.clear()
    ax.set_extent(region_extent, crs=ccrs.PlateCarree())  # Set map zoom
    ax.coastlines(resolution='110m', color='black')
    ax.add_feature(cfeature.LAND, color='lemonchiffon')
    pcolormesh = ax.pcolormesh(
        lon2d, lat2d, no3_vals[frame, :, :],
        cmap="Reds",
        transform=ccrs.PlateCarree(),
        norm=LogNorm(vmin=1, vmax=20),
    )
    # Add gridlines with customized label
    gl = ax.gridlines(crs=ccrs.PlateCarree(), draw_labels=True, linewidth=1, color='gray', alpha=0.5, linestyle='--')
    gl.top_labels = False
    gl.right_labels = False
    gl.xlocator = mticker.FixedLocator([-110, -100, -90, -80, -70])
    gl.ylocator = mticker.FixedLocator([-30, -20, -10, -5])
    gl.xlabel_style = {'size': 10}
    gl.ylabel_style = {'size': 10}

    # Update title
    plt.title(f'NO3 Map - Time: {str(time[frame])[:10]}', fontsize=14)
    return pcolormesh,

# Set up the figure and axis with adjusted layout
fig = plt.figure(figsize=(9, 5))  # Adjust figure size to balance map and colorbar
ax = plt.axes(projection=ccrs.PlateCarree())
plt.subplots_adjust(right=0.85)  # Leave space for the colorbar on the right

# Create animation
ani = FuncAnimation(
    fig, update, frames=len(time), init_func=init, blit=False
)

# Save the animation as a GIF
ani.save('NO3_animation_custom_lat_lon_2015.gif', writer='pillow', fps=10)
plt.close()

<!-- Describe how this figure helps address your research question. What patterns or insights can you observe? -->
**Figure 2 Caption:** Your caption here.
shows the change of chlorophyll and nitrate over time. The chlorophyll and nitrate decreased obviously in the near coastline, but seems to result in a more spread-out in the ocean, a wider and more average distribution.

## 3. {Section 3 Title}  

<!-- In the introduction below, specify the data set(s) you are working with in this section, which aspect of your research question this section addresses, and the figure you will produce -->

Short introduction to this section here.

In [9]:
# if a new dataset, load data and perform data cleaning steps
# otherwise, continue to data analysis and figure creation

In [10]:
# perform data analysis

In [11]:
# make and display figure

<!-- Describe how this figure helps address your research question. What patterns or insights can you observe? -->
**Figure 3 Caption:** Your caption here.

## Discussion
**Interpretation of Results:**  *(Write ~1 paragraph below)*
<!-- Summarize what you learned from each data analysis section/figure. Discuss key insights and conclusions regarding your research question. Do your results support or contradict your hypothesis? -->
    In time of El Niño, weaker Humboldt current and weaker upwelling seems to result in a more spread-out nutrient thus more spread-out phytoplankton distribution. 	The population decreased obviously in the near coastline, but hard to tell whether the phytoplankton or the primary production decreased or not in a larger scale. Reaulting a wider and more average distribution.

**Limitations and Future Work:**  *(Write ~1 paragraph below)*
<!-- Identify limitations in your analysis. Discuss any factors that may have impacted the validity or reliability of your results (e.g., data quality, sample size, assumptions). -->
<!-- What next steps could you or another researcher take to continue investigating this research question? Suggest ideas for further research, data collection, or alternative methodologies that could enhance understanding of the topic. -->
Bould be better if do the difference value of the chlorfill between normal and el neio time and plot that difference value, better showing the relavent difference.

## References  

*Cite all papers, websites, or other resources you referenced in your project. Use a consistent citation style (e.g., APA, MLA, Chicago).*

1. **Author(s).** (Year). *Title of the work*. Publisher. URL (if applicable)  

3. **Website Name.** (Date accessed). *Title of the webpage*. URL  
(1).NOAA National Centers for Environmental Information. 2020. Daily L4 Optimally Interpolated SST (OISST) In situ and AVHRR Analysis. Ver. 2.1. PO.DAAC, CA, USA. Dataset accessed [YYYY-MM-DD] at https://doi.org/10.5067/GHAAO-4BC21
(2).Joiner, J., Y. Yoshida, P. Koehler, C. Frankenberg, and N.C. Parazoo. 2023. L2 Daily Solar-Induced Fluorescence (SIF) from MetOp-A GOME-2, 2007-2018, V2. ORNL DAAC, Oak Ridge, Tennessee, USA. https://doi.org/10.3334/ORNLDAAC/2292

*Continue listing additional references as needed.*