# **Ocean acidification**
---

**Content creators:** C. Gabriela Mayorga Adame

**Content reviewers:** Jenna Pearson, Abigail Bodner

**Content editors:** Zane Mitrevica, Natalie Steinemann

**Production editors:** TBD

**Our 2023 Sponsors:** TBD

In [None]:
# @title #**Project background** 
#This will be a short video introducing the content creator(s) and motivating the research direction of the template.
#The Tech team will add code to format and display the video

In this project, you will be able to analyse ocean model and observational data from global databases to extract variables like pH, CO2, and temperature among others, and to investigate ocean acidification in your region of interest. This project will also be an opportunity to investigate the relationships between these variables as well as their impact on the marine ecosystems.

#**Project template**
<p align='center'><a href="https://github.com/ClimatematchAcademy/course-content/blob/main/projects/template-images/ocean_acidification_template_map.svg"><img src="https://github.com/ClimatematchAcademy/course-content/blob/main/projects/template-images/ocean_acidification_template_map.svg?raw=True" alt="Ocean acidification" vw="100" vh="75" /></a></p>

# **Data exploration notebook**
## **Project setup**



Please run the following cells!
    



In [None]:
# Imports

import random
import numpy as np
import matplotlib.pyplot as plt

Click on the data source below to jump to the respective section:
- [NOAA ocean pH, acidity, and Revelle Factor](#datasource1)
- [Copernicus](#datasource2)


## **NOAA ocean pH, acidity, and Revelle Factor**

### **Global surface ocean pH, acidity, and Revelle Factor on a 1x1 degree global grid from 1770 to 2100 (National Center for Environmental Inormation NOAA Accession 0206289)** 

This dataset contains spatial distribution of surface ocean **pH (total hydrogen scale)**, acidity (or hydrogen ion activity, unit: nmol/kg, or 10^-9 mol/kg) and Revelle Factor (a measure of the ocean's buffer capacity, unitless) on a 1x1 degree global grid (Longitude: [20.5:1:379.5], Latitude: [-89.5:1:89.5]) in all 12 months of the years from 1770 to 2100. This data product is produced by combining a recent observational seawater carbon dioxide (CO2) data product, i.e., the 6th version of the Surface Ocean CO2 Atlas (1991-2018, ~23 million observations), with temporal trends at individual locations of the global ocean from a robust Earth System Model (ESM2M), to provide a high-resolution regionally varying view of global surface ocean pH, acidity, and the Revelle Factor. The climatology extends from the pre-Industrial era (1770 C.E.) to the end of this century under historical atmospheric CO2 concentrations (pre-2005) and the Representative Concentrations Pathways (RCP2.6, RCP4.5, RCP6.0 and RCP8.5, post-2005) of the Intergovernmental Panel on Climate Change (IPCC)’s 5th Assessment Report (AR5). 

**Citation:** 
Jiang, L.-Q., B. R. Carter, R. A. Feely, S. Lauvset, and A. Olsen (2019), Surface ocean pH 
and buffer capacity: past, present and future, Nature Scientific Reports, 9:18624, 
doi:10.1038/s41598-019-55039-4.


In [None]:
# Dataset-specific imports

!pip install netCDF4
import netCDF4 as nc

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting netCDF4
  Downloading netCDF4-1.6.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.2 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m5.2/5.2 MB[0m [31m47.1 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting cftime (from netCDF4)
  Downloading cftime-1.6.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.2/1.2 MB[0m [31m65.0 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: cftime, netCDF4
Successfully installed cftime-1.6.2 netCDF4-1.6.3


In [None]:
# Code to retrieve and load the data
!wget https://www.ncei.noaa.gov/data/oceans/ncei/ocads/data/0206289/Surface_pH_1770_2100/Surface_pH_1770_2000.nc



--2023-05-14 00:49:36--  https://www.ncei.noaa.gov/data/oceans/ncei/ocads/data/0206289/Surface_pH_1770_2100/Surface_pH_1770_2000.nc
Resolving www.ncei.noaa.gov (www.ncei.noaa.gov)... 205.167.25.171, 205.167.25.177, 205.167.25.172, ...
Connecting to www.ncei.noaa.gov (www.ncei.noaa.gov)|205.167.25.171|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 150351486 (143M) [application/x-netcdf]
Saving to: ‘Surface_pH_1770_2000.nc’


2023-05-14 00:49:38 (68.1 MB/s) - ‘Surface_pH_1770_2000.nc’ saved [150351486/150351486]



These are the files of the climate change projections under various scenarios (RCP 2.6 to 8.5) for those feeling adventurous: 
*   Surface_pH_2010_2100_RCP26.nc 
*   Surface_pH_2010_2100_RCP45.nc	 
*   Surface_pH_2010_2100_RCP60.nc 
*   Surface_pH_2010_2100_RCP85.nc 

To load them replace the filename in the the path/filename line above

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


The two datasets used in this project are in netCDF format. Loading them is very straightforward and no pre-processing is required. 

We can load and visualize the **surface pH** as follows:



In [None]:
# Code to print the shape, array names, etc of the dataset
with nc.Dataset('Surface_pH_1770_2000.nc', 'r') as nc_fid:
   vnames = nc_fid.variables.keys()
   print(vnames)
   lat = nc_fid['Latitude'][:] # note here you need [:] after to extract the data
   lon = nc_fid['Longitude'][:]
   ph = nc_fid['pH'][:]
   yrs=nc_fid['Year']
   #Month is just 1 to 12 so there is no nead to retrive it as a variable
   print(ph.shape, lat.shape, lon.shape, yrs.shape)

dict_keys(['Longitude', 'Latitude', 'Month', 'Year', 'pH'])
(24, 12, 180, 360) (180, 360) (180, 360) (24,)


In [None]:
import xarray as xr
ds1 = xr.open_dataset('Surface_pH_1770_2000.nc')
ds1

## **Copernicus**

Copernicus is the Earth observation component of the European Union’s Space programme, looking at our planet and its environment to benefit all European citizens. It offers information services that draw from satellite Earth Observation and in-situ (non-space) data.

The European Commission manages the Programme. It is implemented in partnership with the Member States, the European Space Agency (ESA), the European Organisation for the Exploitation of Meteorological Satellites (EUMETSAT), the European Centre for Medium-Range Weather Forecasts (ECMWF), EU Agencies and Mercator Océan.

Vast amounts of global data from satellites and ground-based, airborne, and seaborne measurement systems provide information to help service providers, public authorities, and other international organisations improve European citizens' quality of life and beyond. The information services provided are free and openly accessible to users.

**Source**: https://www.copernicus.eu/en/about-copernicus

### **cams-global-ghg-reanalysis-egg4-monthly** 

https://www.copernicus.eu/en/access-data/copernicus-services-catalogue/cams-global-ghg-reanalysis-egg4-monthly


From this dataset we will use **CO2 column-mean molar fraction** from the Single-level chemical vertical integrals variables & **Sea Surface Temperature** from the Single-level meteorological variables (in case you need to download them direclty from the catalog). 


This dataset is part of the ECMWF Atmospheric Composition Reanalysis focusing on long-lived greenhouse gases: carbon dioxide (CO2) and methane (CH4). The emissions and natural fluxes at the surface are crucial for the evolution of the long-lived greenhouse gases in the atmosphere. In this dataset the CO2 fluxes from terrestrial vegetation are modelled in order to simulate the variability across a wide range of scales from diurnal to inter-annual. The CH4 chemical loss is represented by a climatological loss rate and the emissions at the surface are taken from a range of datasets.

Reanalysis combines model data with observations from across the world into a globally complete and consistent dataset using a model of the atmosphere based on the laws of physics and chemistry. This principle, called data assimilation, is based on the method used by numerical weather prediction centres and air quality forecasting centres, where every so many hours (12 hours at ECMWF) a previous forecast is combined with newly available observations in an optimal way to produce a new best estimate of the state of the atmosphere, called analysis, from which an updated, improved forecast is issued. Reanalysis works in the same way to allow for the provision of a dataset spanning back more than a decade. Reanalysis does not have the constraint of issuing timely forecasts, so there is more time to collect observations, and when going further back in time, to allow for the ingestion of improved versions of the original observations, which all benefit the quality of the reanalysis product.

**Source & further information:** https://ads.atmosphere.copernicus.eu/cdsapp#!/dataset/cams-global-ghg-reanalysis-egg4-monthly?tab=overview


In [None]:
# Some how the SSTyCO2_CAMS_Copernicus_data.nc datafile needs to be read!!!

We can load and visualize the **sea surface temperature** and **CO2 concentration** as follows:

In [None]:
with nc.Dataset('SSTyCO2_CAMS_Copernicus_data.nc', 'r') as nc_fid2:
   vnames2 = nc_fid2.variables.keys()
   print(vnames2)
   lat2 = nc_fid2['latitude'][:] # note here you need [:] after to extract the data
   lon2 = nc_fid2['longitude'][:]  
   time2=nc_fid['time']
   co2 = nc_fid2['tcco2'][:]
   sst = nc_fid2['sst'][:]
 

   print(co2.shape, sst.shape, lat2.shape, lon2.shape, time2.shape)

FileNotFoundError: ignored

In [None]:
ds2 = xr.open_dataset('SSTyCO2_CAMS_Copernicus_data.nc')
ds2

# **Further reading**

Missing