[Jupyter Book](https://geo-smart.github.io/oceanography/intro.html) and [GitHub repo](https://github.com/geo-smart/oceanography).


> Author's note: This notebook establishes a science basis that the subsequent chapters will
address (by dint of laborious data wrangling). The chapter concludes with a list of subsequent
grouped chapters to establish a roadmap.


# Ocean Science



> But I now leave my cetological System standing thus unfinished, even as the great Cathedral of Cologne was left, with the crane still standing upon the top of the uncompleted tower. For small erections may be finished by their first architects; grand ones, true ones, ever leave the copestone to posterity. God keep me from ever completing anything. This whole book is but a draught—nay, but the draught of a draught. Oh, Time, Strength, Cash, and Patience! <br><br> -Herman Melville



```{figure} ../img/revelle.jpg
---
height: 300px
name: directive-fig
---
Research Vessel Revelle (Scripps)
```

## Science basis


The purpose of this Jupyter Book is to present research ideas in oceanography relative 
to observational data from sensors. The underlying intent is to provide a reproducible 
methodology; but this tends to take up all the oxygen in the room. So before diving in
let us frame out the science, starting with an ambitious question:
<br><br><br>

${\Large \textrm{How stable is the epipelagic ocean?}}$

<br>

### The *epipelagic ocean* defined

Before tackling what is meant by *stability*
let us define the term 'epipelagic ocean'. *Epipelagic* is more or
less synomymous with *sunlit* or *photic* and it is applied as a modifier to *zone*. 
That is, the upper 200 meter layer of the water column is the epipelagic zone; 
this being the maximal depth of downwelling sunlight, which is in turn the 
energy source for primary production. We are considering the ecosystem occupied 
primarily by phytoplankton in the upper ocean, an engine that powers the food
web by converting solar energy to chemical energy.


Moving from *epipelagic zone* to *epipelagic ocean*: The epipelagic zone changes in 
nature from one place to another; so how stable is it *everywhere*? Alas we have as 
our observational starting point a mere three observing sites, located fairly close 
to one another in the northeast Pacific. The good news is that each
of these sites gathers an unprecedented amount of data by means of "shallow profiler"
sensing platforms.


### Limits on observation


Placing shallow profilers in vast quantities would be prohibitively expensive.
Working with three of them however puts us in a position to comment on *stability* in 
two ways. First we will have the state of the upper ocean in unprecedented fine-scale.
Second we hope to identify key features of the water column that might be amenable to other 
sensor programs, particularly ARGO drifters and satellite remote sensing of the sea surface.


### The meaning of *stability*. 


Stability is not a simple reifiable value. Here we define it very broadly.
A list of interpretive stability parameters and dimensions includes:


- depth dimension: Through the upper 200 meters on scales of centimeters to tens of meters
- physical stability: temperature, density of water, available light
- chemical stability: salinity, dissolved oxygen, inorganic carbon
- biological stability: nutrient concentration (nitrates), particulate distribution, fluorescence, ...
- lateral structure, scales from meters to mesoscale (hundreds of kilometers)
- time scales: minutes to days to seasonal to annual to multi-year climatology
- perturbation in relation to larger phenomena
    - sea state, storms, temperature, upwelling, eddies, currents, terrigenous influence (runoff)
- perturbation in relation to small phenomena such as plankton lensing
- stability of stratified sub-structure
    - mixed layer depth, multiple clines (barocline, thermocline etcetera), lower epipelagic
- intrinsic signal consistency, for example presence/absence of inversion layers

    
From an empirical perspective these dimensions of stability can in many cases be seen as
forms of standard deviation. But that is looking ahead to strategy. 
Hopefully by defining *stability* in this wide multidimensional sense we are open to
a great deal of interest in the *structure* of the upper ocean.


### The utility of *coincidence*


*Coincidence* here refers to data structure that is present in multiple sensor streams. 


The water column is well understood to be stratified. The upper layer is
the *mixed layer*, below that is a transitional layer, and below this is the upper 
part of the *pelagic layer* which constitutes the bulk of the water column. In this 
work we can begin with a null hypothesis that the sensor data will show these 
three strata with repeatable consistency. 


Shortly we commence looking at depth charts that show the epipelagic depth on the vertical axis
and one or more sensor parameters on the horizontal. Such a chart is informally referred to as 
a profile. Two things are readily apparent in profiles: First they tend to have a consistent
shape, and second that they occasionally exhibit anomalous structure. Anomalies are interesting
for a single sensor type say like temperature. However an anomaly that is present and persistent 
across multiple sensor types (say temperature, salinity and Chlorophyll-A fluorescence) is
indicating ocean structure on some physical and temporal scale. 

## Agenda



- Getting our feet wet
    - Ocean Science (this chapter): Establish a heirarchy of research questions and terminology
    - Data: Structure, necessity of profile metadata, sensors-to-measurements
    - Epipelargosy: A sense of the structure of the epipelagic water column
    - Anomaly and Coincidence: 
    - Annotation:
- Other observation systems
    - ARGO:
    - GLODAP:
    - MODIS:
    - ROMS:
- Bio-optics
    - Spectrophotometer
    - PAR and spectral irradiance
- Digging in to the stability question
    - Temperature:
- Appendices: Technical background
    - shallow profiler technical
    - documentation
    - issues


Additional themes to bring in

- Methods
- Development history (2017 to present)
- Workflows
- Reproducibility
- Troubleshooting
    


## Where?


The ocean is water and salt; as well as being a reservoir of carbon compounds that
are the basis for and byproducts of ocean life. Our objective is to explore observational
data from the ocean with the idea of developing insights into the physical, chemical
and biological processes. And *where* is this happening? Again our vertical *where?* 
is the upper 200 meters of the ocean, the epipelagic or surface waters illuminated by sunlight.
Our map-plane *where?* is three point-like locations off the coast of Oregon state.


- Oregon Offshore
- Oregon Slope Base
- Axial Base






## Science framework


This section continues in the theme of defining terms and concepts central to 
the question of epipelagic stability. 


### Ocean chemistry


Let's begin with a table of molecules.

    
| Mass (Daltons) | Substance
|---|---
|1|Hydrogen ion H<sup>+</sup>
|17|Hydroxide ion OH<sup>-</sup>
|46|carbon dioxide CO<sub>2</sub>
|62|carbonic acid H<sub>2</sub>CO<sub>3</sub>
|61|bicarbonate anion HCO<sub>3</sub><sup>-</sup>
|60|carbonate CO<sub>3</sub><sup>2-</sup>
|180|glucose C<sub>6</sub>H<sub>12</sub>O<sub>6</sub>
|894|chlorophyll C<sub>55</sub>H<sub>72</sub>MgN<sub>4</sub>O<sub>5</sub>


### Ocean structure


- The ocean is 3700 meters in depth on average
- Coastal ocean waters over the continental shelves are abous six times as productive as the deep ocean
- The photic zone is about 200 meters in depth, implying that over 90% of the ocean is perpetually in darkness
- Comment on heat capacity of seawater versus atmosphere
- Ocean water temperature (away from the poles) decreases with depth
    - Geothermal heat at the sea floor
- Ocean water salinity increases with depth
- Ocean water can hold more dissolved oxygen (DO) as temperature decreases
    - \[DO\] is affected by other factors such as biological respiration
- Carbon dioxide concentration is a complex topic
    - A more appropriate term to use is carbonate chemistry
- Productivity primarily refers to photosynthesis by phytoplankton
    - Photosynthesis is bounded on the low side by limited availability of nutrients and sunlight
    - Photosynthesis is bounded on the high side by corresponding saturation
- comment on what counts as nutrients
    - sub-comment on coastal productivity
    
    
Science questions given below are framed jointly with presumptive statements and some 
fragmentary responses. The resulting science framework, presented in a hierarchical 
format, is intended to establish project scope.


- Is shallow profiler data reliably interpretable?
    - Sensor by sensor: Can 30-day-span mean signals be used to flag anomalies?
    - Supposing yes: Characterize anomaly signals in three dimensions { sensor, depth, time }
    - Can the mixed layer depth be measured as a synthetic time series dataset
- Productivity estimation in terms of available light
    - Can PAR, spectral irradiance and inferred mixed layer produce a productivity estimate?
    - What presumptions are well-founded for diel migration?
    - Can vertical current speed observations contribute?
    - Can Endurance Offshore sonar data given independent productivity estimates?
    - Does ROMS data include a model estimate of productivity?
    - Does ocean color (satellite) data provide an independent estimate of productivity?
        - Supposing yes during times of good visibility
            - Does eddie structure in time series give a testable signal?
            - In relation: Does north/east ADCP data give testable scale and structure?
            - What work exists for similar comparisons using ARGO BGC drifters?
- Satellite validation, extrapolation
    - Are satellite-derived SST, chlor-a, MSLA validated by in situ sensors? 
    - Supposing yes: Spatio-temporal structural spectra?
- Can the photic zone water column characterization be extended downward?
    - Deep profilers (sites with depth 500m, 2800m, 2400m)
    - ARGO
    - Gliders
    - Supposing yes: Is this relevant to understanding the carbon cycle? Other cycles?
- Are these methods able to identify terrigenous influences (e.g. the Columbiar River)?
- Are these methods able to identify upwelling signals?
- Tie together the Endurance offshore (top of shelf) to Oregon Slope Base (bottom of shelf)
    - Are observations correlated? (Particularly 'beyond' seasonal in some sense)
- FDOM and particulate backscatter?
- Spectrophotometer?
- Hydrophone?


### Microbial ecology and global carbon


- DOM is dissolved organic matter
    - small organic molecules not functional within organisms
    - CDOM is an older term for color-DOM (has some spectral signature)
    - FDOM indicates fluorescent, hence measurable by fluorometry in some degree
- metabolites are products of metabolic processes
- energy consumption dependent on iron, nitrates, phosphorous; temperature mediation 
- Carbon pools measured in Gigatons (one billion x one thousand kilograms) 
    - or equivalently in Petagrams of Carbon PgC
    - Distinct from the mass of greenhouse gases: 44/12 times larger
        - CO2 has a molecular weight of 44 whereas Carbon usually has an atomic weight of 12 
    - Earth system science considers cycling of matter and energy
        - Exchange of carbon between reservoirs is expressed in terms of rates of transfer
            - for exmple PgC per year
    - Earth carbon pools include ocean, atmosphere, lithosphere, soil, peat, living creatures... 
    - Carbon transfer mechanisms include 
        - primary production
        - greenhoues gas (GHG) emission by humans 
        - carbon dioxide moving from the atmosphere into the ocean. 
            - GHG transfer to the atmosphere from the lithosphere is about 9 PgC / year
            - combining fuel burning with land-use changes such as slash-and-burn clearcutting
            - The ocean biological pump and solubility pump combine 
            - to move about 11 PgC into the ocean's interior per year 
                - ...a few pieces of a more complex picture.
        - Below I calculate the mass of dissolved organic matter in the ocean
            - The approximate value is given as 1,000 PgC
            - The calculation arrives at 645 PgC         
- Inorganic carbon: Simplest carbon compounds
    - the ocean-atmosphere interface facilitates dissolving of atmospheric carbon dioxide in the ocean
    - However carbon dioxide molecules dissolved in the ocean are subject to modification ('*carbonate chemistry*')
        - Atmospheric CO2 has a half-life of 60 years...
            - whereas dissolved CO2 in the ocean has a half-life of minutes
                - $CO_2$ carbon dioxide from the atmosphere, dissolved in the ocean transforms into
                - $H_2CO_3$ carbonic acid which dissociates into
                    - $HCO_3^-$ bicarbonate ions and
                    - $H^+$ hydrogen ions 
                        - which lower the pH of the ocean
                            - historically from 8.15 in 1950 to 8.05 in 2020

## Observation methods

- shallow profiler: run sensors through the watercolumn with a sampling rate / agenda
    - curtain plotting
    - associated platform data, particularly current measurement
- gliders
- predictive models
- drifters
- satellite remote sensing
- synthetic datasets

## From Data to Insight


### Remove the following to a technical section (e.g. Data chapter)


- NetCDF is the primary data file format
    - Consists of a two-level heirarchy
        - Top level: Groups (may or may not be present)
        - Second level: Subdivided into Dimensions, Coordinates, Data Variables, Indices (?), and Metadata (?) 
- Python is the operative programming language
    - XArray is the Python library used to parse and manipulate NetCDF data
        - The central data structure in XArray is the DataArrays
        - DataArrays are often bundled together to form Datasets
        - Both DataArrays and Datasets as objects include parsing and filtering methods

```{figure} ../img/ABCOST_signals_vs_depth_and_time.png
---
height: 500px
name: directive-fig
---
Salinity, Temperature, Dissolved Oxygen and Bio-optical signals with depth
```

## Metabolic energy for an apex predator


```{figure} ../img/Sphyrna_mokarran.png
---
height: 600px
name: directive-fig
---
Apex predator: Great hammerhead shark (Sphyrna mokarran)
```
    
Sunlight is not a direct energy source for 
[Sphyrna mokarran](https://en.wikipedia.org/wiki/Great_hammerhead))
(nor for Homo sapiens). How many stages of predation are below the hammerhead shark apex? 

    
- Hammerhead shark
    - Bluespotted stingray ([Neotrygon kuhlii](https://en.wikipedia.org/wiki/Kuhl%27s_maskray))
        - Butterfly chiton ([Cryptoconchus porosus](https://en.wikipedia.org/wiki/Cryptoconchus_porosus))
            - Benthic (shallow sea floor) diatoms
                - which convert sunlight to chemical energy by photosynthesis


Photosynthesis happens in organelles using a pigment called chlorophyll, producing carbohydrates that store 
energy. The molecular basis of this process is carbon dioxide and other carbonate molecules plus water. 
Molecular oxygen is a by-product of the process.


Carbonate molecules dissolved in ocean water are
considered inorganic and are not usable as an energy supply. Carbohydrate molecules are built from
these carbonate molecules and they *are* usable as an energy supply (by both producers like the
diatom and by consumers like the Hammerhead.) The conversion from inorganic to organic
molecules via sunlight is the key energy transformation at the base of the food web. Carbon is
ubiquitous in the ocean; but it is always undergoing change in molecular form from lower to
higher stored energy and back again.

## Introducing Python code 

In [2]:
# how much dissolved organic carbon in the ocean? Supposed to be about 1000 GT; and 38,000 GT inorganic
# 100 - 500 umoles carbon per kg seawater near the surface
# This is reduced by a factor of 5 - 10 in below-surface waters
surface_carbon_conc = 280e-6
depth_attenuation = (1./7.)
C_gm_per_mole = 12

radius_earth_meters = 6378000
pi_approx = 3.141592654
ocean_percent = 71
ocean_mean_depth_meters = 3700
kg_per_m3 = 1000

ocean_mass_kg = 4. * pi_approx * radius_earth_meters**2 * (ocean_percent / 100.) * ocean_mean_depth_meters * kg_per_m3

GTons_per_gm = 1e-15
carbon_gm_per_kg = surface_carbon_conc * depth_attenuation * C_gm_per_mole

gm_C_in_ocean = ocean_mass_kg * carbon_gm_per_kg

PgC = GTons_per_gm * gm_C_in_ocean

print("Mass of earth's oceans:", '{:0.2e}'.format(ocean_mass_kg), 'kg')
print("Carbon grams per kilogram seawater:", round(carbon_gm_per_kg, 6))
print("Grams of dissolved organic carbon in the ocean:", '{:0.2e}'.format(gm_C_in_ocean))
print("Dissolved organic carbon mass, earth's oceans, Gtons or Petagrams:", round(PgC, 1), 'PgC')
print("  (Gigatons and Petagrams are equivalent: 1 gram being 1 millionth of a metric ton")
print("   and Giga is 1 millionth of Peta.)")

Mass of earth's oceans: 1.34e+21 kg
Carbon grams per kilogram seawater: 0.00048
Grams of dissolved organic carbon in the ocean: 6.45e+17
Dissolved organic carbon mass, earth's oceans, Gtons or Petagrams: 644.6 PgC
  (Gigatons and Petagrams are equivalent: 1 gram being 1 millionth of a metric ton
   and Giga is 1 millionth of Peta.)


### Carbon pools

* Ocean 38,000 PgC
  * Dissolved organic carbon (size 0.22 to 0,70 microns): 1000 PgC
  * Inorganic carbon (dissolved CO2 and related carbonates): 37,000 PgC
* Earth biomass: 600 PgC
* Atmosphere: 800 PgC
* Soil + peat: 1500 PgC (1000 PgC organic)

### Carbon transport

* Marine autotrophs: 50 PgC/a
* Terrestrial primary production 50 PgC/a
* Lithosphere to atmosphere (human activity) 10 PgC/a
* Atmosphere to ocean interior (Biological and Solubility Pumps): 11 PgC/a 


Noting that the biological pump operates at about the same scale as the marine carbon pump; 
and these numbers are about one fifth of marine primary production we can make the case that
biological activity is an important component of the global carbon cycle. 

* Carbon is 1, 1, 4, 50 respectively life, atmosphere, soil, ocean. 1 = 600 Gton.
* Where the edge is
    * System models are vague. For example what drives coastal productivity?
    * How is decreasing ocean pH impacting ecologies?
    * What is the data trying to tell us (deluge problem)
* What you bring: Imagination, enthusiasm, perseverence
    * Even as an aware person you can advocate for science education
* What you can develop: Math, computing skills (domain context of course!)

Other programs

* ARGO
* Estuary modeling
* Currents and ecosystems
* Metagenomics

# Ocean Science Data Interpretation


This book concerns *ocean science* via *data interpretation*. 
We begin by building a model of how the ocean works, eventually
turning to data to refine that model. 


- The ocean is illuminated from above by sunlight that penetrates
to a depth of about 200 meters
    - Beneath this upper layer the ocean is always dark, regardless of time of day
- The upper 200 meters of the ocean is the *photic zone*
    - Sunlight is available as an energy source
    - Phytoplankton ('plant-like plankton') make use of this energy
        - They are the base level of the food web
    - Phytoplankton use available chemicals and sunlight to store energy
        - The primary storage medium is carbohydrates


Now let's consider sources of ocean data


- In the US: The National Science Foundation supports an ocean observatory
    - OOI = the ***Ocean Observatories Initiative*** 
    - OOI is subdivided into components called **arrays**
        - Arrays are operated cooperatively but somewhat independently
        - Arrays have, in turn, various collections of sensors and support hardware 
        - A single array can span an area of several thousand square kilometers
        - There are seven arrays in the OOI program
        - Two arrays were built in the southern ocean
            - The southern ocean arrays began operation in 2014
                - They have since been decommissioned
        - Five more arrays were built in the northern hemisphere
            - The Regional Cabled Array (Pacific ocean)
            - The Global Station Papa Array (Pacific ocean)
            - The Coastal Endurance Array (Pacific ocean)
            - The Coastal Pioneer Array (Atlantic ocean)
            - The Global Irminger Sea Array (Atlantic ocean)


The Regional Cabled Array (RCA) is our focus here


- The RCA is located off the Oregon coast 
- It extends several hundred kilometers off-shore
- The RCA extends beyond the continental shelf out into the deep ocean


The RCA features three shallow profilers designed to observe the photic zone 
in great detail.
These shallow profilers are **platforms** tethered to the sea floor by means of two 
long cables. Each platform is positively buoyant, positioned at a depth of 200 meters 
below the ocean surface. The platform has both power and a data connection to shore. 


- 500m depth: Outer edge of the continental shelf off of central Oregon
- 3100m depth: Further out at the base of the continental shelf off central Oregon
- 2100m depth: At the Juan de Fuca plate boundary, at the base of Axial Seamount




The **profiler** rests in a cradle on the **platform**. 
It is also positively buoyant. Platform and Profiler together look like this: 



```{figure} ../img/shallowprofilerinsitu.png
---
height: 500px
name: directive-fig
---
Shallow profiler deployed at 200 meters depth, eastern Pacific ocean
```


Under normal circumstances
this profiler is allowed to rise to near the surface (depth of approximately 10 meters) 
nine times each day. 
This is accomplished by means of a single cable on a winch. As the profiler 
ascends its "upward facing"
sensors acquire data. Once the profiler reaches the top of the profile 
it is winched back down again. 


Mean time in minutes for...

    
```
Ascent:    67
Descent:   45      (exception: local noon and midnight descents are about an hour longer)
Rest:      45
```


Ascent data are
considered more pristine; 
although pH and pCO2 are unique in that they are recorded on *descent*.


The table below shows data available from the profiler and its 200m-depth retaining platform.
Pressure, density, salinity, temperature and depth are interrelated. In particular, pressure 
in decibars and depth in meters are very nearly the same. Charts of sensor value against depth 
effectively treat profiles as "instantaneous" snapshots of upper water column. 