Code covering various topics in oceanography
Historical estimates of codfish landings from Alexander 2009, making use of modern data and values of MSY/SSB from NOAA/NMSF by Mayo 2008. Their data suggest that there truly were significantly more fish in the sea during the 19th century, rapidly decreasing during the industrial age.
Data from Russell (1928) The Vertical Distribution of Marine Macroplankton. VI. Further Observations on Diurnal Changes, presumably in the English channel not far from Plymouth. Original figures had violin plots, but the X-axis was not uniform. Some errors appear to be in the original data from miscounting for the totals in the table. Six examples are shown, but all are plotted in the extended pdf version.
Data for OSD (Ocean Station Data) were downloaded from NOAA WOD, as of March 2021, containing data for 3069708 casts/stations for 28987253 total stops.
This is in a completely useless .csv format that is NOT a table, broken down into 13 files (ocldb1616358314.25649.OSD.csv.gz
up to ocldb1616358314.25649.OSD13.csv.gz
. Here, a python parser converts it into a giant table. The process took 16 minutes on my laptop and requires 22Gb RAM.
compile_wod_csv_to_real_table.py -c *.csv.gz > ocldb1616358314.25649.OSD_all.all_vars.tab
As of 2022, the dataset is downloadable here.
The table headers are listed below, and most are self explanatory. stop
refers to the bottle order in a single cast, with 0 being the first. Note that the variable names mostly keep format of the OSD data, including the 10 character limit (e.g. Temperatur
).
cast_id cruise_id orig_station_id orig_cruise_id latitude longitude year month day country country_acc_number stop depth CFC11 DeltaC13 DeltaC14 Nitrate pH CFC12 Chlorophyl Alkalinity Pressure Argon Temperatur CFC113 tCO2 Silicate Oxygen Salinity Oxy18 Tritium Neon DeltaHe3 Phosphate pCO2 Helium Ammonia
Units should be:
Depth m
Pressure dbar
Temperatur degrees C
Salinity PSS
Oxygen umol/kg
Phosphate umol/kg
Nitrate umol/kg
Silicate umol/kg
Ammonia umol/l
Chlorophyl ug/l
tCO2 mM
DeltaC14 per mille
DeltaC13 per mille
Oxy18 per mille
Alkalinity meq/l
CFC11 pmol/kg
CFC12 pmol/kg
CFC113 pmol/kg
Helium nmol/kg
DeltaHe3 percent
Tritium TU
Neon nmol/kg
Argon nmol/kg
Loading the entire table into R then requires 14Gb RAM.
wod_data_file = "~/project/WOD_select/ocldb1616358314.25649.OSD_all.all_vars.tab"
wod_data = read.table(wod_data_file, header=TRUE, sep="\t")
wod_summary = summary(wod_data)
wod_summary
Some basic filtering can be applied to simplify the dataset. To take only the first, or shallowest bottle, set stop==0
. This would be looking at surface values of nearly all measurements, of a total of 3069708 observations.
first_stop_only = filter(wod_data, stop == 0)
Taking all shallow water measurements is a larger set, since there are many 10m or 20m samples, this leaves 12134025 samples.
surface_data_only = filter(wod_data, abs(depth) < 50)
In general, the data are messy and need substantial post-processing. For example, most casts have temperature. However, this column contains a few negative values (below -2, which would be the temperature of brine-excluded polar water), and values between 50 and 100, which are likely Fahrenheit, instead of hydrothermal vents.
> table( round(wod_data[["Temperatur"]]) )
-100 -60 -48 -44 -39 -34 -33 -22 -16 -15 -12 -11 -10 -8
7 1 1 1 2 1 1 1 1 1 4 2 1 1
-7 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7
1 2 4 23 258491 718538 1000463 1075674 1522455 1735250 1978899 1625205 1600309 1521408
8 9 10 11 12 13 14 15 16 17 18 19 20 21
1386334 1192817 1029452 914095 899285 972681 996685 848622 844351 759436 741297 599957 539063 455230
22 23 24 25 26 27 28 29 30 31 32 33 34 35
443248 374590 369995 329949 332909 319404 300566 214482 64114 5055 948 337 149 86
36 37 38 39 40 42 43 45 47 48 49 50 51 52
19 21 90 8 5 4 1 1 3 2 4 5 7 2
53 55 56 57 58 59 60 61 62 63 64 65 66 67
2 2 14 9 8 1 2 1 9 1 7 1 8 2
68 70 72 74 76 77 78 79 80 81 82 83 87 88
5 6 4 3 9 1 9 1 7 2 11 2 2 2
90 93 94 96 98 99 100 105 107 110 116 117 118 127
1 1 1 1 1 1 12 2 1 2 1 2 1 1
131 138 266 270 311 1000
1 1 1 1 1 1
This code makes a map of global surface nitrate. The highest values appear to be due to river inputs. The southern ocean is also noticeably darker than much of the rest of the world, as a well known HNCL region.
library(ggplot2)
library(dplyr)
worldpolygons = map_data("world")
first_stop_w_nitrate = filter(wod_data, stop==0, !is.na(Nitrate))
wnit_gg = ggplot(worldpolygons) +
coord_cartesian(expand = c(0,0)) +
labs(x=NULL, y=NULL) +
theme(axis.text = element_blank(),
axis.ticks = element_blank(),
legend.position=c(0.75,0.75) ) +
geom_polygon( aes(x=long, y = lat, group = group), fill="#aaaaaa", colour="#ffffff") +
scale_colour_gradient(low = "#e7e1ef", high = "#8e1236", trans="log10", na.value="#f7f4f9" ) +
geom_point(data=first_stop_w_nitrate, aes( x=longitude, y=latitude, colour=Nitrate), size=0.5 )
ggsave(file="~/git/oceanography_scripts/images/WOD_OSD_surface_nitrate.png", wnit_gg, device="png", width=12, height=6, dpi=90)
Using the matlab datasets from Tang 2019, on distribution of diazotrophs in the oceans.
Replots of some of the data from Wang 2019. I do not remember the data being available online with the paper, but were sent from the lead author. They can be downloaded here.
Version 1.0 of an interactive ShinyApp map using leaflet showing sea level rise around the world.
Plot of Secchi disk data from NOAA National Centers for Environmental Information. This was in the format of a .csv file containing 463875 casts, and required little post-processing.
Plot of the CTD from MBARI ROVs
Rscript ../mbari_ctd_plotter.R mbari_dive_d420_ctd.txt
Plot of Phanerozoic oxygen level, based on various models ( Bergman 2004 COPSE and Berner 2006 GEOCARBSULF ). These data were hacked out of the paper, though an updated version of the model code is here by Lenton 2018.
Plot of Temperature-Salinity diagram from WOCE Station P17N in the North Pacific on 1-June-1993
Rscript ts_diagram_example.R