# Mauna Loa Weekly Atmospheric CO$_2$ Data

For some exercises we use measurements of the atmospheric CO<sub>2</sub> concentration from Mauna Loa, Hawaii (Keeling & Whorf [2004](https://cdiac.ess-dive.lbl.gov/trends/co2/sio-keel-flask/sio-keel-flaskmlo_c.html)).

The data is available in the [statsmodels](http://www.statsmodels.org/stable/index.html) package, as [weekly data](http://www.statsmodels.org/devel/datasets/generated/co2.html). However, I want to make it directly available. 


In [None]:
import pandas as pd
from statsmodels.datasets import co2

In [None]:
# load the data
data = co2.load_pandas().data

In [None]:
with open("co2.csv", "w") as fid:

    # write header
    fid.write("Mauna Loa Weekly Atmospheric CO2 Data\n")
    fid.write(" - units: ppm\n")
    fid.write(" - Keeling & Whorf 2004\n")
    fid.write(" - Obtained from statsmodels\n")
    fid.write(" - http://www.statsmodels.org/devel/datasets/generated/co2.html\n")

    # write data
    data.to_csv(fid, sep=",")

In [None]:
# load the data again, to see if all went well
d = pd.read_csv("co2.csv", index_col=0, parse_dates=True, header=5)

print(d.head(n=7))

In [None]:
! head co2.csv

## Also save as NetCDF

In [None]:
ds = data.to_xarray()
ds = ds.rename(dict(index="time"))

ds.co2.attrs = dict(units="ppm")

In [None]:
co2_annual = ds.co2.groupby("time.year").mean("time")
co2_annual.attrs = dict(units="ppm")

ds = ds.assign(co2_annual=co2_annual)

In [None]:
ds.attrs = dict(
    data="Mauna Loa Weekly Atmospheric CO2 Data",
    source="statsmodels (http://www.statsmodels.org/devel/datasets/generated/co2.html)",
    reference="Keeling & Whorf 2004",
)

In [None]:
ds.to_netcdf("co2.nc", format="NETCDF4_CLASSIC")

In [None]:
! ncdump -h co2.nc