# Station metadata

In [None]:
import pandas as pd

import uscrn

## Load

We use {func}`uscrn.load_meta` to load the station metadata from NCEI.

In [None]:
%time

meta = uscrn.load_meta()

In [None]:
meta

In [None]:
meta.info()

In [None]:
list(meta.attrs)

## Examine

(A bit.)

First, we look at the status breakdown.

In [None]:
meta.status.value_counts()

In [None]:
meta.operation.value_counts()

Most, but not all, sites are in the US.

In [None]:
meta.country.value_counts()

In [None]:
meta.query("country != 'US'")

Technically there are a few different networks.

In [None]:
meta.network.value_counts()

In [None]:
meta.query("operation == 'Operational'").network.value_counts()

In [None]:
meta.query("operation == 'Operational' and not wban.isnull()").network.value_counts()

👆 This should be the number of sites we get when we use {func}`uscrn.get_data` (for a time period that the currently reported operational status is accurate for).

Sites with "closing" in the past are not marked as operational (good).

In [None]:
meta[meta.closing < pd.Timestamp.now()].operation.value_counts()

Alaska has the most sites.

In [None]:
meta.state.value_counts().head(10)

There is a range of elevations.

In [None]:
meta.elevation.plot.hist()

Most of the sites that are high above sea-level are in NM/CO/UT/AZ. CO/WY/UT/NM/NV are the top five highest states overall, in that order.

In [None]:
meta.query("elevation > 4000").state.value_counts().head(5)

Most sites have elevation recorded.

In [None]:
meta[meta.elevation.isnull()]

Some sites don't have a WBAN (but do have a station ID). Presumably these sites aren't included in the data archives.

In [None]:
assert not meta.station_id.isnull().sum()
meta[meta.wban.isnull()]