[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/brightbandtech/nnja-ai/blob/main/example_notebooks/adpsfc_example.ipynb)

In [1]:
# Uncomment the following line to install the package
#!pip install git+https://github.com/brightbandtech/nnja-ai.git

## Navigating APDSFC data
The ADPSFC datasets, representing surface station observations, are very rich in data, but as a result, a bit difficult to dig through. Here we'll shows a few ways to explore the dataset and find the most useful variables

In [None]:
import pandas as pd
from nnja import DataCatalog

catalog = DataCatalog(mirror="gcp_brightband")
date = pd.to_datetime("2021-01-01").tz_localize("UTC")
metar_dataset = catalog["conv-adpsfc-NC000001"].sel(time=date)
ds = metar_dataset.load_dataset(backend="pandas")

Loading manifest for dataset 'conv-adpsfc-NC000001'...


In [None]:
def get_mnenomic_from_variable_name(variable_name):
    # variable name can be e.g. PRSSQ1.CHPT, which has mnemonic CHPT (the alst bit after the dot)
    # can also just be CHPT, which is the mnemonic
    # can also be something like PRSSQ1.CHPT01, which is the mnemonic CHPT (dropping the 01)
    # or CHTP01, which is the mnemonic CHPT too
    # tricky case: if .. is in the variable name, then the mnemonic may include the 2nd dot.
    # so e.g. SDFFH..REHOVI would have mnemonic .REHOVI.
    # but if the double dot is elsewhere we don't care
    if "." in variable_name:
        mnemonic = variable_name.split(".")[-1]
        if variable_name.split(".")[-2] == "":
            mnemonic = "." + mnemonic
    if mnemonic[-2:].isdigit():
        mnemonic = mnemonic[:-2]
    return mnemonic

In [None]:
code_and_flag_tables = {}
for varname, var in metar_dataset.variables.items():
    if var.is_code_or_flag_table:
        c_or_f = "code" if "code table" in var.extra_metadata else "flag"
        code_and_flag_tables[varname] = {
            "table": var.extra_metadata[f"{c_or_f} table"],
            "link": var.extra_metadata[f"{c_or_f} table link"],
        }
code_and_flag_tables

{'CORN': {'table': '033215',
  'link': 'https://www.nco.ncep.noaa.gov/sib/jeff/CodeFlag_0_STDv31_LOC7.html#033215'},
 'RCPTIM.RCTS': {'table': '008202',
  'link': 'https://www.nco.ncep.noaa.gov/sib/jeff/CodeFlag_0_STDv31_LOC7.html#008202'},
 'RPSEC1.ITSO': {'table': '002193',
  'link': 'https://www.nco.ncep.noaa.gov/sib/jeff/CodeFlag_0_STDv31_LOC7.html#002193'},
 'RPSEC1.TOST': {'table': '002001',
  'link': 'https://www.nco.ncep.noaa.gov/sib/jeff/CodeFlag_0_STDv31_LOC7.html#002001'},
 'RPSEC1.INPC': {'table': '013194',
  'link': 'https://www.nco.ncep.noaa.gov/sib/jeff/CodeFlag_0_STDv31_LOC7.html#013194'},
 'WNDSQ1.TIWM': {'table': '002002',
  'link': 'https://www.nco.ncep.noaa.gov/sib/jeff/CodeFlag_0_STDv31_LOC7.html#002002'},
 'WNDSQ1.QMWN': {'table': '033195',
  'link': 'https://www.nco.ncep.noaa.gov/sib/jeff/CodeFlag_0_STDv31_LOC7.html#033195'},
 'TMPSQ1.QMAT': {'table': '033193',
  'link': 'https://www.nco.ncep.noaa.gov/sib/jeff/CodeFlag_0_STDv31_LOC7.html#033193'},
 'TMPSQ1.QMDD':

# Subsetting to primary variables
A dataset like ADPSFC can have useful variables (such as measured air temperature, "TMPSQ1.TMDB"), along with less useful ones (such as the usually unreported estimated rate of ice accretion, "ICESQ1.ROIA"). Similarly, there are key descriptor fields (e.g. LAT and LON) and less valuable descriptor fields (e.g. "TMPSQ1.MSST", the method of water temperature and/or salinity measurement). All of these fields are included in the NNJA-AI dataset, but a quick way to subset the variables to explore is to use the 'category' field, which we have added to help filter out some of these less salient fields.

In [None]:
print(len(metar_dataset.variables))
key_vars = [
    varname
    for varname, var in metar_dataset.variables.items()
    if var.category in ["primary descriptors", "primary data"]
]
print(len(key_vars))

# Subsetting to primary variables
ds = metar_dataset.sel(variables=key_vars).load_dataset(backend="pandas")
ds.head()

87
31


Unnamed: 0,OBS_TIMESTAMP,LAT,LON,SELV,RPID,RPSEC1.HOVI,WNDSQ1.WDIR,WNDSQ1.WSPD,WNDSQ1.WNDSQ2.MXGS,TMPSQ1.TMDB,...,PCPSQ1.TP06,PCPSQ1.PCPSQ2.TP01,PCPSQ1.PCPSQ2.TP03,PCPSQ1.PCPSQ2.TP12,PCPSQ1.PCPSQ2.TP24,TOCC,PPWSQ1.PRWE,PPWSQ1.PSW1,PPWSQ1.PSW2,OBS_DATE
0,2021-01-01 00:00:00+00:00,45.42,12.38,6.0,16101,,350.0,1.0,,278.25,...,0.0,,,,,,,,,2021-01-01
1,2021-01-01 00:00:00+00:00,8.08,-80.95,88.0,78795,8000.0,0.0,0.0,,301.15,...,0.0,,,,,100.0,15.0,2.0,2.0,2021-01-01
2,2021-01-01 00:00:00+00:00,47.17,20.23,85.0,12860,6000.0,260.0,3.0,,276.75,...,6.0,,,,,100.0,60.0,6.0,6.0,2021-01-01
3,2021-01-01 00:00:00+00:00,10.0,-83.05,4.0,78767,15000.0,80.0,3.6,,300.15,...,0.0,,,,,88.0,5.0,2.0,2.0,2021-01-01
4,2021-01-01 00:00:00+00:00,14.9,-85.93,442.0,78714,13000.0,90.0,1.5,,298.85,...,0.0,,,,4.2,88.0,15.0,4.0,4.0,2021-01-01


In [None]:
# plot map of temperatures for a day