# QC flags

By default, the QC flags are applied. This means that for numeric data columns that have a QC flag column, values where the QC flag is not "0" are set to NaN.

See {doc}`select-sites` for more information about selecting sites and
{doc}`daily` / {func}`uscrn.get_data` and {doc}`nrt` / {func}`uscrn.get_nrt_data` for more information about loading data.

In [None]:
import pandas as pd

import uscrn

In [None]:
station_id = "1045"  # Boulder, CO

df = uscrn.get_data(2019, "hourly", station_id=station_id, n_jobs=1)
df_no_qc = uscrn.get_data(2019, "hourly", station_id=station_id, apply_qc=False)

In [None]:
qc_vns = [k for k, v in df.attrs["attrs"].items() if v["qc_flag_name"]]

counts = []
for vn in qc_vns:
    fn = df.attrs["attrs"][vn]["qc_flag_name"]
    counts.append(df[fn].value_counts().convert_dtypes().rename(vn))

counts = pd.DataFrame(counts)
counts

In [None]:
vn = counts.sort_values(by="0").iloc[0].name

pd.concat(
    [
        df[vn].isnull().value_counts().rename("qc"),
        df_no_qc[vn].isnull().value_counts().rename("no qc"),
    ],
    axis=1,
)

In [None]:
df.sur_temp_type.value_counts()

## IR surface measurement type

NRT data are (presumably) more likely to have non-corrected values present.

In [None]:
df = uscrn.get_nrt_data((-4, None), "hourly", n_jobs=2)

In [None]:
df.sur_temp_type.value_counts()

In [None]:
wbans = sorted(df.query("sur_temp_type == 'U'").wban.unique())
print(wbans)
print(len(wbans))