Skip to content

bcgov/envair

Repository files navigation

bcgov/envair: BC air quality data retrieval and analysis tool

bcgovr

img License

Overview

bcgov/envair is an R package developed by the BC Ministry of Environment and Climate Change Strategy Knowledge Management Branch/Environmental and Climate Monitoring Section (ENV/KMB/ECMS) air quality monitoring unit. Package enables R-based retrieval and processing of air quality monitoring data. Output is now compatible with the popular openair package.

Installation

You can install envair directly from this GitHub repository. To proceed, you will need the remotes package:

install.packages("remotes")

Next, install and load the envair package using remotes::install_github():

remotes::install_github("bcgov/envair")
library(envair)

Features

  • Retrieve data from the Air Quality Data Archive by specifying parameter (pollutant) or station. Functions have the option to add Transboundary Flow Exceptional Event (TFEE) flags. Data archive is located in the ENV’s FTP server: ftp://ftp.env.gov.bc.ca/pub/outgoing/AIR/

  • Generate annual metrics, data captures and statistical summaries following the Guidance Document on Achievement Determination and the Canadian Ambient Air Quality Standards

  • Most of data processing functions can automatically process a parameter as input, or a dataframe of air quality data.

  • Retrieve archived and current ventilation index data with options to generate a kml map.

Functions

  • importBC_data() Retrieves station or parameter data from specified year/s between 1980 and yesterday.

    • for you can specify one or several parameters, or name of one or several stations
    • if station is specified, output returns a wide table following the format of the openair package. It also renames all columns into lowercase letters, changes scalar wind speed to ws, and vector wind direction to wd. It also shifts the datetime to the time-beginning format
    • if parameter is specified, output displays data from all air quality monitoring stations that reported this data.
    • use flag_TFEE = TRUE to add a new boolean (TRUE/FALSE) column called flag_tfee. This option only works when you enter parameter (not station) in the parameter_or_station.
    • use merge_Stations = TRUE to merge data from monitoring stations and corresponding alternative stations, especially in locations where the monitoring station was relocated. This function may also change the name of the air quality monitoring station.
    • set pad = TRUE to pad missing dates, set use_ws_vector = TRUE to use vector wind speed instead of scalar, and set use_openairformat = FALSE to produce the original non-openair output.
  • importBC_data_avg() Retrieves pollutant (parameter) data and performs statistical averaging based on the specified averaging_type

    • function can retrieve 24-hour averages (24-hr), daily 1-hour maximum (d1hm), daily 8-hour maximum (d8hm), rolling 8-hour values (8-hr)
    • it can also make annual summaries such as 98th percentile of daily 1-hour maximum, annual mean of 24-hour values. To perform annual summaries, the averaging_type should include “annual <averaging/percentile> <1-hour or 24-hour or dxhm>”
      • function can calculate the number of times a certain value has been exceeded

        List of possible values for the averaging_type.
        Type of averaging averaging_type= Syntax Output description
        1-hr “1-hr” Outputs hourly data. No averaging done.
        Daily Average “24-hr” Outputs the daily (24-hour) values.
        Rolling 8-hour “8-hr” Outputs hourly values that were calculated from rolling 8-hour average.
        Daily 1-hour maximum “d1hm” Outputs daily values of the highest 1-hour concentration
        Daily 8-hour maximum “d8hm” Outputs the daily 8-hour maximum for each day
        Annual mean of 1-hour values “annual mean 1-hr” Outputs the average of all hourly values
        Annual mean of daily values

        “annual mean 24-hr”

        “annual mean <avging>”

        Outputs average of all daily values.
        Annual 98th percentile of 1-hour values

        “annual 98p 1-hr”

        “annual <xxp> <avging>

        Outputs the 98th percentile of the 1-hour values.
        4th Highest daily 8-hour maximum

        “annual 4th d8hm”

        “annual <rank> <avging>

        Outputs the 4th highest daily 8-hour maximum.
        Number of daily values exceeding 28 µg/m3

        “exceed 28 24-hr”

        “exceed <value> <avging>

        Outputs the number of days where the 28 µg/m3 is exceeded
        Number of exceedance to d8hm of 62ppb

        “exceed 62 d8hm”

        “exceed <value> <avging>

        Outputs the number of days where the daily 8-hour maximum exceeds 62 ppb

        List of possible values for the averaging_type.

  • get_stats() Retrieves a statistical summary based on the default for the pollutant. Currently applies to PM2.5, O3, NO2, and SO2. Output includes data captures, annual metrics, and exceedances.

  • get_captures() Calculates the data captures for a specified pollutant or dataframe. Output is a long list of capture statistics such as hourly, daily, quarterly, and annual sumamaries.

  • listBC_stations() Lists details of all air quality monitoring stations (active or inactive)

  • list_parameters() Lists the parameters that can be imported by importBC_data()

  • importECCC_forecast() Retrieves AQHI, PM2.5, PM10, O3, and NO2 forecasts from the ECCC datamart

  • get_venting_summary() Summarizes the ventilation index, counting the number of GOOD, FAIR, or POOR days for the month

  • GET_VENTING_ECCC() Retrieve the venting index FLCN39 from Environment and Climate Change Canada datamart or from the B.C.’s Open Data Portal

  • ventingBC_kml() Creates a kml or shape file based on the 2019 OBSCR rules. This incorporates venting index and sensitivity zones.

Usage and Examples

importBC_data()


Retrieving air quality data, include TFEE adjustment and combine related stations

Use flag_TFEE = TRUE an merge_Stations = TRUE to produce a result that defines the TFEE and merges station and instruments, as performed during the CAAQS-reporting process.

library(envair)
df_data <- importBC_data('pm25',years = 2015:2017, flag_TFEE = TRUE,merge_Stations = TRUE)

knitr::kable(df_data[1:4,])
Using openair package function on BC ENV data.

By default, this function produces openair-compatible dataframe output. This renames WSPD_VECT,WDIR_VECT into ws and wd, changes pollutant names to lower case characters (e.g., pm25,no2,so2), and shifts the date from time-ending to time-beginning format. To use, specify station name and year/s. For a list of stations, use listBC_stations() function. If no year is specified, function retrieves latest data, typically the unverfied data from start of year to current date.

library(openair)
PG_data <- importBC_data('Prince George Plaza 400',2010:2012)
pollutionRose(PG_data,pollutant='pm25')

Other features for station data retrieval
  • To import without renaming column names, specify use_openairformat = FALSE. This also keeps date in time-ending format
  • By default, vector wind direction and scalar wind speeds are used
  • To use vector wind speed, use use_ws_vector = TRUE
  • Station name is not case sensitive, and works on partial text match
  • Multiple stations can be specified c(‘Prince George’,‘Kamloops’)
  • For non-continuous multiple years, use c(2010,2011:2014)
importBC_data('Prince George Plaza 400',2010:2012,use_openairformat = FALSE)
importBC_data('Kamloops',2015)
importBC_data(c('Prince George','Kamloops'),c(2010,2011:2014))
importBC_data('Trail',2015,pad = TRUE)              
Retrieve parameter data

Specify parameter name to retrieve data. Note that these are very large files and may use up your computer’s resources. List of parameters can be found using the list_parameters() function.

pm25_3year <- importBC_data('PM25',2010:2012)

importBC_data_avg()


Retrieving the annual average of daily values for multiple parameters

The function is capable of processing multiple parameters, and multiple years, but it can only do one averaging type. The averaging type can be a simple averaging (e.g., 24-hour, 8-hour, or combined averaging (e.g., annual 98p d1hm, annual mean 24-hour). Check the table above for a comprehensive list of averaging_type.

#user can specify the parameter, enter parameter name as input
annual_mean <- importBC_data_avg(c('pm25','o3'), years = 2015:2018, averaging_type = 'annual mean 24-hr')

#or if you already have a dataframe, you can use the dataframe as input to for the statistical summary
df_input <- importBC_data(param = c('pm25','o3'), years = 2015:2018)
annual_mean <- importBC_data_avg(df_input,averaging_type = 'annual mean 24-hr')

get_stats()


Calculate the annual metrics of PM2.5

The function will calculate statistical summaries , data captures, and number of exceedances based on the CAAQS metrics and values. The function will only perform a year-by-year calculation so the results are not the actual CAAQS metrics, but can be used to derive it. For ozone, it creates Q2 + Q3

#example retrieves stat summaries
stats_result <- get_stats(param = 'o3', years = 2016,add_TFEE = TRUE, merge_Stations = TRUE)

get_captures()


Calculate the data captures of PM2.5

The function will create a summary of data captures for the parameter or for dataframe. You can specify the parameter or, if available, use an air quality dataframe as input.

#you can use the parameter as input
data_captures <- get_captures(param = c('pm25','o3'), years = 2015:2018,merge_Stations = TRUE)

#or you can use a dataframe 
air_data <- importBC_data(c('pm25','o3'), years = 2015:201,merge_Stations = TRUE)
data_captures <- get_captures(param = air_data, years = 2015:2018)

listBC_stations()


produces a dataframe that lists all air quality monitoring station details. if year is specified, it retrieves the station details from that year. Note that this entry may not be accurate since system has not been in place to generate these station details.

listBC_stations()
listBC_stations(2016)
STATION_NAME_FULL STATION_NAME EMS_ID NAPS_ID SERIAL CITY LAT LONG ELEVATION STATUS_DESCRIPTION OWNER REGION STATUS OPENED CLOSED NOTES SERIAL_CODE CGNDB AIRZONE
100 Mile House 100 Mile House M116006 NA 374 100 Mile House 51.6542 -121.375 1000 NON OPERATIONAL ENV 05 - Cariboo INACTIVE 1992-11-11 NA N/A UNKNOWN N/A Central Interior
100 Mile House BCAC 100 Mile House BCAC E218444 NA 228 100 MIle House 51.6461 -121.937 0 NON OPERATIONAL ENV 05 - Cariboo INACTIVE 2010-02-16 NA N/A UNKNOWN N/A Central Interior
Abbotsford A Columbia Street Abbotsford A Columbia Street E289309 NA 428 Abbotsford 49.0215 -122.3266 65 METRO VANCOUVER MVRD 02 - Lower Mainland ACTIVE 2012-07-25 NA N/A UNKNOWN N/A Lower Fraser Valley
Abbotsford A Columbia Street Met Abbotsford A Columbia Street E289309 NA 429 Abbotsford 49.0215 -122.3266 65 METRO VANCOUVER MVRD 02 - Lower Mainland ACTIVE 2012-07-25 NA N/A UNKNOWN N/A Lower Fraser Valley

list_parameters()


produce a vector string of available parameters that can be retrieved with importBC_data()

GET_VENTING_ECCC()


produces a dataframe containing the recent venting index.

GET_VENTING_ECCC()
GET_VENTING_ECCC('2019-11-08')
GET_VENTING_ECCC((dates = seq(from = lubridate::ymd('2021-01-01'),
        to = lubridate::ymd('2021-05-01'), by = 'day')))
VENTING_INDEX_ABBREV DATE_ISSUED CURRENT_VI CURRENT_VI_DESC CURRENT_WSPD CURRENT_MIX_HEIGHT TODAY_VI TODAY_VI_DESC TODAY_WSPD TODAY_MIX_HEIGHT TOMORROW_VI TOMORROW_VI_DESC TOMORROW_WSPD TOMORROW_MIX_HEIGHT NAME REGION LAT LONG
100 MILE 2024-01-12 73 GOOD 23 1686 48 FAIR 16 1474 23 POOR 15 1124 100 Mile House CENTRAL INTERIOR 51.63915 -121.2945
ATLIN 2024-01-12 11 POOR 10 742 11 POOR 10 745 15 POOR 10 817 Atlin NORTHERN BC 59.57000 -133.7000
BELLA COOLA 2024-01-12 37 FAIR 13 480 34 FAIR 21 283 21 POOR 11 256 Bella Coola COAST 52.38000 -126.7500
BURNS LAKE 2024-01-12 10 POOR 6 748 20 POOR 8 987 25 POOR 10 1026 Burns Lake CENTRAL INTERIOR 54.23142 -125.7597

importECCC_forecast()


  • Retrieves forecasts and model data from ECCC
  • parameters include AQHI, PM25, NO2, O3, PM10
importECCC_forecast('no2')

ventingBC_kml()


  • creates a kml object based on the 2019 OBSCR rules
  • directory to save kml file can be specified. File will be saved in that directory as Venting_Index_HD.kml.
ventingBC_kml()
ventingBC_kml('C:/temp/')