R Package for Accessing EPA AQS data
This package provides an R interface for the EPA Air Quality System (AQS) API. Information about the API, including Terms of Service, is available at https://aqs.epa.gov/aqsweb/documents/data_api.html.
The aqsr
package can be installed by running
devtools::install_github("jpkeller/aqsr")
The AQS API requires an email address and key for all queries. The key
is not used for authentication (as in a password), but it is used for
identification. Sign-up using the aqs_signup()
function, and your key
phrase will be emailed.
Once an email address and key are registered, assign them to a list in
the working environment using create_user()
. For example:
myuser <- create_user(email="myemail@mydomain",
key="mykeyhere")
Alternatively, the email address and key can be stored as the
environment variables AQS_EMAIL
and AQS_KEY
, respectively, to avoid
directly storing the values in code that might be part of a public
repository. To do this, add the lines
AQS_EMAIL="myemail@mydomain"
AQS_KEY="mykeyhere"
to the .Renviron
file in your home directory. Calling create_user()
without any argument will then read from these values:
myuser <- create_user() # Use stored credentials for user
Care should still be taken to avoid storing the resulting object in a public repository.
All functions for requesting data from the API require that the list
generated by create_user()
be provided in the aqs_user
argument.
A full list of services provided by the AQS API can be accessed by
calling list_services()
. These services include sampleData
,
signup
, list
, and metaData
, among others. The endpoints for each
service are listed in list_endpoints()
, and the variables required for
each endpoint are listed in list_required_vars()
. For example:
# List all services
list_services()
## [1] "signup" "metaData"
## [3] "list" "monitors"
## [5] "sampleData" "dailyData"
## [7] "annualData" "qaBlanks"
## [9] "qaCollocated" "qaFlowRateVerifications"
## [11] "qaFlowRateAudits" "qaOnePointQcRawData"
## [13] "qaPepAudits"
# List endpoints for "dailyData" service
list_endpoints(service="dailyData")
## [1] "bySite" "byCounty" "byState" "byBox" "byCBSA"
# List variables needed for obtaining data using the "byCounty" endpoint
list_required_vars(endpoint="byCounty")
## [1] "param" "bdate" "edate" "state" "county"
Information on parameter codes and required input for defining data
requests can be obtained from the API using aqs_list()
,
aqs_list_parameters()
, and the related functions.
For example, to list the available parameter classes (groups of parameters), use the function
> aqs_list_classes(aqs_user=myuser)
code value_represented
2 AIRNOW MAPS The parameters represented on AirNow maps (88101, 88502, and 44201)
27 ALL Select all Parameters Available
3 AQI POLLUTANTS Pollutants that have an AQI Defined
4 CORE_HAPS Urban Air Toxic Pollutants
5 CRITERIA Criteria Pollutants
6 CSN DART List of CSN speciation parameters to populate the STI DART tool
7 FORECAST Parameters routinely extracted by AirNow (STI)
8 HAPS Hazardous Air Pollutants
9 IMPROVE CARBON IMPROVE Carbon Parameters
10 IMPROVE_SPECIATION PM2.5 Speciated Parameters Measured at IMPROVE sites
11 MET Meteorological Parameters
12 NATTS CORE HAPS The core list of toxics of interest to the NATTS program.
13 NATTS REQUIRED Required compounds to be collected in the National Air Toxics Network
14 PAMS Photochemical Assessment Monitoring System
15 PAMS_VOC Volatile Organic Compound subset of the PAMS Parameters
16 PM COARSE PM between 2.5 and 10 micrometers
17 PM10 SPECIATION PM10 Speciated Parameters
18 PM2.5 CONT NONREF PM2.5 Continuous, Nonreference Methods
19 PM2.5 MASS/QA PM2.5 Mass and QA Parameters
20 SCHOOL AIR TOXICS School Air Toxics Program Parameters
21 SPECIATION PM2.5 Speciated Parameters
22 SPECIATION CARBON PM2.5 Speciation Carbon Parameters
23 SPECIATION CATION/ANION PM2.5 Speciation Cation/Anion Parameters
24 SPECIATION METALS PM2.5 Speciation Metal Parameters
25 UATMP CARBONYL Urban Air Toxics Monitoring Program Carbonyls
26 UATMP VOC Urban Air Toxics Monitoring Program VOCs
To list the codes for the criteria air pollutants, use:
aqs_list_parameters(myuser, pc="CRITERIA")
## code value_represented
## 2 14129 Lead (TSP) LC
## 21 42101 Carbon monoxide
## 3 42401 Sulfur dioxide
## 4 42602 Nitrogen dioxide (NO2)
## 5 44201 Ozone
## 6 81102 PM10 Total 0-10um STP
## 7 85129 Lead PM10 LC FRM/FEM
## 8 88101 PM2.5 - Local Conditions
The primary functions for requesting measurements stored in the AQS
database are aqs_annualData()
, aqs_dailyData()
, and
aqs_sampleData()
. Variations of each function exist for queries
targeting a specific criteria, e.g. aqs_annualData_byState()
. The
underlying function that queries the API is aqs_get()
, which can be
called directly if desired.
Requesting all PM2.5 measurements (parameter code 88101
) for
California (state code 06
) from January 1, 2017 through January 4,
2017:
# Sample Data -- By State
s1 <- aqs_sampleData(aqs_user=myuser,
endpoint="byState",
state="06",
bdate="20170101",
edate="20170104",
param="88101")
dim(s1)
## [1] 6915 27
s1[1:2,]
## state_code county_code site_number parameter_code poc latitude longitude
## 2 06 075 0005 88101 3 37.76595 -122.399
## 21000 06 075 0005 88101 3 37.76595 -122.399
## datum parameter date_local time_local date_gmt time_gmt
## 2 WGS84 PM2.5 - Local Conditions 2017-01-01 00:00 2017-01-01 08:00
## 21000 WGS84 PM2.5 - Local Conditions 2017-01-01 01:00 2017-01-01 09:00
## sample_measurement units_of_measure sample_duration
## 2 11 Micrograms/cubic meter (LC) 1 HOUR
## 21000 6 Micrograms/cubic meter (LC) 1 HOUR
## sample_frequency detection_limit uncertainty qualifier method_type
## 2 HOURLY 5 NA <NA> FEM
## 21000 HOURLY 5 NA <NA> FEM
## method_code method
## 2 170 Met One BAM-1020 Mass Monitor w/VSCC - Beta Attenuation
## 21000 170 Met One BAM-1020 Mass Monitor w/VSCC - Beta Attenuation
## state county date_of_last_change cbsa_code
## 2 California San Francisco 2017-03-22 41860
## 21000 California San Francisco 2017-03-22 41860
Requesting all NO2 measurements (parameter code 42602
) for King
County, WA from January 1, 2017 through January 4, 2017. First, we find
out the state code for Washington:
state_fips <- aqs_list_states(myuser)
tail(state_fips, 10)
## code value_represented
## 47 51 Virginia
## 48 53 Washington
## 49 54 West Virginia
## 50 55 Wisconsin
## 51 56 Wyoming
## 52 66 Guam
## 53 72 Puerto Rico
## 54 78 Virgin Islands
## 55 80 Country Of Mexico
## 56 CC Canada
From this list, we get that the state code for Washington is "53"
. Now
we find the code for King County:
wa_counties <- aqs_list_counties(myuser, state="53")
head(wa_counties)
## code value_represented
## 2 001 Adams
## 210 003 Asotin
## 3 005 Benton
## 4 007 Chelan
## 5 009 Clallam
## 6 011 Clark
From these results we see the King County code is "033"
.We now request
the sample:
s2 <- aqs_sampleData(aqs_user=myuser,
endpoint="byCounty",
state="53",
county="033",
bdate="20170101",
edate="20170110",
param="42602")
s2[1:2,]
## state_code county_code site_number parameter_code poc latitude longitude
## 2 53 033 0030 42602 1 47.59722 -122.3197
## 2100 53 033 0030 42602 1 47.59722 -122.3197
## datum parameter date_local time_local date_gmt time_gmt
## 2 WGS84 Nitrogen dioxide (NO2) 2017-01-01 00:00 2017-01-01 08:00
## 2100 WGS84 Nitrogen dioxide (NO2) 2017-01-01 01:00 2017-01-01 09:00
## sample_measurement units_of_measure sample_duration sample_frequency
## 2 9.5 Parts per billion 1 HOUR HOURLY
## 2100 NA Parts per billion 1 HOUR HOURLY
## detection_limit uncertainty qualifier
## 2 0.05 NA <NA>
## 2100 0.05 NA AI - Insufficient Data (cannot calculate).
## method_type method_code
## 2 FRM 599
## 2100 FRM 599
## method state county
## 2 Instrumental - Chemiluminescence Teledyne API 200 EU/501 Washington King
## 2100 Instrumental - Chemiluminescence Teledyne API 200 EU/501 Washington King
## date_of_last_change cbsa_code
## 2 2017-04-25 42660
## 2100 2017-04-25 42660
We could also request data for site 530330030
in King County:
s3 <- aqs_sampleData(aqs_user=myuser,
endpoint="bySite",
state="53",
county="033",
site="0030",
bdate="20170201",
edate="20170210",
param="42602")
s3[1:2,]
## state_code county_code site_number parameter_code poc latitude longitude
## 2 53 033 0030 42602 1 47.59722 -122.3197
## 241 53 033 0030 42602 1 47.59722 -122.3197
## datum parameter date_local time_local date_gmt time_gmt
## 2 WGS84 Nitrogen dioxide (NO2) 2017-02-01 00:00 2017-02-01 08:00
## 241 WGS84 Nitrogen dioxide (NO2) 2017-02-01 01:00 2017-02-01 09:00
## sample_measurement units_of_measure sample_duration sample_frequency
## 2 20 Parts per billion 1 HOUR HOURLY
## 241 NA Parts per billion 1 HOUR HOURLY
## detection_limit uncertainty qualifier
## 2 0.05 NA <NA>
## 241 0.05 NA AI - Insufficient Data (cannot calculate).
## method_type method_code
## 2 FRM 599
## 241 FRM 599
## method state county
## 2 Instrumental - Chemiluminescence Teledyne API 200 EU/501 Washington King
## 241 Instrumental - Chemiluminescence Teledyne API 200 EU/501 Washington King
## date_of_last_change cbsa_code
## 2 2017-05-09 42660
## 241 2017-05-09 42660
Data can be requested by metropolitan area, specifically Core-based Statistical Area (CBSA).
aqs_list(myuser, endpoint="cbsas")
From this output (hidden here for brevity), we see that the CBSA code
for the greater Atlanta, GA region is 12060
. We can now request all
PM2.5 observations from March 1, 2017 in the Atlanta-Sandy
Springs-Roswell CBSA:
s4 <- aqs_sampleData(aqs_user=myuser,
endpoint="byCBSA",
cbsa="12060",
bdate="20170301",
edate="20170301",
param="88101")
s4[1:2,]
## state_code county_code site_number parameter_code poc latitude longitude
## 2 13 089 0002 88101 3 33.6878 -84.2905
## 26 13 089 0002 88101 3 33.6878 -84.2905
## datum parameter date_local time_local date_gmt time_gmt
## 2 NAD83 PM2.5 - Local Conditions 2017-03-01 00:00 2017-03-01 05:00
## 26 NAD83 PM2.5 - Local Conditions 2017-03-01 01:00 2017-03-01 06:00
## sample_measurement units_of_measure sample_duration
## 2 NA Micrograms/cubic meter (LC) 1 HOUR
## 26 NA Micrograms/cubic meter (LC) 1 HOUR
## sample_frequency detection_limit uncertainty qualifier
## 2 HOURLY 5 NA AN - Machine Malfunction.
## 26 HOURLY 5 NA AN - Machine Malfunction.
## method_type method_code
## 2 FEM 170
## 26 FEM 170
## method state county
## 2 Met One BAM-1020 Mass Monitor w/VSCC - Beta Attenuation Georgia DeKalb
## 26 Met One BAM-1020 Mass Monitor w/VSCC - Beta Attenuation Georgia DeKalb
## date_of_last_change cbsa_code
## 2 2018-01-31 12060
## 26 2018-01-31 12060