R interface for the NASS QUICK STATS API
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
R
inst/examples
man
tests
vignettes
.Rbuildignore
.gitignore
.travis.yml
DESCRIPTION
LICENSE
Makefile
NAMESPACE
README.Rmd
README.md

README.md

ORCiD DOI Licence

Build Status Last-changedate Coverage Status

rnassqs (R NASS QuickStats)

This is a package that allows users to access the NASS quickstats data through their API. It is fairly low level and does not include a lot of scaffolding or setup. Some things may change, but at this point it is relatively stable.

Installing

Install like any R package from github:

library(devtools)
install_github('potterzot/rnassqs')

API Key

To use the NASS Quickstats API you need an API key. There are several ways of clueing the rnassqs package in to your api key. You can set the variable explicitly and pass it to functions, like so

params <- list(...)                    # parameters for query 
api_key <- "<your api key here>"       # api key
data <- nassqs(params, key = api_key)  # query and return data

Alternatively, you can set the api key as an environmental variable either by adding it to your .Renviron like so:

NASSQS_TOKEN="<your api key here>"

or by setting it explicitly in the console by calling rnassqs::nassqs_token(). This will prompt you to enter the api key if not set, and return the value of the api key if it is set. If you do not set the key and you are running the session interactively, R will ask you for the key when you try to issue a query.

Usage

See the examples in inst/examples for quick recipes to download data.

The most basic level of access is with nassqs_GET(), with which you can make any query of variables. For example, to mirror the request that is on the NASS API documentation, you can do:

library(nassqs)
params = list("commodity_desc"="CORN", "year__GE"=2012, "state_alpha"="VA")
req = nassqs_GET(params=params, key=your_api_key)
qsJSON = nassqs_parse(req)

Note that you can request data for multiple values of the same parameter by as follows:

params = list("commodity_desc"="CORN", "year__GE"=2012, "state_alpha"="VA", "state_alpha"="WA")
req = nassqs_GET(params=params, key=your_api_key)
qsJSON = nassqs_parse(req)

NASS does not allow GET requests that pull more than 50,000 records in one request. The function will inform you if you try to do that. It will also inform you if you’ve requested a set of parameters for which there are no records.

Handling inequalities and operators other than “=”

The NASS API handles other operators by modifying the variable name. The API can accept the following modifications:

  • __LE: <=
  • __LT: <
  • __GT: >
  • __GE: >=
  • __LIKE: like
  • __NOT_LIKE: not like
  • __NE: not equal

For example, to request corn yields for all years since 2000, you would use something like:

params = list("commodity_desc"="CORN", 
              "year__GE"=2000, 
              "state_alpha"="VA", 
              "statisticcat_desc"="YIELD")
df = nassqs(params=params) #returns data as a data frame.

You could also use the helper function nassqs_yield():

nassqs_yield(list("commodity_desc"="CORN", "agg_level_desc"="NATIONAL")) #gets US yields

Alternatives

NASS also provides a daily tarred and gzipped file of their entire dataset. At the time of writing it is approaching 1 GB. You can download that file from their FTP:

ftp://ftp.nass.usda.gov/quickstats

The FTP link also contains builds for: NASS’ census (every 5 years, the next is 2017), or data for one of their specific sectors (CROPS, ECONOMICS, ANIMALS & PRODUCTS). At the time of this writing, specific files for the ENVIRONMENTAL and DEMOGRAPHICS sectors are not available.