Skip to content

Commit

Permalink
allow ct_search() to pull all monthly data for an entire year
Browse files Browse the repository at this point in the history
  • Loading branch information
ChrisMuir committed Mar 19, 2018
1 parent 993e462 commit 8f6f3bf
Show file tree
Hide file tree
Showing 10 changed files with 262 additions and 187 deletions.
12 changes: 6 additions & 6 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,15 +3,15 @@ comtradr 0.1.0.09000

## NEW FEATURES

* For function `ct_search`, expanded the valid input types for args `start_date` and `end_date` (see [issue #10](https://github.com/ropensci/comtradr/issues/10) for details).
* Modifications to `ct_search()` to allow for pulling all monthly data for an entire year in a single query (issue [#14](https://github.com/ropensci/comtradr/issues/14))
* For function `ct_search()`, expanded the valid input types for args `start_date` and `end_date` (issue [#10](https://github.com/ropensci/comtradr/issues/10)).

## BUG FIXES

* The updates generated by function `ct_update_databases()` are now properly preserved between R sessions ([issue](https://github.com/ropensci/comtradr/issues/11)).
* Passing `"services"` to arg `type` within function `ct_search()` now uses commodity scheme `EB02` by default (previously this would throw an error, fixes [issue #6](https://github.com/ropensci/comtradr/issues/6)).
* When using commodity scheme `EB02` within function `ct_search()`, passing `"TOTAL"` to arg `commod_codes` no longer returns zero results (fixes [issue #7](https://github.com/ropensci/comtradr/issues/7)).
* `ct_commodity_lookup()` no longer returns zero results when passing all caps input to arg `search_terms` (fixes [issue #9](https://github.com/ropensci/comtradr/issues/9)).

* The updates generated by function `ct_update_databases()` are now properly preserved between R sessions (issue [#11](https://github.com/ropensci/comtradr/issues/11)).
* Passing `"services"` to arg `type` within function `ct_search()` now uses commodity scheme `EB02` by default (previously this would throw an error, fixes issue [#6](https://github.com/ropensci/comtradr/issues/6)).
* When using commodity scheme `EB02` within function `ct_search()`, passing `"TOTAL"` to arg `commod_codes` no longer returns zero results (issue [#7](https://github.com/ropensci/comtradr/issues/7)).
* `ct_commodity_lookup()` no longer returns zero results when passing all caps input to arg `search_terms` (issue [#9](https://github.com/ropensci/comtradr/issues/9)).


comtradr 0.1.0
Expand Down
147 changes: 97 additions & 50 deletions R/ct_search.R
Original file line number Diff line number Diff line change
Expand Up @@ -56,8 +56,11 @@
#' these three may use the catch-all input "All".
#' \item For the same group of three ("reporters", "partners", date range),
#' if the input is not "All", then the maximum number of input values
#' for each is five (for date range, if not using "all", then the
#' "start_date" and "end_date" must at most span five months or five years).
#' for each is five. For date range, if not using "All", then the
#' "start_date" and "end_date" must not span more than five months or five
#' years. There is one exception to this rule, if arg "freq" is "monthly",
#' then a single year can be passed to "start_date" and "end_date" and the
#' API will return all of the monthly data for that year.
#' \item For param "commod_codes", if not using input "All", then the maximum
#' number of input values is 20 (although "All" is always a valid input).
#' }
Expand Down Expand Up @@ -105,8 +108,8 @@
#' ex_2 <- ct_search(reporters = "Canada",
#' partners = "All",
#' trade_direction = "all",
#' start_date = "2011",
#' end_date = "2015",
#' start_date = 2011,
#' end_date = 2015,
#' commod_codes = shrimp_codes)
#' nrow(ex_2)
#'
Expand Down Expand Up @@ -199,21 +202,7 @@ ct_search <- function(reporters, partners,
if (any(c(start_date, end_date) %in% c("all", "All", "ALL"))) {
date_range <- "all"
} else {
start_date <- as.character(start_date)
end_date <- as.character(end_date)
if (freq == "M") {
date_range <- validate_date_inputs(start_date, end_date, freq)
date_range <- seq.Date(date_range$start_date, date_range$end_date,
by = "month") %>%
format(format = "%Y%m")
} else if (freq == "A") {
date_range <- validate_date_inputs(start_date, end_date, freq)
date_range <- seq.Date(date_range$start_date, date_range$end_date,
by = "year") %>%
format(format = "%Y")
}

date_range <- paste(date_range, collapse = ",")
date_range <- get_date_range(start_date, end_date, freq)
}

## Transformations to reporters.
Expand Down Expand Up @@ -457,41 +446,99 @@ execute_api_request <- function(url) {
}


#' Validate date input args
#' Get Date Range
#'
#' @return List of validated and transformed dates (start date and end date).
#' @return Date range as a single string, comma sep.
#' @noRd
validate_date_inputs <- function(start_date, end_date, freq) {
get_date_range <- function(start_date, end_date, freq) {
start_date <- as.character(start_date)
end_date <- as.character(end_date)

if (freq == "A") {
# For annual date ranges.
s_date <- as.Date(start_date, format = "%Y-%m-%d")
if (is.na(s_date)) {
s_date <- as.Date(paste0(start_date, "-01-01"), format = "%Y-%m-%d")
}
e_date <- as.Date(end_date, format = "%Y-%m-%d")
if (is.na(e_date)) {
e_date <- as.Date(paste0(end_date, "-01-01"), format = "%Y-%m-%d")
}
if (any(is.na(s_date), is.na(e_date))) {
stop("if 'freq' is 'annual', args 'start_date' & 'end_date' must",
" either be 'all' or be strings that have format 'yyyy-mm-dd' or",
" 'yyyy'", call. = FALSE)
}
# Date range when freq is "annual" (date range by year).
start_date <- convert_to_date(start_date)
end_date <- convert_to_date(end_date)
date_range <- seq.Date(start_date, end_date, by = "year") %>%
format(format = "%Y")
} else if (freq == "M") {
# For monthly date ranges.
s_date <- as.Date(start_date, format = "%Y-%m-%d")
if (is.na(s_date)) {
s_date <- as.Date(paste0(start_date, "-01"), format = "%Y-%m-%d")
}
e_date <- as.Date(end_date, format = "%Y-%m-%d")
if (is.na(e_date)) {
e_date <- as.Date(paste0(end_date, "-01"), format = "%Y-%m-%d")
}
if (any(is.na(s_date), is.na(e_date))) {
stop("if 'freq' is 'monthly', args 'start_date' & 'end_date' must",
" either be 'all' or be strings that have format 'yyyy-mm-dd' or",
" 'yyyy-mm'", call. = FALSE)
# Date range when freq is "monthly".
sd_year <- is_year(start_date)
ed_year <- is_year(end_date)
if (sd_year && ed_year) {
# If start_date and end_date are both years ("yyyy") and are identical,
# return the single year as the date range.
if (identical(start_date, end_date)) {
return(start_date)
} else {
stop("Cannot get more than a single year's worth of monthly data ",
"in a single query", call. = FALSE)
}
} else if (!sd_year && !ed_year) {
# If neither start_date nor end_date are years, get date range by month.
start_date <- convert_to_date(start_date)
end_date <- convert_to_date(end_date)
date_range <- seq.Date(start_date, end_date, by = "month") %>%
format(format = "%Y%m")
} else {
# Between start_date and end_date, if one is a year and the other isn't,
# throw an error.
stop("If arg 'freq' is 'monhtly', 'start_date' and 'end_date' must ",
"have the same format", call. = FALSE)
}
}
return(list("start_date" = s_date, "end_date" = e_date))

# If the derived date range is longer than five elements, throw an error.
if (length(date_range) > 5) {
stop("If specifying years/months, cannot search more than five ",
"consecutive years/months in a single query", call. = FALSE)
}

return(paste(date_range, collapse = ","))
}


#' Given a numeric or character date, convert to an object with class "Date".
#'
#' @return Object of class "Date" (using base::as.Date()).
#' @noRd
convert_to_date <- function(date_obj) {
# Convert to char.
#date_obj <- as.character(date_obj)
# Convert to Date.
if (is_year(date_obj)) {
date_obj <- as.Date(paste0(date_obj, "-01-01"), format = "%Y-%m-%d")
} else if (is_year_month(date_obj)) {
date_obj <- as.Date(paste0(date_obj, "-01"), format = "%Y-%m-%d")
} else {
date_obj <- as.Date(date_obj, format = "%Y-%m-%d")
}
# If conversion to Date failed, throw error.
if (is.na(date_obj)) {
stop(sprintf(
paste("arg '%s' must be a date with one of these formats:\n",
"int: yyyy\n",
"char: 'yyyy'\n",
"char: 'yyyy-mm'\n",
"char: 'yyyy-mm-dd'"),
deparse(substitute(date_obj))
), call. = FALSE)
}

date_obj
}


#' Is input a year string or not.
#'
#' @noRd
is_year <- function(x) {
grepl("^\\d{4}$", x)
}


#' Is input a year-month string or not.
#'
#' @noRd
is_year_month <- function(x) {
grepl("^\\d{4}-\\d{2}", x)
}
4 changes: 2 additions & 2 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -82,8 +82,8 @@ shrimp_codes <- ct_commodity_lookup("shrimp", return_code = TRUE, return_char =
example2 <- ct_search(reporters = "Thailand",
partners = "All",
trade_direction = "exports",
start_date = "2007-01-01",
end_date = "2011-01-01",
start_date = 2007,
end_date = 2011,
commod_codes = shrimp_codes)
# Inspect the output
Expand Down
12 changes: 6 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -91,8 +91,8 @@ str(example1)
#> $ fob_trade_value_usd : logi NA NA NA NA NA NA ...
#> $ flag : int 0 0 0 0 0 0 0 0 0 0 ...
#> - attr(*, "url")= chr "https://comtrade.un.org/api/get?max=50000&type=C&freq=A&px=HS&ps=all&r=156&p=410,842,484&rg=2&cc=TOTAL&fmt=json&head=H"
#> - attr(*, "time_stamp")= POSIXct, format: "2017-11-01 21:39:54"
#> - attr(*, "req_duration")= num 11.8
#> - attr(*, "time_stamp")= POSIXct, format: "2018-03-18 13:20:03"
#> - attr(*, "req_duration")= num 1.12
```

**Example 2**: Return all exports related to shrimp from Thailand to all other countries, for years 2007 thru 2011
Expand All @@ -108,8 +108,8 @@ shrimp_codes <- ct_commodity_lookup("shrimp", return_code = TRUE, return_char =
example2 <- ct_search(reporters = "Thailand",
partners = "All",
trade_direction = "exports",
start_date = "2007-01-01",
end_date = "2011-01-01",
start_date = 2007,
end_date = 2011,
commod_codes = shrimp_codes)

# Inspect the output
Expand Down Expand Up @@ -151,8 +151,8 @@ str(example2)
#> $ fob_trade_value_usd : logi NA NA NA NA NA NA ...
#> $ flag : int 0 0 0 0 0 0 0 0 0 0 ...
#> - attr(*, "url")= chr "https://comtrade.un.org/api/get?max=50000&type=C&freq=A&px=HS&ps=2007,2008,2009,2010,2011&r=764&p=all&rg=2&cc=0"| __truncated__
#> - attr(*, "time_stamp")= POSIXct, format: "2017-11-01 21:40:06"
#> - attr(*, "req_duration")= num 13.7
#> - attr(*, "time_stamp")= POSIXct, format: "2018-03-18 13:20:07"
#> - attr(*, "req_duration")= num 3.14
```

[![ropensci\_footer](https://ropensci.org/public_images/ropensci_footer.png)](https://ropensci.org)
57 changes: 33 additions & 24 deletions inst/doc/comtradr-vignette.R
Original file line number Diff line number Diff line change
Expand Up @@ -10,42 +10,51 @@ knitr::opts_chunk$set(comment = "#>", collapse = TRUE, fig.width = 9, fig.height
## ------------------------------------------------------------------------
library(comtradr)

## ------------------------------------------------------------------------
## -------------------------------------------------------------------------------------------------------------------------------------------
q <- ct_search(reporters = "USA",
partners = c("Germany", "France", "Japan", "Mexico"),
trade_direction = "imports")

# API calls return a tidy data frame.
str(q)

## ---- eval = FALSE-------------------------------------------------------
## ---- eval = FALSE--------------------------------------------------------------------------------------------------------------------------
# q <- ct_search(reporters = "USA",
# partners = c("Germany", "France", "Japan", "Mexico"),
# trade_direction = "imports",
# start_date = "2010-01-01",
# end_date = "2014-01-01")
# start_date = 2010,
# end_date = 2014)

## ---- eval = FALSE-------------------------------------------------------
## ---- eval = FALSE--------------------------------------------------------------------------------------------------------------------------
# # Get all monthly data for a single year (API max of 12 months per call).
# q <- ct_search(reporters = "USA",
# partners = c("Germany", "France", "Japan", "Mexico"),
# trade_direction = "imports",
# start_date = "2012-03-01",
# end_date = "2012-07-01",
# start_date = 2012,
# end_date = 2012,
# freq = "monthly")
#
# # Get monthly data for a specific span of months (API max of five months per call).
# q <- ct_search(reporters = "USA",
# partners = c("Germany", "France", "Japan", "Mexico"),
# trade_direction = "imports",
# start_date = "2012-03",
# end_date = "2012-07",
# freq = "monthly")

## ------------------------------------------------------------------------
## -------------------------------------------------------------------------------------------------------------------------------------------
ct_country_lookup("korea", "reporter")
ct_country_lookup("bolivia", "partner")

## ---- eval = FALSE-------------------------------------------------------
## ---- eval = FALSE--------------------------------------------------------------------------------------------------------------------------
# q <- ct_search(reporters = "Rep. of Korea",
# partners = "Bolivia (Plurinational State of)",
# trade_direction = "all")

## ------------------------------------------------------------------------
## -------------------------------------------------------------------------------------------------------------------------------------------
ct_commodity_lookup("tomato")

## ---- eval = FALSE-------------------------------------------------------
## ---- eval = FALSE--------------------------------------------------------------------------------------------------------------------------
# tomato_codes <- ct_commodity_lookup("tomato",
# return_code = TRUE,
# return_char = TRUE)
Expand All @@ -55,13 +64,13 @@ ct_commodity_lookup("tomato")
# trade_direction = "all",
# commod_codes = tomato_codes)

## ------------------------------------------------------------------------
## -------------------------------------------------------------------------------------------------------------------------------------------
q <- ct_search(reporters = "USA",
partners = c("Germany", "France", "Mexico"),
trade_direction = "all",
commod_codes = c("0702", "070200", "2002", "200210", "200290"))

## ------------------------------------------------------------------------
## -------------------------------------------------------------------------------------------------------------------------------------------
# The url of the API call.
attributes(q)$url
# The date-time of the API call.
Expand All @@ -70,35 +79,35 @@ attributes(q)$time_stamp
# The total duration of the API call, in seconds.
attributes(q)$req_duration

## ------------------------------------------------------------------------
## -------------------------------------------------------------------------------------------------------------------------------------------
ct_country_lookup(c("Belgium", "vietnam", "brazil"), "reporter")

ct_commodity_lookup(c("tomato", "trout"), return_char = TRUE)

## ------------------------------------------------------------------------
## -------------------------------------------------------------------------------------------------------------------------------------------
ct_commodity_lookup(c("tomato", "trout"), return_char = FALSE)

## ------------------------------------------------------------------------
## -------------------------------------------------------------------------------------------------------------------------------------------
ct_commodity_lookup(c("tomato", "sldfkjkfdsklsd"), verbose = TRUE)

## ------------------------------------------------------------------------
## -------------------------------------------------------------------------------------------------------------------------------------------
ct_update_databases()

## ------------------------------------------------------------------------
## -------------------------------------------------------------------------------------------------------------------------------------------
ct_commodity_db_type()

## ------------------------------------------------------------------------
## -------------------------------------------------------------------------------------------------------------------------------------------
# Column headers returned from function ct_search
colnames(q)

## ------------------------------------------------------------------------
## -------------------------------------------------------------------------------------------------------------------------------------------
# Apply polished column headers
q <- ct_use_pretty_cols(q)

# Print new column headers.
colnames(q)

## ---- warning = FALSE, message = FALSE-----------------------------------
## ---- warning = FALSE, message = FALSE------------------------------------------------------------------------------------------------------
library(ggplot2)

# Comtrade api query.
Expand All @@ -122,7 +131,7 @@ ggplot(df, aes(Year, `Trade Value usd`, color = factor(`Partner Country`),
labs(title = "Total Value (USD) of Chinese Exports, by Year") +
theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1))

## ---- warning = FALSE, message = FALSE-----------------------------------
## ---- warning = FALSE, message = FALSE------------------------------------------------------------------------------------------------------
library(ggplot2)
library(dplyr)

Expand All @@ -135,8 +144,8 @@ shrimp_codes <- ct_commodity_lookup("shrimp",
df <- ct_search(reporters = "Thailand",
partners = "All",
trade_direction = "exports",
start_date = "2007-01-01",
end_date = "2011-01-01",
start_date = 2007,
end_date = 2011,
commod_codes = shrimp_codes)

# Apply polished col headers.
Expand Down
Loading

0 comments on commit 8f6f3bf

Please sign in to comment.