Skip to content

Commit

Permalink
add process_barcharts() function
Browse files Browse the repository at this point in the history
  • Loading branch information
mstrimas committed Apr 8, 2024
1 parent ec0c70d commit 299fe6f
Show file tree
Hide file tree
Showing 73 changed files with 601 additions and 124 deletions.
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Package: auk
Title: eBird Data Extraction and Processing in R
Version: 0.7.0
Version: 0.7.1
Authors@R:
c(person(given = "Matthew",
family = "Strimas-Mackey",
Expand Down
1 change: 1 addition & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,7 @@ export(ebird_species)
export(filter_repeat_visits)
export(format_unmarked_occu)
export(get_ebird_taxonomy)
export(process_barcharts)
export(read_ebd)
export(read_sampling)
importFrom(magrittr,"%>%")
Expand Down
4 changes: 4 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
# auk 0.7.1

- added a helper function for processing bar chart data from eBird `process_barcharts()`

# auk 0.7.0

- update for 2023 eBird taxonomy
Expand Down
88 changes: 88 additions & 0 deletions R/process_barcharts.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
#' Process eBird bar chart data
#'
#' eBird bar charts show the frequency of detection for each week for all
#' species within a region. These can be accessed by visiting any region or
#' hotspot page and clicking the "Bar Charts" link in the left column. As an
#' example, these [bar charts for
#' Guatemala](https://ebird.org/barchart?r=GT&yr=all&m=) list all the species
#' (as well as non-species taxa) that have been observed in eBird in Guatemala
#' and, for each species, the width of the green bar reflects the frequency of
#' detections on eBird checklists within the region (referred to as detection
#' frequency). Detection frequency is provide for each of 4 "weeks" of each
#' month (although these are not technically 7 day weeks since months have more
#' than 28 days). The data underlying the bar charts can be downloaded via a
#' link at the bottom right of the page; however, the text file that's
#' downloaded is in a challenging format to work with. This function is designed
#' to read these text files and return a nicely formatted data frame for use in
#' R.
#'
#' @param filename character; path to the bar chart data text file downloaded
#' from the eBird website.
#'
#' @return This functions returns a data frame in long format where each row
#' provides data for one species in one week. `detection_frequency` gives the
#' proportion of checklists in the region that reported the species in the
#' given week and `n_detections` gives the number of detections. The total
#' number of checklists in each week used to estimate detection frequency is
#' provided as a data frame stored in the `sample_sizes` attribute.
#'
#' @export
#' @family helpers
#' @examples
#' # example bar chart data for svalbard
#' f <- system.file("extdata/barchart-sample.txt", package = "auk")
#' # import and process barchart data
#' barchart <- process_barcharts(f)
#' head(barchart)
#'
#' # the sample sizes for each week can be access with
#' attr(barchart, "sample_sizes")
process_barcharts <- function(filename) {
stopifnot(is.character(filename), file.exists(filename))

l <- readLines(filename)
l <- l[l != ""]

# column headers
month_week <- tidyr::expand_grid(month = tolower(month.abb), week = seq_len(4))
week_vars <- paste(month_week$month, month_week$week, sep = "_")

# number of checklists per week
ss_row <- which(stringr::str_detect(l, "Sample Size:\t"))
if (length(ss_row) != 1) {
stop("The barchart data is in an unexpected format and cannot be read. ",
"This function can only process unmodified data downloaded directly ",
"from the eBird website.")
}
ss <- stringr::str_remove(l[ss_row], "Sample Size:\t")
ss <- as.integer(stringr::str_split_1(ss, "\t")[seq_len(48)])
ss <- dplyr::bind_cols(month_week, n_checklists = ss)

# detection frequency
detfrq <- l[seq(ss_row + 1, length(l))]
cn <- c("common_name", week_vars, "blank")
ct <- c("c", rep("d", times = length(cn) - 2), "c")
ct <- paste(ct, collapse = "")
detfrq <- readr::read_tsv(I(detfrq), col_names = cn, col_types = ct)
detfrq$blank <- NULL
# transform to long
detfrq <- tidyr::pivot_longer(detfrq, cols = -"common_name",
values_to = "detection_frequency")
detfrq <- tidyr::separate(detfrq, col = "name", into = c("month", "week"))
detfrq$week <- as.integer(detfrq$week)
detfrq$name <- NULL

# add in species codes
tax <- auk::ebird_taxonomy
tax <- tax[, c("species_code", "common_name", "scientific_name")]
detfrq <- dplyr::inner_join(tax, detfrq, by = "common_name")

# add in num detections
detfrq <- dplyr::inner_join(detfrq, ss, by = c("month", "week"))
detfrq$n_detections <- round(detfrq$n_checklists * detfrq$detection_frequency)
detfrq$n_checklists <- NULL
detfrq <- dplyr::as_tibble(detfrq)

attr(detfrq, "sample_sizes") <- ss
return(detfrq)
}
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -298,7 +298,7 @@ eBird Basic Dataset files can be read with `read_ebd()`:
#> $ locality_type : chr [1:398] "H" "H" "H" "H" ...
#> $ latitude : num [1:398] 26.9 26.6 58.8 58.8 25.5 ...
#> $ longitude : num [1:398] -99.3 -99.1 -122.9 -122.9 -100.3 ...
#> $ observation_date : Date[1:398], format: "2011-11-14" "2011-11-14" "2011-06-14" "2011-06-15" ...
#> $ observation_date : Date[1:398], format: "2011-11-14" "2011-11-14" "2011-06-14" ...
#> $ time_observations_started: chr [1:398] "06:45:00" "08:15:00" "10:30:00" "07:00:00" ...
#> $ observer_id : chr [1:398] "obsr554038" "obsr146271" "obsr12384" "obsr12384" ...
#> $ sampling_event_identifier: chr [1:398] "S21633922" "S9118288" "S22036612" "S22036670" ...
Expand Down
9 changes: 2 additions & 7 deletions cran-comments.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,6 @@
# auk 0.7.0
# auk 0.7.1

- update for 2023 eBird taxonomy
- no need to restart after setting AWK and EBD paths
- retain breeding codes in `auk_zerofill()`
- changes to conform with deprecation of `.data$` in tidyselect expressions
- changes to package-level documentation in roxygen2
- removed non-ASCII characters from datasets
- added a helper function for processing bar chart data from eBird `process_barcharts()`

# Test environments

Expand Down
3 changes: 3 additions & 0 deletions data-raw/barchart.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
f_src <- "~/data/ebird/auk/ebird_SJ__2014_2024_1_12_barchart.txt"
f_dst <- "inst/extdata/barchart-sample.txt"
file.copy(f_src, f_dst)
2 changes: 1 addition & 1 deletion docs/404.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion docs/CONDUCT.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion docs/CONTRIBUTING.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion docs/LICENSE.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

14 changes: 7 additions & 7 deletions docs/articles/auk.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion docs/articles/development.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion docs/articles/index.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion docs/authors.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions docs/index.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

7 changes: 6 additions & 1 deletion docs/news/index.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion docs/pkgdown.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,5 +4,5 @@ pkgdown_sha: ~
articles:
auk: auk.html
development: development.html
last_built: 2024-03-26T15:34Z
last_built: 2024-04-08T16:16Z

2 changes: 1 addition & 1 deletion docs/reference/auk-package.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit 299fe6f

Please sign in to comment.