Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support dplyr::collect() for running galah queries #183

Closed
mjwestgate opened this issue Feb 26, 2023 · 1 comment
Closed

Support dplyr::collect() for running galah queries #183

mjwestgate opened this issue Feb 26, 2023 · 1 comment
Assignees
Labels
enhancement New feature or request

Comments

@mjwestgate
Copy link
Collaborator

{galah} 1.5.1 added dplyr verbs as alternatives to various functions e.g.

galah_call() |>
  filter(year == 2022) |>
  group_by(basisOfRecord) |>
  count()

The major one not yet included was collect(), which could substitute for atlas_ functions, if galah_call() gained a type argument or similar, e.g.

galah_call(type = "occurrences") |>
  filter(year) |> 
  select(group = c("basic", "events")) |> 
  collect()

This syntax is most closely analogous to the {gbifdb} package.

@mjwestgate mjwestgate added the enhancement New feature or request label Feb 26, 2023
@mjwestgate mjwestgate self-assigned this Feb 27, 2023
@mjwestgate
Copy link
Collaborator Author

Another valuable step here would be adding compute() prior to collect(), as this more accurately mirrors what is actually happening; i.e. compute() sets up a job to process on the selected atlas, and collect() retrieves it once complete.

mjwestgate added a commit that referenced this issue May 1, 2023
- `compute` passes a query to the desired atlas, and returns a status URL
- `collect` checks whether that query is complete, and gets data if it is
- only enabled for living atlases so far (not GBIF)
mjwestgate added a commit that referenced this issue May 18, 2023
New method integrates code for LAs and GBIF
daxkellie added a commit that referenced this issue Jun 1, 2023
* Updated `url_GET` (required by current `compute_counts()` implementation) to use `url_build_internal()` rather than `build_fq_url()`
* Preserve correct field name as column name after `count()` query
mjwestgate added a commit that referenced this issue Jun 9, 2023
- new function `parse_quosures_basic` for capturing NSE in `group_by`, `select` etc (#154, #169)
- ensure `dplyr` and `galah` versions of same functions are in same files, and share help files
- remove remaining cache functions
- add `collapse`, `compute` and `collect` help file (#183)
mjwestgate added a commit that referenced this issue Jun 19, 2023
…e`) #183

There is no clear `compute` stage for species lists, so this is skipped
mjwestgate added a commit that referenced this issue Jun 22, 2023
Removed 'what' argument from `collect()` etc, replaced by `type` argument which is now passed to `galah_call()` instead.
mjwestgate added a commit that referenced this issue Jun 23, 2023
… files (#183)

New function `download_media()` replaces previous behaviour
mjwestgate added a commit that referenced this issue Jul 19, 2023
Includes new functions `request_data()` (aka `galah_call()`) and `request_metadata()` to underpin piped queries. `query_API()` now requires an object with api-relevant names (`url`, `header` etc)
mjwestgate added a commit that referenced this issue Jul 19, 2023
- `show_all_fields()` works again
- build new function `check_api_key()`
- `collapse_datasets()` correctly named
- fix bug in object definition within `request_data()`
- allow extraction of named slots in `query_API()`
- get correct fields in `collect_layers()`
mjwestgate added a commit that referenced this issue Jul 27, 2023
- `request_metadata(type = "taxa")` is now supported with `collapse()` and `collect()`
- `search_taxa()`, `galah_identify()`, `identify.data_request()` and `identify.metadata_request()` are all supported
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants