Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should there be a way to only apply a subset of the attributes in a profile? #146

Open
shandiya opened this issue Jun 21, 2022 · 2 comments
Labels
enhancement New feature or request

Comments

@shandiya
Copy link
Contributor

Currently, there's no way to select the attributes of a profile within a call to galah_filter(). For instance, if I apply the ALA general profile, there's no way to also get absence records since the profile excludes these. It would be nice to be able to modify the attributes included in these profiles.

@shandiya shandiya added the enhancement New feature or request label Jun 21, 2022
@mjwestgate
Copy link
Collaborator

This is a good idea. Looking at the biocache, it appears that this is possible. For example, a biocache search for Litoria (available here) simply adds &qualityProfile=ALA to the URL to apply the default profile (galah does this already). BUT you can tailor this using &disableQualityFilter. The relevant query to support absences - while still leaving the rest of the filters 'on' - would be &disableQualityFilter=occurrence-status.

One problem with this is that search_profile_attributes() doesn't currently return the field names needed to make this query work (occurrence-status in this case). Fortunately, however, this is easily fixed as this information is returned via the API. We could support this kind of functionality within the new galah_data_profile() function, currently on the development branch of galah (#130).

The final question is how this should appear to the user. Setting up NSE is pretty easy. Without checking I can't be sure, but something like this might work:

galah_call() |>
  galah_filter(year == 2022) |>
  galah_data_profile(ALA, -occurrence-status) |>
  atlas_counts()

Is that a good solution? Or is it a bit messy?

@daxkellie
Copy link
Contributor

I think that this solution is pretty tidy as long as users find it intuitive to think of data profile filters like columns (e.g. dplyr::select(-col1))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants