Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

handle country groupings more effectively #1

Open
5 tasks
alistaire47 opened this issue Jul 12, 2017 · 0 comments
Open
5 tasks

handle country groupings more effectively #1

alistaire47 opened this issue Jul 12, 2017 · 0 comments

Comments

@alistaire47
Copy link
Owner

It would be useful to handle containment and country groupings more effectively, e.g. when moving from countries to regions:

passport::as_country_name(c('FR', 'DE', 'SG'), 'continent')
#> Multiple unique values aggregated to single output
#> [1] "EU" "EU" "AS"

passport::as_country_name(c('FR', 'DE', 'SE', 'SG'), 'en_un_subregion')
#> Multiple unique values aggregated to single output
#> [1] "Western Europe"     "Western Europe"     "Northern Europe"   
#> [4] "South-eastern Asia"

Issues with current implementation:

  • Regions are obviously not countries.
  • Changes are not reversible, as indicated by the message.
    • Apart from introducing NAs, reversibility should a goal for the conversion functions.
  • For group membership (EU, UN, G20, etc.) it would be nice to have vectors for filtering and generating logical columns

Some of these are already in countries (above, UN SIDS, UN development status, etc.) and CLDR has some data, though there are certainly more country groups that should be added.

Use cases:

  1. Converting to a superset, maybe with new function as_region
    • Are all groupings regions?
    • "Region" has a lot of definitions, so this name may clash with other packages
    • Is as_ the right prefix? It's conversion, yes, but levels are being aggregated.
  2. Adding a logical column for whether a country is in a group
    • Could have its own function, but if groups are exposed as vectors (via an accessor?), users can just use %in%
  3. Filtering to countries in a group, e.g. the OECD countries
    • Requires groups be exposed and in same format as existing country data

Addressing 2 and 3 requires group vectors, but does it make more sense to

  • make an accessor function that returns a vector of the specified group converted to the specified format, or
  • add a bunch of iso2c vectors of groups as package data which can be converted with existing conversion functions, or
  • both?

To codify, the TODO:

  • Reproducibly generate vectors of country groupings (necessary even if they're not exported)
  • Aggregate group vectors into internal data structure?
  • Make group accessor and/or document group data exposed
  • Separate regions from existing data; deduplicate new data (alts are irrelevant)
  • Write as_region (or whatever it ends up going by)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant