Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

as_epi_df construction conviences #456

Open
dsweber2 opened this issue Jun 5, 2024 · 2 comments
Open

as_epi_df construction conviences #456

dsweber2 opened this issue Jun 5, 2024 · 2 comments

Comments

@dsweber2
Copy link
Contributor

dsweber2 commented Jun 5, 2024

A couple of things I realized would be convenient when trying to use as_epi_df:

  • allow for common time_value aliases. E.g. if date or anything containing that is present and unique, just assume that the user means that that column should be used as the time_value. Maybe info that it's happening.
  • if no geo_value is present, assume that all values have the same geo_value at a national scale
  • give an alias argument that allows the user to input e.g. geo_value = some_col, time_value = date. This is basically just allowing them to skip writing a rename immediately before as_epi_df
@dsweber2
Copy link
Contributor Author

dsweber2 commented Jun 5, 2024

@lcbrooks @nmdefries would like your opinion at some point on whether we ought to do this/rough priority

@brookslogan
Copy link
Contributor

brookslogan commented Jun 5, 2024

My thoughts:

  • Guessing a time_value column: sounds reasonable; not sure if this should be based on names or classes or both though. cli_informing sounds good.
  • Inventing single geo_value: less sure about this one. We won't know which nation to guess, so I'd imagine it'd need to be some other special value like "unspecified" [and same sort of thing with geo_type, maybe "custom"?]. Plus we should check for unique time values to make sure we're not just missing some key vars. Really, I think we should be checking epikey-times are unique identifiers all the time, but especially in this case.
  • Avoiding rename: sounds nice. For the geo_value = some_col, time_value = date interface, we should use the tidyselect package. We should match whatever our decision is here in "Promote" other_keys to be printed, a constructor parameter, and more clearly documented #186 / [enh] promote other_keys  #446 (though that needs to accommodate 0 / >1 selections as well, but that is probably possible just by using a common tidyselect command and removing a length check on the result).

Priority: seems like one of many individually-low- to medium- priority papercuts that keep being neglected. If it's been annoying you recently, seems like a good time to handle it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants