Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Informative messages for some side-effect functions #213

Closed
4 tasks done
mattwarkentin opened this issue Nov 12, 2020 · 9 comments
Closed
4 tasks done

Informative messages for some side-effect functions #213

mattwarkentin opened this issue Nov 12, 2020 · 9 comments
Assignees

Comments

@mattwarkentin
Copy link
Contributor

mattwarkentin commented Nov 12, 2020

Prework

  • Read and agree to the code of conduct and contributing guidelines.
  • If there is already a relevant issue, whether open or closed, comment on the existing thread instead of posting a new issue.
  • New features take time and effort to create, and they take even more effort to maintain. So if the purpose of the feature is to resolve a struggle you are encountering personally, please consider first posting a "trouble" or "other" issue so we can discuss your use case and search for existing solutions first.
  • Format your code according to the tidyverse style guide.

Proposal

Hi @wlandau,

Any interest in providing suppressible informative messages when using some of the side-effect functions like:

  • tar_delete()
  • tar_deduplicate()
  • tar_invalidate()
  • tar_prune()

This is maybe most useful when pruning and deduplicating, since it is a little harder to reason about whether anything will be done at all.

For invalidation and deletion, it's a bit more obvious what will happen since the user has to specify names, but since tidyselect syntax is supported, might still be informative to know what targets were ultimately invalidated or deleted.

Just a thought.

@wlandau
Copy link
Collaborator

wlandau commented Nov 12, 2020

I will think about it. I would probably go with these:

  • tar_delete()
  • tar_invalidate()
  • tar_prune()
  • tar_load()
  • tar_workspace()

Alternatively we could just write helpers to introspect the data store.

  • tar_list(): basically list.files("_targets/objects/")
  • tar_workspaces(): list.files("_targets/workspaces/")`

Between that and tar_meta(), I think we have all the information covered.

@wlandau
Copy link
Collaborator

wlandau commented Nov 12, 2020

tar_deduplicate() is a funny one because tar_meta() always reads deduplicated data (without modifying storage) so you would have to count the lines in _targets/meta/meta to check if deduplication actually happened. But tar_make() always deduplicates the file when it starts, so this is something users rarely have to worry about anyway.

@mattwarkentin
Copy link
Contributor Author

mattwarkentin commented Nov 12, 2020

Regarding the possibility of tar_list() (or maybe tar_objects() for consistency), do you think it should return literal objects (i.e. including branches), or just the root target names?

Does tar_prune() prune whole targets only, or is it able to prune branches that are no longer needed too? Seems like the former. This is probably relevant to what tar_list() might return.

@wlandau
Copy link
Collaborator

wlandau commented Nov 12, 2020

Regarding the possibility of tar_list() (or maybe tar_objects() for consistency), do you think it should return literal objects (i.e. including branches), or just the root target names?

I like tar_objects() more, now that you mention it. We could probably support a logical branches argument to choose whether we want branch names. tar_outdated() supports this.

Does tar_prune() prune whole targets only, or is it able to prune branches that are no longer needed too? Seems like the former. This is probably relevant to what tar_list() might return.

tar_prune() does account for branches in its reasoning:

https://github.com/wlandau/targets/blob/b6c338e77f85177dbcebc52261bc986ea6574231/R/tar_prune.R#L45-L61

@mattwarkentin
Copy link
Contributor Author

mattwarkentin commented Nov 12, 2020

Okay, good to know. I think I like the idea of tar_objects(names = NULL, branches = FALSE). I think this function would give enough information to allow the user to introspect for themselves what the implications would be when calling tar_delete(), tar_invalidate(), tar_prune(), or tar_load(), right?

tar_prune() is the only one I'm not sure of. tar_objects() would list all of the objects in cache, but what information lets the user know which objects still belong to the current pipeline? tar_meta() %>% filter(type %in% c("stem", "branches"), format != "file")??

@wlandau
Copy link
Collaborator

wlandau commented Nov 12, 2020

Okay, good to know. I think I like the idea of tar_objects(names = NULL, branches = FALSE). I think this function would give enough information to allow the user to introspect for themselves what the implications would be when calling tar_delete(), tar_invalidate(), tar_prune(), or tar_load(), right?

Actually, on reflection I think branches = FALSE would raise some unnecessarily difficult issues. We need the metadata to reason about branches, and stuff could be present in _targets/objects/ and not the metadata due to tar_invalidate(). Plus, we would need to think about whether representation in tar_objects() means all the branches are present or any branches are present, and that could throw some users off. tar_objects() as list.files("_targets/objects/") (+ safeguards) seems clearer.

tar_prune() is the only one I'm not sure of. tar_objects() would list all of the objects in cache, but what information lets the user know which objects still belong to the current pipeline? tar_meta() %>% filter(type %in% c("stem", "branches"), format != "file")??

tar_manifest() gets all the targets in the pipeline, which can be left-joined with tar_meta() to get branch info.

@wlandau
Copy link
Collaborator

wlandau commented Nov 13, 2020

Just implemented tar_objects() and tar_workspaces().

@wlandau wlandau closed this as completed Nov 13, 2020
@mattwarkentin
Copy link
Contributor Author

@wlandau
Copy link
Collaborator

wlandau commented Nov 13, 2020

Done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants