Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Helper for downstream batched targets #50

Closed
3 tasks done
wlandau opened this issue May 11, 2021 · 3 comments
Closed
3 tasks done

Helper for downstream batched targets #50

wlandau opened this issue May 11, 2021 · 3 comments
Assignees

Comments

@wlandau
Copy link
Member

wlandau commented May 11, 2021

Prework

  • I understand and agree to the code of conduct and contributing guidelines.
  • If there is already a relevant issue, whether open or closed, comment on the existing thread instead of posting a new issue.
  • New features take time and effort to create, and they take even more effort to maintain. So if the purpose of the feature is to resolve a struggle you are encountering personally, please consider first posting a "trouble" or "other" issue so we can discuss your use case and search for existing solutions first.

Proposal

tar_rep() establishes a batching scheme, but then it is up to the user to write custom code to iterate over reps within batches. If possible, it would be nice to have a helper automatically process the iterations of an individual batch. This sort of thing is so general that I am not sure it is possible Sketch:

# _targets.R
library(targets)
library(tarchetypes)
list(
  tar_rep(data1, simulate_data(), batches = 40, reps = 25),
  tar_rep(data2, simulate_data(), batches = 40, reps = 25),
  tar_rep(data3, simulate_data()),
  tar_target(analysis, tar_map_reps(analyze_data(data1, data2, data3)), pattern = map(data1, data2))
)

tar_map_reps() should automatically detect which targets are batched (e.g. data1 and data2 but not data3) and infer how to iterate over the batches given the data types (lists or data frames).

@wlandau wlandau self-assigned this May 11, 2021
@wlandau
Copy link
Member Author

wlandau commented May 20, 2021

I no longer like the above helper tar_map_reps(). I propose a target factory called tar_rep_map() that abstracts away all the dynamic branching and accepts multiple batched targets.

# _targets.R
library(targets)
library(tarchetypes)
list(
  tar_rep(data1, simulate_data(), batches = 40, reps = 25),
  tar_rep(data2, simulate_data(), batches = 40, reps = 25),
  tar_rep(data3, simulate_data()),
  tar_rep_map(analysis, analyze_data(data1, data2, data3), data1, data2) # Use ... to declare batched targets data1 and data2.
)

@wlandau
Copy link
Member Author

wlandau commented May 21, 2021

Another idea: tar_batch(), a target factory to create a batching structure on an existing return value without having to go through tar_rep().

@wlandau
Copy link
Member Author

wlandau commented May 21, 2021

Another idea: tar_batch(), a target factory to create a batching structure on an existing return value without having to go through tar_rep().

Thought about it, but tar_batch() and a more generalized tar_rep_map() is extremely brittle. Let's stick with operations directly downstream of tar_rep() targets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant