Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature request: handle errors and warnings together peacfully #8

Closed
MatthieuStigler opened this issue Nov 21, 2018 · 8 comments
Closed

Comments

@MatthieuStigler
Copy link

Hi

Using safely or quietly, one faces a weird trade-off: do I want to allow for errors or warnings? This raises two issues:

  1. One uses safely and has warnings: cannot catch them
  2. One uses quietly and has an error: function fails

Ideally, on could just run one single function let's call its peacefully , that handles all of these ,i.e. return always slots like quietly plus error. I don't think this is available currently? Not sure whether this should be implemented at the purrr or collateral package though? Maybe start within collateral, and safely push to purrr later one? ;-)

@MishaCivey
Copy link

MishaCivey commented Nov 22, 2018

I'm not sure it's a trade-off. Two functions do two very different things, so it makes sense to separate them.

But if you do want to have this function, it's just one compose away :)

peacefully <- purrr::compose(purrr::safely, purrr::quietly)

You can then use it in place of safely/quietly, but keep in mind that output from this function will be two-layered list with first level being result, error, and result (if everything works) will be a list with all the bits and pieces from quietly. You can swap them around, if you want then levels change, but idea stays the same.

It will even have nice property that your results will always be purrr::pluck(x, c("result", "result") away, regardless of the order.

@MatthieuStigler
Copy link
Author

MatthieuStigler commented Nov 22, 2018

Thanks for the reply!

I mean it's a trade-off in the sense that I need to decide ex-ante whether my function is going to throw an error (use safely) or a warning (use quietly). When do we know this in advance?

So actually, it's not a trade-off, since in a typical workflow I will need to use both: safely, then if no errors occurs, replace it by quietly. So that's why I feel a peacefully function that it side-effect-stable would make sense.

And thanks for the code, very helpful! Now I would like to have it as a flat list (i.e. component result, output, warnings, message, error), I faced two issues:

  1. flatten() is not output-stable: it will not return a NULL slot, try list(result= list(result = 1, warnings = NA), error = NULL) %>% flatten
  2. didn't manage to include flatten within the compose call? I tried: purrr::compose(flatten, purrr::safely, purrr::quietly)(log)(1) but this doesn't work...

Any idea how I could solve 1 or 2? Thanks a lot!!!

use case

Use case would be for a case like:

data_frame(value = list("a", -1, 0, 1)) %>% 
  mutate(output = map_peacefully(value, log ))

Where I would show that first row gave an error, second a warning, and last two were ok.

@jimjam-slam
Copy link
Owner

jimjam-slam commented Nov 23, 2018

I'd actually been thinking along similar lines, since a common use case of collateral for me is running regression models where some problems might cause an error and some might cause warnings. If you're using safely and quietly directly then purrr::compose gets you all the way there, but if collateral were to provide helpers for side effect extraction (ref. issue #5), then there would be a good case for providing a mapper based on compose explicitly 😄

@MatthieuStigler
Copy link
Author

Yes, same case for me: some regressions throw errors, some warnings, catching both is a daunting task so far with existing tools :-(

And yes, the compose approach seems to provide a simple way to do this! Any idea however on how to return a flat(tened) list (see my questions 1 and 2) ? I think that's what a user would expect, but the NULL slots makes it challenging...

Thanks a lot!!

@jimjam-slam
Copy link
Owner

jimjam-slam commented Jan 2, 2019

Now that I've merged the PR changing the backend a little, this is on my immediate TODO list. Currently the has_*() functions check the class list to make sure you don't try to use them on the wrong kind of column; I think they'd be better implemented as S3 methods, as the peacefully_mapped class will require a different implementation for the same functions to safely_mapped and quietly_mapped classes.

I'll try to also add helpers for extracting components when I do this, as they'll work essentially the same way as the has_*() functions, and doing it oneself will be less obvious with peacefully_mapped columns for the reasons we've talked about.

Re. flattened output, the API I'm trying to establish is that the collateral map variants give you the original safely() or quietly() output (just enhanced visually), and then you use a separate step to extract the side effect you need. Like, currently:

df %>%
mutate(
  tricky_thing = map_quietly(some_column, tricky_function),
  tricky_results = map_dbl(tricky_thing, 'result'))

There're a few reasons for that decision:

  • I'm not sure how to provide just a single flattened side effect and the visual reporting on all side effects;
  • I think it would be misleading if the visual output didn't match the contents of the list column;
  • I'd have to provide variants for every output type (dbl, chr, etc.), and I think purrr already does that really well.

The way I'm thinking about going about this currently is to have functions that return strings to be given to regular map functions. Right now, if you give map() or map_chr() or whatever an actual function (but you don't call that function), it is passed to the mapper for iteration. Similarly, a formula is converted to a function that is passed.

But if you give it a character vector, it turns that into an extractor function, where each element steps down a level into each object in the column. That's how I currently check for the presence of side effects, and it's also how I currently recommend extracting results, as I show above. Passing .f = 'results' simply retrieves $results from each list element.

With a peacefully_mapped column, it would be trickier: you'd need .f = c('results', 'results') or similar. My thinking is that if I provide a function that returns the correct character vector, that could be used instead:

df %>% mutate(tricky_results = map_dbl(tricky_thing, peaceful_results()))

Note that peaceful_results() is being called here (it has parentheses after it), so it returns a character vector that gets turned into an extractor function, rather than being passed directly to the mapper.

The downside of this solution is that you'd have to give explicitly named variants for each class (it can't just be get_results()), because it doesn't see the list columns at all and hence can't check the class of the list column. Although if I'm doing that, I might as well just implement the extractors myself, and then I can add arguments to allow folks to customise the output (which might be useful for errors, warnings, etc... I'm not too sure).

Anyway, that's what I'm thinking! If you have other thoughts, I'd love to hear them 🙂

@jimjam-slam
Copy link
Owner

I'm sorry, I should have mentioned this earlier: I've implemented peacefully variants and am testing them on the dev branch! I've been a little slack getting the next CRAN release together (I need to write a few more unit tests, but I'm getting there) if you'd like to give it a go!

@MatthieuStigler
Copy link
Author

thanks, looking forward to look at this, although will take some more time!

@jimjam-slam
Copy link
Owner

jimjam-slam commented Jul 8, 2019

No worries, Matthieu! I actually did some work yesterday after discovering a bug in pmap_*() and, after fixing it and implementing tests, decided to merge all this into master. I'd still like to rework some of the docs before the next CRAN release, but I'm pretty happy with it so far.

Part of my implementation of peacefully is to shift the output of compose so that each mapped list is just one level deep (result, error, warnings, messages and output), rather than ending up with nested lists. Because of this, the other helper functions (has_*(), tally_*(), etc.) work fairly predictably regardless of which adverb you're using.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants