Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't use 'recover' with dplyr 1.0.0. How to interactively debug? #5308

Closed
MilesMcBain opened this issue Jun 8, 2020 · 1 comment · Fixed by #5322
Closed

Can't use 'recover' with dplyr 1.0.0. How to interactively debug? #5308

MilesMcBain opened this issue Jun 8, 2020 · 1 comment · Fixed by #5322
Assignees
Labels
bug an unexpected problem or unintended behavior wip work in progress
Milestone

Comments

@MilesMcBain
Copy link

MilesMcBain commented Jun 8, 2020

I've noticed that recover = TRUE is now seemingly unusable with dplyr.

For example this code

library(tidyverse)

mtc <- mtcars

set.seed(1111)
rand_na <- function(vec) {
  index_na  <- rdunif(15, a = 1, b = length(vec))
  vec[index_na] <- NA
  vec
}

mtc$cyl <- rand_na(mtcars$cyl)

mtc %>%
  group_by(carb) %>%
  nest() %>%
  mutate(model = map(data, ~lm(mpg ~ cyl + hp + wt,
                               data = .x)))

Produces this stack of frames:

Enter a frame number, or 0 to exit   

 1: mtc %>% group_by(carb) %>% nest() %>% mutate(model = map(data, ~lm(mpg ~ cy
 2: withVisible(eval(quote(`_fseq`(`_lhs`)), env, env))
 3: eval(quote(`_fseq`(`_lhs`)), env, env)
 4: eval(quote(`_fseq`(`_lhs`)), env, env)
 5: `_fseq`(`_lhs`)
 6: freduce(value, `_function_list`)
 7: withVisible(function_list[[k]](value))
 8: function_list[[k]](value)
 9: mutate(., model = map(data, ~lm(mpg ~ cyl + hp + wt, data = .x)))
10: mutate.data.frame(., model = map(data, ~lm(mpg ~ cyl + hp + wt, data = .x))
11: mutate_cols(.data, ...)
12: tryCatch({
    for (i in seq_along(dots)) {
        not_named <- (is.null(d
13: tryCatchList(expr, classes, parentenv, handlers)
14: tryCatchOne(expr, names, parentenv, handlers[[1]])
15: value[[3]](cond)
16: stop_dplyr(i, dots, fn = "mutate", problem = conditionMessage(e), parent = 
17: abort(bullets, class = "dplyr_error", error_name = error_name, error_expres
18: signal_abort(cnd)

And I haven't been able to actually find the call to lm burried in there, since I seem to hit some c code in mutate_cols that breaks the chain.

Just shy of a year ago I made a video of myself debugging this error with recover. So we can look at what the stack of frames was then:

image

And we can see, I had access to lm and lm.fit, where the error is occurring.

IMHO this is a real shame as recover was a really efficient technique for jumping to the source of an error and tinkering with offending code to understand why it was breaking. In the latest dplyr we do get a handy error message:

Error: Problem with `mutate()` input `model`.
x 0 (non-NA) cases
i Input `model` is `map(data, ~lm(mpg ~ cyl + hp + wt, data = .x))`.
i The error occured in group 6: carb = 8.
Run `rlang::last_error()` to see where the error occurred.

Which points me to the group that caused the problem. But I still have to write code to go inspect that group to figure out what is going on, e.g. mtc %>% filter(carb == 8). In this case it might be enough, but in more complex cases, like the second error I debug in the video, and the case I just had today it won't be.

I'm looking at writing a lot of code, overloading other people's functions, and messing around with environments to get to a situation where I can debug an error right at the point it fails.

I had a look at rlang::last_error and it seems like it almost has all the pieces I'd need to re-run the offending function with debugOnce. So that got me thinking that maybe rlang could bring this facility back with something like rlang::debug_last_error. That might be jumping the gun though. Maybe I am just stuck in the old way of doing things.

Is there another pathway to get to interactive debugging with dplyr, like what I was doing with recover?

@lionel-
Copy link
Member

lionel- commented Jun 8, 2020

We should rethrow errors from a calling handler rather than an exiting handler. This way the stack is still around when it reaches the global handlers like options(error = ).

@lionel- lionel- self-assigned this Jun 10, 2020
lionel- added a commit to lionel-/dplyr that referenced this issue Jun 10, 2020
@hadley hadley added bug an unexpected problem or unintended behavior wip work in progress labels Jun 22, 2020
@hadley hadley added this to the 1.0.1 milestone Jun 22, 2020
lionel- added a commit that referenced this issue Jul 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug an unexpected problem or unintended behavior wip work in progress
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants