Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support reticulate conditions #1664

Open
hadley opened this issue Nov 1, 2023 · 8 comments
Open

Support reticulate conditions #1664

hadley opened this issue Nov 1, 2023 · 8 comments

Comments

@hadley
Copy link
Member

hadley commented Nov 1, 2023

e.g. tidyverse/purrr#1104

gergness added a commit to gergness/rlang that referenced this issue Feb 22, 2024
@lionel-
Copy link
Member

lionel- commented May 31, 2024

hmm... I'm very hesitant on this one. It might not be documented that conditions must be lists, but it's widely assumed.

@t-kalinowski does reticulate need reference semantics on the condition? Could you store an environment in the condition instead?

@t-kalinowski
Copy link
Member

t-kalinowski commented May 31, 2024

There are a couple of tricky questions to answer in reticulate if the R condition is derived from a Python Exception, but is not a bona-fide Python object.

Right now, you can catch a raised Python exception as an R condition, and resignal it.

cond <- tryCatch(
  py_run_string("raise ZeroDivisionError()"),
  python.builtin.ZeroDivisionError = function(cond) cond)
)

stop(cond)

If we convert the caught cond to an R list like this, tricky questions are <???>

structure(
  class = c(<???>, "error", "condition"),
  list(message = <...>, call = <...>,
       py_exception = structure(<environment>
                                class = c("python.builtin.ZeroDivisionError", 
                                          "python.builtin.ArithmeticError", 
                                          "python.builtin.Exception", 
                                          "python.builtin.BaseException", 
                                          "python.builtin.object", 
                                          <???>))
       )
  )

From the users perspective,

  • is the cond list a Python object? (safe to use in all places where we accept a python object?).
  • is the py_exception environment equivalent to the cond list? Does the py_exception environment also have S3 class `"error" "condition"?

I'd lean towards answering "yes" for both questions, keeping this an invisible implementation detail.

Another tricky question: what about an exception that was never raised?

exception <- py_eval("ZeroDivisionError()")

Is exception an environment or a list? Does it also inherit from error and condition? (Can it be raised with stop()?) Most likely we make this a list, for consistency with raised exceptions.

This all would require a non-negligible refactoring in reticulate; we would have to do normalization behind the scenes at all Python entry points. Note, we already do some normalization since we have already have two R types that can be Python objects (environments and closures), so adding a third isn't too bad, but still not trivial.

@lionel-
Copy link
Member

lionel- commented May 31, 2024

I can see how that would be tricky to change now. But the alternative of having to consider environment conditions in other parts of our code seems worrying as well, unless we decide conditions should only be manipulated at R level or by calling the S3 API. Which might be fine, but would require a review of the rlang C API and maybe the vctrs internals (which also uses the rlang API and has a bunch of error handling at C level).

Regarding the design questions you brought up:

  • It makes sense to have two classes, one public inheriting from condition and one private which inherits from environment and wrapped by the former. A non-raised exception would still be constructed with both the public and private objects.

  • A condition class doesn't need to have message and call fields, it just needs to implement conditionMessage() and conditionCall().

@t-kalinowski
Copy link
Member

A condition class doesn't need to have message and call fields, it just needs to implement conditionMessage() and conditionCall().

In the initial implementation in reticulate, I only implemented conditionMessage() and conditionCall() for exceptions, but quickly ran into errors when other code (I forget where exactly, maybe rlang?), was calling x$message instead of conditionMessage().

It makes sense to have two classes,...

Sorry, I don't fully understand. Are you suggesting that both the inner environment and the outer list share the same S3 class? I.e.,

class <-  c("python.builtin.ZeroDivisionError", 
            "python.builtin.ArithmeticError", 
            "python.builtin.Exception", 
            "python.builtin.BaseException", 
            "python.builtin.object", 
            "error", "condition")
structure(
  class = class,
  list(message = <...>, call = <...>,
       py_exception = structure(<environment>
                                class = class))
)

@t-kalinowski
Copy link
Member

What do you think about attaching the python exception ref environment as an attribute of the R condition, instead of as a list element.

@t-kalinowski
Copy link
Member

t-kalinowski commented May 31, 2024

Ok, I have a tentative fix in reticulate.

A recent refactor of py_to_r()/r_to_py() and the PyObjectRef class in reticulate made this change much easier than it would have been otherwise.

This should be a mostly invisible change from a users perspective, so long as they were not previously inspecting typeof(<exception>). We'll see if any bugs or changes in edge cases shake out after the change.

Note, we're attaching the R environment w/ the extptr as an attribute of the R condition list object. In initial testing it didn't seem that rlang or purrr has any issues with that.

@lionel-
Copy link
Member

lionel- commented Jun 3, 2024

Good news! Out of curiosity, why not make it a condition field?

@t-kalinowski
Copy link
Member

Convenience mostly. An attribute is how Python callables keep a reference to the environment with the external pointer; making conditions consistent with that keeps the reticulate internals simpler and lets the objects share some common code paths.

Also, the condition "namespace" can be crowded, since it includes everything from the Python Exception instance object, in addition to the R call and message. Keeping the Python refenv as an attribute will minimize the (admittedly low) risk for namespace collisions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants