-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for python objects #37
Conversation
R/iterator.R
Outdated
#' @exportS3Method | ||
as_iterator.python.builtin.iterator <- function(x) { | ||
force(x) | ||
function() reticulate::iter_next(x, completed = quote(.__exhausted__.)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be created using the exhaustion constructor. At the minimum it should be sym(".__exhausted__.")
. The .__exhausted__.
symbol should never appear in code, in case the objects of a namespace are iterated over.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, that's a good reason I hadn't considered. I generally try to avoid micro-optimizations like this, but introducing to as.symbol()
overhead into tight for
loops seemed like an easy win:
plot(print(bench::mark(
quote(.__exhausted__.),
as.symbol(".__exhausted__."))))
#> # A tibble: 2 × 13
#> expression min median `itr/sec` mem_alloc `gc/sec` n_itr
#> <bch:expr> <bch:t> <bch:> <dbl> <bch:byt> <dbl> <int>
#> 1 quote(.__exhausted__.) 0 1ns 1.29e9 0B 0 10000
#> 2 as.symbol(".__exhausted__.") 82ns 164ns 5.68e6 0B 0 10000
#> # … with 6 more variables: n_gc <dbl>, total_time <bch:tm>, result <list>,
#> # memory <list>, time <list>, gc <list>
#> Loading required namespace: tidyr
#> Warning: Transformation introduced infinite values in continuous y-axis
#> Warning: Removed 2678 rows containing missing values (geom_point).
Created on 2022-06-29 by the reprex package (v2.0.1)
Would you like me to change back to as.symbol(".__exhausted__.")
? Or maybe create the symbol once in the closure then use that?
as_iterator.python.builtin.iterator <- function(x) {
force(x)
exhausted <- as.symbol(".__exhausted__.")
function() reticulate::iter_next(x, completed = exhausted)
}
Unfortunately, reticulate::iter_next()
forces the completed argument on each call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe create the symbol once in the closure then use that?
This should also be avoided for the same reason. I'd probably just use coro::exhausted()
. In R, generators and iterators are not expected to be used for tight loops.
R/iterator.R
Outdated
})) | ||
|
||
|
||
#' @exportS3Method |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not #' @export
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No strong reason, happy to change it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup I'd prefer @export
for coding style consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done!
@@ -136,12 +136,16 @@ as_iterator.default <- function(x) { | |||
} | |||
} | |||
|
|||
on_load(on_package_load("reticulate", { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I take it reticulate doesn't plan to add these methods?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think {coro} is a better home for these methods. reticulate already provides an as_iterator()
function which is not a generic and does not adhere to the coro iterator protocol. Exporting a method for a same-named generic in reticulate invites confusion, and also the complexity of an awkward delayed registration dance that can be completely avoided if these methods live here.
I think we don't even need a separate method for |
Now this is just missing a NEWS bullet :) |
Done! |
Thanks! Do you need that on CRAN? If yes, I'll probably release right away, otherwise this might sit on github for months. Do you have any other planned changes for coro? |
This came out of working on updates to the Tensorflow website, which is still ongoing work. I don’t have any other coro enhancements planned presently, but other things might still shake out in the next two weeks while we’re updating the TF examples. @dfalbel what do you think? |
@lionel- If it's ok with you, let's wait ~ 2 weeks. I'll ping you when we need this on CRAN. |
That sounds good. |
Hi @lionel-, the dev version is working great for us and I don't think we'll have any additional PRs to coro in the near term. When you get a chance, can you please publish to CRAN? |
Great, I'll do that! |
This PR adds an
as_iterator()
method for a base python object. This enables usage like this with TensorFlow Datasets:Without this, users need to instead explicitly call
reticulate::as_iterator(ds)
, before passing tocoro::loop()
, which can be confusing because of the name clash betweencoro::as_iterator()
, and also not necessary for most other iterables that are automatically converted to an iterator.