-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Altrep Integers class. #274
Conversation
Thanks for your efforts! Two comments:
|
This should handle both the ALTREP and regular vectors. The methods we call like get_region and elt should also work When constructing vectors, I'm chosing to use regular vectors The regular conversion rules should accept anything with INTSXP |
I see, thanks! |
The |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this pull request is a really nice starting point for discussion.
extendr-api/src/wrapper/integers.rs
Outdated
/// assert_eq!(vec.iter().cloned().sum::<i32>(), 3); | ||
/// } | ||
/// ``` | ||
pub fn iter(&self) -> impl Iterator<Item = &i32> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies for being nitpicky: Shouldn't the item type be Rint
? In the general case, we don't know that we can return an i32
for sure (could be NA
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought this would return SliceIter<T>
. Or, is the idea that doing NA
-validation beforehand at once, which is similar to this comment above? If so, i32
might make sense.
do all its checks once for a number of elements
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've implemented Rint where possible.
I'm in two minds about get_region as we would like to support SIMD arithmetic but can do this too if required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The SliceIter should probably go. Its only useful feature is that the lifetime
is effectively static.
|
||
impl std::iter::Sum for Rint { | ||
/// Sum an integer iterator over Rint. | ||
/// Yields NA on overflow of NAs present. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yields NA on overflow of NAs present.
If this sentence means to say no NA
-validation is needed because of this, I don't think it's true. (e.g. c(-1, NA)
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/of/or/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see. So, anyway we need NA
-validation here, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I thought this would return i32::MIX - 1
, but I found this is handled correctly thanks to <Rint>::checked_sub()
. Sorry for my misunderstanding!
rextendr::rust_source(code = r"(
use extendr_api::scalar::Rint;
#[extendr(use_try_from = true)]
fn do_sum(x: Integers) -> Rint {
x.iter().sum()
}
)", patch.crates_io = list(`extendr-api` = list(git = "https://github.com/extendr/extendr"))
)
#> ℹ build directory: '/tmp/Rtmp7RG5fJ/file77e25279e1bf0'
#> ✓ Writing '/tmp/Rtmp7RG5fJ/file77e25279e1bf0/target/extendr_wrappers.R'.
do_sum(c(NA_integer_, -1L))
#> [1] NA
Created on 2021-09-12 by the reprex package (v2.0.1)
// is problematic when converting a negative value to unsigned | ||
// integer types (e.g. `-1i32 as u8` becomes 255). | ||
if let Some(v) = robj.as_integer() { | ||
if let Ok(v) = Self::try_from(v) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can skip this try_from
in this case. Rint
is 32-bit, so it should never overflow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
/// Get a single element from the vector. | ||
/// Note that this is very inefficient in a tight loop. | ||
pub fn elt(&self, index: usize) -> Rint { | ||
unsafe { INTEGER_ELT(self.get(), index as R_xlen_t).into() } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the case of non-ALTREP, cpp11 accesses to the underlying memory directly without INTEGER_ELT
. Can we do the same thing? Doesn't this matter much on the performance?
unsafe { INTEGER_IS_SORTED(self.get()).into() } | ||
} | ||
|
||
/// Return TRUE if the vector has NAs, FALSE if not, or NA_BOOL if unknown. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// Return TRUE if the vector has NAs, FALSE if not, or NA_BOOL if unknown. | |
/// Return TRUE if the vector has no NAs, FALSE if not, or NA_BOOL if unknown. |
Just so you know that we all very much respect your views.
…On Mon, Sep 6, 2021 at 3:31 PM Hiroaki Yutani ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In extendr-api/src/wrapper/integers.rs
<#274 (comment)>:
> + } else {
+ let mut robj = Robj::alloc_vector(INTSXP, values.len());
+ let dest: &mut [i32] = robj.as_typed_slice_mut().unwrap();
+
+ for (d, v) in dest.iter_mut().zip(values) {
+ *d = v.into();
+ }
+ robj
+ };
+ Self { robj }
+ })
+ }
+
+ /// Get a single element from the vector.
+ /// Note that this is very inefficient in a tight loop.
+ pub fn elt(&self, index: usize) -> Rint {
I don't have a strong opinion. I think I'll be fine with Rint.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#274 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAL36XAZG23WTWIIGBKSUW3UATGCXANCNFSM5DCG5Q2Q>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
Thanks, I just feel I don't have any idea worth arguing about there for now :) Let's move forward. |
This is a sample for an Integers class that uses Altrep
methods to handle very large vectors.
There are no iterators at present as we would need to implement
Rint first.
The USP of this is the ability to make vectors from iterators without
allocating memory.
We support elt(), get_region(), is_sorted() and no_na() correctly.
We do not have access to altrep min() and max() since these were removed
from Rinternals.h
Suggestions on a postcard...