Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Implements Rfloat and Doubles using macros #284

Merged
merged 35 commits into from
Sep 20, 2021

Conversation

Ilia-Kosenkov
Copy link
Member

@Ilia-Kosenkov Ilia-Kosenkov commented Sep 11, 2021

Fixes #280.

This turned out to be a much larger contribution than I expected.
After implementing Doubles, I decided to introduce paste as a dependency (see #283) and then further improved code generation.

Major topics:

  1. Rfloat as a thin NA-aware wrapper around f64. I generalized macros so that both Rint and Rfloat are generated using the same macros, and the only thin written by hand is TryFrom that also allows coercion (i.e., floats that can be represented as ints can be converted to Rint).
  2. Doubles. Doubles is an almost exact copy of Integers, so, with the help of paste, both Doubles and Integers are generated using a single macros. I expect this approach will work as well for other primitive types. If not, at least in these cases we do not write the same code twice.
  3. Moved tests from doc strings to test module. When members are generated using macros, it is even harder to generate meaningful tests. In this case, I took out test cases, and put them in a separate module (under crate::wrappers::{integers, doubles}::tests). These tests are guarded with #[cfg(test)] and are now executed in all build scenarios (see S4 doctests potentially fail on Windows #282).

Takeways:

  • Prefer 'horizontal' development to 'vertical' (i.e., deal with one abstraction layer fully instead of implementing support of one type), which encourages macros usage instead of heavy copy-pasting
  • Prefer meaningful tests in a separate module to reduce comment pollution around impl members
  • Carefully address edge cases (marked with TODO)

TODO:

  • Meaningful tests in extendr-api/tests
  • Meaningful tests in tests/extendrtests
  • Find a way to keep docs when generating members using macros

@Ilia-Kosenkov Ilia-Kosenkov marked this pull request as draft September 11, 2021 15:00
@clauswilke
Copy link
Member

This is great!

One comment about doc strings: I understand the reasoning that they shouldn't be used to test the code, and I agree. However, I think 1-2 examples per function are super helpful and make the code more accessible for a new user. So I wouldn't just delete all these examples. Rather assess whether the example is actually helpful or not.

@Ilia-Kosenkov
Copy link
Member Author

@clauswilke, I agree about examples. The challenge is -- when members are generated using macros, we have two options -- generate examples the same way we generate all these methods right now, which can be tricky, or add additional parameters to the macros, which are translated into docs.
This is challenging, and I will look into it (and encourage others with the knowledge of macros to do the same). I certainly do prefer macros with weaker documentation but no code repetition compared to a lot of code repetition but slightly better docs.

@clauswilke
Copy link
Member

clauswilke commented Sep 11, 2021

The challenge is -- when members are generated using macros, we have two options -- generate examples the same way we generate all these methods right now, which can be tricky, or add additional parameters to the macros, which are translated into docs.

I see. Maybe it's better to write documentation by hand and to just put all the relevant info and examples into a single documentation block for the entire type. In other words, put the documentation and examples for Doubles here:

#[derive(Debug, PartialEq, Clone)]
pub struct Doubles {
pub(crate) robj: Robj,

@Ilia-Kosenkov
Copy link
Member Author

@clauswilke,
Yeah, it is probably better to write the whole documentation block in one place.
However, we can provide some limited support for doctest (and examples) by generating docstrings within the macros.
The last commit 27c0858 demonstrates such an example. The generated test case also participates in the testing.

@andy-thomason
Copy link
Contributor

andy-thomason commented Sep 14, 2021

Looking good!

I am usually cautious about using extra dependencies, but the problems with concat_ident! require this for now.
There is a PR to fix this in the language but don't forget to exhale.

It is worth thinking about Logicals/Rbool (aka Bool) but not in this PR.

This shares much with Integers.

We may need to implement more operators in the future (such as &, |, ^) for integer and bool.

@Ilia-Kosenkov
Copy link
Member Author

I have identified several issues with Rfloat that I need to address.
The main problem is that NA checks and PartialEq do not work in the current (macros) implementation, because NA is represented with f64::NaN, and f64::NaN != f64::NaN (there are multiple NaNs, and NA uses one value AFAIK).

@Ilia-Kosenkov
Copy link
Member Author

Ilia-Kosenkov commented Sep 14, 2021

Rint was implementing PartialEq and Eq by default. This implementation is inconsistent with how R treats NAs. For R, NA != NA (it actually returns NA_logical_), so Eq should not be implemented. ParitalEq cannot be derived implicitly, because then it will generate equality such that Rint::na() == Rint::na() (ints can be compared directly), but Rfloat::na() != Rfloat::na() (NA_real_ is a NaN, and f64::NaN != f64::NaN).
Solution - manually implement PartialEq for (Rtype, Rtype) and return false if either of objects is NA. This can be done in a generic way, using macros.

Downside - can no longer compare NAs directly and in assertions, IsNA::is_na() should be used instead (as it is done in R and in f64 when working with NaNs).

I also generated some doctests using macros, to test simple behavior (like Rtype::default() == Rtype::default()).

Next stop: bounds check when using elt, doctesting this case, then some integration tests using R package. I will probably finalize it in a day or two and it will be ready for review.

@clauswilke
Copy link
Member

Downside - can no longer compare NAs directly and in assertions,

I don't see this as a downside. As you say, it's exactly how R handles the same case.

@andy-thomason
Copy link
Contributor

I agree that Rint::na() should not (partial) compare with another Rint::na()

This behavious makes it easy to compare floats.

Comment on lines +23 to +32
fn double_scalar(x: f64) -> f64 {
x
}

// Convert an int scalar to itself
// x a number
#[extendr]
fn int_scalar(x: i32) -> i32 { x }
fn int_scalar(x: i32) -> i32 {
x
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These formatting changes were unintentional, I hooked cargo fmt to OnSave of my CLion, so once I edited this file, it automatically applied formatting and I did not notice it.

@Ilia-Kosenkov Ilia-Kosenkov marked this pull request as ready for review September 15, 2021 19:29
Copy link
Contributor

@yutannihilation yutannihilation left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments.

extendr-api/src/wrapper/macros.rs Outdated Show resolved Hide resolved
extendr-api/src/wrapper/macros.rs Show resolved Hide resolved
Co-authored-by: Hiroaki Yutani <yutani.ini@gmail.com>
Copy link
Member

@clauswilke clauswilke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel this exceeds my level of technical understanding so please take my approval as nothing more than saying I'm very much in favor of this work and am glad you're taking the lead.

One minor comment: I always find it hard to read macro code and I would encourage you to be extra careful with documenting. It may seem all obvious to you today as you're writing this but a few more comments probably wouldn't hurt. At a minimum, I think it would be good for every chunk of macro code (i.e., every impl section) to provide in a comment a concrete example of what it implements. For example: "Implement binary mathematical operators such as + or -: a + b"

@andy-thomason andy-thomason merged commit 298a31d into extendr:master Sep 20, 2021
andy-thomason added a commit that referenced this pull request Sep 20, 2021
* rebase on master

* I hope this explains the drop test.

* More comments for Claus.

* Fix typos

* fmt

* More comments.

* Rashly edit the merge conflict in github.

* fmt

* [WIP] Implements `Rfloat` and `Doubles` using macros (#284)

* Generalizing macros

* Generating 'From'

* Fixed typo

* More macros

* Preliminary implementation of Rfloat

* First Rfloat tests

* Refactored macros

* Split 'impl' macros into two

* Instance methods for Rfloat

* Mixed bianry operators

* Testing Rfloat

* Formatting

* Gnenerating altrep implementations

* Adding support for Doubles

* Comments

* Using 'paste!' to generate vectors

* Implementing 'Doubles' using macros

* Ported 'Integers' to new macros

* Generating default value

* Formatting

* Adding comments to the macros

* Simplified unary operator macros

* Simplified binary operator macros

* [POC] generating doc tests on the fly

* Implementing 'Default' for scalars

* Some docs

* Changed how 'NA's are handled

* Updated scalars tests

* Defensive check on 'vector::elt' + doctests for 'elt'

* Testing 'Doubles'

* Testing out of range

* Testing Integers/Doubles from R

* Updated tests

* Fixed docs

Co-authored-by: Hiroaki Yutani <yutani.ini@gmail.com>

* Note on 'elt' boundary checks [skip ci]

Co-authored-by: Hiroaki Yutani <yutani.ini@gmail.com>

* rebase on master

* I hope this explains the drop test.

* More comments for Claus.

* Fix typos

* fmt

* More comments.

* Rashly edit the merge conflict in github.

* fmt

* Fix merge conflict.

* Merge

Co-authored-by: Ilia <ilia.kosenkov@outlook.com>
Co-authored-by: Hiroaki Yutani <yutani.ini@gmail.com>
@Ilia-Kosenkov Ilia-Kosenkov deleted the doubles-w-paste branch October 8, 2023 10:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement "Doubles" wrapper
4 participants