Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Derive PartialEq+Eq. #975

Closed
wants to merge 10 commits into from
Closed

Derive PartialEq+Eq. #975

wants to merge 10 commits into from

Conversation

rickvanprim
Copy link

Fixes #964.

@@ -61,11 +61,21 @@ impl<T> fmt::Debug for Array64<T> {
write!(f, "Array64 {{}}")
}
}
impl<T> ::core::cmp::PartialEq for Array64<T>
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some places used full paths for the ops traits, so I wasn't sure if that's preferred here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't care a lot. Since Edition 2018 you don't need the leading ::.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you have no preference, I'll just leave as is. If you'd prefer I remove the leading :: I can do that too.

@@ -28,6 +28,19 @@ pub struct ChaCha {
pub(crate) d: vec128_storage,
}

// Custom PartialEq implementation as `vec128_storage` doesn't implement it.
impl ::core::cmp::PartialEq for ChaCha {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems pretty non-ideal, but I'm hoping to not have to also try to get this change into vec128_storage.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm vec128_storage doesn't provide this Into implementation on all platforms. I'm going to do a transmute to [u32; 4] to see if that gets the build passing, but I'd like some input here on what an acceptable approach is.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing wrong with a custom implementation, but that unsafe code — we've been trying to minimise usage, so I'm not keen on this.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I go try to get PartialEq+Eq derived for vec128_storage upstream? I don't think there's a cross target way currently available to compare these without unsafe.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could ask @kazcw about that, but looks to me like you can just do let b: [u32; 4] = self.b.into();.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My previous version used Into<[u32; 4]> but it's not implemented on all platforms. Though that doesn't make a ton of sense to me based on the code you linked. Relevant build log: https://travis-ci.org/github/rust-random/rand/jobs/682708945

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this code was added since ppv-lite 0.2.6. @kazcw could you release a new version please?

I don't wish to add this unsafe code, which makes us blocked on ppv-lite.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dhardy Sure, I can do that this weekend.

@@ -4,6 +4,10 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/)
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [Unreleased]
- Derive PartialEq+Eq for BlockRng and BlockRng64
- Add PartialEq+Eq constraint to BlockRngCore::Results
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this reasonable? Also should BlockRngCore itself have a where Self: PartialEq + Eq constraint at this point?

Copy link
Member

@dhardy dhardy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! Unfortunately, there are a few issues..

CHANGELOG.md Outdated
@@ -8,6 +8,9 @@ A [separate changelog is kept for rand_core](rand_core/CHANGELOG.md).

You may also find the [Upgrade Guide](https://rust-random.github.io/book/update.html) useful.

## [Unreleased]
- Derive PartialEq+Eq for StdRng, SmallRng, ThreadRng, ReseedingRng, ReseedingCore, and StepRng (#975)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't make any sense for ThreadRng because that's just a handle. Handles cannot cross thread boundaries so any two handles which can be compared should be equal. I'm also not sure it's useful on the reseeding constructs.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair. I was thinking that comparing the internal pointers would be reasonable (assuming that's what the derive does for NonNull). You bring up a good point about it not being something you'd store and pass across threads. I'll remove the derive on that one.

Cargo.toml Outdated
@@ -73,7 +73,7 @@ libc = { version = "0.2.22", optional = true, default-features = false }
# Emscripten does not support 128-bit integers, which are used by ChaCha code.
# We work around this by using a different RNG.
[target.'cfg(not(target_os = "emscripten"))'.dependencies]
rand_chacha = { path = "rand_chacha", version = "0.2.1", default-features = false, optional = true }
rand_chacha = { path = "rand_chacha", version = "0.2.2", default-features = false, optional = true }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your change will be in 0.2.3, if that's why you updated the number. (If not, well, 0.2.2 was not a mandatory upgrade.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, good catch, will fix.

@@ -61,11 +61,21 @@ impl<T> fmt::Debug for Array64<T> {
write!(f, "Array64 {{}}")
}
}
impl<T> ::core::cmp::PartialEq for Array64<T>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't care a lot. Since Edition 2018 you don't need the leading ::.

@@ -28,6 +28,19 @@ pub struct ChaCha {
pub(crate) d: vec128_storage,
}

// Custom PartialEq implementation as `vec128_storage` doesn't implement it.
impl ::core::cmp::PartialEq for ChaCha {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing wrong with a custom implementation, but that unsafe code — we've been trying to minimise usage, so I'm not keen on this.

@@ -67,7 +67,7 @@ pub trait BlockRngCore {

/// Results type. This is the 'block' an RNG implementing `BlockRngCore`
/// generates, which will usually be an array like `[u32; 16]`.
type Results: AsRef<[Self::Item]> + AsMut<[Self::Item]> + Default;
type Results: AsRef<[Self::Item]> + AsMut<[Self::Item]> + Default + PartialEq + Eq;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be okay, but need to think about it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively I can conditionally do PartialEq+Eq on BlockRng where R::Results: PartialEq.

@@ -147,7 +147,7 @@ where
{
}

#[derive(Debug)]
#[derive(Debug, PartialEq, Eq)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this makes sense. We have a reseed-on-fork feature which in a sense violates Eq (though in a new process). We also don't guarantee we won't introduce non-deterministic reseeding in the future.

We also have no use for this, since ThreadRng shouldn't be comparable (as already mentioned)?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll get rid of this then. From the description, this sounded like a wrapper class that would be comparable if the RNG it's wrapping is comparable, and the example uses a ChaCha20Core, though I'm sure there is some nuance I'm missing 🙂

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should get rid of this on both ReseedingRng and ReseedingCore right? Or just the latter?

@dhardy
Copy link
Member

dhardy commented May 4, 2020

This revocation is incomplete too — it uselessly derives on ReseedingRng and still mentions both Reseeding* in the changelog.

@rickvanprim
Copy link
Author

Well my GitHub newbishness is showing. Evidently I had a bunch of comments "in review" that weren't actually posted. I apologize about seeming unresponsive.

@kazcw
Copy link
Collaborator

kazcw commented May 15, 2020

A bitwise comparison like this is not reliable for a seekable BlockRngCore like ChaCha, because the same logical state with different amounts of data buffered are different physical states. I.e. if you clone a stream and then seek one by a certain amount, the two streams may never compare equal per this implementation even if you re-match their logical stream positions, because their buffers will be aligned differently. To avoid false-negative return values, BlockRng::partial_eq needs to account for buffer offset.

This could be done without breaking existing impls of BlockRngCore with something like impl PartialEq for BlockRng<R> where R: StreamCmp { ... }, where StreamCmp provides stream_eq(&self, rhs: &Self, offset: isize) -> bool, where offset is logically equivalent to rhs.byte_pos() - self.byte_pos() if the streams are equal. (ChaCha could offer a more powerful interface based around querying whether the streams are the same, and querying each stream's current position; but passing the buffer offset difference like this is a least-common-denominator API implementable by any RNG, even if it can't quantify its position or doesn't have a same-stream notion).

@dhardy, what do you think?

@dhardy
Copy link
Member

dhardy commented May 16, 2020

You are right @kazcw, on the other hand I don't think false negatives for RNG comparisons are a big deal: if RNG1 != RNG2 we have no idea whether they have different seeds or one is merely a word ahead of the other, and we certainly don't know whether the next n outputs will be equal. It may however cause spurious test failures, so certainly not ideal.

stream_eq(&self, rhs: &Self, offset: isize) -> bool

Correctly implementing this in general is not trivial. Even with bounds on isize, seeking may be required (since BlockRng's buffer can be larger than one block).

The alternative is not to implement PartialEq on block RNGs. @rickvanprim is there any reason you need this? Is there sufficient utility in general to justify adding a stream_eq method?

@dhardy
Copy link
Member

dhardy commented May 16, 2020

Thinking about this some more: if x, y are RNGs, then x == y implies that for any pure function f on RNGs, f(x) == f(y). Also, for any pure function g which permutes (and returns) RNGs we should have g(x.clone()) == g(x). The derive(PartialEq) impls do satisfy this.

In general, though, x != y tells us nothing at all (not even that x and y will eventually diverge — this could be hard to test on large chaotic PRNGs). We should document this on the RNGs in question, but probably not worry about it?

@kazcw
Copy link
Collaborator

kazcw commented May 16, 2020

I think an impl that satisfies the definition of an equivalence relation but yields results that depend on otherwise-unobservable implementation details is likely to surprise users. I think in practice people will expect that x == y implies that after cargo update, x == y. I also think that documenting caveats for a transitively-autoderivable trait will not help end users of the trait except in postmortem.

A least-surprise eq can be done more simply than my original suggestion: don't impl the traits for BlockRng, just on the struct that wraps it. For a seekable generator like ChaCha, the wrapper can test whether it's the same stream (implemented upstream) and it has the same get_word_pos. For a non-seekable generator, the buffer is always at a self-aligned position in the stream, so the wrapper can compare generator states and buffer positions. Either case is straightforward; the cost comes from trying to abstract over generators for BlockRng.

Shipped Eq impls in ppv-lite86 0.2.7.

@dhardy
Copy link
Member

dhardy commented May 17, 2020

Good points @kazcw (though the above still allows fixing false negatives in updates).

So then we don't derive for BlockRng. This implies however that we need some API to access the unused items in the buffer, e.g.:

    /// Read remaining entries in the buffer
    #[inline]
    pub fn remaining_buf(&self) -> &[R::Item] {
        &self.results.as_ref()[self.index..]
    }

(Also for BlockRng64.) I don't exactly like exposing the internals like this, but it's probably the best option.

@rickvanprim
Copy link
Author

@dhardy I'm not sure I'm fully following this discussion. If no external state is relied upon, then I don't see how two instances that are bitwise identical can diverge when performing the same operations.

My use case just needs one type of PRNG that only uses its internal state to calculate the next value and is therefore deterministic, and then I'd like that PRNG to implement PartialEq/Eq. However for all PRNGs where those properties apply, I think it would make sense to implement PartialEq/Eq. For ones that rely on eternal state, I could see arguments in either direction for whether or not it makes sense for them to implement PartialEq/Eq, but I don't have an opinion there.

@dhardy
Copy link
Member

dhardy commented May 18, 2020

two instances that are bitwise identical

No, these will test equal and not diverge. The first point is that by using seek operations, some buffered PRNGs like ChaCha may not be bitwise identical (due to buffer state) but still behave identically; ideally these should test equal but the argument above is that it doesn't matter much. The second point is that it is possible (though extremely unlikely) that two unequal PRNGs will produce the same output for the next n operations, then diverge.

RNGs relying on external state (OsRng, ReseedingRng, ...) should either not implement PartialEq or should always test not equal — but without good reason, it is better not to implement the == operator.

@rickvanprim
Copy link
Author

Thanks for the explanation. For my purposes, bitwise identical is all that matters, but handling the buffered case differently could make sense.

@kazcw
Copy link
Collaborator

kazcw commented May 18, 2020

The second point is that it is possible (though extremely unlikely) that two unequal PRNGs will produce the same output for the next n operations, then diverge.

@dhardy I'm not sure what this is in reference to. Neither the impl in the PR nor my suggestion rely on comparing buffer contents or other PRNG output (the derived eq in the PR compares buffers, but doesn't rely on the result since it also compares BlockRngCores).


The issue that I'm concerned about, and elaborated on in my second comment, is that a bitwise comparison of BlockRngs returns implementation-defined results under conditions where I wouldn't fault a user for expecting consistent results.

Lets define some terms.

identical is defined inductively:

  • two RNGs x and y created from the same parameters are identical on creation
  • the output of clone is identical to the input
  • if x is identical to y, f(x) is identical to f(y) for any sequence of operations f

equivalent is broader:

  • two RNGs x and y created from the same parameters are equivalent on creation
  • the output of clone is equivalent to the input
  • if x is equivalent to y, f(x) is equivalent to g(y) for any sequences of operations f and g such that f and g advance the stream by the same number of bytes

I think an implementation of eq that returned true if and only if the inputs were identical might be acceptable because at least it is well-defined for all inputs. But that's not what the derive-based implementation does; the implementation in this PR has three cases:

  • identical states: return true
  • non-identical, non-equivalent states: return false
  • non-identical, but equivalent states: return unspecified results, that may vary between patch releases of the rand crates

The problem is the indeterminacy in the 3rd case; a user who expects eq to test equivalence may get results consistent with that expectation for the sequences of operations they test... and may get different results with a different rand version.

I see no benefit to the indeterminate behavior; it would not be appreciably more complex in implementation to test equivalence (using stream_eq):

impl PartialEq for $ChaChaXRng {
    fn eq(&self, rhs: &Self) -> bool {
        self.rng.core.state.stream64_eq(&rhs.rng.core.state)
            && self.get_word_pos() == rhs.get_word_pos()
    }
}

@dhardy
Copy link
Member

dhardy commented May 19, 2020

I'm not sure what this is in reference to

It is possible for two completely different PRNGs to produce the same next word of output, and even several words (though obviously you would never except to observe an event this improbable). It has nothing to do with buffers.

non-identical, but equivalent states: return unspecified results, that may vary between patch releases of the rand crates

We haven't yet specified exactly what happens. And I agree, we should be more precise about this (and probably not use derive on BlockRng).

Also, if equivalent RNGs are not guaranteed to test equal, then we should probably only implement PartialEq and not Eq, but...

It would not be appreciably more complex in implementation to test equivalence (using stream_eq)

Great. So I'm not sure why we're arguing?

But your code misses one thing: it must also test results already in the buffer. Right? Or can we guarantee that this is unnecessary (since the buffer is always cleared when seeking)?


So, it's probably easiest if @rickvanprim removes the code affecting BlockRng and the PRNGs which use that from this PR, and then either @kazcw or myself opens a new PR for these PRNGs.

kazcw added a commit to kazcw/rand that referenced this pull request May 19, 2020
As in rust-random#975, but defining equality such that the user is not exposed to
the fact that one logical state may have different representations in an
implementation-specific way.
kazcw added a commit to kazcw/rand that referenced this pull request May 19, 2020
Implements rust-random#974. As in rust-random#975, but defining equality such that the user is
not exposed to the fact that one logical state may have different
representations in an implementation-specific way.
@rickvanprim
Copy link
Author

@dhardy / @kazcw shall I wait for #979 to land and then rebase this PR?

kazcw added a commit to kazcw/rand that referenced this pull request May 28, 2020
Implements rust-random#974. As in rust-random#975, but defining equality such that the user is
not exposed to the fact that one logical state may have different
representations in an implementation-specific way.
dhardy pushed a commit that referenced this pull request May 29, 2020
Implements #974. As in #975, but defining equality such that the user is
not exposed to the fact that one logical state may have different
representations in an implementation-specific way.
@dhardy
Copy link
Member

dhardy commented May 29, 2020

Everything we want from here was already incorporated in #979.

@dhardy dhardy closed this May 29, 2020
@rickvanprim rickvanprim deleted the rickvanprim/partialeq+eq branch June 2, 2020 19:36
@rickvanprim rickvanprim restored the rickvanprim/partialeq+eq branch June 2, 2020 19:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement PartialEq/Eq on PRNG Generators where possible
3 participants