New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Alphanumeric
samples bytes instead of chars
#1012
Conversation
Sampling a random alphanumeric string by collecting chars (that are known to be ASCII) into a String involves re-allocation as String is encoding to UTF-8, via the example: ```rust let chars: String = iter::repeat(()) .map(|()| rng.sample(Alphanumeric)) .take(7) .collect(); ``` I wanted to get rid of the clearly unnecessary re-allocations in my applications, so I needed to be able to access to the ASCII characters as simple bytes. It seems like that was already what was going on inside Alphanumeric however, it was just internal. This PR changes the `Distribution<char>` impl to provide `u8`s (which it generates internally) instead, and implements the previous `Distribution<char>` using it. One could then, for example, do this: ```rust let mut rng = thread_rng(); let bytes = (0..7).map(|_| rng.sample(ByteAlphanumeric)).collect(); let chars = unsafe { String::from_utf8_unchecked(bytes) }; ```
The corresponds more closely to the internally used types and can be easily converted to a `char` via `From` and `Into`, while being more flexible to use. This is a breaking change.
@@ -249,7 +250,7 @@ mod tests { | |||
'\u{ed692}', | |||
'\u{35888}', | |||
]); | |||
test_samples(&Alphanumeric, 'a', &['h', 'm', 'e', '3', 'M']); | |||
test_samples(&Alphanumeric, 0, &[104, 109, 101, 51, 77]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now we need .map(..)
for distributions?
Possible I think, and in a way it makes sense. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now we need .map(..) for distributions?
How do you mean that? You need that if you want char
, but I think that makes sense, because the conversion from u8
to char
is trivial, but the other direction is not. I think it makes more sense to see Alphanumeric
as a distribution of bytes, because this type is more narrow, and it's unfortunate to throw away that compile-time knowledge by forcing a conversion to char
.
If you prefer, we can also use this for the test:
test_samples(&Alphanumeric, 0, &[104, 109, 101, 51, 77]); | |
test_samples(&Alphanumeric, b'a', &[b'h', b'm', b'e', b'3', b'M']); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Edit: start again.)
The point is that here you could replace &Alphanumeric
with &Alphanumeric.map(char::from)
and keep the other args to test_samples
as chars. Of course that doesn't matter for this test, but may be mildly useful elsewhere — though maybe not often, since we can already do Alphanumeric.sample_iter(rng).map(char::from)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point, we might as well implement Iterator
for distributions, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do: the .sample_iter(rng)
method. The RNG has to be attached somehow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is preferable to convert the distribution into an iterator for such cases. This also supports all the other Iterator
methods, without adding more API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed — this appears the best option.
Includes and thus closes #935.