New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
migrate rand() to new random number library #18333
Conversation
ca82c42
to
a9a1d24
Compare
retest this please |
See this:
|
src/client/Client.cc
Outdated
@@ -8218,7 +8220,7 @@ int Client::lookup_hash(inodeno_t ino, inodeno_t dirino, const char *name, | |||
req->set_filepath2(path2); | |||
|
|||
int r = make_request(req, perms, NULL, NULL, | |||
rand() % mdsmap->get_num_in_mds()); | |||
ceph::util::generate_random_number(mdsmap->get_num_in_mds())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: tab left
ab2db8a
to
7fde78b
Compare
Jenkins retest this please. |
3fe5c3d
to
769565f
Compare
src/include/random.h
Outdated
|
||
template <typename NumberT> | ||
using default_distribution = typename | ||
default_distribution_t<NumberT, std::is_integral<NumberT>::value>::type; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i really like this template and how it avoids duplicating interfaces for int/real 👍
i just think the naming is backwards - i'd expect the struct to be named default_distribution
, with a helper typedef named default_distribution_t
. that's the pattern you see from the standard library, ex. std::enable_if
src/include/random.h
Outdated
return detail::generate_random_number<IntegerT, DistributionT, EngineT> | ||
(limits::min(), limits::max()); | ||
return detail::generate_random_number<NumberT, DistributionT, EngineT> | ||
(0, std::numeric_limits<NumberT>::max()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
was this change from limits::min()
back to 0
here intentional? we discussed this case previously here
i understand your point that min=0
makes the interface closer to the rand()
function that it aims to replace, but i think that a more explicit interface would be less confusing and prone to bugs in the long run
so first, i'd argue for removing the generate_random_number(max)
overloads entirely, and require the caller to pass (0, max)
instead:
- it only costs 2 extra characters of typing, and there's no confusion at the call site about the range of output values
- it eliminates confusion about whether this single-argument overload is the same as
generate_random_number(min, max)
but with a default argument formax=limits::max()
(which was my first guess) - it avoids the case of invalid input, like what happens if you call
generate_random_number(-5)
?
i see this generate_random_number()
overload as a useful way to avoid typing out all of the std::numeric_limits<>
crap, but setting min=0
seems arbitrary to me - if anything, it should use the full range of values available to the given type. that said, there may be alternative ways to save typing, like providing aliases for some common distributions, and making first-class overloads for generate_random_number(dist)
that take them. then you could do something like this: generate_random_number(positive_ints);
regardless, i think we should add overloads that take a distribution, and implement the (min, max) versions in terms of that. i still don't like the way we're using thread_local
for distributions that could just as easily be on the stack and stored in registers. (i'm not sure exactly what the standard says about thread_local storage, but there's bound to be some overhead there)
15b84ac
to
3cfb95c
Compare
Hi @cbodley, There's a lot here to unpack, but here we go!
nullary (niladic) function: range [0... max] (This excludes engine parameters, etc., merely the range parameters.) The primary purpose of the library is found in the first two forms. The third form is actually the gravy. The first form is meant as an improved replacement for rand(). The second is for handling the overwhelmingly most common case (in Ceph, and probably most other programs) of eliminating code in the form: rand() % bound ...for the well-known reasons that this can badly skew the output's distribution, and that there is an assumption on the engine's span.(1) That makes this form particularly important, and not something I think should be removed. I agree with you that it's arbitrary that the unary parameter is max rather than min, but I offer that in addition to capturing the most common use of rand() in our code, there's considerable precendent for having the unary form represent a maximum bound. Although not universal, it's a popular convention-- for example: Finally, note that the default parameters for (as example) std::uniform_int_distribution also range from [0, max] and that bad input is undefined behavior, which we are consistent with: We could avoid this by checking the value of the parameters at runtime and raising an exception, but I don't think we'll see this often enough in practice to warrant the almost certain performance impact. But, it could definitely be discussed as another PR. While definitely concerned that you got bitten by this, I'm not convinced that it will be a common source of bugs, or a universal cause for surprise. If you are still feeling very strongly about this, we can move the discussion to the list, but I hope that I've assuaged your worries here. To strengthen this, in addition to the canonical example in the unit tests stating the expected behavior, I've added explicit tests. I greatly appreciate your pointing out that thread_local storage isn't needed for distributions-- which improved the performance of the library! I'd like to implement your suggestion of allowing distributions as parameters. I actually spent all day on it and unfortunately the SFINAE is going to be a headache in C++11-- I've juuust about got some of it working, but it needs to go into a separate PR for my own sanity. (Plus, it's now holding up this changeset.) Thank you again for your thoughtful input, discussion, and time!
|
2b0a1ff
to
f10a27a
Compare
f10a27a
to
d91eb1d
Compare
b5be236
to
9d9b796
Compare
9d9b796
to
fbaf2c1
Compare
Jenkins retest this please. |
d09ceb7
to
8a763b2
Compare
Signed-off-by: Jesse Williamson <jwilliamson@suse.de>
Signed-off-by: Jesse Williamson <jwilliamson@suse.de>
Signed-off-by: Jesse Williamson <jwilliamson@suse.de>
Signed-off-by: Jesse Williamson <jwilliamson@suse.de>
Signed-off-by: Jesse Williamson <jwilliamson@suse.de>
Signed-off-by: Jesse Williamson <jwilliamson@suse.de>
Signed-off-by: Jesse Williamson <jwilliamson@suse.de>
Signed-off-by: Jesse Williamson <jwilliamson@suse.de>
Signed-off-by: Jesse Williamson <jwilliamson@suse.de>
Signed-off-by: Jesse Williamson <jwilliamson@suse.de>
Signed-off-by: Jesse Williamson <jwilliamson@suse.de>
Signed-off-by: Jesse Williamson <jwilliamson@suse.de>
Signed-off-by: Jesse Williamson <jwilliamson@suse.de>
Signed-off-by: Jesse Williamson <jwilliamson@suse.de>
Signed-off-by: Jesse Williamson <jwilliamson@suse.de>
Signed-off-by: Jesse Williamson <jwilliamson@suse.de>
Signed-off-by: Jesse Williamson <jwilliamson@suse.de>
Signed-off-by: Jesse Williamson <jwilliamson@suse.de>
Signed-off-by: Jesse Williamson <jwilliamson@suse.de>
b549477
to
66eed3d
Compare
Signed-off-by: Jesse Williamson <jwilliamson@suse.de>
So, how are people feeling? Shall I try to break this into some smaller changesets, or is it better all together? For instance, I could separate the library changes, adjustments to Ceph's code, and modifications to the unit tests. |
I'm going to find a way to break this into smaller bites! |
Refer-to: |
Migrates call sites to the new random number library. Should affect any rand() calls, as well as calls to Boost random and other random number generation. Also includes enhancements to Ceph random number library to allow parametric parameters to the default calls, along with simplifications to the implementation (could make a separate PR if desired).
Please check to see if I have migrated things appropriately, there were lots of moving parts in some of these!
Thank you!