migrate rand() to new random number library #18333

chardan · 2017-10-16T20:54:37Z

Migrates call sites to the new random number library. Should affect any rand() calls, as well as calls to Boost random and other random number generation. Also includes enhancements to Ceph random number library to allow parametric parameters to the default calls, along with simplifications to the implementation (could make a separate PR if desired).

Please check to see if I have migrated things appropriately, there were lots of moving parts in some of these!

Thank you!

joscollin · 2017-10-18T02:04:23Z

retest this please

joscollin · 2017-10-18T02:15:32Z

@chardan

See this:

/home/jenkins-build/build/workspace/ceph-pull-requests/src/common/WeightedPriorityQueue.h:319: undefined reference to `ceph::__ceph_assert_fail(char const*, char const*, int, char const*)'
/home/jenkins-build/build/workspace/ceph-pull-requests/src/common/WeightedPriorityQueue.h:319: undefined reference to `ceph::__ceph_assert_fail(char const*, char const*, int, char const*)'

amitkumar50 · 2017-10-18T16:06:52Z

src/client/Client.cc

@@ -8218,7 +8220,7 @@ int Client::lookup_hash(inodeno_t ino, inodeno_t dirino, const char *name,
  req->set_filepath2(path2);

  int r = make_request(req, perms, NULL, NULL,
-		       rand() % mdsmap->get_num_in_mds());
+		               ceph::util::generate_random_number(mdsmap->get_num_in_mds()));


NIT: tab left

chardan · 2017-10-19T16:08:42Z

Jenkins retest this please.

cbodley · 2017-10-26T14:28:15Z

src/include/random.h

+
+template <typename NumberT>
+using default_distribution = typename
+	default_distribution_t<NumberT, std::is_integral<NumberT>::value>::type;


i really like this template and how it avoids duplicating interfaces for int/real 👍

i just think the naming is backwards - i'd expect the struct to be named default_distribution, with a helper typedef named default_distribution_t. that's the pattern you see from the standard library, ex. std::enable_if

cbodley · 2017-10-26T15:32:33Z

src/include/random.h

-  return detail::generate_random_number<IntegerT, DistributionT, EngineT>
-          (limits::min(), limits::max());
+  return detail::generate_random_number<NumberT, DistributionT, EngineT>
+          (0, std::numeric_limits<NumberT>::max());


was this change from limits::min() back to 0 here intentional? we discussed this case previously here

i understand your point that min=0 makes the interface closer to the rand() function that it aims to replace, but i think that a more explicit interface would be less confusing and prone to bugs in the long run

so first, i'd argue for removing the generate_random_number(max) overloads entirely, and require the caller to pass (0, max) instead:

it only costs 2 extra characters of typing, and there's no confusion at the call site about the range of output values

it eliminates confusion about whether this single-argument overload is the same as generate_random_number(min, max) but with a default argument for max=limits::max() (which was my first guess)

it avoids the case of invalid input, like what happens if you call generate_random_number(-5)?

i see this generate_random_number() overload as a useful way to avoid typing out all of the std::numeric_limits<> crap, but setting min=0 seems arbitrary to me - if anything, it should use the full range of values available to the given type. that said, there may be alternative ways to save typing, like providing aliases for some common distributions, and making first-class overloads for generate_random_number(dist) that take them. then you could do something like this: generate_random_number(positive_ints);

regardless, i think we should add overloads that take a distribution, and implement the (min, max) versions in terms of that. i still don't like the way we're using thread_local for distributions that could just as easily be on the stack and stored in registers. (i'm not sure exactly what the standard says about thread_local storage, but there's bound to be some overhead there)

chardan · 2017-11-02T21:21:21Z

Hi @cbodley,

There's a lot here to unpack, but here we go!

Thanks for pointing out that thread_local distributions aren't required-- a mini-benchmark suggests that making static improves the performance of the generator!
I'm planning to add support for passing distributions to functions; notes below.
Thank you for pointing out the change in range "back" to zero. This was actually a bug fix, and I've added a test for it. The correct behavior of this library is as follows:

nullary (niladic) function: range [0... max]
unary function: range [0... n]
binary function: range [a... b], a >= b

(This excludes engine parameters, etc., merely the range parameters.)

The primary purpose of the library is found in the first two forms. The third form is actually the gravy. The first form is meant as an improved replacement for rand(). The second is for handling the overwhelmingly most common case (in Ceph, and probably most other programs) of eliminating code in the form:

rand() % bound

...for the well-known reasons that this can badly skew the output's distribution, and that there is an assumption on the engine's span.(1) That makes this form particularly important, and not something I think should be removed.

I agree with you that it's arbitrary that the unary parameter is max rather than min, but I offer that in addition to capturing the most common use of rand() in our code, there's considerable precendent for having the unary form represent a maximum bound. Although not universal, it's a popular convention-- for example:
Perl: https://perldoc.perl.org/functions/rand.html
Python: https://docs.python.org/3/library/random.html#random.randrange
Ruby: https://ruby-doc.org/core-2.2.0/Random.html
Java: (see nextInt()): https://docs.oracle.com/javase/8/docs/api/java/util/Random.html
Erlang: (>= 1): http://erlang.org/doc/man/random.html

Finally, note that the default parameters for (as example) std::uniform_int_distribution also range from [0, max] and that bad input is undefined behavior, which we are consistent with:
http://en.cppreference.com/w/cpp/numeric/random/uniform_int_distribution/uniform_int_distribution
With respect to passing bad input, that results in undefined behavior (just as it does if you pass such input to <random>). Note that using the binary form in no way prevents this, ie. these are undefined behavior (naturally, these may be variables, rather than constants):
generate_random_number(10, 1);
generate_random_number(0, -1);

We could avoid this by checking the value of the parameters at runtime and raising an exception, but I don't think we'll see this often enough in practice to warrant the almost certain performance impact. But, it could definitely be discussed as another PR.

While definitely concerned that you got bitten by this, I'm not convinced that it will be a common source of bugs, or a universal cause for surprise. If you are still feeling very strongly about this, we can move the discussion to the list, but I hope that I've assuaged your worries here. To strengthen this, in addition to the canonical example in the unit tests stating the expected behavior, I've added explicit tests.

I greatly appreciate your pointing out that thread_local storage isn't needed for distributions-- which improved the performance of the library!

I'd like to implement your suggestion of allowing distributions as parameters. I actually spent all day on it and unfortunately the SFINAE is going to be a headache in C++11-- I've juuust about got some of it working, but it needs to go into a separate PR for my own sanity. (Plus, it's now holding up this changeset.)

Thank you again for your thoughtful input, discussion, and time!

Brown, Walter. N3551, "Random Number Generation in C++". "https://isocpp.org/files/papers/n3551.pdf".

chardan · 2017-12-05T23:21:36Z

Jenkins retest this please.

Signed-off-by: Jesse Williamson <jwilliamson@suse.de>

chardan · 2018-02-21T15:44:28Z

So, how are people feeling? Shall I try to break this into some smaller changesets, or is it better all together? For instance, I could separate the library changes, adjustments to Ceph's code, and modifications to the unit tests.

chardan · 2018-03-01T14:13:21Z

I'm going to find a way to break this into smaller bites!

chardan · 2018-03-01T14:13:50Z

Refer-to:
#20670

chardan requested a review from cbodley October 16, 2017 20:54

chardan force-pushed the jfw-wip-puppies_against_rand branch from ca82c42 to a9a1d24 Compare October 17, 2017 23:21

amitkumar50 reviewed Oct 18, 2017

View reviewed changes

chardan force-pushed the jfw-wip-puppies_against_rand branch 2 times, most recently from ab2db8a to 7fde78b Compare October 19, 2017 02:47

chardan force-pushed the jfw-wip-puppies_against_rand branch 3 times, most recently from 3fe5c3d to 769565f Compare October 25, 2017 02:14

cbodley reviewed Oct 26, 2017

View reviewed changes

cbodley added the common label Oct 26, 2017

chardan force-pushed the jfw-wip-puppies_against_rand branch 2 times, most recently from 15b84ac to 3cfb95c Compare November 2, 2017 21:07

chardan force-pushed the jfw-wip-puppies_against_rand branch 3 times, most recently from 2b0a1ff to f10a27a Compare November 8, 2017 23:00

chardan force-pushed the jfw-wip-puppies_against_rand branch from f10a27a to d91eb1d Compare November 10, 2017 23:42

chardan force-pushed the jfw-wip-puppies_against_rand branch 5 times, most recently from b5be236 to 9d9b796 Compare December 1, 2017 20:52

chardan force-pushed the jfw-wip-puppies_against_rand branch from 9d9b796 to fbaf2c1 Compare December 5, 2017 20:24

chardan force-pushed the jfw-wip-puppies_against_rand branch 2 times, most recently from d09ceb7 to 8a763b2 Compare December 7, 2017 22:54

joscollin mentioned this pull request Dec 8, 2017

osd: Using /dev/urandom instead of rand() #19379

Closed

Jesse Williamson added 19 commits February 21, 2018 02:24

msg: migrate rand() to <random>

a05ad1a

Signed-off-by: Jesse Williamson <jwilliamson@suse.de>

key_value_store: migreate rand() to <random>

6b5e5d5

Signed-off-by: Jesse Williamson <jwilliamson@suse.de>

os: migrate rand() to <random> (BlueStore)

5610c94

Signed-off-by: Jesse Williamson <jwilliamson@suse.de>

os: migrate rand() to <random> (filestore)

ad37870

Signed-off-by: Jesse Williamson <jwilliamson@suse.de>

tools: migrate rand() to <random>

43d8745

Signed-off-by: Jesse Williamson <jwilliamson@suse.de>

pgbackend: fixup warning

bd6f4b9

Signed-off-by: Jesse Williamson <jwilliamson@suse.de>

client: migrate Mutex code

6512656

Signed-off-by: Jesse Williamson <jwilliamson@suse.de>

common: migrate Mutex code

e99138a

Signed-off-by: Jesse Williamson <jwilliamson@suse.de>

include: migrate Mutex code

ff41630

Signed-off-by: Jesse Williamson <jwilliamson@suse.de>

key_value_store: migrate Mutex code

97f7f83

Signed-off-by: Jesse Williamson <jwilliamson@suse.de>

mds: migrate Mutex code

4489390

Signed-off-by: Jesse Williamson <jwilliamson@suse.de>

mon: migrate Mutex code

a51a816

Signed-off-by: Jesse Williamson <jwilliamson@suse.de>

msg: migrate Mutex code

a4ff7ff

Signed-off-by: Jesse Williamson <jwilliamson@suse.de>

os: migrate Mutex code

5793967

Signed-off-by: Jesse Williamson <jwilliamson@suse.de>

osd: migrate Mutex code

592e34e

Signed-off-by: Jesse Williamson <jwilliamson@suse.de>

osdc: migrate Mutex code

0a4f183

Signed-off-by: Jesse Williamson <jwilliamson@suse.de>

rgw: migrate Mutex code

190094e

Signed-off-by: Jesse Williamson <jwilliamson@suse.de>

test: migrate Mutex code

8a32827

Signed-off-by: Jesse Williamson <jwilliamson@suse.de>

tools: migrate Mutex code

1e3d5be

Signed-off-by: Jesse Williamson <jwilliamson@suse.de>

chardan force-pushed the jfw-wip-puppies_against_rand branch from b549477 to 66eed3d Compare February 21, 2018 10:29

Resolve test regressions.

66eed3d

Signed-off-by: Jesse Williamson <jwilliamson@suse.de>

chardan self-assigned this Feb 23, 2018

chardan added the needs-review label Feb 23, 2018

chardan closed this Mar 1, 2018

chardan mentioned this pull request Mar 1, 2018

Extends random.h: numeric types relaxed to compatible types (with #20670

Merged

chardan reopened this Mar 1, 2018

chardan closed this Mar 1, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

migrate rand() to new random number library #18333

migrate rand() to new random number library #18333

chardan commented Oct 16, 2017

joscollin commented Oct 18, 2017

joscollin commented Oct 18, 2017

amitkumar50 Oct 18, 2017

chardan commented Oct 19, 2017

cbodley Oct 26, 2017

cbodley Oct 26, 2017

chardan commented Nov 2, 2017 •

edited

chardan commented Dec 5, 2017

chardan commented Feb 21, 2018

chardan commented Mar 1, 2018

chardan commented Mar 1, 2018

migrate rand() to new random number library #18333

migrate rand() to new random number library #18333

Conversation

chardan commented Oct 16, 2017

joscollin commented Oct 18, 2017

joscollin commented Oct 18, 2017

amitkumar50 Oct 18, 2017

Choose a reason for hiding this comment

chardan commented Oct 19, 2017

cbodley Oct 26, 2017

Choose a reason for hiding this comment

cbodley Oct 26, 2017

Choose a reason for hiding this comment

chardan commented Nov 2, 2017 • edited

chardan commented Dec 5, 2017

chardan commented Feb 21, 2018

chardan commented Mar 1, 2018

chardan commented Mar 1, 2018

chardan commented Nov 2, 2017 •

edited