Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use the random number generator correctly #171

Merged
merged 3 commits into from Apr 8, 2020
Merged

Conversation

svwilliams
Copy link
Contributor

Use the random number generator correctly.

@svwilliams svwilliams requested a review from ayrton04 April 7, 2020 00:20
@@ -87,16 +87,16 @@ namespace uuid
UUID generate()
{
static std::random_device rd;
static std::mt19937 generator(rd());
static std::mt19937_64 generator(rd());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to https://www.boost.org/doc/libs/1_48_0/doc/html/boost_random/reference.html#boost_random.reference.generators the 64-bit Mersenne Twister is significantly slower than the 32-bit one.

The Boost implementation https://www.boost.org/doc/libs/1_48_0/doc/html/boost_random/reference.html#boost.random.mt19937_64 seems to match the STL one https://en.cppreference.com/w/cpp/numeric/random/mersenne_twister_engine, so the values in that table should also be representative of the STL implementation. But from your comment above, that might not be true, so it's probably worth running a benchmark to understand the impact.

Maybe, if efficiency is relevant here (I'm not sure), it's better to generate 4 random variables with the 32-bit version instead of 2 with the 64-bit one. 🤔

BTW I wonder how are these static variables are handled when running things from plugins loaded dynamically (with pluginlib). I'm just not sure if there's a single RNG for the process, or otherwise, one per plugin. 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be easy enough to just print out the memory address of the generator and find out how that's being handled. I was under the impression that scope for these things was limited to the library.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick benchmarks, generating 1 million UUIDs.
32-bit: ~60ms
64-bit: ~33ms
Boost new generator: ~230000ms
Boost reuse generator: ~55ms

Originally I was creating a new generator every time a random uuid was requested:

UUID generate()
{
  return boost::uuids::random_generator()();
}

That appears to be thread-safe, but it is extremely slow. I can, instead, create a static Boost generator and a mutex guard, and then reuse the generator for each call.

UUID generate()
{
  static boost::uuids::random_generator generator;
  static std::mutex generator_mutex;

  UUID u;
  {
    std::lock_guard<std::mutex> lock(generator_mutex);
    u = generator();
  }
  return u;
}

Then the performance is similar to the stdlib version.

As far as performance being relevant, UUID generation was a major percentage of total runtime using the original "new Boost generator" method. That is what prompted the replacement. But the difference between 30ms and 60ms is likely trivial. It probably makes sense to revert back to the Boost generator.

But that doesn't answer the static variable - plugin question. Since fuse_core is a shared library and the static member is in a cpp file, I assume the shared library object owns the variable and it is reused by the plugins when they are loaded.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, did you figure out the answer to the static variable - plugin question?

I guess it's being reused by the plugins from your comment and the fact you've merged the PR 😃

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be double-sure, I printed out the memory address of the Boost generator object every time a UUID was requested. The memory address is always the same. This seems to confirm the "shared single instance" theory, even when runtime-loaded libraries are used.

static std::mt19937 generator(rd());
static std::uniform_int_distribution<uint64_t> distibution;
static std::mutex distribution_mutex;
static boost::uuids::random_generator generator;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@svwilliams svwilliams merged commit 5f28532 into devel Apr 8, 2020
@svwilliams svwilliams deleted the fix-uuids-again branch December 17, 2020 17:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants