Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Refactor] C random number generator #729

Merged
merged 17 commits into from Jul 31, 2019

Conversation

@BarclayII
Copy link
Collaborator

commented Jul 26, 2019

Description

This PR replaces rand_r with C++ random number generators, dedicated for OpenMP multithreading. This should provide reproducibility in sampling as well.

Each OpenMP thread would get its own random number generator associated with the thread ID.

New C++ interfaces:

  • Random::SetSeed(seed) RandomEngine::ThreadLocal()->SetSeed(seed): set the seed of C random number generator.
  • Random::RandInt(a, b) RandomEngine::ThreadLocal()->RandInt(a, b): generate a random integer in [a, b) uniformly.
  • Random::Uniform(a, b) RandomEngine::ThreadLocal()->Uniform(a, b): generate a random float in [a, b) uniformly.

New Python interface:

  • dgl.contrib.sampling.seed(val) dgl.random.seed(val): set the seed of C random number generator from Python.

EDIT: moved to the thread-local style of TVM, hence changing the C++ interface.

Checklist

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [$CATEGORY] (such as [Model], [Doc], [Feature]])
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage
  • Code is well-documented
  • To the my best knowledge, examples are either not affected by this change,
    or have been fixed to be compatible with this change

BarclayII added some commits Jul 25, 2019

@BarclayII BarclayII changed the title [Refactor] C random number generator [WIP][Refactor] C random number generator Jul 26, 2019

BarclayII added some commits Jul 26, 2019

@jermainewang

This comment has been minimized.

Copy link
Member

commented Jul 26, 2019

dgl.contrib.sampling.seed(val): set the seed of C random number generator from Python.

How about dgl.random.seed(val)? The random seed might be used by other components as well.

@jermainewang

This comment has been minimized.

Copy link
Member

commented Jul 26, 2019

I think concurrency and rng are two things. If a user want to get a deterministic result, s/he should set the number of workers to be one and fix the seed. Otherwise, the behavior may or may not be non-deterministic (undefined), making implementation easier. I don't like the idea of coupling RNG with omp, they are two things. My suggestion:

  • Use a global singleton seed that could be set/get by the end user. We could assume that only one thread is setting the seed so we don't need to protect it.
  • Use thread-local rng for each thread. It will use the global seed if available, otherwise, use local time seed.
  • Do nothing about concurrency.

BarclayII added some commits Jul 28, 2019

@BarclayII BarclayII changed the title [WIP][Refactor] C random number generator [Refactor] C random number generator Jul 28, 2019

@BarclayII

This comment has been minimized.

Copy link
Collaborator Author

commented Jul 28, 2019

This PR has adopted the solution from TVM, and should be ready for review.

@BarclayII BarclayII requested a review from jermainewang Jul 28, 2019

@jermainewang
Copy link
Member

left a comment

LGTM. Minor comments.

Show resolved Hide resolved python/dgl/random.py
Show resolved Hide resolved src/random.cc Outdated

@BarclayII BarclayII merged commit e9e587b into dmlc:master Jul 31, 2019

1 check passed

continuous-integration/jenkins/pr-merge This commit looks good
Details

@BarclayII BarclayII deleted the BarclayII:crandom branch Jul 31, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.