Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to manually set random seed globally #76

Closed
Megavoxel01 opened this issue Apr 13, 2020 · 8 comments
Closed

Option to manually set random seed globally #76

Megavoxel01 opened this issue Apr 13, 2020 · 8 comments

Comments

@Megavoxel01
Copy link

Hi!

Thanks for this awesome package! I'm wondering if there is any option available to fix the manual seed so I can reproduce same results across different trainning outputs. Currently I try to manually set the random seeds for pytorch and numpy under train_pytorch.py and dataloader/sampler.py but the final output embeddings of multiple trainning attempts are still different. Is there any workaround for this?

Thanks for any help in advance.

@Megavoxel01 Megavoxel01 changed the title Option to set manual seed globally Option to manually set random seed globally Apr 13, 2020
@zheng-da
Copy link
Contributor

The randomness probably comes from DGL sampler. You can try this. https://docs.dgl.ai/en/0.4.x/generated/dgl.random.seed.html
I think it should fix the randomness of the DGL sampler. If not, we'll fix it. Thanks

@Megavoxel01
Copy link
Author

Thanks for the reply!
After some digging, it seems like OMP_NUM_THREADS also need to be set to 1 to get same outputs from sampler, since the default edge sampler is using multi-threading when creating negative entity list. However, the final output embeddings ares still different. I'm wondering is it even possible to get completetly same embedding from different training process, especially when using multi-process/thread and multi GPU?

@zheng-da
Copy link
Contributor

If it's multithreading or multiprocessing, I think it's impossible to get it reproducible. I'm not sure if any of the GPU parallel computation can make it non-deterministic as well.

@Megavoxel01
Copy link
Author

Megavoxel01 commented Apr 22, 2020

Thanks for your help!
I managed to produce deterministic results on both CPU and GPU. Here's what I've done.

  1. Fix random seed of numpy, Pytorch and DGL. For Pytorch on GPU, CUDA random seed and some other CuDNN related options need to be set as well.(https://pytorch.org/docs/stable/notes/randomness.html)
  2. Set both --num_thread and --num_proc to 1, and turn off all other multi-thread related options(like --async_update).
  3. Set OMP_NUM_THREADS=1. This step is crucial, since even if you set both thread num and procedure num to 1, DGL will still use OpenMP to automatically parallelize other jobs at background, which will introduce randomness.

@zheng-da
Copy link
Contributor

Good to know. Thanks for showing us how to make the training deterministic. It'll be useful for future users.

@xinyi-zhao
Copy link

3. OMP_NUM_THREAD

Does the second operation matter? And how to set num_thread and num_proc to be 1?

Thank you.

@Dinxin
Copy link

Dinxin commented Jun 15, 2020

@Megavoxel01, How to set num_threads and num_proc to be 1 and Set OMP_NUM_THREADS=1?

@fburdet
Copy link

fburdet commented Feb 23, 2022

@Megavoxel01 can you please give more detailed information (actual files to modify and code) to get a deterministic model?

Or is there any chance that it has been integrated into dgl-ke since then?

Thanks in advance

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants