Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] for training part of policy gradient #13

Closed
sherlock1987 opened this issue May 27, 2020 · 5 comments
Closed

[BUG] for training part of policy gradient #13

sherlock1987 opened this issue May 27, 2020 · 5 comments
Labels
bug Something isn't working

Comments

@sherlock1987
Copy link

Describe the bug
/pytorch/aten/src/ATen/native/cuda/MultinomialKernel.cu:243: void at::native::::sampleMultinomialOnce(long *, long, int, scalar_t *, scalar_t *, int, int) [with scalar_t = float, accscalar_t = float]: block: [40,0,0], thread: [0,0,0] Assertion val >= zero failed.
/pytorch/aten/src/ATen/native/cuda/MultinomialKernel.cu:243: void at::native::::sampleMultinomialOnce(long *, long, int, scalar_t *, scalar_t *, int, int) [with scalar_t = float, accscalar_t = float]: block: [40,0,0], thread: [1,0,0] Assertion val >= zero failed.
/pytorch/aten/src/ATen/native/cuda/MultinomialKernel.cu:243: void at::native::::sampleMultinomialOnce(long *, long, int, scalar_t *, scalar_t *, int, int) [with scalar_t = float, accscalar_t = float]: block: [9,0,0], thread: [0,0,0] Assertion val >= zero failed.

This will happen when runing train.py in convlab2/policy/pg, and this will happen after 15 epoch even I have already loaded the MLE model.

@sherlock1987 sherlock1987 added the bug Something isn't working label May 27, 2020
@sherlock1987
Copy link
Author

And it is caused by this.
DEBUG:root:<> epoch 19, iteration 0, policy, loss -52.303607639513515
DEBUG:root:<> epoch 19, iteration 1, policy, loss nan
DEBUG:root:<> epoch 19, iteration 2, policy, loss nan
DEBUG:root:<> epoch 19, iteration 3, policy, loss nan
DEBUG:root:<> epoch 19, iteration 4, policy, loss nan
INFO:root:<> epoch 19: saved network to mdl
Process SpawnProcess-81:
Traceback (most recent call last):
File "/home/raliegh/anaconda3/envs/convlab/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
self.run()
File "/home/raliegh/anaconda3/envs/convlab/lib/python3.7/multiprocessing/process.py", line 99, in run
self._target(*self._args, **self._kwargs)
File "/home/raliegh/图片/ConvLab-2/convlab2/policy/pg/train.py", line 61, in sampler
a = policy.predict(s)
File "/home/raliegh/视频/ConvLab-2/convlab2/policy/pg/pg.py", line 55, in predict
a = self.policy.select_action(s_vec.to(device=DEVICE), self.is_train).cpu()
File "/home/raliegh/视频/ConvLab-2/convlab2/policy/rlmodule.py", line 92, in select_action
a = a_probs.multinomial(1).squeeze(1) if sample else a_probs.argmax(1)
RuntimeError: invalid multinomial distribution (encountering probability entry < 0)
Process SpawnProcess-82:
Traceback (most recent call last):
File "/home/raliegh/anaconda3/envs/convlab/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
self.run()
File "/home/raliegh/anaconda3/envs/convlab/lib/python3.7/multiprocessing/process.py", line 99, in run
self._target(*self._args, **self._kwargs)
File "/home/raliegh/图片/ConvLab-2/convlab2/policy/pg/train.py", line 61, in sampler
a = policy.predict(s)
File "/home/raliegh/视频/ConvLab-2/convlab2/policy/pg/pg.py", line 55, in predict
a = self.policy.select_action(s_vec.to(device=DEVICE), self.is_train).cpu()
File "/home/raliegh/视频/ConvLab-2/convlab2/policy/rlmodule.py", line 92, in select_action
a = a_probs.multinomial(1).squeeze(1) if sample else a_probs.argmax(1)
RuntimeError: invalid multinomial distribution (encountering probability entry < 0)
Process SpawnProcess-83:
Traceback (most recent call last):
File "/home/raliegh/anaconda3/envs/convlab/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
self.run()
File "/home/raliegh/anaconda3/envs/convlab/lib/python3.7/multiprocessing/process.py", line 99, in run
self._target(*self._args, **self._kwargs)
File "/home/raliegh/图片/ConvLab-2/convlab2/policy/pg/train.py", line 61, in sampler
a = policy.predict(s)
File "/home/raliegh/视频/ConvLab-2/convlab2/policy/pg/pg.py", line 55, in predict
a = self.policy.select_action(s_vec.to(device=DEVICE), self.is_train).cpu()
File "/home/raliegh/视频/ConvLab-2/convlab2/policy/rlmodule.py", line 92, in select_action
a = a_probs.multinomial(1).squeeze(1) if sample else a_probs.argmax(1)
RuntimeError: invalid multinomial distribution (encountering probability entry < 0)
Process SpawnProcess-84:
Traceback (most recent call last):
File "/home/raliegh/anaconda3/envs/convlab/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
self.run()
File "/home/raliegh/anaconda3/envs/convlab/lib/python3.7/multiprocessing/process.py", line 99, in run
self._target(*self._args, **self._kwargs)
File "/home/raliegh/图片/ConvLab-2/convlab2/policy/pg/train.py", line 61, in sampler
a = policy.predict(s)
File "/home/raliegh/视频/ConvLab-2/convlab2/policy/pg/pg.py", line 55, in predict
a = self.policy.select_action(s_vec.to(device=DEVICE), self.is_train).cpu()
File "/home/raliegh/视频/ConvLab-2/convlab2/policy/rlmodule.py", line 92, in select_action
a = a_probs.multinomial(1).squeeze(1) if sample else a_probs.argmax(1)
RuntimeError: invalid multinomial distribution (encountering probability entry < 0)

@zqwerty
Copy link
Member

zqwerty commented May 27, 2020

Sorry, we will check this as soon as possible

@sherlock1987
Copy link
Author

It is fine, thank you so much, I think this is related to the loss, when loss goes to NAN, it fail to do multi process.

@zqwerty
Copy link
Member

zqwerty commented Jun 10, 2020

We have updated the policy to address this issue. Have a try!

@zqwerty
Copy link
Member

zqwerty commented Jul 16, 2020

move to #54

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants