Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cant' work on pytorch 0.4.0 #52

Closed
jiakai0419 opened this issue May 8, 2018 · 8 comments
Closed

Cant' work on pytorch 0.4.0 #52

jiakai0419 opened this issue May 8, 2018 · 8 comments

Comments

@jiakai0419
Copy link

jiakai0419 commented May 8, 2018

macOS 10.13.4
Python 3.6.4
pytorch 0.4.0

I encountered an error

~/py-garage/pytorch-a3c(master*) » python3 main.py --env-name "PongDeterministic-v4" --num-processes 1                                                        anya@turing-machine
WARN: gym.spaces.Box autodetected dtype as <class 'numpy.float32'>. Please provide explicit dtype.
WARN: gym.spaces.Box autodetected dtype as <class 'numpy.float32'>. Please provide explicit dtype.
WARN: gym.spaces.Box autodetected dtype as <class 'numpy.float32'>. Please provide explicit dtype.
WARN: <class 'envs.AtariRescale42x42'> doesn't implement 'observation' method. Maybe it implements deprecated '_observation' method.
WARN: <class 'envs.AtariRescale42x42'> doesn't implement 'observation' method. Maybe it implements deprecated '_observation' method.
/Users/anya/py-garage/pytorch-a3c/test.py:37: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  cx = Variable(torch.zeros(1, 256), volatile=True)
/Users/anya/py-garage/pytorch-a3c/test.py:38: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  hx = Variable(torch.zeros(1, 256), volatile=True)
/Users/anya/py-garage/pytorch-a3c/test.py:44: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  state.unsqueeze(0), volatile=True), (hx, cx)))
/Users/anya/py-garage/pytorch-a3c/train.py:55: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
  prob = F.softmax(logit)
/Users/anya/py-garage/pytorch-a3c/test.py:45: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
  prob = F.softmax(logit)
/Users/anya/py-garage/pytorch-a3c/train.py:56: UserWarning: Implicit dimension choice for log_softmax has been deprecated. Change the call to include dim=X as an argument.
  log_prob = F.log_softmax(logit)
Process Process-2:
Traceback (most recent call last):
  File "/usr/local/Cellar/python/3.6.4_4/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/usr/local/Cellar/python/3.6.4_4/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/anya/py-garage/pytorch-a3c/train.py", line 60, in train
    action = prob.multinomial().data
TypeError: multinomial() missing 1 required positional arguments: "num_samples"
/Users/anya/py-garage/pytorch-a3c/test.py:40: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  cx = Variable(cx.data, volatile=True)
/Users/anya/py-garage/pytorch-a3c/test.py:41: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  hx = Variable(hx.data, volatile=True)
Time 00h 00m 01s, num steps 0, FPS 0, episode reward -21.0, episode length 812

I try to fix it
TypeError: multinomial() missing 1 required positional arguments: "num_samples"

diff --git a/train.py b/train.py
index 1b9c139..e3f0143 100644
--- a/train.py
+++ b/train.py
@@ -57,7 +57,7 @@ def train(rank, args, shared_model, counter, lock, optimizer=None):
             entropy = -(log_prob * prob).sum(1, keepdim=True)
             entropies.append(entropy)

-            action = prob.multinomial().data
+            action = prob.multinomial(num_samples=1).data
             log_prob = log_prob.gather(1, Variable(action))

             state, reward, done, _ = env.step(action.numpy())

I encountered a new error

~/py-garage/pytorch-a3c(master*) » python3 main.py --env-name "PongDeterministic-v4" --num-processes 1                                                        anya@turing-machine
WARN: gym.spaces.Box autodetected dtype as <class 'numpy.float32'>. Please provide explicit dtype.
WARN: gym.spaces.Box autodetected dtype as <class 'numpy.float32'>. Please provide explicit dtype.
WARN: gym.spaces.Box autodetected dtype as <class 'numpy.float32'>. Please provide explicit dtype.
WARN: <class 'envs.AtariRescale42x42'> doesn't implement 'observation' method. Maybe it implements deprecated '_observation' method.
WARN: <class 'envs.AtariRescale42x42'> doesn't implement 'observation' method. Maybe it implements deprecated '_observation' method.
/Users/anya/py-garage/pytorch-a3c/test.py:37: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  cx = Variable(torch.zeros(1, 256), volatile=True)
/Users/anya/py-garage/pytorch-a3c/test.py:38: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  hx = Variable(torch.zeros(1, 256), volatile=True)
/Users/anya/py-garage/pytorch-a3c/test.py:44: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  state.unsqueeze(0), volatile=True), (hx, cx)))
/Users/anya/py-garage/pytorch-a3c/train.py:55: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
  prob = F.softmax(logit)
/Users/anya/py-garage/pytorch-a3c/train.py:56: UserWarning: Implicit dimension choice for log_softmax has been deprecated. Change the call to include dim=X as an argument.
  log_prob = F.log_softmax(logit)
/Users/anya/py-garage/pytorch-a3c/test.py:45: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
  prob = F.softmax(logit)
/Users/anya/py-garage/pytorch-a3c/test.py:40: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  cx = Variable(cx.data, volatile=True)
/Users/anya/py-garage/pytorch-a3c/test.py:41: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  hx = Variable(hx.data, volatile=True)
/Users/anya/py-garage/pytorch-a3c/train.py:108: UserWarning: torch.nn.utils.clip_grad_norm is now deprecated in favor of torch.nn.utils.clip_grad_norm_.
  torch.nn.utils.clip_grad_norm(model.parameters(), args.max_grad_norm)
Process Process-2:
Traceback (most recent call last):
  File "/usr/local/Cellar/python/3.6.4_4/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/usr/local/Cellar/python/3.6.4_4/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/anya/py-garage/pytorch-a3c/train.py", line 111, in train
    optimizer.step()
  File "/Users/anya/py-garage/pytorch-a3c/my_optim.py", line 70, in step
    p.data.addcdiv_(-step_size, exp_avg, denom)
TypeError: addcdiv_() takes 2 positional arguments but 3 were given
Time 00h 00m 01s, num steps 20, FPS 13, episode reward -21.0, episode length 812

I am very confused and cannot fix it.
TypeError: addcdiv_() takes 2 positional arguments but 3 were given

@ikostrikov
Copy link
Owner

ikostrikov commented May 8, 2018

Fixed in e898f75

Also there are some warnings that do not affect performance of the algorithm.

I will fix them closer to the 0.5.0 release.

@jiakai0419
Copy link
Author

Your efficient response touched me.

@jiakai0419
Copy link
Author

@ikostrikov I think the performance is affected in version 0.4.0

python3 main.py --env-name "PongDeterministic-v4" --num-processes 16
Time 00h 00m 09s, num steps 5031, FPS 519, episode reward -21.0, episode length 812
Time 00h 01m 10s, num steps 35482, FPS 501, episode reward -2.0, episode length 100
Time 00h 02m 11s, num steps 66664, FPS 505, episode reward -2.0, episode length 100
Time 00h 03m 13s, num steps 97058, FPS 503, episode reward -2.0, episode length 100
Time 00h 04m 14s, num steps 128517, FPS 504, episode reward -2.0, episode length 108
Time 00h 05m 24s, num steps 163141, FPS 502, episode reward -21.0, episode length 764
Time 00h 06m 34s, num steps 200426, FPS 508, episode reward -21.0, episode length 764
Time 00h 07m 57s, num steps 245725, FPS 514, episode reward -21.0, episode length 1942
Time 00h 09m 16s, num steps 284730, FPS 511, episode reward -21.0, episode length 1324
Time 00h 10m 41s, num steps 325153, FPS 507, episode reward -21.0, episode length 1324
Time 00h 12m 01s, num steps 361563, FPS 501, episode reward -21.0, episode length 1324
Time 00h 13m 28s, num steps 406910, FPS 503, episode reward -21.0, episode length 1964
Time 00h 14m 53s, num steps 450836, FPS 505, episode reward -21.0, episode length 1964
Time 00h 16m 22s, num steps 493876, FPS 503, episode reward -21.0, episode length 1964

@ikostrikov
Copy link
Owner

How many cores do you have on your machine?

Is seems to start learning something since the length of the episodes goes up.

@jiakai0419
Copy link
Author

  Number of Processors:	1
  Total Number of Cores:	2

I will test that on 64 cores machine.

@ikostrikov
Copy link
Owner

On 2 core machine it will just take a lot of time. I would expect a decent reward after 1h of training on Pong.

@ph-dev-2016
Copy link

ph-dev-2016 commented Oct 7, 2018

I'm using pytorch-cpu 0.4.1 and Python3.7 in Windows7.

Still see this error "TypeError: multinomial() missing 1 required positional arguments: "num_samples""

@ikostrikov
Copy link
Owner

@ph-dev-2016 Try with the most recent version of this repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants