Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hi, how do I get numpy array froma LazyFrame to simply play the Trained nets ? #19

Open
JulesVerny opened this issue Dec 27, 2018 · 4 comments

Comments

@JulesVerny
Copy link

I have into Chapter 7 of your book. Its really impressive, however many details are buried within this PTAN package.
I believe I have trained a number of nets against atari games in Chapt 7, but to replay them is causing me some frustration. I tried to modify the play game code from Chapter 6. But now state = env.reset() / step returns a ptan.common.wraper.LazyFrames object. It is not obvious how to convert this back into a simple numpy array to select a Single Best Action, for playing a trained game.
state_v = torch.tensor(np.array([state], copy=False))
returns a Typerror as numpy does not understand your LazyFrames object type. It is not obvious to simply convert a single obs(LazyFrame) into a numpy object, and hence into a torch tensor to feed into DQN network.
Hoping for some help, to continue

@Shmuma
Copy link
Owner

Shmuma commented Dec 27, 2018

Hi!

LazyFrames is not my invention, it was copied from openai standard wrappers. This class is supposed to avoid keeping the copy of the same frame multiple times, for instance in situation in frame stack.

This class exposes __array__ method, which is numpy interface to convert anything into array-like. So, to convert LazyFrames into ndarray, you just need to call np.array(lazy_frames_instance). Your example, I guess, returns typeerror as you're passing state in the list.

Hope this will help.

@JulesVerny
Copy link
Author

JulesVerny commented Dec 27, 2018

Hello Thanks for the quick reply. Yep as you stated it was a problem passing in in as a list [state]
I have now tried the following which gets me a little further:
state_v = torch.tensor(np.array (state,copy=False)) # Seems does return numpy
q_vals = net(state_v).data.numpy()[0]

I am now stuck on hitting a RuntimeError: Expected 4-dimensional input for 4-dimensional weight [32, 4, 8, 8], but got 3-dimensional input of size [4, 84, 84] instead. Even though I thought the Chapt 7 networks took in images to feed the convolutional input, it appears the net object is actually expecting some 32xminibatch sets. A little frustrating as all I want to do is to see the games play.

@Shmuma
Copy link
Owner

Shmuma commented Dec 27, 2018

Looks like you're doing something wrong, in terms of wrappers applied. You shouldn't get LazyFrames as the state, as it should be other wrappers. But it's hard to tell without the full code.

I suggest you to copy Chapter06/{03_dqn_play.py + lib/wrappers.py} into Chapter07 and change the model construction (as you should use the model you've used from chapter07). Then it should word as expected.

Alternatively you could check the wrappers stack you have in your environment by printing it, it will show all the wrappers.

@JTatts
Copy link

JTatts commented Jan 27, 2019

Hi Jules,

Maybe you've solved this by now but I was stuck for a little while before I worked out the solution.
I think Shmuma's solution is the easiest but there are two changes that you need to make to wrappers.py.

  1. The network itself now does input normalisation so you should remove the ScaledFloatFrame from make_env.

  2. In ImageToPyTorch moveaxis has been replaced by swapaxes (for efficiency???) so the form of the input array to the network has changed.

After these two changes everything works fine for me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants