Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fully support from/to numpy/pytorch for Batch #62

Closed
duburcqa opened this issue May 29, 2020 · 4 comments · Fixed by #63, #68 or #73
Closed

Fully support from/to numpy/pytorch for Batch #62

duburcqa opened this issue May 29, 2020 · 4 comments · Fixed by #63, #68 or #73
Labels
enhancement Feature that is not a new algorithm or an algorithm enhancement

Comments

@duburcqa
Copy link
Collaborator

The current implementation of PPO and other policy algorithm do not support action dict because of this line.

It could be solved by adding to new method to Batch class to convert back the relevant fields to torch.Tensor.

I'm opening a PR to fix that.

@duburcqa
Copy link
Collaborator Author

Also, the use of torch.tensor must be prohibited to convert numpy array to torch tensor since it is less efficient and break memory sharing on cpu.

@Trinkle23897
Copy link
Collaborator

Also, the use of torch.tensor must be prohibited to convert numpy array to torch tensor since it is less efficient and break memory sharing on cpu.

For most of the scenarios, the agent's action contains only a few elements that could be considered as negligible.
Currently, the replay buffer is stored as np.ndarray. If you want to prohibit the conversion, the underlying data structure should change to torch.tensor. But I think it is not a good approach since the memory in GPU is far less than RAM, and some basic operations (e.g. compute the returns) are more efficient in the CPU side.

@duburcqa
Copy link
Collaborator Author

duburcqa commented May 29, 2020

I'm not saying to use only torch.tensor, but convert them using torch.from_numpy. Sorry for lack of clarity :/

@duburcqa
Copy link
Collaborator Author

duburcqa commented May 29, 2020

@Trinkle23897 The PR should be ready right now. I try to do the minimal modifications to fully support Batch from/to numpy/pytorch.

@duburcqa duburcqa changed the title Add support of action dict Fully support from/to numpy/pytorch for Batch May 29, 2020
@Trinkle23897 Trinkle23897 added the enhancement Feature that is not a new algorithm or an algorithm enhancement label Jun 1, 2020
@Trinkle23897 Trinkle23897 added this to TODO in Issue/PR Categories via automation Jun 1, 2020
@Trinkle23897 Trinkle23897 moved this from TODO to Other feature requests in Issue/PR Categories Jun 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Feature that is not a new algorithm or an algorithm enhancement
Projects
No open projects
Issue/PR Categories
Other feature requests
2 participants