Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RL4J - Made a few fixes #8303

Merged
merged 4 commits into from Oct 31, 2019
Merged

Conversation

@aboulang2002
Copy link
Contributor

aboulang2002 commented Oct 19, 2019

What changes were proposed in this pull request?

As I am figuring out how to change the way observations are handled in RL4J, I found and fixed a few issues:

Changes to ObservationSpace and ActionSpace:

  • DiscreteSpace did not use org.nd4j.linalg.api.rng.Random
  • Moved ObservationSpace, ActionSpace, DiscreteSpace and ArrayObservationSpace to rl4j-core
  • Removed setSeed from ActionSpace (since we are now using an already seeded Random via Nd4J.getRandom())
  • Added setRandom to ActionSpace to help unit tests

Changes to initMDP:

  • initMDP did not check if !mdp.isDone()
  • In initMDP(), the history was being filled with only the initial observation. i.e. the very first observation was continuously fed to HistoryProcessor.add() instead of the current one (nextO)

Other:

  • The last observation of an epoch was not being recorded (HistoryProcessor.record())
    • Changed QLearningDiscrete, Policy.play()
    • Did not change AsyncThreadDiscrete. To be done in another PR; I don't want to change anything in this class before it's unit tested.
    • Added refac unit tests to Policy.play(). Note: It seems weird that it's the policy that has the play() method. There should be an Agent class. To be done later.

How was this patch tested?

Unit tests and manual tests

Quick checklist

The following checklist helps ensure your PR is complete:

  • Eclipse Contributor Agreement signed, and signed commits - see IP Requirements page for details
  • Reviewed the Contributing Guidelines and followed the steps within.
  • Created tests for any significant new code additions.
  • Relevant tests for your changes are passing.
Signed-off-by: unknown <aboulang2002@yahoo.com>
@saudet

This comment has been minimized.

Copy link
Member

saudet commented Oct 20, 2019

We can't move ObservationSpace, etc from gym-java-client because that creates circular dependencies. The plan is to get rid of gym-java-client, but for the moment please just leave it like it is now.

Signed-off-by: unknown <aboulang2002@yahoo.com>
@aboulang2002

This comment has been minimized.

Copy link
Contributor Author

aboulang2002 commented Oct 20, 2019

I reverted the move of ObservationSpace, ActionSpace and others from gym-java-client. Sorry for missing the circular dependency.

Signed-off-by: unknown <aboulang2002@yahoo.com>
@aboulang2002

This comment has been minimized.

Copy link
Contributor Author

aboulang2002 commented Oct 23, 2019

Added a unit test to AsyncThreadDiscrete and a few others.
AsyncThreadDiscrete seems to record all observations after all.

@saudet

This comment has been minimized.

Copy link
Member

saudet commented Oct 24, 2019

We can still make the modifications to the ones in gym-java-client though. Those will get moved later, so might as well make the changes there now.

Signed-off-by: Alexandre Boulanger <aboulang2002@yahoo.com>
@aboulang2002

This comment has been minimized.

Copy link
Contributor Author

aboulang2002 commented Oct 25, 2019

Good idea. I made the changes.

@saudet saudet self-requested a review Oct 31, 2019
@saudet
saudet approved these changes Oct 31, 2019
Copy link
Member

saudet left a comment

Confirmed working with the A3CALE and Cartpole examples.

@saudet saudet merged commit a2b973d into eclipse:master Oct 31, 2019
1 check passed
1 check passed
eclipsefdn/eca The author(s) of the pull request is covered by necessary legal agreements in order to proceed!
Details
@aboulang2002 aboulang2002 deleted the aboulang2002:ab2002_rl4j_minor_fixes branch Jan 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.