-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add example solving OpenAI Gym MountainCarContinuous env #149
Add example solving OpenAI Gym MountainCarContinuous env #149
Conversation
cd8ce34
to
d06f9af
Compare
This is pretty cool! However, a principle problem that I see is that this has an additional dependency, namely OpenAI Gym. I think there are two possible ways:
For the example itself:
|
Hey, thanks for tagging me here. It's the first time I look into CGP, it looks very cool and the installation was super easy, it just works. 🥇 Regarding this PR, I agree with @mschmidt87. A bit more information and visualization would be neat. I ran the script, it counted from 0 to 500 and in the end, something reached a fitness for 13.5. Is this a good or bad value? I would also like to see the found solution and if this solution is sufficient to solve the task according to the criteria by the Gym. |
d06f9af
to
8c63140
Compare
Thanks for your comments! ✨ I don't like the idea of creating a separate I've added visualization of the solutions for every increase of the champion's fitness. Please check whether this works for you. 😬 I'm not sure what you mean by "logging the found solution" do you mean just print to the screen? ;) |
8c63140
to
c82b242
Compare
Actually I'm not sure what the criteria are for this environment. Do you know where to check this @weidel-p? |
1d47d5e
to
bb8a9cc
Compare
Yes, it works for me. Looks very good :) 👍 You can find the threshold for solving the task in the leaderboard here: For this task, the threshold is at 90. |
Looks good, in principle, but I think the amount of animations is too many at the end. Let's just animate the winning, final solution, and perhaps keep it as a switch to visualize all steps with fitness gain. With logging the solution, yes, just print it to the screen. |
Regarding the additional dependency: What about adding something like: try:
import gym
except ImportError:
raise ImportError("Could not import OpenAI Gym package. Please install it via `pip install gym`.") |
bb8a9cc
to
63a8179
Compare
thanks for the reviews @weidel-p @mschmidt87 :) i've implemeted your suggestions and am happy to report that the found solution passes the solving criteria for this environment. in order to achieve this i needed to adapt the fitness function a bit, putting more emphasis on the reward gathered rather than the number of episodes solved. please have another look! |
15832bb
to
3cbe7e5
Compare
Very nice, @jakobj , just a minor suggestion: Why not setting the |
yes, i agree this would be the optimal case. however, the requirement is defined as ">90 fitness averaged over 100 consecutive trials". unfortunately we can not use this as an objective during evolution since only high-performance solutions are able to achieve this large number of trials in a reasonable amount of time. hence i opted for this specific combination objective + min fitness. |
Okay, understood. As a minor change, I would suggest to use |
3cbe7e5
to
4197ca7
Compare
fixed, pls have another look :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This example demonstrates how to solve an Open AI Gym environment with CGP. Fitness function is based on empirical observations to maximize solving speed. Found solution is almost always of the form
c \dot x
, sometimesc \dot x - x
for that last bit of efficiency (try it!). The solution intuitively makes sense, doesn't it? ;) #interpretableRLtagging our mountaincar expert @weidel-p