-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to properly reset my Agent when he does something fatal? #19
Comments
As of right now I am resetting the position of my car to it's starting location on the track whenever it runs into a wall. Until the Simulation says that the car is its designated location and is ready to go the engine won't send any observations. |
The code for resetting the cart and pole is in the cylinder_bp class, there
is a branch node there that checks if the pole is tilted below a certain
angle and then resets everything if that is the case. You will want to
pause sending observations when you reset the agent, otherwise actions will
continue to be generated and they will result in bogus training data. You
do not have to reset the entire training counter, in fact you want these
negative rewards to be passed to the neural network and resetting the
training counter would defeat the purpose.
…On Thu, May 12, 2022 at 10:13 AM Florian Dittrich ***@***.***> wrote:
Hello,
in your cartpole example, whenever the Cylinder falls below a certain
pitch, the cylinder and cart are reset to their original position and the
training commences. How did you implement this, because I couldn't find the
code for it in the example.
In my case, if my car drives into a wall, I'd like to give it a negative
reward to punish it for that behavior and reset its position back to the
start of the race track. When that happens, do I pause sending observations
until the car is back at its location and ready to go or do I have to reset
the training episode counter in some way?
It would be extremely helpful if you could shed some light on this for me
:)
best regards,
FLOROID
—
Reply to this email directly, view it on GitHub
<#19>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABI6N2I73BKU5KWM4WLLBBTVJUGZBANCNFSM5VYM6EDQ>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Okay, that's very reassuring thank you <3 Do you have any idea why during the evaluation phase Mindmaker is always sending the same action though? I've set up everything else and my Agent is ready to train, but I'd need to be able to see what he is learning using the evaluation phase so I can adjust parameters if needed. Thank you in advance. |
I still desperately need help with this issue :) |
I still need help with this. |
Hello,
in your cartpole example, whenever the Cylinder falls below a certain pitch, the cylinder and cart are reset to their original position and the training commences. How did you implement this, because I couldn't find the code for it in the example.
In my case, if my car drives into a wall, I'd like to give it a negative reward to punish it for that behavior and reset its position back to the start of the race track. When that happens, do I pause sending observations until the car is back at its location and ready to go or do I have to reset the training episode counter in some way?
It would be extremely helpful if you could shed some light on this for me :)
best regards,
FLOROID
The text was updated successfully, but these errors were encountered: