Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to properly reset my Agent when he does something fatal? #19

Closed
FLOROID opened this issue May 12, 2022 · 5 comments
Closed

How to properly reset my Agent when he does something fatal? #19

FLOROID opened this issue May 12, 2022 · 5 comments

Comments

@FLOROID
Copy link

FLOROID commented May 12, 2022

Hello,

in your cartpole example, whenever the Cylinder falls below a certain pitch, the cylinder and cart are reset to their original position and the training commences. How did you implement this, because I couldn't find the code for it in the example.

In my case, if my car drives into a wall, I'd like to give it a negative reward to punish it for that behavior and reset its position back to the start of the race track. When that happens, do I pause sending observations until the car is back at its location and ready to go or do I have to reset the training episode counter in some way?

It would be extremely helpful if you could shed some light on this for me :)

best regards,

FLOROID

@FLOROID
Copy link
Author

FLOROID commented May 16, 2022

As of right now I am resetting the position of my car to it's starting location on the track whenever it runs into a wall. Until the Simulation says that the car is its designated location and is ready to go the engine won't send any observations.
This seems to work well so far.
However the one problem I've run into is that as soon as all Training Episodes are completed, the Receive action function always spits out the same action with every single call - which means for example that my car is only driving in circles sometimes.
Do you have any good guesses as to why this might be happening?

@krumiaa
Copy link
Owner

krumiaa commented May 16, 2022 via email

@FLOROID
Copy link
Author

FLOROID commented May 17, 2022

Okay, that's very reassuring thank you <3

Do you have any idea why during the evaluation phase Mindmaker is always sending the same action though?
Here is an example of this:
https://user-images.githubusercontent.com/58942125/168793879-330dd1e6-412e-4a7f-95dc-39531d1f9d3f.mp4
(You can see that as soon as the Training Phase finishes, the Predicted Action always displays 2 -> drive backwards)

I've set up everything else and my Agent is ready to train, but I'd need to be able to see what he is learning using the evaluation phase so I can adjust parameters if needed.
Inside the "Set members in MindMakerCustStructStart" node I also found a pin called "Done" (boolean). Does this need to be set to true when all training episodes are done and passed to MindMaker with the SendObservation fuction or what is that pin used for?

Thank you in advance.
Once the training works properly I will also continue working on my documentation and I'll send you the full version when it is done so other people can get their hands dirty with this :D

@FLOROID
Copy link
Author

FLOROID commented May 22, 2022

I still desperately need help with this issue :)

@FLOROID
Copy link
Author

FLOROID commented Jun 10, 2022

I still need help with this.

@FLOROID FLOROID closed this as completed Jun 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants