Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mindmaker is returning the same action for every receiveAction call during Evaluation phase. #21

Open
FLOROID opened this issue Jun 13, 2022 · 22 comments

Comments

@FLOROID
Copy link

FLOROID commented Jun 13, 2022

I've set everything up in my project now and the agent is training without any errors, however when the Evaluation phase starts, he will continue executing the same action he executed in the first evaluation episode for every evaluation episode until the evaluation is done.
What could be causing this and how could I go about fixing this issue? Until I can actually properly evaluate what the agent is learning I can't really make use of the training, so this would be very crucial to fix for me.

@krumiaa
Copy link
Owner

krumiaa commented Jun 13, 2022

it sounds like your agent learned an incorrect strategy, got caught in a local equilibrium rather then discovering whatever global strategy you wanted it to learn. I suggest changing the algorithm, or messing around with the hyperparameters. This is not at all uncommon, finding good hyperparameters can be tricky
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0252754

@FLOROID
Copy link
Author

FLOROID commented Jun 15, 2022

Oh okay, I'll look into it and let you know if I need more help with this. Thank you very much for the paper you sent me, especially since it also focusses on autonomous driving :)

By the way, could this have any influence on the results?
#17 (comment)

@FLOROID
Copy link
Author

FLOROID commented Jun 15, 2022

What would be a good way to test if it actually works properly? Because I've been playing around a little bit with my check reward function and the hyperparameters, but I can't get it to produce any other action in the evaluation phase other than the one being chosen for the rest of that phase.
I also found that in the MindMaker executable, during the Evaluation phase it is repeatedly printing out the line "Im rrestarting now how fun"
image
On top of that, the observations in there frequently print out a bunch of 0. values with no commas and nothing after the points. Is this normal?

I also tried setting the number of training episodes to 0 and just have it evaluate the randomly generated model, which always returns action 0 in the evaluation phase.

@FLOROID
Copy link
Author

FLOROID commented Jun 16, 2022

UPDATE: @krumiaa
Okay so I've been changing a few things. Instead of a discrete action space I am now using a continuous action space of 2 float values between -1 and 1 that will go directly into the throttle and steering inputs of my car.
I figured this would simplify my work a bit more and also let me have some more insight into what the algorithm is actually producing. As it turns out this is running into the same issue as before but gave me a lot more insight.

The algorithm correctly produces a general strategy for avoiding a wall in this case. If, for example, I place the car facing a wall, it will produce a forward throttle input and a steering input that will steer him away from the wall in the direction of the road.
However, the float values of the actions that are produced do not change at all in the evaluation phase, which to my understanding should be impossible since the input values are changing simply by the car moving. It seems to me like the algorithm generalizes the entire dataset into one singular action which will simply be put out for every frame regardless of the inputs. This obviously isn't what we're aiming for, so I'm a little bit confused on how to fix this error since in the cartpole example it perfectly adapts to new inputs constantly.

Any idea what I could try?

many thanks in advance - Floroid <3

@krumiaa
Copy link
Owner

krumiaa commented Jun 16, 2022 via email

@FLOROID
Copy link
Author

FLOROID commented Jun 17, 2022

I believe what doesn't quite click with me yet is why, while the inputs change, the outputs during the evaluation phase don't even change by even 0.0001 ever, which to my understanding shouldn't be mathematically possible, so I guess I'm a little fearful that the issue lies outside of the hyperparameters, but I'm not sure either ^^° It clearly works in the cartpole example, because the output changes depending on which side the pole is falling towards, but my output doesn't change at all @krumiaa

@FLOROID
Copy link
Author

FLOROID commented Jun 17, 2022

To further showcase the exact issue, here is a video example for a 25000 Training Episode Session . At around 12 seconds the Training is completed and he switches to the evaluation where suddenly he starts only doing right turns with the same static outputs. During the Training phase he seems to slowly learn to drive without any anomalies as you can see in the first 12 seconds.

From what I understand, when I load up the model it will run however many evaluation episodes I told it to run again but not continue the training where I left off, even though it will still print out the Training episodes. It's just that in the MindMaker executable it doesn't say that training is in progress anymore.
image
^ 1 is when not loading a model and simply letting it train. 2 is when I load the model and want it to continue training.
Is there any way to load the model and continue training on it? @krumiaa

@FLOROID
Copy link
Author

FLOROID commented Jun 17, 2022

So I ran into another weird anomaly :) @krumiaa
If I run MindMaker without loading any model, it does the correct amount of training episodes and the correct amount of evaluation episodes (regardless of whether the evaluation episodes actually represent the trained model in my case)
but if I do load the model, it will run the given number of training episodes and the number of eval episodes MINUS the number of given training episodes as evaluation episodes. Is this intentional? And if yes, why? ^^
On Top of that, if I set the number of Training Episodes to a higher value than the number of eval episodes, it will produce as many training episodes as the input for the eval episodes but no evaluation episodes.
Either way the training episodes do not actually train the agent unless I start from fresh.

I sincerely apologize for the amount of questions. I hope it's not too much.

@FLOROID
Copy link
Author

FLOROID commented Jun 29, 2022

sorry to bother, but I still need help with these issues @krumiaa

@krumiaa
Copy link
Owner

krumiaa commented Jun 30, 2022

When loading a model try set the number of training episodes to zero or visa versa. I think its currently configured for continuous training with the loaded model, rather than just exploitation, thats probably whats causing the odd behavior. If you wanted to load a model only for demonstration, you could go to the python source code included with the examples and dig through, modify as necessary as per the code here

https://stable-baselines.readthedocs.io/en/master/guide/examples.html

Currently working on some other projects related to mindmaker, if this is a mission critical for your project we could discuss the options for having me consult with you on this and modify the source as per your needs.

@FLOROID
Copy link
Author

FLOROID commented Jun 30, 2022

Thank you very much @krumiaa I will look into it and let you know if consultation is needed :)
Does that mean I can continue training a model when I load it?

Also - do you have any clues as to why the evaluation phase is always spitting out the same action for me?
The problem doesn't occur in the cartpole example so it must either be an error in my blueprints or a bug.

@krumiaa
Copy link
Owner

krumiaa commented Jul 1, 2022 via email

@FLOROID
Copy link
Author

FLOROID commented Jul 3, 2022

Which file do I need open to find this code and where do I need to look for it? I assume it will be local to the project, however there are a lot of files and I'm not sure what file I'm looking for. @krumiaa

@FLOROID
Copy link
Author

FLOROID commented Jul 7, 2022

Still needing help with this ^^ I'm very lost in the files haha @krumiaa

@FLOROID
Copy link
Author

FLOROID commented Jul 11, 2022

Another quick reminder @krumiaa

@krumiaa
Copy link
Owner

krumiaa commented Jul 11, 2022 via email

@FLOROID
Copy link
Author

FLOROID commented Jul 13, 2022

So I'm in my project directory under the MindMaker folder searching for only *.py files but that brings up over 250 files.
Sadly I have no idea what kind of file name I'm looking for ^^° @krumiaa

I also ran this PowerShell command through the directory to search for files containing various search strings such as model.learn or model.predict(obs), but to no avail.
Get-ChildItem -Recurse | Select-String "SeachString" -List | Select Path

@krumiaa
Copy link
Owner

krumiaa commented Jul 14, 2022

if you download the latest version of mindmaker drl from marketplace and look under Content\MindMaker\Source there is a file called mindmakerUE5.py this is the file you want

@FLOROID
Copy link
Author

FLOROID commented Jul 14, 2022

if you download the latest version of mindmaker drl from marketplace and look under Content\MindMaker\Source there is a file called mindmakerUE5.py this is the file you want

I'm currently using UE 4.27 for this project. Will this still work under that version?

I'm a little bit afraid to break something, because UE 4.27 is no longer under the supported Engine Versions on the market place page.
And porting the project to Unreal Engine 5 would most likely break a lot of my code and seems to generally be pretty hard.

@krumiaa
Copy link
Owner

krumiaa commented Jul 15, 2022

I updated the marketplace listing of mindmaker so it has the 4.27 version, if you download and go to Content\MindMaker\MindMakerSource you will see a file called mindmakerUE4.py, this is the python source file you will want to modify

@FLOROID
Copy link
Author

FLOROID commented Jul 18, 2022

I updated the marketplace listing of mindmaker so it has the 4.27 version, if you download and go to Content\MindMaker\MindMakerSource you will see a file called mindmakerUE4.py, this is the python source file you will want to modify

That's wonderful news! I'll try it out now :)

@FLOROID
Copy link
Author

FLOROID commented Jul 18, 2022

Migration to a new project worked fantastically. All the code still works and I can now see the mindmakerUE4.py. This is helping me a lot with understanding what's actually going on behind the scenes :)
The thing that I'm curious about is what the UEdone variable is actually used for
image
I remember seeing it in the struct for sending observations and launching mindmaker, however from how much I can tell so far it's making mindmaker print out this:
image
Can you elaborate a bit on what this does exactly?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants