-
Notifications
You must be signed in to change notification settings - Fork 126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More information about the LongerExplorationPolicy #63
Comments
Hi dynamik, Basically the idea is that if you have a pure random exploration, you will end up with all possible ordered sequences that have uniform probabilities. E.g., if a set of two possible actions {1,2} and two time steps, the sequences {11,12,21,22} have all the same probability 0.25 of being tried out. The length parameter should be chosen depending on your environment. Usually you'll have to try a few possibilities empirically and see what works. Best, |
Hi VinF, thanks for your quick response! How do you evaluate the Ornstein-Uhlenbeck-Process in comparison? Best, |
Hi Roman, |
Hey VinF,
do you have more information about the LongerExplorationPolicy?
I'm wondering whether this policy is suitable for my environment. How should the length parameter be chosen?
Thanks!
Best wishes
The text was updated successfully, but these errors were encountered: