Comparison of RL algorithms (Bandit, Q-Learning etc.) to similar algorithms that use inference. Psi-Auto as an algorithm that automatically tunes the inverse temperature.
-
Updated
May 13, 2023 - Python
Comparison of RL algorithms (Bandit, Q-Learning etc.) to similar algorithms that use inference. Psi-Auto as an algorithm that automatically tunes the inverse temperature.
Add a description, image, and links to the psi-learning topic page so that developers can more easily learn about it.
To associate your repository with the psi-learning topic, visit your repo's landing page and select "manage topics."