Reinforcement Learning with tabular methods: TD-learning (Q-learning and SARSA) and MENACE-like approach applied to a Rubik's cube with a move set restricted to 180-degree turns.
reinforcement-learning
q-learning
epsilon-greedy
sarsa
simulated-annealing
td-learning
softmax
menace-matchboxes
-
Updated
Aug 1, 2021 - C