# Q-Learning example

The following trains a **QController** to play the game. The **QController** has a Q-Table that indexes the Q-value for a given state and action. The states and actions are discretized from the (continuous) possible real states and actions.

In [None]:
from pod.board import PodBoard
from pod.ai.q_controller import QController

board = PodBoard()
controller = QController(board)

Here, we train the controller, progressively decreasing the learning rate and varying the amount of random exploration.

In [None]:
import matplotlib.pyplot as plt

rewards = []
for rate in range(10):
    lr = (10 - rate) / 10
    for p in range(10):
        prob = (10 - p) / 10
        print("Running with learning rate {} random probability {}".format(lr, prob))
        results = controller.train(
            num_episodes=700,
            prob_rand_action=prob,
            learning_rate=lr
        )
        avg = sum(results) / len(results)
        print("   ---> Average cumulative reward: {}".format(avg))
        rewards.append(avg)

plt.plot(rewards)
plt.legend(["Total reward per epoch"])
plt.show()

In [None]:
rewards = []
for rand in range(50):
    prob_rand = (50 - rand) / 50
    results = controller.train(
        num_episodes=500,
        prob_rand_action=prob_rand,
        learning_rate=0.5
    )
    avg = sum(results) / len(results)
    print("prob_rand {} ---> Average cumulative reward: {}".format(prob_rand, avg))
    rewards.append(avg)

plt.plot(rewards)
plt.legend(["Total reward per epoch"])
plt.show()

Now that it has been trained, let's see the result!

In [None]:
TURNS = 200

from pod.game import Player
from pod.drawer import Drawer
from IPython.display import Image
from pod.controller import SimpleController

q_player = Player(controller)
simple_player = Player(SimpleController())

drawer = Drawer(board, [q_player, simple_player])

q_player.reset(board)
simple_player.reset(board)

drawer.animate(TURNS)

In [None]:
q_player.reset(board)
simple_player.reset(board)

drawer.chart_rewards(TURNS)