# Linear TD($\lambda$) Agent
As a first-pass at constructing an agent to play Connect 4 in the Kaggle [ConnectX tournament](https://www.kaggle.com/c/connectx), I'll construct an agent that acts greedily via a linear value function that is approximated using coarse coding and the TD($\lambda$) algorithm.

In [9]:
from linear_TD_agent import TDAgent
from RL_utils import train, watch_play, evaluate
from eval_agents import BaseAgent, StepPlay
import numpy as np
import h5py

First, we'll import weights from a prior training session.

In [10]:
with h5py.File('linear_weights.h5', 'r') as hf:
    w1 = hf["w1"][:]
    w2 = hf["w2"][:]
    w3 = hf["w3"][:]

Now we'll initiate agents with these weights and train these players for 1000 games.

In [11]:
agent1 = TDAgent()
agent1.agent_init({"w": w1})

agent2 = TDAgent()
agent2.agent_init({"w": w2})

agent3 = TDAgent()
agent3.agent_init({"w" : w3})

players = [agent1, agent2, agent3]

record = train(1000, players)

As a sanity check, since these three agents are meant to be evenly matched, we should see that player 1 wins roughly half the time.

In [12]:
print(len(record))
print(np.average( np.asarray(record) == 1 ) )

1000
0.513


We can also evaluate a particular agent against a random agent. 

In [13]:
player1 = BaseAgent()
player1.agent_init()

players = [player1, agent3]

record = evaluate(1000, players)

Or, we can watch the play in a single game with our agents.

In [14]:
watch_play([player1, agent3])

+---+---+---+---+---+---+---+
| 0 | 0 | 0 | 0 | 0 | 0 | 0 |
+---+---+---+---+---+---+---+
| 0 | 0 | 0 | 0 | 0 | 0 | 0 |
+---+---+---+---+---+---+---+
| 0 | 0 | 0 | 0 | 0 | 0 | 0 |
+---+---+---+---+---+---+---+
| 0 | 0 | 0 | 0 | 0 | 0 | 0 |
+---+---+---+---+---+---+---+
| 0 | 0 | 0 | 0 | 0 | 0 | 0 |
+---+---+---+---+---+---+---+
| 0 | 0 | 2 | 0 | 0 | 0 | 0 |
+---+---+---+---+---+---+---+

+---+---+---+---+---+---+---+
| 0 | 0 | 0 | 0 | 0 | 0 | 0 |
+---+---+---+---+---+---+---+
| 0 | 0 | 0 | 0 | 0 | 0 | 0 |
+---+---+---+---+---+---+---+
| 0 | 0 | 0 | 0 | 0 | 0 | 0 |
+---+---+---+---+---+---+---+
| 0 | 0 | 0 | 0 | 0 | 0 | 0 |
+---+---+---+---+---+---+---+
| 0 | 0 | 1 | 0 | 0 | 0 | 0 |
+---+---+---+---+---+---+---+
| 0 | 0 | 2 | 0 | 0 | 0 | 0 |
+---+---+---+---+---+---+---+

+---+---+---+---+---+---+---+
| 0 | 0 | 0 | 0 | 0 | 0 | 0 |
+---+---+---+---+---+---+---+
| 0 | 0 | 0 | 0 | 0 | 0 | 0 |
+---+---+---+---+---+---+---+
| 0 | 0 | 0 | 0 | 0 | 0 | 0 |
+---+---+---+---+---+---+---+
| 0 | 0 