# Baseline Evaluation

We provide three simple agents as a baseline: random agent, greedy agent, and greedy defending agent.

Random agent chooses randomly from available positions. Greedy agent looks for a move that completes the longest sequence which can be extended to five-in-a-row. Greedy defending agent also looks for the opposition's greedy move, and looks to prevent the move when appropriate.

In this notebook we evaluate these agents by running them against each other.

In [1]:
import numpy as np
import pandas as pd
from tqdm import trange
from time import sleep

# may need to update sys.path to import?
from gomoku.manager import GameManager

In [2]:
agent_names = ["random", "greedy", "greedy_defender"]

def evaluate_game(game: GameManager, runs: int = 100):
    results_df = pd.DataFrame(columns=['black', 'white', 'wins (black)', 'wins (white)', 'ties'])
    for agent1 in agent_names:
        for agent2 in agent_names:
            print(f"Running {agent1} against {agent2}")
            sleep(0.5)

            # array containing results: dummy, agent1 (1), agent2 (2), tie (-1)
            results = [0] * 4
            for _ in trange(runs):
                winner = game.run_game(agent1, agent2)
                results[winner.value] += 1

            results_df.loc[len(results_df)] = [agent1, agent2] + results[1:]
            print(f"Results: black({agent1}): {results[1]}, white({agent2}): {results[2]}, ties: {results[3]}")
    return results_df

In [3]:
# pivot results table
def pivot_table(df_input: pd.DataFrame):
    df = df_input.copy()
    idx = pd.IndexSlice
    df = df.pivot(index='black', columns='white', values=['wins (black)', 'wins (white)', 'ties']).astype(str)
    df = df.loc[:,idx['wins (black)',:]] + '/' \
        + df.loc[:,idx['wins (white)',:]].values + '/' \
        + df.loc[:,idx['ties',:]].values
    df = df.droplevel(axis=1, level=0)
    return df

## 5 x 5 board

In [4]:
df_5 = evaluate_game(GameManager(size=5, quiet=True))

Running random against random
100%|██████████| 100/100 [00:01<00:00, 89.06it/s]
Results: black(random): 19, white(random): 17, ties: 64
Running random against greedy
100%|██████████| 100/100 [00:03<00:00, 28.47it/s]
Results: black(random): 0, white(greedy): 100, ties: 0
Running random against greedy_defender
100%|██████████| 100/100 [00:06<00:00, 16.29it/s]
Results: black(random): 0, white(greedy_defender): 97, ties: 3
Running greedy against random
100%|██████████| 100/100 [00:03<00:00, 30.64it/s]
Results: black(greedy): 100, white(random): 0, ties: 0
Running greedy against greedy
100%|██████████| 100/100 [00:05<00:00, 18.32it/s]
Results: black(greedy): 72, white(greedy): 28, ties: 0
Running greedy against greedy_defender
100%|██████████| 100/100 [00:10<00:00,  9.29it/s]
Results: black(greedy): 0, white(greedy_defender): 89, ties: 11
Running greedy_defender against random
100%|██████████| 100/100 [00:06<00:00, 16.59it/s]
Results: black(greedy_defender): 96, white(random): 0, ties: 4
Ru

In [5]:
df_5

Unnamed: 0,black,white,wins (black),wins (white),ties
0,random,random,19,17,64
1,random,greedy,0,100,0
2,random,greedy_defender,0,97,3
3,greedy,random,100,0,0
4,greedy,greedy,72,28,0
5,greedy,greedy_defender,0,89,11
6,greedy_defender,random,96,0,4
7,greedy_defender,greedy,100,0,0
8,greedy_defender,greedy_defender,0,0,100


In [6]:
pivot_table(df_5)

white,greedy,greedy_defender,random
black,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
greedy,72/28/0,0/89/11,100/0/0
greedy_defender,100/0/0,0/0/100,96/0/4
random,0/100/0,0/97/3,19/17/64


It is clear that the greedy agent is better than the random agent, and greedy defender is better than greedy. Games between greedy defender all ended in a tie, which is not surprising on a small board where opposition moves can easily be blocked.

## 7 x 7 board

In [7]:
df_7 = evaluate_game(GameManager(size=7, quiet=True))

Running random against random
100%|██████████| 100/100 [00:01<00:00, 65.12it/s]
Results: black(random): 51, white(random): 43, ties: 6
Running random against greedy
100%|██████████| 100/100 [00:06<00:00, 14.54it/s]
Results: black(random): 0, white(greedy): 100, ties: 0
Running random against greedy_defender
100%|██████████| 100/100 [00:14<00:00,  6.79it/s]
Results: black(random): 0, white(greedy_defender): 100, ties: 0
Running greedy against random
100%|██████████| 100/100 [00:06<00:00, 14.48it/s]
Results: black(greedy): 100, white(random): 0, ties: 0
Running greedy against greedy
100%|██████████| 100/100 [00:11<00:00,  8.61it/s]
Results: black(greedy): 92, white(greedy): 8, ties: 0
Running greedy against greedy_defender
100%|██████████| 100/100 [00:20<00:00,  4.85it/s]
Results: black(greedy): 0, white(greedy_defender): 100, ties: 0
Running greedy_defender against random
100%|██████████| 100/100 [00:12<00:00,  7.76it/s]
Results: black(greedy_defender): 100, white(random): 0, ties: 0
Ru

In [8]:
df_7

Unnamed: 0,black,white,wins (black),wins (white),ties
0,random,random,51,43,6
1,random,greedy,0,100,0
2,random,greedy_defender,0,100,0
3,greedy,random,100,0,0
4,greedy,greedy,92,8,0
5,greedy,greedy_defender,0,100,0
6,greedy_defender,random,100,0,0
7,greedy_defender,greedy,100,0,0
8,greedy_defender,greedy_defender,1,0,99


In [9]:
pivot_table(df_7)

white,greedy,greedy_defender,random
black,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
greedy,92/8/0,0/100/0,100/0/0
greedy_defender,100/0/0,1/0/99,100/0/0
random,0/100/0,0/100/0,51/43/6


## 10 x 10

In [10]:
df_10 = evaluate_game(GameManager(size=10, quiet=True))

Running random against random
100%|██████████| 100/100 [00:02<00:00, 36.94it/s]
Results: black(random): 48, white(random): 52, ties: 0
Running random against greedy
100%|██████████| 100/100 [00:15<00:00,  6.45it/s]
Results: black(random): 0, white(greedy): 100, ties: 0
Running random against greedy_defender
100%|██████████| 100/100 [00:31<00:00,  3.16it/s]
Results: black(random): 0, white(greedy_defender): 100, ties: 0
Running greedy against random
100%|██████████| 100/100 [00:15<00:00,  6.47it/s]
Results: black(greedy): 100, white(random): 0, ties: 0
Running greedy against greedy
100%|██████████| 100/100 [00:26<00:00,  3.74it/s]
Results: black(greedy): 100, white(greedy): 0, ties: 0
Running greedy against greedy_defender
100%|██████████| 100/100 [01:08<00:00,  1.46it/s]
Results: black(greedy): 7, white(greedy_defender): 93, ties: 0
Running greedy_defender against random
100%|██████████| 100/100 [00:31<00:00,  3.15it/s]
Results: black(greedy_defender): 100, white(random): 0, ties: 0
Ru

In [11]:
df_10

Unnamed: 0,black,white,wins (black),wins (white),ties
0,random,random,48,52,0
1,random,greedy,0,100,0
2,random,greedy_defender,0,100,0
3,greedy,random,100,0,0
4,greedy,greedy,100,0,0
5,greedy,greedy_defender,7,93,0
6,greedy_defender,random,100,0,0
7,greedy_defender,greedy,100,0,0
8,greedy_defender,greedy_defender,30,45,25


In [12]:
pivot_table(df_10)

white,greedy,greedy_defender,random
black,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
greedy,100/0/0,7/93/0,100/0/0
greedy_defender,100/0/0,30/45/25,100/0/0
random,0/100/0,0/100/0,48/52/0


Here we start seeing wins/losses in games between two greedy defending agents, which suggests 10x10 is a good starting point for trying to improve this agent.

## 14 x 14

In [13]:
df_14 = evaluate_game(GameManager(size=14, quiet=True))

Running random against random
100%|██████████| 100/100 [00:04<00:00, 21.51it/s]
Results: black(random): 52, white(random): 48, ties: 0
Running random against greedy
100%|██████████| 100/100 [00:33<00:00,  2.95it/s]
Results: black(random): 0, white(greedy): 100, ties: 0
Running random against greedy_defender
100%|██████████| 100/100 [01:01<00:00,  1.63it/s]
Results: black(random): 0, white(greedy_defender): 100, ties: 0
Running greedy against random
100%|██████████| 100/100 [00:31<00:00,  3.16it/s]
Results: black(greedy): 100, white(random): 0, ties: 0
Running greedy against greedy
100%|██████████| 100/100 [00:55<00:00,  1.80it/s]
Results: black(greedy): 100, white(greedy): 0, ties: 0
Running greedy against greedy_defender
100%|██████████| 100/100 [10:09<00:00,  6.09s/it]
Results: black(greedy): 4, white(greedy_defender): 96, ties: 0
Running greedy_defender against random
100%|██████████| 100/100 [01:02<00:00,  1.60it/s]
Results: black(greedy_defender): 100, white(random): 0, ties: 0
Ru

In [14]:
df_14

Unnamed: 0,black,white,wins (black),wins (white),ties
0,random,random,52,48,0
1,random,greedy,0,100,0
2,random,greedy_defender,0,100,0
3,greedy,random,100,0,0
4,greedy,greedy,100,0,0
5,greedy,greedy_defender,4,96,0
6,greedy_defender,random,100,0,0
7,greedy_defender,greedy,100,0,0
8,greedy_defender,greedy_defender,51,48,1


In [15]:
pivot_table(df_14)

white,greedy,greedy_defender,random
black,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
greedy,100/0/0,4/96/0,100/0/0
greedy_defender,100/0/0,51/48/1,100/0/0
random,0/100/0,0/100/0,52/48/0


## 19 x 19

In [16]:
df_19 = evaluate_game(GameManager(size=19, quiet=True))

Running random against random
100%|██████████| 100/100 [00:07<00:00, 13.61it/s]
Results: black(random): 57, white(random): 43, ties: 0
Running random against greedy
100%|██████████| 100/100 [01:03<00:00,  1.57it/s]
Results: black(random): 0, white(greedy): 100, ties: 0
Running random against greedy_defender
100%|██████████| 100/100 [02:14<00:00,  1.34s/it]
Results: black(random): 0, white(greedy_defender): 100, ties: 0
Running greedy against random
100%|██████████| 100/100 [01:02<00:00,  1.60it/s]
Results: black(greedy): 100, white(random): 0, ties: 0
Running greedy against greedy
100%|██████████| 100/100 [01:51<00:00,  1.11s/it]
Results: black(greedy): 100, white(greedy): 0, ties: 0
Running greedy against greedy_defender
100%|██████████| 100/100 [04:40<00:00,  2.81s/it]
Results: black(greedy): 6, white(greedy_defender): 94, ties: 0
Running greedy_defender against random
100%|██████████| 100/100 [02:00<00:00,  1.21s/it]
Results: black(greedy_defender): 100, white(random): 0, ties: 0
Ru

In [17]:
df_19

Unnamed: 0,black,white,wins (black),wins (white),ties
0,random,random,57,43,0
1,random,greedy,0,100,0
2,random,greedy_defender,0,100,0
3,greedy,random,100,0,0
4,greedy,greedy,100,0,0
5,greedy,greedy_defender,6,94,0
6,greedy_defender,random,100,0,0
7,greedy_defender,greedy,100,0,0
8,greedy_defender,greedy_defender,48,52,0


In [18]:
pivot_table(df_19)

white,greedy,greedy_defender,random
black,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
greedy,100/0/0,6/94/0,100/0/0
greedy_defender,100/0/0,48/52/0,100/0/0
random,0/100/0,0/100/0,57/43/0


19x19 is the size of a go board, which is where gomoku games are usually played.

Note that with the greedy defending agent, black and white win evenly. It is known that black (playing the first move) has a winning strategy if the game is played without additional rules.