# Line Extending Game 0
Here, we:
1. Introduce the "two-pixel line extending game"
2. Use reinforcement learning to learn to play the game
3. Use reinforcement learning to learn *rules* to play the game

Throughout the notebook, helper functions whose implementations are not important are factored out into a library file.

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
import random
import warnings
import time
from collections import Counter

import numpy as np
import matplotlib.pyplot as plt

from nsai_experiments import line_extending_game_tools as lgt

## 1: The two-pixel line extending game

Imagine a NxN grid of pixels that can either be on (`x`) or off (`-`), represented by a NumPy Boolean array:

In [3]:
sample_grid = lgt.create_grid("""
    - - - - -
    x - - - -
    x - - x -
    - - x - -
    - - - - -
    """)
print(sample_grid)
lgt.display_grid(sample_grid)

[[False False False False False]
 [ True False False False False]
 [ True False False  True False]
 [False False  True False False]
 [False False False False False]]
- - - - -
x - - - -
x - - x -
- - x - -
- - - - -


The game is: for every "line segment" consisting of at least two contiguous `x`s, extend the segment all the way across the grid. For instance:

In [4]:
sample_start1 = lgt.create_grid("""
    - - - - - - - - - - 
    - - x - - x - - - - 
    - x - - x - - - - - 
    - - - - - - - - - - 
    - - - - - - - - - - 
    - - - - - - - - - - 
    - - - - - - - - - - 
    - - - - - - - - - - 
    - - - - - - - x x - 
    - - - - - - - - - - 
    """)

sample_final1 = lgt.create_grid("""
    - - - x - - x - - - 
    - - x - - x - - - - 
    - x - - x - - - - - 
    x - - x - - - - - - 
    - - x - - - - - - - 
    - x - - - - - - - - 
    x - - - - - - - - - 
    - - - - - - - - - - 
    x x x x x x x x x x 
    - - - - - - - - - -  
    """)

lgt.display_grid(sample_start1)
print()
lgt.display_grid(sample_final1)

- - - - - - - - - -
- - x - - x - - - -
- x - - x - - - - -
- - - - - - - - - -
- - - - - - - - - -
- - - - - - - - - -
- - - - - - - - - -
- - - - - - - - - -
- - - - - - - x x -
- - - - - - - - - -

- - - x - - x - - -
- - x - - x - - - -
- x - - x - - - - -
x - - x - - - - - -
- - x - - - - - - -
- x - - - - - - - -
x - - - - - - - - -
- - - - - - - - - -
x x x x x x x x x x
- - - - - - - - - -


The game is played in moves, where each move consists of changing one `-` to an `x`. For simplicity in this very basic version of the game, we disallow starting states that correspond to final states containing line segments not part of a line. For instance, this is disallowed:

In [5]:
bad_start = lgt.create_grid("""
    - - - - - - - - - -
    - - x - - - - - - -
    - - x - - - - - - -
    - - - - - - - - - -
    - - - - - - - - - -
    - - - - - - - - - -
    - - - - - - - - - -
    - - - - - - - - - -
    - - - - - - x x - -
    - - - - - - - - - -
    """)

bad_final = lgt.create_grid("""
    - - x - - - - - - -
    - - x - - - - - - -
    - - x - - - - - - -
    - - x - - - - - - -
    - - x - - - - - - -
    - - x - - - - - - -
    - - x - - - - - - -
    - - x - - - - - - -
    x x x x x x x x x x
    - - x - - - - - - -
    """)

lgt.display_grid(bad_start)
print()
lgt.display_grid(bad_final)

- - - - - - - - - -
- - x - - - - - - -
- - x - - - - - - -
- - - - - - - - - -
- - - - - - - - - -
- - - - - - - - - -
- - - - - - - - - -
- - - - - - - - - -
- - - - - - x x - -
- - - - - - - - - -

- - x - - - - - - -
- - x - - - - - - -
- - x - - - - - - -
- - x - - - - - - -
- - x - - - - - - -
- - x - - - - - - -
- - x - - - - - - -
- - x - - - - - - -
x x x x x x x x x x
- - x - - - - - - -


because its final state contains several diagonal line segments not part of lines. For the two-pixel version, this means that, with a few exceptions, lines cannot touch each other. This rule gives our game the useful property that if grid A can be transformed into grid B with one move, a solution for grid B plus that one move is a solution for grid A.

### A Human Solution

With this convenient modification, an intuitive solution to the game is: whenever you see a line segment, extend it on one of the ends if possible; repeat until no more moves are possible; then the game is solved. Here's an implementation of that:

In [6]:
"""
Takes a segment and direction and outputs the coordinates of the points on each end of the
segment that we'd want to extend
"""
def where_to_extend(segment, direction):
    (a, b), (c, d) = segment
    match direction:
        case "HORIZONTAL": return (a, b-1), (c, d+1)
        case "VERTICAL":   return (a-1, b), (c+1, d)
        case "SLOPE_DOWN": return (a-1, b-1), (c+1, d+1)
        case "SLOPE_UP":   return (a+1, b-1), (c-1, d+1)

def solve_human(unsolved_problem, timeout = 100, random_seed = 47, print_status = False):
    random.seed(random_seed)
    rows, cols = np.shape(unsolved_problem)
    answer = np.copy(unsolved_problem)
    # Each iteration is a move in the game; note that we only need to refer to the current state (not the starting state) to find the next move
    for i in range(timeout):
        # Find all possible line segments and the points we'd want to fill in to extend those segments
        segments, directions = lgt.find_all_segments(answer)
        possible_moves = [point
                              for (segment, direction) in zip(segments, directions)
                                  for point in where_to_extend(segment, direction)]
        
        # Exclude points that are off the board and points that have already been filled in
        if print_status: print(f"{len(possible_moves)} initial possible moves... ", end = "")
        possible_moves = list(filter(lambda point: 0 <= point[0] < rows and 0 <= point[1] < cols, possible_moves))
        if print_status: print(f"down to {len(possible_moves)} due to out of bounds... ", end = "")
        possible_moves = list(filter(lambda point: not answer[point[0], point[1]], possible_moves))
        if print_status: print(f"down to {len(possible_moves)} due to already filled in")
        
        # End or choose a random move from the possible points
        if len(possible_moves) == 0:
            if print_status: print(f"Success after {i+1} iterations!")
            return answer
        my_move = random.choice(possible_moves)
        answer[my_move] = True
    if print_status: print(f"Timed out after {timeout} iterations")
    return answer

lgt.display_grid(sample_start1)
human_answer = solve_human(sample_start1, print_status = True)
lgt.display_grid(human_answer)

assert np.array_equal(human_answer, sample_final1)

- - - - - - - - - -
- - x - - x - - - -
- x - - x - - - - -
- - - - - - - - - -
- - - - - - - - - -
- - - - - - - - - -
- - - - - - - - - -
- - - - - - - - - -
- - - - - - - x x -
- - - - - - - - - -
6 initial possible moves... down to 6 due to out of bounds... down to 6 due to already filled in
8 initial possible moves... down to 7 due to out of bounds... down to 5 due to already filled in
10 initial possible moves... down to 9 due to out of bounds... down to 5 due to already filled in
12 initial possible moves... down to 11 due to out of bounds... down to 5 due to already filled in
14 initial possible moves... down to 12 due to out of bounds... down to 4 due to already filled in
16 initial possible moves... down to 14 due to out of bounds... down to 4 due to already filled in
18 initial possible moves... down to 15 due to out of bounds... down to 3 due to already filled in
20 initial possible moves... down to 16 due to out of bounds... down to 2 due to already filled in
22 initial po

Obviously this is not the most efficient solution — we are starting completely from scratch every iteration and, for most of the game, most of the candidate moves are already filled in — but it is easy to understand. Let's generate some (problem, solution) pairs and test that the algorithm works:

In [7]:
lgt.display_grid(lgt.generate_problem(10, 10, 2, 2, 2)[0])

- - - - - - - - - -
- - x - - - - x - -
- x - x - - - - - x
x - - - - - - - x -
- - - - - - - x - -
- - - - - - x - - -
- x - - - - - x - -
- - - - - - - - - -
- - - - - - - - - -
- - - - - - - - - -


In [8]:
random.seed(47)
for i in range(100):
    # Generate a problem with the given dimensions and number of three-length segments,
    # two-length features, and one-length features; and the corresponding solution
    problem, solution = lgt.generate_problem(10, 10, random.randrange(2), random.randrange(3), random.randrange(4))
    assert np.array_equal(solve_human(problem), solution)