# Can I build a neural network to solve Mastermind, and how will it compare with Knuth's algorithm?

<div align="center">
    <img src="https://m.media-amazon.com/images/I/612QbEzSafL._AC_SL1500_.jpg"
         alt="Box top from Mastermind game."
         width="400px"
        />
</div>

## 2025-05-19: Intro

I started using Gemini about a month ago, so although I've never tried to build a neural network, recently I've been thinking more about them. While staring at the <a href="https://en.wikipedia.org/wiki/Mastermind_(board_game)">Mastermind</a> board my kids left out, I started thinking about whether I could build a neural network to solve it. This would be overpowered, of course. While researching the game, I quickly stumbled across Donald Knuth's algorithm, which I immediately stopped reading about so I could think it through and implement it myself.

The plan here, I think, is:
1. Build a game environment.
2. Implement Knuth's algorithm.
3. Build a neural network.

The game environment should be able to return the latest result or all results; I gather I shouldn't ask my neural network to manage the game's history. I may try to build the neural network myself using NumPy. I just discovered <a href="https://www.youtube.com/playlist?list=PLQVvvaa0QuDcjD5BAw2DxE6OF2tius3V3">this playlist</a>, and I've been meaning to revisit the first few videos of <a href="https://www.youtube.com/playlist?list=PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3pi">3blue1brown's</a> for the ?th time.

Initial thoughts for the game environment:

```python
import random

class Mastermind:
    def __init__(self, return_history=False, seed=None):
        self.secret = None
        self.return_history = return_history
        self.history = []
        self.reset(seed)

    def reset(self, seed=None):
        if seed is not None:
            random.seed(seed)
        self.secret = random.choices(range(1, 7), k=4)
        self.history.clear()

    def step(self, guess):
        # score guess; determine how many black and white pegs
        # append (guess, score) to history
        return self.history[:] if self.return_history else self.history[-1]
```

Now that I think about it, I'm not sure what data structure I'll need to feed into the neural network. But I can work that out when I get there.

## 2025-05-20: Game Environment

In [1]:
from collections import Counter
import random

In [2]:
class Mastermind:
    def __init__(self, return_history=False, seed=None):
        self.secret = None
        self.secret_counts = Counter()
        self.return_history = return_history
        self.history = []
        self.reset(seed)

    def reset(self, seed=None):
        self.secret_counts.clear()
        self.history.clear()
        if seed is not None:
            random.seed(seed)
        self.secret = random.choices(range(1, 7), k=4)
        self.secret_counts.update(self.secret)

    def step(self, guess):
        black = sum(self.secret[i] == guess[i] for i in range(4))
        white = sum((self.secret_counts & Counter(guess)).values()) - black
        score = (black, white)
        self.history.append((guess, score))
        return self.history[:] if self.return_history else score

One thing this class doesn't address so far is how to help carry out Knuth's algorithm. For each remaining possible guess $A$, I think I'll need to take each other remaining possible guess $B$ and say, if $B$ were the secret, what score would $A$ receive? And then group the $B$s by that score, and aggregate them to measure $A$'s suitability as the next guess.

I'm starting to think that I don't need an actual game environment to carry out Knuth's algorithm.

In any case, here's a test run that I solved manually:

In [3]:
M = Mastermind(seed='first test')

In [4]:
M.step((1, 2, 3, 4))

(0, 2)

In [5]:
M.step((2, 4, 5, 6))

(0, 3)

In [6]:
M.step((4, 5, 2, 2))

(1, 1)

In [7]:
M.step((6, 6, 4, 2))

(1, 1)

In [8]:
M.step((6, 1, 2, 5))

(0, 3)

In [9]:
M.step((5, 3, 6, 2))

(4, 0)

In [10]:
len(M.history)

6

During my neural network studying, I wondered how many output neurons I should have for Mastermind. I think I should use one-hot encoding, with each possible color of each peg having its own neuron, so I'll have $4 \times 6 = 24$ output neurons.

I'm starting to see a lot of possible issues with the input neurons, though. I could represent each previous guess and score with $4 \times 6 + 2 \times 5 = 34$ neurons, and there are a maximum of 9 previous guesses so that's $9 \times 34 = 306$ input neurons. Maybe that's a lot? But here are some definite concerns:
1. Much of the history will be empty/missing. I gather these inputs could be set to zero. Maybe the first layer biases as well?
2. The order of previous guesses doesn't matter. I feel that brute-forcing through this would make training exponentially more difficult/expensive. I asked Gemini for some topics to look into: permutation invariance, set processing, aggregation functions and pooling, embedding functions, and deep sets. Among others. I may have picked a project that doesn't lend itself to a simple neural network.
3. The different colors are interchangeable. This one, it seems, a neural network may actually be well suited to learn.

Also, yes, I know, the `step` method above returns multiple types.