In [1]:
import sys
sys.path.insert(-1, '..')

import puzzle.tester as tester
import puzzle.sudoku as su
from puzzle.jupyter_helpers import *
display(HTML(SUDOKU_CSS))

In [2]:
import copy
import pandas as pd
import matplotlib.pyplot as plt

%matplotlib inline
plt.rcParams["figure.figsize"] = [12, 6]
pd.set_option('precision', 3)

## Diversion #2: Cheating

Occurred to me that the way I've designed the puzzle and solver classes opens the way to having a "cheating" solver. Basically a solver that over-writes the puzzle with a pre-programmed sequence of numbers that obey the rules but do not match the original puzzle clues.

So just for fun let's see how easy that is to do and how I could improve the puzzle class to detect and block attempts to "cheat".

### First attempt: Lie

First attempt was based on this code in the original implementation of `is_solved`:

```python
def is_solved(self):
    return self.is_puzzle_valid() and self._num_empty_cells == 0
```

So, how about a solver that just plain lies by over-writing the number of empty cells left?


In [3]:
class CheatingSolver:
    def solve(self, puzzle):
        """Easiest way to cheat would be to trick the is_solved() method on the puzzle to always returning True"""
        puzzle._num_empty_cells = 0
        return True

puzzle = su.SudokuPuzzle(starting_grid=su.from_string(su.SAMPLE_PUZZLES[0]['puzzle']))
solver = CheatingSolver()
solver.solve(puzzle)

True

So the solver will always return `True` but the puzzle itself should know that it's not really solved. I changed the `is_solved` method to actually check that every cell has a value.

```python
def is_solved(self):
    if self.is_puzzle_valid():
        for i in range(self.max_value()):
            for j in range(self.max_value()):
                if self.is_empty(i, j):
                    return False
        return True
    else:
        return False
```

In [4]:
puzzle.is_solved()

False

Now if we use this in the `PuzzleTester` then we want to make sure that it's detecting that the puzzle isn't really solved.

In [5]:
include_levels = ['Kids', 'Easy', 'Moderate', 'Hard']  # , 'Diabolical', 'Pathalogical']
test_cases = [x for x in su.SAMPLE_PUZZLES if x['level'] in include_levels]
pt = tester.PuzzleTester(puzzle_class=su.SudokuPuzzle)
pt.add_test_cases(test_cases)

8

In [6]:
solver = CheatingSolver()
pt.run_tests(solver)
df = pd.DataFrame(pt.get_test_results())
df.style.highlight_null()

Unnamed: 0,label,level,starting_clues,CheatingSolver
0,SMH 1,Kids,31,
1,SMH 2,Easy,24,
2,KTH 1,Easy,30,
3,Rico Alan Heart,Easy,22,
4,SMH 3,Moderate,26,
5,SMH 4,Hard,22,
6,SMH 5,Hard,25,
7,Greg [2017],Hard,21,


So I had to change `PuzzleTester` class to check the return value of the puzzle's `is_solved` method, rather than trust the solver's return value from `solve`. If the puzzle asserts that it is NOT solved then no result is recorded for the solver.

### Second attempt: Over-write with a canned solution

So since our really simple cheater no longer works I guess we'll need a more sophisticated version. We could just fill in the blank cells with "1" (or any other value) but then the `is_puzzle_valid` check would fail, at which point we may as well solve it properly. 


So maybe what our cheat needs to do is overwrite *all* cells in a rule-abiding way. We won't be actually solving the original puzzle. Basically, we're just writing a "pre-solved" puzzle over the top.


In [7]:
class CheatingSolver:
    def solve(self, puzzle):
        """Write a pre-solved puzzle in over the top of the provided one"""
        starting_values = [0, 3, 6, 1, 4, 7, 2, 5, 8]
        max_value = puzzle.max_value
        assert max_value == 9, "I can't handle puzzles other than 9x9"
        puzzle.clear_all()
        for i in range(max_value):
            for j in range(max_value):
                #print(i, j, (starting_values[i] + j) % max_value + 1)
                puzzle.set(i, j, (starting_values[i] + j) % max_value + 1)
        return True

In [8]:
puzzle = su.SudokuPuzzle(starting_grid=su.from_string(su.SAMPLE_PUZZLES[0]['puzzle']))
solver = CheatingSolver()
solver.solve(puzzle)
puzzle.is_solved()

True

So the cheat works. 

Now, the whole point of cheating here is to be faster than a real solver, so let's test performance.


In [9]:
for m in su.SOLVERS:
    solver = su.SudokuSolver(method=m)
    pt.run_tests(solver, m)

In [10]:
all_methods = list(pt.get_solver_labels())
all_methods.append(method)

solver = CheatingSolver()
pt.run_tests(solver)
# show_results(pt, axis=1)
df = pd.DataFrame(pt.get_test_results())
df

NameError: name 'method' is not defined

## Diversion #3: Catching Cheats

To prevent the new cheat we basically need to compare the puzzle with a copy of the original. That way we can detect that the starting clues have been over written.

We can't do this in the `SudokuPuzzle` itself. Python doesn't really have `private` attributes, [more a naming convention](https://docs.python.org/3/tutorial/classes.html#tut-private) that signals "hey, you're not supposed to muck around with this", but we've already seen that we can pretty much ignore that and modify the class's internals. And since we're trying to guard against cheating we can assume an attacker will happily ignore convention.

If we assume that the caller (test harness) can be trusted then we can let the caller verify that the original puzzle is OK. We'll just need a function that confirms if the starting clues in one puzzle also exist in the second.


In [11]:
def has_same_clues(a, b):
    """Returns true if the non empty cells in a have the same value in b"""
    if a.max_value != b.max_value:
        return False
    
    for i in range(a.max_value):
        for j in range(a.max_value):
            if not a.is_empty(i, j) and a.get(i, j) != b.get(i, j):
                return False
    return True

In [12]:
puzzle = su.SudokuPuzzle(starting_grid=su.from_string(su.SAMPLE_PUZZLES[-1]['puzzle']))
original = copy.deepcopy(puzzle)
has_same_clues(original, puzzle)

True

In [13]:
solver.solve(puzzle)
puzzle.is_solved()

True

In [14]:
has_same_clues(original, puzzle)

True

Putting it all together, let's ~~monkey patch~~ switch on anti-cheat checking and make sure it throws away test results if the solver has cheated.

In [15]:
pt.run_tests(solver)
df = pd.DataFrame(pt.get_test_results())
df.style.highlight_null()

Unnamed: 0,label,level,starting_clues,CheatingSolver,backtracking,constraintpropogation,deductive,sat,SudokuSolver
0,SMH 1,Kids,31,,0.00383,0.00167,0.00166,0.0173,0.0259
1,SMH 2,Easy,24,,0.2,0.00196,0.00311,0.017,0.0217
2,KTH 1,Easy,30,,0.0112,0.00142,0.00152,0.0168,0.0188
3,Rico Alan Heart,Easy,22,,0.0686,0.0206,0.00644,0.0169,0.0208
4,SMH 3,Moderate,26,,0.0744,0.0193,0.027,0.0173,0.0177
5,SMH 4,Hard,22,,1.34,0.0242,0.0166,0.0171,0.0174
6,SMH 5,Hard,25,,0.562,0.0237,0.0125,0.019,0.0182
7,Greg [2017],Hard,21,,0.565,0.037,0.0406,0.0194,0.0244


OK! Our cheating solver has had no results recorded for it, because the answer it gives does not match the starting clues!

There are probably ways to defeat these checks, particularly in a language like Python where "monkey patching" is a thing and everything is dynamic. That might be a fun way to learn more about the internals of Python, but for now I'm declaring this "done" and moving on to the next puzzle...