# Using Python to solve Regexp CrossWord Puzzles

Have a look at the amazing <https://regexcrossword.com/> website.

I played during about two hours, and could manually solve almost all problems, quite easily for most of them.
But then I got stucked on [this one](https://regexcrossword.com/challenges/volapuk/puzzles/5).

Soooooo. I want to use [Python3](https://docs.python.org/3/) [regular expressions](https://docs.python.org/3/library/re.html) and try to solve any such cross-word puzzles.

**Warning:** This notebook will *not* explain the concept and syntax of regular expressions, go read on about it on Wikipedia or in a good book. The Python documentation gives a nice introduction [here](https://docs.python.org/3/howto/regex.html#regex-howto).

- Author: [Lilian Besson](https://besson.link) ([@Naereen](https://GitHub.com/Naereen) ;
- License: [MIT License](https://lbesson.mit-license.org/) ;
- Date: 28-02-2021.

## Representation of a problem

Here is a screenshot from the game webpage.

![](Using_Python_to_solve_Regexp_CrossWord_Puzzles_1.png)

As you can see, an instance of this game is determined by its rectangular size, let's denote it $(m, n)$, so here there are $m=5$ lines and $n=5$ columns.

I'll also use this [easy problem](https://regexcrossword.com/challenges/beginner/puzzles/1):

![](Using_Python_to_solve_Regexp_CrossWord_Puzzles_2.png)

Let's define both, in a small dictionnary containing two to four lists of regexps.

### Easy problem of size $(2,2)$ with four constraints

In [1]:
problem1 = {
    "left_lines": [
        r"HE|LL|O+",   # HE|LL|O+   line 1
        r"[PLEASE]+",  # [PLEASE]+  line 2
    ],
    "right_lines": None,
    "top_columns": [
        r"[^SPEAK]+",  # [^SPEAK]+  column 1
        r"EP|IP|EF",   # EP|IP|EF   column 2
    ],
    "bottom_columns": None,
}

The keys `"right_lines"` and `"bottom_columns"` can be empty, as for easier problems there are no constraints on the right and bottom.

Each line and column (but not each square) contains a regular expression, on a common alphabet of letters and symbols.
Let's write $\Sigma$ this alphabet, which in the most general case is $\Sigma=\{$ `A`, `B`, ..., `Z`, `0`, ..., `9`, `:`, `?`, `.`, `$`, `-`$\}$.

For the first beginner problem, the alphabet can be shorten:

In [2]:
alphabet1 = {
    'H', 'E', 'L', 'O',
    'P', 'L', 'E', 'A', 'S', 'E',
    'S', 'P', 'E', 'A', 'K',
    'E', 'P', 'I', 'P', 'I', 'F',
}

print(f"alphabet1 = \n{sorted(alphabet1)}")

alphabet1 = 
['A', 'E', 'F', 'H', 'I', 'K', 'L', 'O', 'P', 'S']


### Easy problem of size $(5,5)$ with 20 constraints

Defining the [second problem](https://regexcrossword.com/challenges/volapuk/puzzles/5) is just a question of more copy-pasting:

In [89]:
problem2 = {
    "left_lines": [
        r"(N3|TRA|N7)+",  # left line 1
        r"[1LOVE2?4]+.",  # left line 2
        r"(A|D)M[5-8$L]+",  # left line 3
        r"[^\s0ILAD]+",  # left line 4
        r"[B-E]+(.)\1.",  # left line 5
    ],
    "right_lines": [
        r"[^OLD\s]+",  # right line 1
        r"(\d+)[LA\s$?]+",  # right line 2
        r"(\-P|5\$|AM|Z|L)+",  # right line 3
        r"(\-D|\-WE)+[^L4-9N$?]+",  # right line 4
        r"[FED$?]+",  # right line 5
    ],
    "top_columns": [
        r"[2TAIL\-D]+",  # top column 1
        r"(WE|R4|RY|M)+",  # top column 2
        r"[FEAL3-5S]+",  # top column 3
        r"[^FA\sT1-2]+F",  # top column 4
        r"[LO\s\?5-8]+",  # top column 5
    ],
    "bottom_columns": [
        r"[^ILYO]+",  # top column 1
        r".+[MURDEW]+",  # top column 2
        r"[1ALF5$E\s]+",  # top column 3
        r"[\dFAN$?]+",  # top column 4
        r".+\s.+\?",  # top column 5
    ],
}

And its alphabet:

In [4]:
import string

In [5]:
alphabet2 = set(string.digits) \
    | set(string.ascii_uppercase) \
    | { ':', '?', '.', '$', '-' }

print(f"alphabet2 = \n{sorted(alphabet2)}")

alphabet2 = 
['$', '-', '.', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', ':', '?', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z']


### A few useful functions

Let's first extract the dimension of a problem:

In [6]:
def dimension_problem(problem):
    m = len(problem['left_lines'])
    if problem['right_lines'] is not None:
        assert m == len(problem['right_lines'])
    n = len(problem['top_columns'])
    if problem['bottom_columns'] is not None:
        assert n == len(problem['bottom_columns'])
    return (m, n)

In [7]:
problem1

{'left_lines': ['HE|LL|O+', '[PLEASE]+'],
 'right_lines': None,
 'top_columns': ['[^SPEAK]+', 'EP|IP|EF'],
 'bottom_columns': None}

In [8]:
dimension_problem(problem1)

(2, 2)

Now let's write a representation of a grid, a solution (or partial solution) of a problem:

In [115]:
___ = "_"  # represents an empty answer, as _ is not in the alphabet
grid1_partial = [
    [ 'H', ___ ],
    [ ___, 'P' ],
]

In [114]:
grid1_solution = [
    [ 'H', 'E' ],
    [ 'L', 'P' ],
]

As well as a few complete grids which are NOT solutions

In [119]:
grid1_wrong1 = [
    [ 'H', 'E' ],
    [ 'L', 'F' ],
]

In [122]:
grid1_wrong2 = [
    [ 'H', 'E' ],
    [ 'E', 'P' ],
]

In [123]:
grid1_wrong3 = [
    [ 'H', 'E' ],
    [ 'O', 'F' ],
]

In [124]:
grid1_wrong4 = [
    [ 'O', 'E' ],
    [ 'O', 'F' ],
]

We also write these short functions to extract the $i$-th line or $j$-th column:

In [125]:
def nth_line(grid, line):
    return "".join(grid[line])

def nth_column(grid, column):
    return "".join(grid[line][column] for line in range(len(grid)))

In [52]:
[ nth_line(grid1_solution, line) for line in range(len(grid1_solution)) ]

['HE', 'LP']

In [53]:
[ nth_column(grid1_solution, column) for column in range(len(grid1_solution[0])) ]

['HL', 'EP']

And a partial solution for the harder problem:

In [116]:
___ = "_"  # represents an empty answer, as _ is not in the alphabet
grid2_partial = [
    [ 'T', 'R', 'A', 'N', '7' ],
    [ '2', '4', ___, ___, ' ' ],
    [ 'A', ___, ___, ___, ___ ],
    [ '-', ___, ___, ___, ___ ],
    [ 'D', ___, ___, ___, '?' ],
]

Let's extract the dimension of a grid, just to check it:

In [12]:
def dimension_grid(grid):
    m = len(grid)
    n = len(grid[0])
    assert all(n == len(grid[i]) for i in range(1, m))
    return (m, n)

In [13]:
print(f"Grid grid1_partial has dimension: {dimension_grid(grid1_partial)}")
print(f"Grid grid1_solution has dimension: {dimension_grid(grid1_solution)}")

Grid grid1_partial has dimension: (2, 2)
Grid grid1_solution has dimension: (2, 2)


In [14]:
print(f"Grid grid2_partial has dimension: {dimension_grid(grid2_partial)}")

Grid grid2_partial has dimension: (5, 5)


In [23]:
def check_dimensions(problem, grid):
    return dimension_problem(problem) == dimension_grid(grid)

In [25]:
assert check_dimensions(problem1, grid1_partial)
assert check_dimensions(problem1, grid1_solution)

In [26]:
assert not check_dimensions(problem2, grid1_partial)

In [27]:
assert check_dimensions(problem2, grid2_partial)

In [28]:
assert not check_dimensions(problem1, grid2_partial)

### Two more checks

We also have to check if a word is in an alphabet:

In [67]:
def check_alphabet(alphabet, word, debug=True):
    result = True
    for i, letter in enumerate(word):
        new_result = letter in alphabet
        if debug and result and not new_result:
            print(f"The word {repr(word)} is not in alphabet {repr(alphabet)}, as its #{i}th letter {letter} is not present.")
        result = result and new_result
    return result

In [30]:
assert check_alphabet(alphabet1, 'H' 'E')  # concatenate the strings

In [31]:
assert check_alphabet(alphabet1, 'H' 'E')
assert check_alphabet(alphabet1, 'L' 'P')
assert check_alphabet(alphabet1, 'H' 'L')
assert check_alphabet(alphabet1, 'E' 'P')

In [32]:
assert check_alphabet(alphabet2, "TRAN7")

And also check that a word matches a regexp:

In [33]:
import re

In [72]:
def match(regexp, word, debug=True):
    result = re.match(regexp, word)
    entire_match = False
    if result is not None:
        entire_match = result.group(0) == word
    if debug:
        if entire_match:
            print(f"The word {repr(word)} is matched by {repr(regexp)}")
        else:
            print(f"The word {repr(word)} is NOT matched by {repr(regexp)}")
    return entire_match

In [73]:
match(r"(N3|TRA|N7)+", "TRAN7")

The word 'TRAN7' is matched by '(N3|TRA|N7)+'


True

In [74]:
match(r"(N3|TRA|N7)+", "TRAN8")

The word 'TRAN8' is NOT matched by '(N3|TRA|N7)+'


False

In [75]:
match(r"(N3|TRA|N7)+", "")

The word '' is NOT matched by '(N3|TRA|N7)+'


False

In [76]:
match(r"(N3|TRA|N7)+", "TRA")

The word 'TRA' is matched by '(N3|TRA|N7)+'


True

That should be enough to start the first "easy" task.

## First easy task: check that a line/column word validate its contraints

Given a problem $P$ of dimension $(m, n)$, its alphabet $\Sigma$, a position $i \in [| 0, m-1 |]$ of a line or $j \times [|0, n-1 |]$ of a column, and a word $w \in \Sigma^k$ (with $k=m$ for line or $k=n$ for column), I want to write a function that checks the validity of each (left/right) line, or (top/bottom) constraints.

To ease debugging, and in the goal of using this Python program to improve my skills in solving such puzzles, I don't want this function to just reply `True` or `False`, but to also print for each constraints if it is satisfied or not.

**Bonus:** for each regexp contraint, highlight the parts which corresponded to each letter of the word?

### For lines

We are ready to check the one or two constraints of a line.
The same function will be written for columns, just below.

In [94]:
def check_line(problem, alphabet, word, position, debug=True, early=False):
    if not check_alphabet(alphabet, word, debug=debug):
        return False
    m, n = dimension_problem(problem)
    if len(word) != n:
        if debug:
            print(f"Word {repr(word)} does not have correct size n = {n} for lines")
        return False
    assert 0 <= position < m
    constraints = []
    if "left_lines" in problem and problem["left_lines"] is not None:
        constraints += [ problem["left_lines"][position] ]
    if "right_lines" in problem and problem["right_lines"] is not None:
        constraints += [ problem["right_lines"][position] ]
    # okay we have one or two constraint for this line,
    assert len(constraints) in {1, 2}
    # let's check them!
    result = True
    for cnb, constraint in enumerate(constraints):
        if debug:
            print(f"For line constraint #{cnb} {repr(constraint)}:")
        new_result = match(constraint, word, debug=debug)
        if early and not new_result: return False
        result = result and new_result
    return result

Let's try it!

In [82]:
problem1, alphabet1, grid1_solution

({'left_lines': ['HE|LL|O+', '[PLEASE]+'],
  'right_lines': None,
  'top_columns': ['[^SPEAK]+', 'EP|IP|EF'],
  'bottom_columns': None},
 {'A', 'E', 'F', 'H', 'I', 'K', 'L', 'O', 'P', 'S'},
 [['H', 'E'], ['L', 'P']])

In [84]:
n, m = dimension_problem(problem1)

for line in range(n):
    word = nth_line(grid1_solution, line)
    print(f"- For line number {line}, checking word {repr(word)}:")
    result = check_line(problem1, alphabet1, word, line)

- For line number 0, checking word 'HE':
For line constraint #0 'HE|LL|O+':
The word 'HE' is matched by 'HE|LL|O+'
- For line number 1, checking word 'LP':
For line constraint #0 '[PLEASE]+':
The word 'LP' is matched by '[PLEASE]+'


In [87]:
n, m = dimension_problem(problem1)
fake_words = ["OK", "HEY", "NOT", "HELL", "N", "", "HU", "OO", "EA"]

for word in fake_words:
    print(f"# For word {repr(word)}:")
    for line in range(n):
        result = check_line(problem1, alphabet1, word, line)
        print(f"  => {result}")

# For word 'OK':
For line constraint #0 'HE|LL|O+':
The word 'OK' is NOT matched by 'HE|LL|O+'
  => False
For line constraint #0 '[PLEASE]+':
The word 'OK' is NOT matched by '[PLEASE]+'
  => False
# For word 'HEY':
The word 'HEY' is not in alphabet {'O', 'P', 'H', 'L', 'I', 'S', 'E', 'A', 'F', 'K'}, as its #2th letter Y is not present.
  => False
The word 'HEY' is not in alphabet {'O', 'P', 'H', 'L', 'I', 'S', 'E', 'A', 'F', 'K'}, as its #2th letter Y is not present.
  => False
# For word 'NOT':
The word 'NOT' is not in alphabet {'O', 'P', 'H', 'L', 'I', 'S', 'E', 'A', 'F', 'K'}, as its #0th letter N is not present.
  => False
The word 'NOT' is not in alphabet {'O', 'P', 'H', 'L', 'I', 'S', 'E', 'A', 'F', 'K'}, as its #0th letter N is not present.
  => False
# For word 'HELL':
Word 'HELL' does not have correct size n = 2 for lines
  => False
Word 'HELL' does not have correct size n = 2 for lines
  => False
# For word 'N':
The word 'N' is not in alphabet {'O', 'P', 'H', 'L', 'I', 'S', '

That was long, but it works fine!

In [91]:
n, m = dimension_problem(problem2)

for line in [0]:
    word = nth_line(grid2_partial, line)
    print(f"- For line number {line}, checking word {repr(word)}:")
    result = check_line(problem2, alphabet2, word, line)
    print(f"  => {result}")

- For line number 0, checking word 'TRAN7':
For line constraint #0 '(N3|TRA|N7)+':
The word 'TRAN7' is matched by '(N3|TRA|N7)+'
For line constraint #1 '[^OLD\\s]+':
The word 'TRAN7' is matched by '[^OLD\\s]+'
  => True


In [93]:
n, m = dimension_problem(problem2)
fake_words = [
    "TRAN8", "N2TRA",  # violate first constraint
    "N3N3N7", "N3N3", "TRA9",  # smaller or bigger dimension
    "O L D", "TRA  ",  # violate second contraint
]

for word in fake_words:
    for line in [0]:
        print(f"- For line number {line}, checking word {repr(word)}:")
        result = check_line(problem2, alphabet2, word, line)
        print(f"  => {result}")

- For line number 0, checking word 'TRAN8':
For line constraint #0 '(N3|TRA|N7)+':
The word 'TRAN8' is NOT matched by '(N3|TRA|N7)+'
For line constraint #1 '[^OLD\\s]+':
The word 'TRAN8' is matched by '[^OLD\\s]+'
  => False
- For line number 0, checking word 'N2TRA':
For line constraint #0 '(N3|TRA|N7)+':
The word 'N2TRA' is NOT matched by '(N3|TRA|N7)+'
For line constraint #1 '[^OLD\\s]+':
The word 'N2TRA' is matched by '[^OLD\\s]+'
  => False
- For line number 0, checking word 'N3N3N7':
Word 'N3N3N7' does not have correct size n = 5 for lines
  => False
- For line number 0, checking word 'N3N3':
Word 'N3N3' does not have correct size n = 5 for lines
  => False
- For line number 0, checking word 'TRA9':
Word 'TRA9' does not have correct size n = 5 for lines
  => False
- For line number 0, checking word 'O L D':
The word 'O L D' is not in alphabet {'4', 'N', 'W', 'A', ':', 'B', 'P', '5', 'H', '6', 'R', '$', '.', 'Y', 'O', '-', 'G', 'I', 'C', 'M', 'S', 'Z', '1', 'E', 'V', 'F', '?', '0'

### For columns

We are ready to check the one or two constraints of a line.
The same function will be written for columns, just below.

In [99]:
def check_column(problem, alphabet, word, position, debug=True, early=False):
    if not check_alphabet(alphabet, word, debug=debug):
        return False
    m, n = dimension_problem(problem)
    if len(word) != m:
        if debug:
            print(f"Word {repr(word)} does not have correct size n = {n} for columns")
        return False
    assert 0 <= position < n
    constraints = []
    if "top_columns" in problem and problem["top_columns"] is not None:
        constraints += [ problem["top_columns"][position] ]
    if "bottom_columns" in problem and problem["bottom_columns"] is not None:
        constraints += [ problem["bottom_columns"][position] ]
    # okay we have one or two constraint for this column,
    assert len(constraints) in {1, 2}
    # let's check them!
    result = True
    for cnb, constraint in enumerate(constraints):
        if debug:
            print(f"For column constraint #{cnb} {repr(constraint)}:")
        new_result = match(constraint, word, debug=debug)
        if early and not new_result: return False
        result = result and new_result
    return result

Let's try it!

In [100]:
problem1, alphabet1, grid1_solution

({'left_lines': ['HE|LL|O+', '[PLEASE]+'],
  'right_lines': None,
  'top_columns': ['[^SPEAK]+', 'EP|IP|EF'],
  'bottom_columns': None},
 {'A', 'E', 'F', 'H', 'I', 'K', 'L', 'O', 'P', 'S'},
 [['H', 'E'], ['L', 'P']])

In [101]:
n, m = dimension_problem(problem1)

for column in range(m):
    word = nth_column(grid1_solution, column)
    print(f"- For column number {column}, checking word {repr(word)}:")
    result = check_column(problem1, alphabet1, word, column)

- For column number 0, checking word 'HL':
For column constraint #0 '[^SPEAK]+':
The word 'HL' is matched by '[^SPEAK]+'
- For column number 1, checking word 'EP':
For column constraint #0 'EP|IP|EF':
The word 'EP' is matched by 'EP|IP|EF'


In [102]:
n, m = dimension_problem(problem1)
fake_words = ["OK", "HEY", "NOT", "HELL", "N", "", "HU", "OO", "EA"]

for word in fake_words:
    print(f"# For word {repr(word)}:")
    for column in range(m):
        result = check_column(problem1, alphabet1, word, column)
        print(f"  => {result}")

# For word 'OK':
For column constraint #0 '[^SPEAK]+':
The word 'OK' is NOT matched by '[^SPEAK]+'
  => False
For column constraint #0 'EP|IP|EF':
The word 'OK' is NOT matched by 'EP|IP|EF'
  => False
# For word 'HEY':
The word 'HEY' is not in alphabet {'O', 'P', 'H', 'L', 'I', 'S', 'E', 'A', 'F', 'K'}, as its #2th letter Y is not present.
  => False
The word 'HEY' is not in alphabet {'O', 'P', 'H', 'L', 'I', 'S', 'E', 'A', 'F', 'K'}, as its #2th letter Y is not present.
  => False
# For word 'NOT':
The word 'NOT' is not in alphabet {'O', 'P', 'H', 'L', 'I', 'S', 'E', 'A', 'F', 'K'}, as its #0th letter N is not present.
  => False
The word 'NOT' is not in alphabet {'O', 'P', 'H', 'L', 'I', 'S', 'E', 'A', 'F', 'K'}, as its #0th letter N is not present.
  => False
# For word 'HELL':
Word 'HELL' does not have correct size n = 2 for columns
  => False
Word 'HELL' does not have correct size n = 2 for columns
  => False
# For word 'N':
The word 'N' is not in alphabet {'O', 'P', 'H', 'L', 'I'

That was long, but it works fine!

In [108]:
n, m = dimension_problem(problem2)

for column in [0]:
    word = nth_column(grid2_partial, column)
    print(f"- For column number {column}, checking word {repr(word)}:")
    result = check_column(problem2, alphabet2, word, column)
    print(f"  => {result}")

- For column number 0, checking word 'T2A-D':
For column constraint #0 '[2TAIL\\-D]+':
The word 'T2A-D' is matched by '[2TAIL\\-D]+'
For column constraint #1 '[^ILYO]+':
The word 'T2A-D' is matched by '[^ILYO]+'
  => True


In [109]:
n, m = dimension_problem(problem2)
fake_words = [
    "TRAN8", "N2TRA",  # violate first constraint
    "N3N3N7", "N3N3", "TRA9",  # smaller or bigger dimension
    "O L D", "TRA  ",  # violate second contraint
]

for word in fake_words:
    for line in [0]:
        print(f"- For line number {line}, checking word {repr(word)}:")
        result = check_column(problem2, alphabet2, word, line)
        print(f"  => {result}")

- For line number 0, checking word 'TRAN8':
For column constraint #0 '[2TAIL\\-D]+':
The word 'TRAN8' is NOT matched by '[2TAIL\\-D]+'
For column constraint #1 '[^ILYO]+':
The word 'TRAN8' is matched by '[^ILYO]+'
  => False
- For line number 0, checking word 'N2TRA':
For column constraint #0 '[2TAIL\\-D]+':
The word 'N2TRA' is NOT matched by '[2TAIL\\-D]+'
For column constraint #1 '[^ILYO]+':
The word 'N2TRA' is matched by '[^ILYO]+'
  => False
- For line number 0, checking word 'N3N3N7':
Word 'N3N3N7' does not have correct size n = 5 for columns
  => False
- For line number 0, checking word 'N3N3':
Word 'N3N3' does not have correct size n = 5 for columns
  => False
- For line number 0, checking word 'TRA9':
Word 'TRA9' does not have correct size n = 5 for columns
  => False
- For line number 0, checking word 'O L D':
The word 'O L D' is not in alphabet {'4', 'N', 'W', 'A', ':', 'B', 'P', '5', 'H', '6', 'R', '$', '.', 'Y', 'O', '-', 'G', 'I', 'C', 'M', 'S', 'Z', '1', 'E', 'V', 'F', '?

## Second easy task: check that a proposed grid is a valid solution

I think it's easy, as we just have to use $m$ times the `check_line` and $n$ times the `check_column` functions.

In [111]:
def check_grid(problem, alphabet, grid, debug=True, early=False):
    m, n = dimension_problem(problem)
    
    ok_lines = [False] * m
    for line in range(m):
        word = nth_line(grid, line)
        ok_lines[line] = check_line(problem, alphabet, word, line, debug=debug, early=early)
    
    ok_columns = [False] * n
    for column in range(n):
        word = nth_column(grid, column)
        ok_columns[column] = check_column(problem, alphabet, word, column, debug=debug, early=early)
    
    return all(ok_lines) and all(ok_columns)

Let's try it!

### For easty problem

For a partial grid, of course it's going to be invalid just because `'_'` is *not* in the alphabet $\Sigma$.

In [117]:
check_grid(problem1, alphabet1, grid1_partial)

The word 'H_' is not in alphabet {'O', 'P', 'H', 'L', 'I', 'S', 'E', 'A', 'F', 'K'}, as its #1th letter _ is not present.
The word '_P' is not in alphabet {'O', 'P', 'H', 'L', 'I', 'S', 'E', 'A', 'F', 'K'}, as its #0th letter _ is not present.
The word 'H_' is not in alphabet {'O', 'P', 'H', 'L', 'I', 'S', 'E', 'A', 'F', 'K'}, as its #1th letter _ is not present.
The word '_P' is not in alphabet {'O', 'P', 'H', 'L', 'I', 'S', 'E', 'A', 'F', 'K'}, as its #0th letter _ is not present.


False

For a complete grid, let's check that our solution is valid:

In [126]:
check_grid(problem1, alphabet1, grid1_solution)

For line constraint #0 'HE|LL|O+':
The word 'HE' is matched by 'HE|LL|O+'
For line constraint #0 '[PLEASE]+':
The word 'LP' is matched by '[PLEASE]+'
For column constraint #0 '[^SPEAK]+':
The word 'HL' is matched by '[^SPEAK]+'
For column constraint #0 'EP|IP|EF':
The word 'EP' is matched by 'EP|IP|EF'


True

And let's also check that the few wrong solutions are indeed not valid:

In [127]:
check_grid(problem1, alphabet1, grid1_wrong1)

For line constraint #0 'HE|LL|O+':
The word 'HE' is matched by 'HE|LL|O+'
For line constraint #0 '[PLEASE]+':
The word 'LF' is NOT matched by '[PLEASE]+'
For column constraint #0 '[^SPEAK]+':
The word 'HL' is matched by '[^SPEAK]+'
For column constraint #0 'EP|IP|EF':
The word 'EF' is matched by 'EP|IP|EF'


False

In [128]:
check_grid(problem1, alphabet1, grid1_wrong2)

For line constraint #0 'HE|LL|O+':
The word 'HE' is matched by 'HE|LL|O+'
For line constraint #0 '[PLEASE]+':
The word 'EP' is matched by '[PLEASE]+'
For column constraint #0 '[^SPEAK]+':
The word 'HE' is NOT matched by '[^SPEAK]+'
For column constraint #0 'EP|IP|EF':
The word 'EP' is matched by 'EP|IP|EF'


False

In [129]:
check_grid(problem1, alphabet1, grid1_wrong3)

For line constraint #0 'HE|LL|O+':
The word 'HE' is matched by 'HE|LL|O+'
For line constraint #0 '[PLEASE]+':
The word 'OF' is NOT matched by '[PLEASE]+'
For column constraint #0 '[^SPEAK]+':
The word 'HO' is matched by '[^SPEAK]+'
For column constraint #0 'EP|IP|EF':
The word 'EF' is matched by 'EP|IP|EF'


False

In [130]:
check_grid(problem1, alphabet1, grid1_wrong4)

For line constraint #0 'HE|LL|O+':
The word 'OE' is NOT matched by 'HE|LL|O+'
For line constraint #0 '[PLEASE]+':
The word 'OF' is NOT matched by '[PLEASE]+'
For column constraint #0 '[^SPEAK]+':
The word 'OO' is matched by '[^SPEAK]+'
For column constraint #0 'EP|IP|EF':
The word 'EF' is matched by 'EP|IP|EF'


False

We can see that for each wrong grid, at least one of the contraint is violated!

That's pretty good!

### For the hard problem

Well I don't have a solution yet, so I cannot check it!

## Third easy task: generate all words of a given size in the alphabet

Using [`itertools.product`](https://docs.python.org/3/library/itertools.html#itertools.product) and the alphabet defined above, it's going to be easy.

Note that I'll first try with a smaller alphabet, to check the result (for problem 1).

In [40]:
import itertools

In [41]:
def all_words_of_alphabet(alphabet, size):
    yield from itertools.product(alphabet, repeat=size)

Just a quick check:

In [42]:
list(all_words_of_alphabet(['0', '1'], 3))

[('0', '0', '0'),
 ('0', '0', '1'),
 ('0', '1', '0'),
 ('0', '1', '1'),
 ('1', '0', '0'),
 ('1', '0', '1'),
 ('1', '1', '0'),
 ('1', '1', '1')]

The time and memory complexity of this function should be $\mathcal{O}(|\Sigma|^k)$ for words of size $k\in\mathbb{N}^*$.

In [43]:
alphabet0 = ['0', '1']
len_alphabet = len(alphabet0)
for k in [2, 3, 4, 5]:
    print(f"Generating {len_alphabet**k} words of size = {k} takes about")
    %timeit list(all_words_of_alphabet(alphabet0, k))

Generating 4 words of size = 2 takes about
751 ns ± 18.2 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
Generating 8 words of size = 3 takes about
1.2 µs ± 339 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
Generating 16 words of size = 4 takes about
1.34 µs ± 72.4 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
Generating 32 words of size = 5 takes about
2.18 µs ± 55.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


In [44]:
%timeit list(all_words_of_alphabet(['0', '1', '2', '3'], 10))

120 ms ± 9.02 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


We can quickly check that even for the larger alphabet of size ~40, it's quite quick for small words of length $\leq 5$:

In [45]:
len_alphabet = len(alphabet1)
for k in [2, 3, 4, 5]:
    print(f"Generating {len_alphabet**k} words of size = {k} takes about")
    %timeit list(all_words_of_alphabet(alphabet1, k))

Generating 100 words of size = 2 takes about
7.82 µs ± 2.3 µs per loop (mean ± std. dev. of 7 runs, 100000 loops each)
Generating 1000 words of size = 3 takes about
48.4 µs ± 5.48 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
Generating 10000 words of size = 4 takes about
549 µs ± 10.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
Generating 100000 words of size = 5 takes about
8.59 ms ± 1.07 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [46]:
len_alphabet = len(alphabet2)
for k in [2, 3, 4, 5]:
    print(f"Generating {len_alphabet**k} words of size = {k} takes about")
    %timeit list(all_words_of_alphabet(alphabet2, k))

Generating 1681 words of size = 2 takes about
76.4 µs ± 3.46 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
Generating 68921 words of size = 3 takes about
6.77 ms ± 1.55 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)
Generating 2825761 words of size = 4 takes about
316 ms ± 75.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Generating 115856201 words of size = 5 takes about
11.6 s ± 650 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


## My feeling about these problems and my solution

I could have tried to be more efficient, but it's not really important.

## Conclusion

That was nice!

Have a look at [my other notebooks](https://GitHub.com/Naereen/notebooks/).