This notebook demonstrates how chaining `TextModifier`s with `gridsearch.iter_strings` can help solve grid-based puzzles.

Here's a hypothetical puzzle (inspired by a puzzle in an old issue of Panda Magazine, which I solved by hand but then used as a test case when writing the grid searching functions):

------
# By Any Other Name
*You're always forgetting the terms for things.*
```
G S P D N K V N F I I H X I S
B E C F F I M V R R G D V A E
K J L L G A Z Z E K J Q K M G
M M M A O W Z I N A B B E M S
W I V U E W N S C U O K U X E
S H W U Q R N G C F L N D W L
J Q G E S E B D I C S R A U H
L M N K N Y F H D P K N T J E
R F Q A V C M W E A D M Q O X
T V M X P L D I N A G L Z U S
A A B H V H B R T H N G H H P
Y K P B V L K C A T T A E N S
J X V L E Y D K S P O S K R E
Q K U A L K X F J K O N W Y K
Z R B M P Q Y T L J I C Z F T
```
* Batavia (7)
* Black Panther Homeland (7)
* "Good luck!" (5 1 3)
* Imre Lipschitz (7)
* Indescribable (11)
* Nobel Prize-winner Shinya (8)
* Owl Parrots (7)
* Random Mishap (5 8)
* Slow Drip (7)
* Sudden Strike (5 6)
* Ty and Tandy (5 3 6)
------

So you poke at this for a bit, and you see answers for a few of these clues - you happen to know that Imre Lipschitz changed his last name to Lakatos, that the Indonesian capital of Batavia is now called Jakarta, and that Ty Johnson and Tandy Bowen are the alter-egos of Marvel superhero duo Cloak And Dagger.

You don't see any of these in the grid, but you do see "clonnddagger" running up the fourth column, so you guess (correctly) that every answer contains the trigram "aka", but that trigram has been replaced by another letter before putting them into the actual grid.

You could now find the answers for the other clues, hunt through the grid for them, and work from there. But maybe you're eager to move on to other puzzles, and you'd rather have Python do the grunt work of searching the grid for you.

So first, you parse the grid into a numpy array, using `puztool.parse_grid`, or the shorter alias `puztool.pg`:

In [1]:
from puztool import pg, P, lists
grid = pg('''
G S P D N K V N F I I H X I S
B E C F F I M V R R G D V A E
K J L L G A Z Z E K J Q K M G
M M M A O W Z I N A B B E M S
W I V U E W N S C U O K U X E
S H W U Q R N G C F L N D W L
J Q G E S E B D I C S R A U H
L M N K N Y F H D P K N T J E
R F Q A V C M W E A D M Q O X
T V M X P L D I N A G L Z U S
A A B H V H B R T H N G H H P
Y K P B V L K C A T T A E N S
J X V L E Y D K S P O S K R E
Q K U A L K X F J K O N W Y K
Z R B M P Q Y T L J I C Z F T
'''.lower())

Separating by '\\W+'
Array is 15 rows x 15 cols of type object


You know from the enumerations that the shortest answers are 7 letters long, which means the shortest strings you care about in this grid will be 5 characters long. So you can use `iter_strings` to find all 5+-character strings in the grid. The returned values will be `Result` objects where `.val` is the found string and `.provenance` is a tuple of `(start, end)` showing the coordinates of the first and last letters of that word in the grid. There are several thousand such strings here, so let's just see a few of them:

In [2]:
from puztool.grids import iter_strings
for i,s in enumerate(iter_strings(grid, len=(5,None))):
    if not i%1000: # only show every 1000th because there are a LOT
        print(s.val, s.provenance)

gspdn (FromGrid(start=(0, 0), end=(0, 4)),)
zagll (FromGrid(start=(2, 6), end=(2, 2)),)
shwuqrngcflnd (FromGrid(start=(5, 0), end=(5, 12)),)
hmlvb (FromGrid(start=(7, 7), end=(11, 3)),)
aabhvhbrthngh (FromGrid(start=(10, 0), end=(10, 12)),)
pthaapcfuakri (FromGrid(start=(12, 9), end=(0, 9)),)


Now we want to look at all strings that can be produced by taking a string from this and replacing a single character with `'aka'`. We can write this as a `puztool.TextModifier` - a function that takes a `Result` object and returns an iterable of `Result` objects, and automatically knows how to do things like chain with other modifiers. Here's that modifier:

In [3]:
def add_aka(result):
    s = result.val
    for i in range(len(s)):
        yield result.extend(s[:i]+'aka'+s[i+1:], (s, i))

`Result.extend(val, provenance)` returns a new `Result` with the new value but with the new provenance *appended to* the old provenance. In this case, our `add_aka` processor is adding both the original string and which letter was replaced to the provenance chain so we can refer to it later. Here are the first few results:

In [4]:
iter_strings(grid, len=(5,None)) | add_aka | P.limit(6).all()

[Result(val='akaspdn', provenance=(FromGrid(start=(0, 0), end=(0, 4)), ('gspdn', 0))),
 Result(val='gakapdn', provenance=(FromGrid(start=(0, 0), end=(0, 4)), ('gspdn', 1))),
 Result(val='gsakadn', provenance=(FromGrid(start=(0, 0), end=(0, 4)), ('gspdn', 2))),
 Result(val='gspakan', provenance=(FromGrid(start=(0, 0), end=(0, 4)), ('gspdn', 3))),
 Result(val='gspdaka', provenance=(FromGrid(start=(0, 0), end=(0, 4)), ('gspdn', 4))),
 Result(val='akaspdnk', provenance=(FromGrid(start=(0, 0), end=(0, 5)), ('gspdnk', 0)))]

Finally, we can restrict the output to words or phrases in a wordlist using a wordlist as a filter. `puztool.lists.<name>` fetches a `WordList` object derived from `data/wordlists/<name>.txt`; this library doesn't ship with any word lists except OSPD because they're all enormous, but I use a list stored as `npl.txt` that is just the NPL's `allwords.txt` with punctuation, spaces, etc. stripped out. So we can filter the thing with:

In [5]:
iter_strings(grid, len=(5,None)) | add_aka | lists.npl | P.all()

[Result(val='freakaccident', provenance=(FromGrid(start=(0, 8), end=(10, 8)), ('frenccident', 3))),
 Result(val='cloakanddagger', provenance=(FromGrid(start=(1, 2), end=(12, 13)), ('clownddagger', 3))),
 Result(val='unspeakable', provenance=(FromGrid(start=(4, 12), end=(12, 4)), ('unspeible', 5))),
 Result(val='lakatos', provenance=(FromGrid(start=(5, 10), end=(9, 14)), ('lrtos', 1))),
 Result(val='wakanda', provenance=(FromGrid(start=(5, 13), end=(9, 9)), ('wanda', 1))),
 Result(val='leakage', provenance=(FromGrid(start=(5, 14), end=(1, 14)), ('lesge', 2))),
 Result(val='leakages', provenance=(FromGrid(start=(5, 14), end=(0, 14)), ('lesges', 2))),
 Result(val='jakarta', provenance=(FromGrid(start=(6, 0), end=(10, 0)), ('jlrta', 1))),
 Result(val='breakaleg', provenance=(FromGrid(start=(6, 6), end=(0, 0)), ('brealeg', 3))),
 Result(val='speakable', provenance=(FromGrid(start=(6, 10), end=(12, 4)), ('speible', 3))),
 Result(val='yamanaka', provenance=(FromGrid(start=(11, 0), end=(6, 5))

Now we can run this on the full list of strings in the grid. Since the return values are `Result` objects with same-length provenances, it's helpful to unpack them into a pandas DataFrame so that they render nicely in this notebook.

In [6]:
iter_strings(grid, len=(5,None)) | add_aka | lists.npl | P.df(columns=['word', 'grid_range', 'aka'])

Unnamed: 0,word,grid_range,aka
0,freakaccident,"(0, 8)->(10, 8)","(frenccident, 3)"
1,cloakanddagger,"(1, 2)->(12, 13)","(clownddagger, 3)"
2,unspeakable,"(4, 12)->(12, 4)","(unspeible, 5)"
3,lakatos,"(5, 10)->(9, 14)","(lrtos, 1)"
4,wakanda,"(5, 13)->(9, 9)","(wanda, 1)"
5,leakage,"(5, 14)->(1, 14)","(lesge, 2)"
6,leakages,"(5, 14)->(0, 14)","(lesges, 2)"
7,jakarta,"(6, 0)->(10, 0)","(jlrta, 1)"
8,breakaleg,"(6, 6)->(0, 0)","(brealeg, 3)"
9,speakable,"(6, 10)->(12, 4)","(speible, 3)"


The above chart shows all the words or phrases found in the grid, followed by the coordinates where the word was located, then followed by the actual string that was in the grid and the index that needs to be replaced by `'aka'` to yield the answer to the clue. Obviously there are a few extras - "speakable" and "leakages" aren't answers to the clues, they're just coincidental ("speakable" is a substring of "unspeakable", and "leakages" is just because there happened to be an "s" after "leakage"); we can manually remove these with P.exclude.

We can index each FromGrid provenance entry by the number of the changed letter to get the coordinates in the grid of that specific letter; we store that coordinate in a new column `pos`, and store the letter in a new column `a`, then sort by `pos` (i.e., top-to-bottom and then left-to-right, as we'd expect to read them off the grid):


In [7]:
df = iter_strings(grid, len=(5,None)) | add_aka | lists.npl | P.exclude("speakable", "leakages") | P.df(columns=['word', 'grid_range', 'aka'])
df['pos'] = [r.grid_range[r.aka[1]] for r in df.itertuples()]
df['a'] = [x[y] for x,y in df['aka']]
df.sort_values('pos')

Unnamed: 0,word,grid_range,aka,pos,a
7,breakaleg,"(6, 6)->(0, 0)","(brealeg, 3)","(3, 3)",a
0,freakaccident,"(0, 8)->(10, 8)","(frenccident, 3)","(3, 8)",n
5,leakage,"(5, 14)->(1, 14)","(lesge, 2)","(3, 14)",s
1,cloakanddagger,"(1, 2)->(12, 13)","(clownddagger, 3)","(4, 5)",w
8,yamanaka,"(11, 0)->(6, 5)","(yamane, 5)","(6, 5)",e
3,lakatos,"(5, 10)->(9, 14)","(lrtos, 1)","(6, 11)",r
4,wakanda,"(5, 13)->(9, 9)","(wanda, 1)","(6, 12)",a
6,jakarta,"(6, 0)->(10, 0)","(jlrta, 1)","(7, 0)",l
2,unspeakable,"(4, 12)->(12, 4)","(unspeible, 5)","(9, 7)",i
9,sneakattack,"(11, 14)->(11, 6)","(sneattack, 3)","(11, 11)",a


In case that's unclear, we can extract the actual letters to a string pretty easily:

In [8]:
''.join(df.sort_values('pos')['a'])

'answeralias'

The answer to the puzzle, therefore, is **alias**.