### Python module/package imports for this chapter

In [1]:
import sys, math, collections, itertools, multiprocessing, gzip

In [2]:
import numpy as np

import matplotlib
import matplotlib.pyplot as pp

%matplotlib inline

In [3]:
%load_ext line_profiler
%load_ext memory_profiler

## Solution: parallel word search

**March 2020 update**: with recent versions of Jupyter and `multiprocessing`, the solution as given in the video does not work because the parallel processes do not have access to the function `worker` and to the variables `words`, `target`, and `chunksize`. The simplest workaround is to collect the code in a separate script `wordsearch.py` (which we write out in the cell below), and to run it from the shell. When the `multiprocessing` Pool is created, it will automatically import the file `wordsearch` in every new process. Note that the code that initializes and runs the `Pool` is guarded by the clause `if __name__ == '__main__'`: this avoids recreating the Pool recursively.

In [1]:
%%file wordsearch.py
import gzip
import multiprocessing

words = [line.strip() for line in gzip.open('words.gz','rt')]

target = 'zygomaticum'

chunksize = 16384

def worker(i):
    try:
        return i + words[i:i+chunksize].index(target)
    except ValueError:
        return None
    
if __name__ == '__main__':
    pool = multiprocessing.Pool(processes=4)
    results = pool.map(worker,range(0,len(words),chunksize))
    pool.close()
    
    print([r for r in results if r is not None])
    print(words[235786])

Overwriting wordsearch.py


In [2]:
!python wordsearch.py

[235786]
zygomaticum
