# Regex Crossword Helper

This notebook is a supplement for [this post](https://lawyerist.com/?p=136050) (**coming soon**) on regular expressions. Please read that first, as this will make very little sense without this context. If you aren't familiar with notebooks, check out [this peice](https://lawyerist.com/124089/hello-world-attorneys-learn-code/) in which I describe how to download Project Jupyter. After that, what follows should make a little more sense. 

## Import Dependencies 

In [100]:
import re
import urllib.request
import sys  

## Download a Corpus

Here we're going to download a [corpus of English words]((https://github.com/dwyl/english-words)) and save it to disk. We should only have to do this one time. So if you come back to this notebook in the future, you can get away without running the codeblock below. 

In [103]:
corpus_url = "https://github.com/dwyl/english-words/blob/master/words.txt?raw=true"
corpus_text = urllib.request.build_opener(urllib.request.HTTPCookieProcessor).open(corpus_url).read()
corpus = open("corpus.txt", "wb")
corpus.write(corpus_text)
corpus.close()

## Load Saved Corpus

As you can see by reading out the five lines from the corpus text, individual words can be found on each line (where a line break is represented by `\n`. 

In [102]:
corpus = open('corpus.txt','r').read()

#h/t http://stackoverflow.com/a/20441641
with open("corpus.txt") as myfile:
    firstNlines=myfile.readlines()[1000:1005] 
print (firstNlines)

['absonous\n', 'absorb\n', 'absorbability\n', 'absorbable\n', 'absorbance\n']


## Create a Function to Take in Regex and Spit Out Words

In [39]:
def regex_word (regex):
    test_str = corpus
    matches = re.finditer(regex, test_str)
    for matchNum, match in enumerate(matches):
        matchNum = matchNum + 1
        print ("Match {matchNum}: {match}".format(matchNum = matchNum, match = match.group()))

## Find Your Word(s)

Now run `regex_word()` as needed and as described in the [*Build & Solve Crossword Puzzles* section](https://lawyerist.com/?p=136050#crossword) of the accompanying blog post. As you step through word by word, you can create your own crossword one word at a time. See below.

In [None]:
regex_word(r"\n\w{2}e\w{2}e\w{2}e\w{2}\n")

## A Simple Crossword

Starting with an 11x11 grid, I constructed the crossword below by repeatedly running `regex_word()` and filling lines in word by word. As I did this, I realized that there was a lot more I could do to automate the process. For example, I could pick matching words at random and have a program create the puzzle from scratch. However, it would probably be a good idea to give preference to more common words and words that are more likely to combine well with others given their location on the grid. So adding some awareness of useage and letter frequency could be useful. Anywho, this was all just a MacGuffin to show off regex. So given that, I'm rather happy with this as a result.

![crossword](crossword.png)

|Across|Down|
|---------|---|
|1. Multiple barred confinements.|1. An item sought for its membership in a set of similar objects. |
|6. Future birth state of James T. Kirk.|2. Those responsible for entry.|
|7. Repeated forceful blows.|3. A neutral state.|
|8. A stain of ink.|4. Those who carve wood.|
|9. A memorial article.|5. An adjective describing those with strong beliefs.|
|10. An adhesive liquid.|
|11. An indentation.|
|12. Those prophesied to inherit the Earth.|
|13. An alternate name for a particle collider.|
|14. Payment to a landlord.|
|15. A speech impediment and a computer language.|
|16. Young lamb.|
|17. Possessing the power to do something.|
|18. Slang term for food.|
|19. A coin of foreign origin.|
|20. A structure constructed from the skeletons of underwater animal colonies.|
|21. Pollinating insects.|
|22. To be treated with a medicine.|

If you're stuck, here are the [answers](Answers.txt).