<a href="https://colab.research.google.com/github/psb-david-petty/google-colaboratory/blob/master/wordle.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# `wordle.py`

[WORDLE](https://powerlanguage.co.uk/wordle/) is '...a daily word game.'

> I originally downloaded word-lists I found on-line (starting with OWL3) for use in this project, but by [open-sourcing](https://bigtechquestion.com/2019/03/07/software/windows/what-does-open-sourcing-mean/) this tool, I would also have to publish the word-lists. To avoid that, I added `wordset.py` to read the word-list files from URIs (either as raw `.txt` files or from [`.zip`](https://docs.python.org/3/library/zipfile.html) files).

## The word-lists

| Source | Link | Description |
| --- | --- | --- |
| [dolph](https://github.com/dolph/dictionary) | [https://raw.githubusercontent.com/dolph/dictionary/master/enable1.txt](https://raw.githubusercontent.com/dolph/dictionary/master/enable1.txt) | [TK]([https://en.wikipedia.org/wiki/To_come_(publishing) |
| [dolph](https://github.com/dolph/dictionary) | [https://raw.githubusercontent.com/dolph/dictionary/master/ospd.txt](https://raw.githubusercontent.com/dolph/dictionary/master/ospd.txt) | [TK]([https://en.wikipedia.org/wiki/To_come_(publishing) |
| [dolph](https://github.com/dolph/dictionary) | [https://raw.githubusercontent.com/dolph/dictionary/master/popular.txt](https://raw.githubusercontent.com/dolph/dictionary/master/popular.txt) | [TK]([https://en.wikipedia.org/wiki/To_come_(publishing) |
| [dolph](https://github.com/dolph/dictionary) | [https://raw.githubusercontent.com/dolph/dictionary/master/unix-words](https://raw.githubusercontent.com/dolph/dictionary/master/unix-words) | [TK]([https://en.wikipedia.org/wiki/To_come_(publishing) |
| [WordGameDictionary](https://www.wordgamedictionary.com/word-lists/) | [https://www.wordgamedictionary.com/english-word-list/download/english.txt](https://www.wordgamedictionary.com/english-word-list/download/english.txt) | [TK]([https://en.wikipedia.org/wiki/To_come_(publishing) |
| [WordGameDictionary](https://www.wordgamedictionary.com/word-lists/) | [https://www.wordgamedictionary.com/sowpods/download/sowpods.txt](https://www.wordgamedictionary.com/sowpods/download/sowpods.txt) | [TK]([https://en.wikipedia.org/wiki/To_come_(publishing) |
| [WordGameDictionary](https://www.wordgamedictionary.com/word-lists/) | [https://www.wordgamedictionary.com/twl06/download/twl06.txt](https://www.wordgamedictionary.com/twl06/download/twl06.txt) | [TK]([https://en.wikipedia.org/wiki/To_come_(publishing) |
| [yawl](https://github.com/elasticdog/yawl) | [https://raw.githubusercontent.com/elasticdog/yawl/master/yawl-0.3.2.03.tar.gz](https://raw.githubusercontent.com/elasticdog/yawl/master/yawl-0.3.2.03.tar.gz) | `yawl-0.3.2.03/sigword.list` |
| [yawl](https://github.com/elasticdog/yawl) | [https://raw.githubusercontent.com/elasticdog/yawl/master/yawl-0.3.2.03.tar.gz](https://raw.githubusercontent.com/elasticdog/yawl/master/yawl-0.3.2.03.tar.gz) | `yawl-0.3.2.03/word.list` |
| [SDSawtelle](https://sdsawtelle.github.io/blog/output/scrabble-cheatsheet-with-python.html) | [https://sdsawtelle.github.io/blog/output/scrabble-cheatsheet-with-python.html](https://sdsawtelle.github.io/blog/output/scrabble-cheatsheet-with-python.html) | Python cannot directly extract `OWL3_Dictionary.7z` without additional libraries |


## Other enhancements

- I made this a command-line tool using [`optparse`](https://docs.python.org/3/library/optparse.html). (TODO: update to use [`argparse`](https://docs.python.org/3/library/argparse.html).) The command-line help is:

```
Usage: spellingbee.py {LETTERS | -i} [-l L] [-? -v]

Find spelling-bee words using LETTERS and including LETTERS[0].

Options:
  --version         show program's version number and exit
  -?, --help        show this help message and exit
  -i, --input       input LETTERS from keyboard? [False]
  -l L, --length=L  words of length >= L [5]
  -v, --verbose     log status information while processing [False]
```
- Added [`logging`](https://docs.python.org/3/howto/logging.html).
- This [`Colab Notebook`](https://github.com/psb-david-petty/google-colaboratory/blob/master/spellingbee.ipynb) was originally developed from a multi-file module &mdash; which required adapting some code and changing some [`import`](https://docs.python.org/3/reference/import.html) statements.
- The biggest addition was reading the word-lists on line from URIs, rather than publishing the word-lists myself.

In [10]:
#!/usr/bin/env python3
#
# log.py
#
import logging, tempfile

1234567890123456789012345678901234567890123456789012345678901234567890
"""
Logging module that logs to the console and a temporary log file.
"""
__all__ = ["log", "log_path", ]
__author__ = "David C. Petty"
__copyright__ = "Copyright 2022, David C. Petty"
__license__ = "https://choosealicense.com/licenses/mit/"
__version__ = "0.0.1"
__maintainer__ = "David C. Petty"
__email__ = "david_petty@psbma.org"
__status__ = "Development"

path = globals().get('path')    # Initialize global path to temporary log.


def log(name, level=logging.INFO):
    """Return logger with name and level."""
    global path
    new_file = path is None

    # If name already has a logger, return it.
    if name in logging.root.manager.loggerDict:
        return logging.getLogger(name)

    FORMAT = '{asctime:s} {name:^10s} ' \
             '[{threadName:^10s}] {levelname:<8s} {message:s}'
    FORMAT = '{asctime:s} {name:^10s} {levelname:<8s} {message:s}'
    logging.basicConfig(filename='/dev/null', level=logging.NOTSET)
    logger = logging.getLogger(name)

    # Create file handler which logs messages at level.
    if new_file:
        fd, path = tempfile.mkstemp('.log', 'wordle-')
    fh = logging.FileHandler(path, 'a')
    fh.setLevel(level)

    # Create console handler which logs messages at level.
    ch = logging.StreamHandler()
    ch.setLevel(level)

    # Create formatter and add it to handlers.
    formatter = logging.Formatter(
        FORMAT, style='{', datefmt='%Y/%m/%d-%H:%M:%S')
    ch.setFormatter(formatter)
    fh.setFormatter(formatter)

    # Add the handlers to logger.
    logger.addHandler(ch)
    logger.addHandler(fh)

    return logger


def log_path():
    """Return path for temporary log file."""
    global path
    return path


if __name__ == '__main__':
    logger = log(__name__)
    logger.debug('D: SPAM')
    logging.debug('D: SPAM')
    logger.info('I: SPAM, SPAM')
    logger.warning('W: SPAM, SPAM, SPAM')
    logger.error('E: SPAM, SPAM, SPAM, SPAM')
    logger.critical('C: SPAM, SPAM, SPAM, SPAM, SPAM')


2022/08/01-17:18:07  __main__  INFO     I: SPAM, SPAM
2022/08/01-17:18:07  __main__  ERROR    E: SPAM, SPAM, SPAM, SPAM
2022/08/01-17:18:07  __main__  CRITICAL C: SPAM, SPAM, SPAM, SPAM, SPAM


In [11]:
#!/usr/bin.env python3
#
# word.py
#
import string

1234567890123456789012345678901234567890123456789012345678901234567890
"""
Letter utilities for solving NYTimes Spelling Bee puzzle.
"""
__all__ = ["hasonly", "musthave", "is_valid", ]
__author__ = "David C. Petty"
__copyright__ = "Copyright 2016-2021, David C. Petty"
__license__ = "https://choosealicense.com/licenses/mit/"
__version__ = "0.0.1"
__maintainer__ = "David C. Petty"
__email__ = "david_petty@psbma.org"
__status__ = "Development"


def hasonly(word, letters):
    """Return True if elements of word are only in letters, otherwise False."""
    letterset = set(letters)
    return letterset.union(set(word)) == letterset


def hasnone(word, letters):
    """Return True if elements of word are not in letters, otherwise False."""
    letterset = set(letters)
    return letterset.intersection(set(word)) == set()


def musthave(word, letters):
    """Return True if elements of letters are all in word, otherwise False."""
    letterset = set(letters)
    return letterset.intersection(set(word)) == letterset


# Return True if w is a (hyphenated) word that is all one case, False otherwise.
is_valid = lambda w: w and hasonly(w, string.ascii_letters + '-') \
    and (w == w.lower() or w == w.upper())


In [12]:
#!/usr/bin.env python3
#
# wordset.py
#
import os.path
# from log import log
# from word import is_valid

1234567890123456789012345678901234567890123456789012345678901234567890
"""
Functions to read wordlists from file or URI and parse them into sets.
"""
__all__ = ["wordsites", "wordfiles", ]
__author__ = "David C. Petty"
__copyright__ = "Copyright 2016-2021, David C. Petty"
__license__ = "https://choosealicense.com/licenses/mit/"
__version__ = "0.0.1"
__maintainer__ = "David C. Petty"
__email__ = "david_petty@psbma.org"
__status__ = "Development"

logger = log(__name__)  # initialize logger


# https://stackoverflow.com/a/5711095
import io, gzip, tarfile, zipfile
from urllib.request import urlopen

# https://docs.python-requests.org/en/master/
# or: requests.get(url).content

# https://docs.python.org/3/library/zipfile.html
# zipfile = ZipFile(io.BytesIO(resp.read()))
# names = zipfile.namelist()
# for name in names:
#     for line in zipfile.open(name).readlines():
#         print(line.decode('utf-8'))

# https://stackoverflow.com/a/49174340
import ssl
ssl._create_default_https_context = ssl._create_unverified_context

from urllib.parse import urlparse

format = lambda k, w: f"{k}({len(w)}): {sorted(list(w))[: 10]} ..."

def txtwordset(uri, verbose=False):
    """Return set of words parsed from raw URI. Echo results if verbose."""
    name = os.path.basename(urlparse(uri).path)
    with urlopen(uri) as resp:
        wordset = {w.lower() for w in
            [line.decode('utf-8').strip() for line in resp.readlines()]
                if is_valid(w)}
        logger.info(format(name, wordset))
        return wordset

def zipwordsets(uri, names, wordssets, verbose=False):
    """"""
    with urlopen(uri) as resp:
        with tarfile.open(fileobj=io.BytesIO(resp.read()), mode='r:gz') as tar:
            zipname = os.path.basename(urlparse(uri).path)
            if verbose: logger.info(f"{zipname}: {tar.getnames()}")
            for path in names:
                name = os.path.basename(path)
                wordset = {w.lower() for w in
                           [line.decode('utf-8').strip() for line in
                        tar.extractfile(path).readlines()]
                           if is_valid(w)}
                logger.info(format(name, wordset))
                wordssets[name] = wordset

log_site = lambda s: logger.info(f"{'#' * 10} SITE: {s}")

def wordssites(verbose=False):
    """Return list of sets of words from:
    URI: https://raw.githubusercontent.com/dolph/dictionary/master/enable1.txt
    URI: https://raw.githubusercontent.com/dolph/dictionary/master/ospd.txt
    URI: https://raw.githubusercontent.com/dolph/dictionary/master/popular.txt
    URI: https://raw.githubusercontent.com/dolph/dictionary/master/unix-words
    URI: https://www.wordgamedictionary.com/english-word-list/download/english.txt
    URI: https://www.wordgamedictionary.com/sowpods/download/sowpods.txt
    URI: https://www.wordgamedictionary.com/twl06/download/twl06.txt
    URI: https://raw.githubusercontent.com/elasticdog/yawl/master/yawl-0.3.2.03.tar.gz yawl-0.3.2.03/sigword.list
    URI: https://raw.githubusercontent.com/elasticdog/yawl/master/yawl-0.3.2.03.tar.gz yawl-0.3.2.03/word.list
    URI: https://sdsawtelle.github.io/blog/output/scrabble-cheatsheet-with-python.html # cannot directly extract OWL3_Dictionary.7z
    """
    wordssets = dict()

    # Read word-lists from dolph URIs.
    log_site('dolph')
    for uri in [
        #'https://raw.githubusercontent.com/dolph/dictionary/master/enable1.txt',
        'https://raw.githubusercontent.com/dolph/dictionary/master/ospd.txt',
        #'https://raw.githubusercontent.com/dolph/dictionary/master/popular.txt',
        #'https://raw.githubusercontent.com/dolph/dictionary/master/unix-words',
    ]:
        key = os.path.basename(urlparse(uri).path)
        wordssets[key] = txtwordset(uri, verbose)
    """
    # Read word-lists from wordgamedictionary URIs.
    log_site('wordgamedictionary')
    for uri in [
        'https://www.wordgamedictionary.com/english-word-list/download/english.txt',
        'https://www.wordgamedictionary.com/sowpods/download/sowpods.txt',
        'https://www.wordgamedictionary.com/twl06/download/twl06.txt',
    ]:
        key = os.path.basename(urlparse(uri).path)
        wordssets[key] = txtwordset(uri, verbose)

    # Read word-lists from elasticdog URIs.
    log_site('elasticdog')
    uri = 'https://raw.githubusercontent.com/elasticdog/yawl/master/yawl-0.3.2.03.tar.gz'
    keys = ['yawl-0.3.2.03/word.list', 'yawl-0.3.2.03/sigword.list', ]
    zipwordsets(uri, keys, wordssets, verbose)
    """
    return wordssets


def wordsfiles(wordsdir=os.path.dirname(os.path.abspath(globals().get('__file__', ''))),
      wordsfiles=[
          'enable1.txt', 'ospd.txt', 'popular.txt', 'unix-words',
          'english.txt', 'sowpods.txt', 'twl06.txt',
          'sigword.list', 'word.list',
          'OWL3_Dictionary.txt',
      ], verbose=False):
    """"""
    # Read word-list files from local directory into dictionary of word-sets.
    wordssets = dict()
    for wordsname in wordsfiles:
        logger.info(f"NAME: {wordsname}")
        with open(os.path.join(wordsdir, wordsname), 'r') as wordsfile:
            wordssets[wordsname] = {w.lower() for w in wordsfile.read().split('\n')
                if is_valid(w)}

    return wordssets

import functools, sortedcontainers
@functools.lru_cache(1)
def words_as_sets(length=5):
    """Return set of words of length length as sorted sets."""
    wordsdict = wordssites()
    words = set.union(*wordsdict.values())
    return tuple(tuple(w.upper()) for w in words if len(w) == length )


In [13]:
#!/usr/bin.env python3
#
# wordle.py
#
import itertools, math, optparse, os, sys, time
# from log import log, log_path
# from word import hasonly, musthave
# from wordset import wordsfiles, wordssites

1234567890123456789012345678901234567890123456789012345678901234567890
"""
Solution to the Wordle daily word puzzle.
https://powerlanguage.co.uk/wordle/
"""
__all__ = ["wordle", ]
__author__ = "David C. Petty"
__copyright__ = "Copyright 2022, David C. Petty"
__license__ = "https://choosealicense.com/licenses/mit/"
__version__ = "0.1.1"
__maintainer__ = "David C. Petty"
__email__ = "david_petty@psbma.org"
__status__ = "Development"

logger = log(__name__)  # initialize logger

comb = lambda n, k: math.factorial(n) // math.factorial(k) // math.factorial(n - k)

def wordle(guess='     ', positions=0, length=5, number=2, extra=0):
    """Return list of wordle words words."""
    # Word-list files linked from:
    # https://github.com/elasticdog/yawl

    if '__file__' in globals():                         # not a Colab notebook
        wordsdict = wordsfiles()                        # locally from files
    #wordsdict = wordssites()                            # on-line from sites

    # words is the union of all words-sets.
    #words = set.union(*wordsdict.values())
    words = words_as_sets()
    frequent = 'ETAOINSHRDLUCMFWYGPBVKQJXZ'.upper()
    word_map, guesses = dict(), tuple()

    # Create word_map of length-letter words mapped to their letters.
    total, group_scale, guess_scale = length * number + extra, 70, 510000
    logger.info(f"{comb(total, length)} combinations of {length} letters out of the {total} most common: '{frequent[: total]}'... (~{comb(total, length) // group_scale}s)")
    start_t = time.time()                               # start time

    possibilities = itertools.combinations(frequent[: total], length)
    for letters in possibilities:
        for word in words:
            if len(word) == length and musthave(word, letters):
                key = ''.join(sorted(letters))
                word_map[key] = word_map.get(key, tuple()) + (word, )
    logger.info(f"groups with words: ({','.join(sorted(word_map))})")
    groups_t = time.time()                              # group time
    logger.info(f"time: {round(groups_t - start_t, 1)}s")

    if len(word_map) < number: return word_map, guesses

    # TODO: make interactive and iterative
    # Print best number words to guess.
    logger.info(f"{comb(len(word_map), number)} combinations of {number} group(s) out of the {len(word_map)} with words... (~{comb(len(word_map), number) // guess_scale}s)")
    guesses = tuple( ( word_map[k] for k in key )
        for key in ( x for x in itertools.combinations(word_map, number ) 
            if all( hasnone(*t) for t in itertools.combinations(x, 2) ) ) )
    guess_t = time.time()
    logger.info(f"time: {round(guess_t - groups_t, 1)}s")

    sorted_word_map = { k: tuple(sorted(v)) for k, v in word_map.items() }
    sorted_guesses = tuple( tuple( tuple(sorted(t)) for t in g ) for g in guesses )

    return sorted_word_map, sorted_guesses

# TODO: fix spacing
# TODO: change to argparse
class WordleOptionParser( optparse.OptionParser ):
    def __init__( self, **kwargs ):
        optparse.OptionParser.__init__( self, **kwargs )
        self.remove_option( "-h" )
        self.add_option( "-?", "--help", action="help",
            help="show this help message and exit" )
    def error( self, msg ):
        name = self.get_prog_name( )
        sys.stderr.write( "{name}: error: {msg}\n\n".format( **locals( ) ) )
        self.print_help( )
        sys.exit( 2 )

def test( argv ):
    import logging
    # TODO: fix this
    # Parse command-line options.
    usage = "usage: %prog {LETTERS | -i} [-l L] [-? -v]"
    description = "Find spelling-bee words using LETTERS and including LETTERS[0]."
    parser = WordleOptionParser( usage=usage, description=description, version=__version__ )
    parser.add_option( "-i", "--input",
        action="store_true", dest="i", default=False,
        help="input LETTERS from keyboard? [%default]" )
    parser.add_option( "-l", "--length",
        action="store", type='int', dest="l", default=5,
        help="words of length >= L [%default]" )
    parser.add_option( "-v", "--verbose",
        action="store_true", dest="verbose", default=False,
        help="log status information while processing [%default]" )
    opts, args = parser.parse_args( args=argv[ 1: ] )
    # Process command-line options.
    len_args = 0 if opts.i else 1
    if len( args ) != len_args:
        error = f"too {'few' if len(args) < len_args else 'many'} arguments"
        parser.error( error )
    letters, = args if not opts.i else (input('Wordle letters: '),)
    if not opts.verbose: logging.disable(logging.INFO)
    logger.info(f"python3 {' '.join(argv)}")
    logger.info(f"LOG PATH: {log_path()}")

    # Solve Wordle.
    number, extra = 2, 0
    words, guesses = wordle(number=number, extra=extra)
    logger.info(f"{len(words)} groups with words; {len(guesses)} guesses of length {number};")
    # Log first results.
    fw, fg = min(len(words), 5000 - 20), min(len(guesses), 5000 - 20 - len(words))
    logger.info(f"words: {len(words)} (first {min(len(words), fw)})")
    st2st = lambda ws: tuple(sorted(''.join(s) for s in ws))    # already sorted?
    for key in sorted(words)[: fw]:
        logger.info(f"{key}: {st2st(words[key])}")
    logger.info(f"guesses: {len(guesses)} (first {min(len(guesses), fg)})")
    for guess in guesses[: fg]:
        logger.info(f"first {len(guess)} guess(es): {tuple( st2st(g) for g in guess )}")


if __name__ == '__main__':
    is_idle, is_pycharm, is_jupyter = (
        'idlelib' in sys.modules,
        int(os.getenv('PYCHARM', 0)),
        '__file__' not in globals()
        )
    if any((is_idle, is_pycharm, is_jupyter, )):
        # Collab Jupyter Notebook
        test([sys.argv[0], '-v', 'NOTHING', ])
    else:
        test(sys.argv)


2022/08/01-17:18:08  __main__  INFO     python3 /usr/local/lib/python3.7/dist-packages/ipykernel_launcher.py -v NOTHING
2022/08/01-17:18:08  __main__  INFO     LOG PATH: /tmp/wordle-soobzve3.log
2022/08/01-17:18:08  __main__  INFO     ########## SITE: dolph
2022/08/01-17:18:09  __main__  INFO     ospd.txt(79339): ['aa', 'aah', 'aahed', 'aahing', 'aahs', 'aal', 'aalii', 'aaliis', 'aals', 'aardvark'] ...
2022/08/01-17:18:09  __main__  INFO     252 combinations of 5 letters out of the 10 most common: 'ETAOINSHRD'... (~3s)
2022/08/01-17:18:11  __main__  INFO     groups with words: (ADEHR,ADEHS,ADEHT,ADEIR,ADEIS,ADENO,ADENR,ADENS,ADENT,ADEOR,ADERS,ADERT,ADEST,ADHIS,ADHNO,ADHNS,ADHOR,ADHRS,ADHST,ADINO,ADINR,ADIOR,ADIOS,ADIRS,ADIRT,ADIST,ADNOR,ADNOS,ADNRS,ADNST,ADORS,ADORT,ADOST,ADRST,AEHNS,AEHNT,AEHRS,AEHRT,AEHST,AEINS,AEINT,AEIRS,AEIRT,AENOS,AENOT,AENRS,AENRT,AENST,AEORS,AEORT,AEOST,AERST,AHIOS,AHIRS,AHIRT,AHIST,AHNRS,AHNST,AHORS,AHORT,AHOST,AHRST,AINOR,AINRS,AINRT,AINST,AIORT,AIOST,AIRST,A