# Using regular expressions to help with Wordle

[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/rwcitek/MyBinder.demo/main?labpath=%2FRegular.Expressions%2Fwordle.ipynb)

### Background

Wordle is a word-guessing game where the program picks a five letter word from its list and you have six tries to guess it.  
That sounds like a pretty big search space for guessing.  I wonder just how big that search space is and if it can be made smaller with the hints that Wordle gives you about previous guesses.

### Task 1: how big is the search space

In [2]:
!ls -la /usr/share/dict/words
!wc -l /usr/share/dict/words

lrwxr-xr-x  1 root  wheel  4 Aug 24 03:59 [35m/usr/share/dict/words[m[m -> web2
  235886 /usr/share/dict/words


About 235k words.
Of those, I wonder how many are five-letter words.

In [12]:
!grep -E '^[a-z]{5}$' /usr/share/dict/words # >| words.5-letters.txt
!echo "foo" >| words.5-letters.txt
#!wc -l words.5-letters.txt

/bin/bash: -c: line 0: syntax error near unexpected token `newline'
/bin/bash: -c: line 0: `echo "foo" >'


About 8.5k.  That's a lot smaller, but still a pretty big search space.

### Task 2: good initial guess to reduce the search space

I wonder what letters appear most often in that list of five-letter words.
Let's create a frequency list.

In [14]:
!grep -o . /tmp/words.5-letters.txt | \
sort | \
uniq -c | \
sort -rn | \
head \

4467 a
4255 e
3043 r
2801 o
2581 i
2383 s
2381 t
2368 l
2214 n
1881 u


Let's see if there is a five-letter word that contains the top five letters in that frequency list.

In [16]:
!awk '/a/ && /e/ && /r/ && /o/ && /s/' /tmp/words.5-letters.txt

arose


That's a bingo!

In [20]:
# example = vigor
# arose = ro wrong
!grep -E '.[^r][^o]..' /tmp/words.5-letters.txt | wc -l

    7252


In [34]:
!grep -E '[^ase][^rase][^oase][ro][ro]' /tmp/words.5-letters.txt >| puzzle.vigor
!grep -E '[ro][^rase][^oase][^ase][ro]' /tmp/words.5-letters.txt >> puzzle.vigor
!grep -E '[ro][^rase][^oase][ro][^ase]' /tmp/words.5-letters.txt >> puzzle.vigor
! wc -l puzzle.vigor


      67 puzzle.vigor


In [35]:
!sort puzzle.vigor | uniq >| puzzle.vigor1

In [36]:
!grep -wf puzzle.vigor1 word_count.txt | sort -rn | head -1

  83 honor


In [39]:
!grep -E '[^asehon][^rasehon][^oasehn][o][r]' puzzle.vigor1 > puzzle.vigor2
!wc -l puzzle.vigor2


      13 puzzle.vigor2


In [40]:
!grep -wf puzzle.vigor2 word_count.txt | sort -rn | head -1

  12 tutor


In [41]:
!grep -E '[^tuasehon][^turasehon][^tuoasehn][o][r]' puzzle.vigor2 > puzzle.vigor3
!wc -l puzzle.vigor3

       3 puzzle.vigor3


In [42]:
!grep -wf puzzle.vigor3 word_count.txt | sort -rn | head -1

   5 vigor
