# About Generating Permutations and Combinations
## Divide Pair Conquer
### Due: Monday, 1 February 2021, 11:59pm
\\
*Matthew Reed*


There are many occasions when you need to *generate* the permutations or
combinations of a set, not just count them.

There are many algorithms for generating permutations and combinations --- you
can find them if you look.

For an application, from a biographical sketch about Donald Knuth by Kenneth
Rosen, we learn that


> "Knuth grew up in Milwaukee, where his father taught bookkeeping at a Lutheran
high school and owned a small printing business. He was an excellent student,
earning academic achievement awards. He applied his intelligence in
unconventional ways, winning a contest when he was in the eighth grade by
finding as many words as possible that could be formed from the letters in

---

> **Ziegler's Giant Bar**.

___

> This won a television set for his school and a candy bar for everyone in his class.


Knuth found over 4500 words. How many can **you** find?

## Code

In [2]:
from itertools import permutations
import nltk
from nltk.corpus import words
import re

### Memory Hog

In [5]:
# nltk.download('words')
# word_list = words.words()
# word_list = [x.lower() for x in word_list]
# word_list = [re.sub("[^a-z]", '', x) for x in word_list]

# max_word_size = max([len(x) for x in word_list])
# word_list_ordered = {}
# for i in range(1, max_word_size):
#   word_list_ordered[i] = [x for x in word_list if len(x) == i]

In [6]:
# # perm = permutations("zieglersgiantbar", 2)
# alphabet = "zieglersgiantbar"
# # perm = set([''.join(x) for i in range(1,len(alphabet) + 1) for x in list(permutations(alphabet, i))])

# valid_eng_perm = []
# for i in range(1, len(alphabet) + 1):
#   perm = set([''.join(x) for x in permutations(alphabet, i)])
#   # eng_perm = [x for x in perm if x in word_list_ordered[i]]
#   eng_perm = [x for x in word_list_ordered[i] if x in perm]
#   print(f'The number of valid english words of length {i} within "{alphabet}" is {len(eng_perm)}')
#   valid_eng_perm.append(eng_perm)

### Optimized Using Gödel Hashes


In [3]:
nltk.download('words')
word_list = words.words()
word_list = [x.lower() for x in word_list]
word_list = [re.sub("[^a-z]", '', x) for x in word_list]

[nltk_data] Downloading package words to /root/nltk_data...
[nltk_data]   Unzipping corpora/words.zip.


In [4]:
prime_list = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 30, 73, 79, 83, 89, 97, 101]

def godel_subset(subset, superset):
  return superset % subset  == 0

def encode_set(nset):
  return encode_set(nset[1:]) * prime_list[nset[0]] if nset else 1

def set_from_string(alpha_text):
  return [ord(x) - ord('a') for x in alpha_text]

def encode_string(alpha_text):
  return encode_set(set_from_string(alpha_text))

In [9]:
alphabet = "Ziegler's Giant Bar"
alphabet = alphabet.lower()
alphabet = re.sub("[^a-z]", '', alphabet)

alphabet_hash = encode_set(set_from_string(alphabet))

eng_perm = [x for x in word_list if godel_subset(encode_string(x), alphabet_hash)]

In [10]:
len(eng_perm)

3298

## Sources
- https://www.geeksforgeeks.org/permutation-and-combination-in-python/
- https://www.geeksforgeeks.org/python-program-to-convert-a-tuple-to-a-string/
- https://www.datasciencebytes.com/bytes/2014/11/03/get-a-list-of-all-english-words-in-python/
- https://stackoverflow.com/questions/3788870/how-to-check-if-a-word-is-an-english-word-with-python


## Thoughts

I was only able to find 3,298 valid words in English from the permutations of the letters in "Ziegler's Giant Bar." I can think of three possible explanations for finding fewer words than Knuth: I had an incomplete list of possible english words, Knuth's words were not limited to English, or an error was made by Knuth.