## Scrabble

You can write code in this Jupyter Notebook to solve the following problems but the solution needs to be in two .py files as described below. Please upload these two files (your **scrabble.py** and **wordscore.py** files) with your solutions to your GitHub repository and gradescope.

Assignment due date: 11:59PM PT the night before the Week 7 Live Session.

## Objectives:

- Understand PEP 8 standards
- Use all of your previously gained knowledge together on a single program
- Demonstrate how to import a user-made module and function into python from another .py file
- Demonstrate how to refine an algorithm 

## 6. Cheating at Scrabble (100 points)

Write a Python program that takes a Scrabble rack as a function argument and prints all "valid Scrabble English" words that can be constructed from that rack, along with their Scrabble scores, sorted by score. "valid Scrabble English" words are provided in the data source below. A Scrabble rack is made up of 2 to 7 characters.

Below are the requirements for the program:
- This needs to be able to be run as a function as shown below (not an input statement!)
- Name the python file: `scrabble.py` with the main function inside `scrabble.py` named `run_scrabble`
- Make a separate module named `wordscore.py` which contains, at a minimum, a function called `score_word`. This `score_word` function will take each word and return the score (scoring dictionary is described below). Import this function into your main `scrabble.py` program. 
- **NOTE** Only have the function defs and import statements in your `.py` files as the autograder will fail if there are other statements outside the functions.
- Allow anywhere from 2-7 character tiles (letters A-Z, upper or lower case) to be inputted. 
- Do not restrict the number of same tiles (e.g., a user is allowed to input ZZZZZQQ).
- Return two items from the `run_scrabble` function:
  - 1) The **total** list of valid Scrabble words that can be constructed from the rack as (score, word) tuples, sorted by the score and then by the word alphabetically as shown in the first example below. All outputted words need to be in upper case.
  - 2) The Total number of valid words as an integer
  - See examples below for the required output. The autograder is looking for this output so please make sure your solution is in the same format shown.
- You need to handle input errors from the user and suggest what that error might be caused by and how to fix it (i.e., a helpful error message). **Return** this error message as a string from the run_scrabble function (do not raise an exception).
- Implement wildcards as either `*` or `?`. There can be a total of **only** two wild cards in any user input (that is, one of each character: one `*` and one `?`). Only use the `*` and `?` as wildcard characters. A wildcard character can take any value A-Z. Replace the wildcard symbol with the letter in your answer (see the second example below). 
  - Wildcard characters are scored as 0 points, just like in the real Scrabble game. A word that just consists of two wildcards can be made, should be outputted and scored as 0 points. 
  - In a wildcard case where the same word can be made with or without the wildcard, display the highest score. For example: given the input 'I?F', the word 'if' can be made with the wildcard '?F' as well as the letters 'IF'. Since using the letters 'IF' scores higher, display that score.
- For partial credit, your program should take less than one minute to run with 2 wildcards in the input. For full credit, the program needs to run with 2 wildcards in less than 30 seconds.
- Write docstrings for the functions and puts comments in your code.
- You may only use the Python standard library in this assignment. However, any function in the standard library is allowed.

An example invocation and output:

```
run_scrabble("ZAEfiee") -> (
[(17, 'FEAZE'),
(17, 'FEEZE'),
(16, 'FAZE'),
(15, 'FEZ'),
(15, 'FIZ'),
(12, 'ZEA'),
(12, 'ZEE'),
(11, 'ZA'),
(6, 'FAE'),
(6, 'FEE'),
(6, 'FIE'),
(5, 'EF'),
(5, 'FA'),
(5, 'FE'),
(5, 'IF'),
(2, 'AE'),
(2, 'AI'),
(2, 'EA'),
(2, 'EE')], 
19
)
```

An example wildcard invocation and output:
```
run_scrabble("?F") -> (
[(4, 'EF'),
(4, 'FA'),
(4, 'FE'),
(4, 'FY'),
(4, 'IF'),
(4, 'OF')],
6
)
```

#### The Data
The file: http://courses.cms.caltech.edu/cs11/material/advjava/lab1/sowpods.zip or https://drive.google.com/file/d/1ewUiZL_4HanCDsaYB5pcKEgqjMFVgGnh/view?usp=sharing contains all "valid Scrabble English" words in the official words list, one word per line. You should download the word file and keep it in your repository so that the program is standalone (instead of accessing it over the web from Python).

You can read data from a text file with the following code (you can expect sowpods.txt to be run in the same folder as your scrabble.py file):
```
with open("sowpods.txt","r") as infile:
    raw_input = infile.readlines()
    data = [datum.strip('\n') for datum in raw_input]
```

This will show the first 6 words:
```
print(data[0:6])
```
Please use the dictionary below containing the letters and their Scrabble values:
```
scores = {"a": 1, "c": 3, "b": 3, "e": 1, "d": 2, "g": 2,
         "f": 4, "i": 1, "h": 4, "k": 5, "j": 8, "m": 3,
         "l": 1, "o": 1, "n": 1, "q": 10, "p": 3, "s": 1,
         "r": 1, "u": 1, "t": 1, "w": 4, "v": 4, "y": 4,
         "x": 8, "z": 10}
```

#### Grading Breakdown

You will get 80 points from the autograder for this assignment and 20 points will be hidden. That is, passing all of the visible tests will give you 80 points. Make sure you are meeting the requirements of the problem to get the other 20 points!

- Uploading the correctly named files and functions (5 points)
- Docstrings and comments (5 points)
- User error checking (10 points)
- Code works with no wildcards (30)
- Code works with one wildcard (20)
- Code works with two wildcards (20)
- Algorithm efficiency with 2 wildcards (10 points)

#### Tips:
- If you don't know what "scrabble" is or the basic background of the game please look it up online!
- We recommend that you try to break down the problem into steps on your own before writing any code. Once you've scoped generally what you want to do, then start writing some code.  If you get stuck, go back to thinking about the problem rather than trying to fix lots of errors at the code level.
- If you keep getting stuck, then check out: https://wiki.openhatch.org/wiki/Scrabble_challenge or https://drive.google.com/file/d/1g3yz5ljkzaAeQ-AgQR1Hofy8ZJ0jo25x/view?usp=sharing. This is where we got the idea for this assignment and it provides some helpful tips for guiding you along the way.  However, we would recommend that you try to implement this first before looking at the hints on the website.

Good luck!

In [1]:
def score_word(word, wild_tile1=None, wild_tile2=None):
    """
    TODO: doc string
    """
    scores_dict = {"A": 1, "C": 3, "B": 3, "E": 1, "D": 2, "G": 2,
         "F": 4, "I": 1, "H": 4, "K": 5, "J": 8, "M": 3,
         "L": 1, "O": 1, "N": 1, "Q": 10, "P": 3, "S": 1,
         "R": 1, "U": 1, "T": 1, "W": 4, "V": 4, "Y": 4,
         "X": 8, "Z": 10}
    
    score = 0
    for i in range(len(word)):
        # Wild cards are scored as 0 points. 
        # A word that just consists of two wildcards can be made, should be outputted and scored as 0 points.
        if i == wild_tile1 or i == wild_tile2:
            continue
        score += scores_dict[word[i]]

    return score

In [8]:
import itertools
from string import ascii_uppercase as alphabet_letters

def get_valid_word_combinations(word, all_valid_eng_words):
    """
    TODO: docs
    """
    
    # Step 1: create all potential combinations
    combinations = []
    permutations = set([])
    for i in range(1, len(word)):
        combinations.extend(list(itertools.combinations(word, i+1)))
    # Step 2: create all permutations
    for combo in combinations:
        permutations.update(list(itertools.permutations(combo)))        
    
    # Step 3: check all words
    word_combinations = []
    checked_words = set([])
    for potential_word in permutations:
        potential_word = ''.join(potential_word)
        
        # skip any words we already checked
        if potential_word in checked_words:
            continue
            
        # First, check for no wild cards so that the higher scoring word gets added first
        # if there is no wild ard, check the validity of the word
        if '*' not in potential_word and '?' not in potential_word: 
            if potential_word in all_valid_eng_words:
                word_combinations.append((score_word(potential_word), potential_word))
            
        # first, check for wild cards
        elif '*' in potential_word:
            # replace wild card with each letter of the alphabet
            for letter in alphabet_letters:
                wild_word = potential_word.replace('*', letter)
                # check for a second wild card
                if '?' in wild_word:
                    for second_letter in alphabet_letters:
                        wild_wild_word = wild_word.replace('?', second_letter)
                        if wild_wild_word in all_valid_eng_words:
                            word_combinations.append((score_word(wild_wild_word, potential_word.find('*'), potential_word.find('?')), wild_wild_word))
                else:
                    if wild_word in all_valid_eng_words:
                        word_combinations.append((score_word(wild_word, potential_word.find('*')), wild_word))

                        
        # check of ? only wild card
        elif '?' in potential_word:
            for letter in alphabet_letters:
                wild_word = potential_word.replace('?', letter)
                if wild_word in all_valid_eng_words:
                        word_combinations.append((score_word(wild_word, potential_word.find('?')), wild_word))

        # Step 3: update list of all checked words with our permutations
        checked_words.update(potential_word)
        
    return word_combinations


In [43]:
import time
from operator import itemgetter


# For partial credit, your program should take less than one minute to run with 2 wildcards in the input. 
# For full credit, the program needs to run with 2 wildcards in less than 30 seconds
def run_scrabble(word):
    """
    TODO: doc string
    """
    start_time = time.time()
    
    word = word.upper()
    # Step 0: Error Checking
    # Allow anywhere from 2-7 character tiles (letters A-Z, upper or lower case) to be inputted
    if len(word) > 7 or len(word) < 2:
        return "Scrabble word is either too long or too short. Please enter a word between 2 and 7 character tiles."
    if not all(character.isalpha() or character == '?' or character == '*' for character in word):
        return "Scrabble word contains a character that is not allowed. Please limit character tiles toletters A-Z, * and ?"
    if word.count('*') > 1 or word.count('?') > 1:
        return "Too many wild card tiles. A maximum of one * and one ? are allowed."
    
    # Step up: Read in the valid english words
    with open("sowpods.txt","r") as infile:
        raw_input = infile.readlines()
        all_valid_eng_words = [datum.strip('\n') for datum in raw_input]
        
    # Step 1: Get all valid word combinations
    word_combinations = get_valid_word_combinations(word, all_valid_eng_words)
    
    # In a wildcard case where the same word can be made with or without the wildcard, display the highest score
    for idx, tuple1 in enumerate(word_combinations):
        for jdx, tuple2 in enumerate(word_combinations):
            if tuple1[1] == tuple2[1] and idx != jdx:
                if tuple1[0] > tuple2[0]:
                    word_combinations.remove(tuple2)
                else:
                    word_combinations.remove(tuple1)
            
    # Step 2: Sort words by scores
    sorted_word_list = sorted(word_combinations, key=lambda element: (element[0], element[1]))
    
    end_time = time.time()
    print("Time to run:", round(end_time - start_time, 2), "seconds")
    return sorted_word_list, len(sorted_word_list)


In [50]:
run_scrabble('?A')[0]

Time to run: 0.16 seconds


[(1, 'AA'),
 (1, 'AB'),
 (1, 'AD'),
 (1, 'AE'),
 (1, 'AG'),
 (1, 'AH'),
 (1, 'AI'),
 (1, 'AL'),
 (1, 'AM'),
 (1, 'AN'),
 (1, 'AR'),
 (1, 'AS'),
 (1, 'AT'),
 (1, 'AW'),
 (1, 'AX'),
 (1, 'AY'),
 (1, 'BA'),
 (1, 'DA'),
 (1, 'EA'),
 (1, 'FA'),
 (1, 'HA'),
 (1, 'JA'),
 (1, 'KA'),
 (1, 'LA'),
 (1, 'MA'),
 (1, 'NA'),
 (1, 'PA'),
 (1, 'TA'),
 (1, 'YA'),
 (1, 'ZA')]

In [46]:
word_combinations = [(1, 'AA'), (1, 'AB'), (1, 'AD'), (1, 'AE'), (1, 'AG'), (1, 'AH'), (1, 'AI'), (1, 'AL'), (1, 'AM'), (1, 'AN'), (1, 'AR'), (1, 'AS'), (1, 'AT'), (1, 'AW'), (1, 'AX'), (1, 'AY'), (1, 'BA'), (1, 'DA'), (1, 'EA'), (1, 'FA'), (1, 'HA'), (1, 'JA'), (1, 'KA'), (1, 'LA'), (1, 'MA'), (1, 'NA'), (1, 'PA'), (1, 'TA'), (1, 'YA'), (1, 'ZA')]



In [47]:
len(word_combinations)

30

In [None]:
for i in range(0,len(word_combinations)-2):
    first_tuple = word_combinations[i]

In [None]:


[(1, 'AA'), (1, 'AB'), (1, 'AD'), (1, 'AE'), (1, 'AG'), (1, 'AH'), (1, 'AI'), (1, 'AL'), (1, 'AM'), (1, 'AN'), (1, 'AR'), (1, 'AS'), (1, 'AT'), (1, 'AW'), (1, 'AX'), (1, 'AY'), (1, 'BA'), (1, 'DA'), (1, 'EA'), (1, 'FA'), (1, 'HA'), (1, 'JA'), (1, 'KA'), (1, 'LA'), (1, 'MA'), (1, 'NA'), (1, 'PA'), (1, 'TA'), (1, 'YA'), (1, 'ZA')]

