## Mad Libs Game
#### Rachel Hamelburg

### For an explanation of how the game works, why I chose the data structure I did, and a walktrhough of the code, please see Project Report below. No video is provided because all steps / output are captured in this notebook.

In [97]:
import nltk
from nltk.tokenize import word_tokenize

start = input("Welcome to Mad Libs Create Your Own Template! You need two players to play the game. Player 1 will create a story. Player 2 will be asked for words with different parts-of-speech, without being able to see the story that player 1 created. After player 2 completes all the prompts, player 1 will read the story aloud to player 2. When both players are ready, please enter Ready ")

if start.lower() != 'ready':
    input("Let's try that again. Type 'Ready' :)")


player_1 = input(f'\nGreat! Player 1, enter your name. ')

player_2 = input(f'\nNice to meet you, {player_1}. Player 2, how about you? ')

print(f'\nHi, {player_2}. ')

story = input(f'\n{player_1}, your turn! Time to write a short story. Make sure {player_2} cannot see your screen! If you would like to see some title names to inspire you, that is another option! "Write" or "Generate":')
if story.lower() == 'generate':
    print('''
    
    ''')
    print('Not a problem! Writing is hard. Check out these titles as a start:')
    print(' ')
    title_1 = 'My Worst Nightmare'
    title_2 = 'The Best Place on Earth'
    title_3 = 'Guess What I Just Saw!'
    title_4 = "Where's my Wallet?"
    title_5 = "The Great Baking Disaster"
    print(title_5)
    print('''
    
    
    ''')
    print(f'\n{title_1}')
    print(f'\n{title_2}')
    print(f'\n{title_3}') 
    print(f'\n{title_4}')
    print(f'\n{title_5}')
    help_me = input("If you don't like any of them, type 'More'. Otherwise, 'Ready'.")
    print('')
    if help_me.lower() == 'more':
        print('')
        title_6 = 'The Time I Met my Favorite Celebrity'
        title_7 = 'How to Make My Favorite Meal'
        title_8 = 'If I Were President...'        
        title_9 = 'A Snowy Day in the Park'
        title_10 = 'My Funniest Memory'
        print("Here are a few more for you.")
        print(f'\n{title_6}')
        print(f'\n{title_7}')
        print(f'\n{title_8}') 
        print(f'\n{title_9}')
        print(f'\n{title_10}')

        madlib_title = input("If you're feeling reaaalllly stuck, check this out: https://blog.reedsy.com/plot-generator/ . When you are ready to begin, enter a title: ")
        write = input("Great! I can't wait to see what you come up with!")
    else:
        madlib_title = input("Great! Let's choose a title. Either type in one from above or come up with your own, if you've been inspired: ")
        write = input("Great! I can't wait to see what you come up with. Make sure only you can see the screen.")
else:
    madlib_title = input("Great! Let's choose a title: ")
    write = input("Creative! Time to start writing. Make sure Player 2 can't see your screen.")

words = word_tokenize(write)
tagged = nltk.pos_tag(words)

pos_list = ['NN', 'JJ', 'VBG', 'VBD', 'RB']

# Large space to prevent player 2 from seeing the story player 1 has typed

print('\n' * 20)

print(f'{player_2}, your turn!')
for i, (word, pos) in enumerate(tagged):
    if pos in pos_list:
        label = (
            "Past-Tense Verb" if pos == 'VBD' else
            "Noun" if pos == 'NN' else
            "Adjective" if pos == 'JJ' else
            "Verb-ING" if pos == 'VBG' else
            pos
        ) 
        
        user_input = input(f'{label}: ')
        tagged[i] = (user_input, pos)

reconstructed_story = ' '.join([word for word, pos in tagged])
see_madlib = input(f'Your masterpiece is completed! {player_1}, Ready to read? Type ready!')
if see_madlib.lower() == 'ready':
    print(f'Title: {madlib_title}, by {player_1} and {player_2}')
    print('')
    print(reconstructed_story)


Welcome to Mad Libs Create Your Own Template! You need two players to play the game. Player 1 will create a story. Player 2 will be asked for words with different parts-of-speech, without being able to see the story that player 1 created. After player 2 completes all the prompts, player 1 will read the story aloud to player 2. When both players are ready, please enter Ready Ready
Great! Player 1: Please enter your name. Rachel

Nice to meet you, Rachel. Player 2, how about you? Julia

Hi, Julia. 

Rachel, your turn! Time to write a short story. Make sure Julia cannot see your screen! If you would like to see some title names to inspire you, that is another option! "Write" or "Generate":Generate

    
    
Not a problem! Writing is hard. Check out these titles as a start:
 
My Worst Nightmare

The Best Place on Earth

Guess What I Just Saw!

Where's my Wallet?

The Great Baking Disaster

    
    
    
If you don't like any of them, type 'More'. Otherwise, 'Ready'.More


Here are a few 

## Project Report

For this project, I decided to implement a naive, virtual version of "Mad Libs". "Mad Libs" is a popular 2-player word game. Typically, Player 1 chooses a story from a collection of stories. Each story is mostly complete, but there are blanks where certain parts of speech (pos) would go. Player 1 prompts Player 2 for those pos's without reading them the story. At the end, Player 1 reads the story out loud to Player 2. One funny example: https://www.youtube.com/watch?v=6iClgRjmTvc&list=PLykzf464sU98HizZQOzYfvUOjjRSm9XLA&index=7 .

In my application, Player 1 does not choose a story, but comes up with one. Then, the application prompts Player 2 for words, reconstructs the story with those words, and then Player 1 reads the story out loud. As we walk through the code, I will explain why / how the most appropriate structure to support my application is an array. More specifically, a dynamic array ADT with Python's list type.

The first part of my code just deals with getting user input. It explains directions, asks for names of the players, and offers titles for inspiration if Player 1 needs it. You can see in the above code that I entered my name, my friend's name, asked to see two rounds of potential title options, enter one of those titles, and then created my story.

The core of the application comes after the user's input, 'write' is entered, and is where I will explain the approach step-by-step. In my explanation, I will replace the user input story with some sample text for readibility and easier comprehension.

To process Player 1's input, I imported a word tokenizer from NLTK, which is a library with useful packages for NLP tasks with Python. As the name suggests, the word_tokenizer tokenizes our text into words (more specifically, it tokenizes a string into a list of substrings). NLTK tokenizer and pos-tagger instantiate a list of tuples, 'tagged' (In this explanation, I've labeled it 'sample_tagged').   The tuples that the pos tagger produce are composed of this tokenized word, and its associated part of speech. Here is an example of how this works:

In [103]:
import nltk
from nltk.tokenize import word_tokenize

sample_text = "The dog sleepily walked away."
sample_words = word_tokenize(sample_text)
sample_tagged = nltk.pos_tag(sample_words)
print('Tokenized text',sample_words)
print('Tagged text',sample_tagged)
print('List type: ', type(sample_tagged))
print('List item type: ', type(sample_tagged[1]))


Tokenized text ['The', 'dog', 'sleepily', 'walked', 'away', '.']
Tagged text [('The', 'DT'), ('dog', 'NN'), ('sleepily', 'RB'), ('walked', 'VBD'), ('away', 'RB'), ('.', '.')]
List type:  <class 'list'>
List item type:  <class 'tuple'>


At first glance, one might ask why I did not change this list into a dictionary or create a dictionary. After all, after using the NLTK pos-tagged, many go on to change the data structure for further processing, like graph representations (for understanding semantic relationships), objects and classes (for N.E.R.), and more.  I originally thought that implementing a dictionary with key (pos) value (words) pairs would be the most efficient option for this project, and was hesitant at the thought of just replacing tuple items. However, the key to successful word-games like mad libs is maintaining word position in a sentence. This is why I decided to stick with a list. In a dictionary, when several values are associated with one key, they have to be stored in an iterable, like a list. You would have to iterate all the values of that list before moving to the next key (pos). This will disturb the order of the sentence. Have a look at the output in the cell below, which shows the comparison between the original text and the modified text if we use a dictionary when a sentence contains 2 adverbs.

In [1]:
import nltk
from nltk.tokenize import word_tokenize
from collections import OrderedDict

sample_text = "The dog sleepily walked away."
sample_words = word_tokenize(sample_text)
sample_tagged = nltk.pos_tag(sample_words)

sample_dict = OrderedDict()
for word, pos in sample_tagged:
    if pos not in sample_dict:
        sample_dict[pos] = [word]
    else:
        sample_dict[pos].append(word)

print("Original Dictionary:")
print(sample_dict)
print('')

for pos_tag, words in sample_dict.items():
    for i in range(len(words)):
        new_word = input(f"{pos_tag}: {words[i]}: ")
        words[i] = new_word

print("\nModified Dictionary:")
print('')
print(sample_dict)

modified_sentence = ' '.join(word for words in sample_dict.values() for word in words)
print('')
print('Original Sentence:', sample_text)
print('')
print('Modified Sentence', modified_sentence)

Original Dictionary:
OrderedDict([('DT', ['The']), ('NN', ['dog']), ('RB', ['sleepily', 'away']), ('VBD', ['walked']), ('.', ['.'])])

DT: The: The
NN: dog: dog
RB: sleepily: sleepily
RB: away: away
VBD: walked: walked
.: .: .

Modified Dictionary:

OrderedDict([('DT', ['The']), ('NN', ['dog']), ('RB', ['sleepily', 'away']), ('VBD', ['walked']), ('.', ['.'])])

Original Sentence: The dog sleepily walked away.

Modified Sentence The dog sleepily away walked .


On the other hand, with a list of tuples, the order of the words in every sentence is maintained. We can replace tuple items at particular indices with ease.

After insantiating 'tagged', I made a list called pos_list containing the relevant pos's (those which I wanted to extract for my project implementation).  You'll notice below that I only chose 5 pos's - nouns, adjectives, verb-ings, past-tense verbs, and adverbs, respectively. I did so because Player 1 creating the story already allows for great variability in sentence structure. In short, it already has the potential to get pretty whacky, so I didn't want to implement too many possible changes at the expense of interpretability. 

In [3]:
pos_list_sample = ['NN', 'JJ', 'VBG', 'VBD', 'RB']

The next step is to iterate through the list, the elements of the tuple, and the (index, (word, pos)) of 'tagged'. The application looks for pos's in 'tagged' list that match those in pos list. For those matches, we create user-friendly labels, so that the user is prompted with "Noun, Adjective, etc.", instead of Penn Treebank syntax: "NN", "JJ". The last line replaces the tuple at the appropriate index with their input and pos. This is another benefit of the list in Python, and another reason why I chose it - because it is mutable and dynamic. So, while we can't change the items within the tuple without replacing it entirely, we can at least add and remove tuples as we please.

In [6]:
for i, (word_sample, pos_sample) in enumerate(sample_tagged):
    if pos_sample in pos_list_sample:
        label_sample = (
            "Noun" if pos_sample == 'NN' else
            "Adverb" if pos_sample == 'RB' else
            "Past-Tense Verb" if pos_sample == 'VBD' else
            "Verb-ING" if pos_sample == 'VBG' else
            "Unknown"   
        )
        user_input_sample = input(f'{label_sample}: ')
        sample_tagged[i] = (user_input_sample, pos_sample)
    print(f'\n{sample_tagged[i]}')


('The', 'DT')
Noun: cat

('cat', 'NN')
Adverb: happily

('happily', 'RB')
Past-Tense Verb: pranced

('pranced', 'VBD')
Adverb: along

('along', 'RB')

('.', '.')


Then, all that is left to do is rebuild the text! This is yet another reason why the array is integral. We can use "join" to join a list's elements, separated by a space, which allows us to read the output as a sequence of sentences.

In [94]:
reconstructed_sentence_sample = ' '.join([word_sample for word_sample, pos_sample in sample_tagged])
print("Original Sentence:", sample_text)
print("Modified Sentence:", reconstructed_sentence_sample)

Original Sentence: The dog sleepily walked away.
Modified Sentence: The cat happily pranced along .


As you can probably tell, the NLTK pos tagger is not perfect. It doesn't totally capture information about plurality, among other limitations. I think the integration of a LLM would be great for this project. You could have it generate stories based on an idea you have, and its understanding of pos's undoubtedly has the potential to be more robust.