# Choose your own adventure story

- You are about to embark on a choosen your own adventure story with none other than Donal Trump! You have the option to inhabit the world of Alice in Wonderland, The Grimms Fairy Tales, or Ulysses. As you move throughout the story, think about your next decision carefully...danger awaits around every turn!

![alt text](choose_your_own_adventure.png "Title")

> Stories to choose from
- Sherlock
- Alice in Wonderland
- Ulysses

#### Resources
> Trump speeches https://github.com/ryanmcdermott/trump-speeches

> TFIDF http://stackoverflow.com/questions/12118720/python-tf-idf-cosine-to-find-document-similarity

> Mad Libs inspiration: https://www.pinterest.com/explore/mad-libs/

#### Architecture of system

![alt text](architecture.png "Title")


In [105]:
from textblob import TextBlob
from collections import Counter
from collections import defaultdict
import nltk
from functools import reduce
import operator
import numpy as np
import re
from spacy.en import English
## The sentence generator module contains a trigram model for sentence generation
from sentence_generator import SentenceGenerator
from sklearn.metrics.pairwise import linear_kernel
from sklearn.feature_extraction.text import TfidfVectorizer
from story_chunks import place_of_story
from sklearn.metrics import accuracy_score
from Evaluation_Metrics import accuracy_score_test
import seaborn as sns

In [20]:
%load_ext autoreload

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In [33]:
%autoreload 2

In [74]:
# First, we need to decide which world the user would like to inhabit
story_names = ['Grimms fairy tales','Alice in Wonderland','Ulysses']
while True:
    story = input("Which story would you like to read? You can choose Grimms fairy tales, Alice in Wonderland, or Ulysses.")
    if story not in story_names:
        print('Please spell the name correctly.')
    if story in story_names:
        break

Which story would you like to read? You can choose Grimms fairy tales, Alice in Wonderland, or Ulysses.Grimms fairy tales


In [24]:
## Open Trump speeches to predict sentences off of
with open("speeches.txt") as f:
        text_trump = f.read()
#Clean trump text
text= re.sub(r'[-!@#$%^&*()\n_]+',' ',text_trump)
text = text.lower() ## lower the letters to increase the number of matches
#Token the Trump words, count the frequency of each token, and create a unigram, bigram, and trigram dictionary
trump_story = SentenceGenerator(text)

In [75]:
## Open the appropriate story based upon the user input
if story =='Alice in Wonderland':
    with open("alice_in_wonderland_copy.txt") as f:
        text_s = f.read()
elif story =='Grimms fairy tales':
    with open("grimms_fairy_tales.txt") as f:
        text_s = f.read()
elif story =='Ulysses':
    with open("ulysses_copy.txt") as f:
        text_s = f.read()
#Clean the text. Remove all punctuation except for periods, question marks, and exclamation points.
punctuation ='[ \ * \ [ - _ , : ; @ # $ % ^ & * ( ) \]  \n \']'.split(' ')
text_t = re.sub(r'[-!@#$%^&*()\n_]+',' ',text_s)

blob_text_story = TextBlob(text_t)


In [76]:
#POS for the story ## JJ - adjective ##NN - noun ##NNS - common noun plural ## NNP - proper noun plural #VB - verb
pos_entities = defaultdict(list)
for word,pos in blob_text_story.tags:
    if pos != "":
        pos_entities[pos].append(str(word).lower())

In [77]:
#NER for the story characters and places - Entites {'GPE', 'CARDINAL', 'PRODUCT',
#'DATE', 'ORG', 'WORK_OF_ART', 'NORP', 'QUANTITY', 'TIME', 'PERSON', 'ORDINAL', 'MONEY', 'LOC', 'LANGUAGE', 'FAC'}
# takes ~ 2 minutes
nlp = English()
ner_story_tokens = nlp(text_t)
possible_entities = defaultdict(list)
for token in ner_story_tokens:
    if token.ent_type_ != "":
        if str(token) in possible_entities.values():
            pass
        else:
            if len(token)<3: ## check that it is an actual entity 
                pass
            else:
                possible_entities[token.ent_type_].append(str(token).lower())

In [78]:
## find the most common entities used in the story
people_counter = Counter(possible_entities['PERSON'])
place_counter = Counter(possible_entities['GPE'])
org_counter = Counter(possible_entities['ORG'])
language_counter = Counter(possible_entities['LANGUAGE'])
product_counter = Counter(possible_entities['PRODUCT'])
### find the most common POS for different words
single_noun_com_counter= Counter(pos_entities['NN'])
plural_noun_com_counter = Counter(pos_entities['NNS'])
verb_counter= Counter(pos_entities['VB'])

## We have a text generator for Trump and named entity recognition / part of speech tagging completed for the story; now, move onto the actual adventure!

In [79]:
### need to keep track of the different entities used and in what order, # put in items here
#story - variable for story name
#Counters below
protagonist=people_counter.most_common(1)[0][0].lower()
antagonist = people_counter.most_common(2)[1][0].lower()
place_one= place_counter.most_common(1)[0][0].lower()
place_two = place_counter.most_common(2)[1][0].lower()
language_one = language_counter.most_common(1)[0][0].lower()
single_noun_one =single_noun_com_counter.most_common(1)[0][0].lower()
single_noun_two = single_noun_com_counter.most_common(2)[1][0].lower()
plural_noun_one = plural_noun_com_counter.most_common(1)[0][0].lower()
plural_noun_two = plural_noun_com_counter.most_common(2)[1][0].lower()
verb_one = verb_counter.most_common(1)[0][0].lower()
verb_two = verb_counter.most_common(2)[1][0].lower()
verb_three = verb_counter.most_common(3)[2][0].lower()
#utility_item_chocen, funny_item_chosen, transportation_item_chosen
utility_items='pocketknife flashlight'.split(' ')
funny_items='half-eaten-sandwich broken-lightbulb'.split(' ')
transportation_items = 'skateboard bike'.split(' ')

In [82]:
print('Welcome to a choose your own adventure story!\n Below you will pick a story, and you will explore that world with a fellow campanion.\
Throughout the story, you will have different options for next steps to take. \
Your goal is to choose the options that lead you safely on your adventure.  \
However, be careful! Danger awaits around every turn.')

Welcome to a choose your own adventure story!
 Below you will pick a story, and you will explore that world with a fellow campanion.Throughout the story, you will have different options for next steps to take. Your goal is to choose the options that lead you safely on your adventure.  However, be careful! Danger awaits around every turn.


In [83]:
###### The first section of the story   
print('You awake and look around "Hey, I think I recognize this place", you think to yourself. You realize \
that you are sitting down on something, you can not quite tell what, but you feel strangely awake.\
You turn to your right and see {} from {}. On your left, you see \
Donald Trump. You wonder where you are and how this all happened, but before you can think too much \
Trump turns to you and says '.format(protagonist,story),\
      trump_story.sentence_generate(3),\
        'Well that was weird you think to yourself. Shaking it off, you turn to {} who says, \
        "Welcome to {}! We are excited to have you and your friend explore our \
        world. Before we start what do you want to bring with you on your \
        journey?'.format(protagonist,story),\
        "'Please pick one from each category,' instructes {}.".format(protagonist))
print()## keep track of the options chosen
while True:
    print()
    print(utility_items)
    utility_item_chosen = input("Which item would you like?")
    if utility_item_chosen in utility_items:
        break
    else:
        print('Please spell the word correctly without the quotes')
while True:
    print()
    print(funny_items)
    funny_item_chosen = input("Which of these would you like?")
    if funny_item_chosen in funny_items:
        break
    else:
        print('Please spell the word correctly without the quotes')
while True:
    print()
    print(transportation_items)
    trans_item_chosen = input('Which of these would you like?')
    if trans_item_chosen in transportation_items:
        break
    else:
        print('Please spell the word correctly without the quotes')


  generated_text.append(starting_chars[start_char_index])


You awake and look around "Hey, I think I recognize this place", you think to yourself. You realize that you are sitting down on something, you can not quite tell what, but you feel strangely awake.You turn to your right and see hans from Grimms fairy tales. On your left, you see Donald Trump. You wonder where you are and how this all happened, but before you can think too much Trump turns to you and says  She violated section so and so many things we’re going to die.’ ‘oh shut up , makes a lot of people think i’m going to get in . I could even save more . But it has been a lot of things , without any effective plan for victory with the devaluations of so badly , and we are and with all of these areas have still never recovered .  Well that was weird you think to yourself. Shaking it off, you turn to hans who says,         "Welcome to Grimms fairy tales! We are excited to have you and your friend explore our         world. Before we start what do you want to bring with you on your     

In [84]:
#First choice by the user
choice1=[]
while True:
    choice = input('Please write a sentence for what you would like to do next\
    incorporating an item you have choosen.If you want to see the items you have, type items.')  
    if choice =='items':
        print(utility_item_chosen,trans_item_chosen,funny_item_chosen)
    else:
        choice1.append(choice)
        break

Please write a sentence for what you would like to do next    incorporating an item you have choosen.If you want to see the items you have, type items.items
flashlight bike half-eaten-sandwich
Please write a sentence for what you would like to do next    incorporating an item you have choosen.If you want to see the items you have, type items.i want to ride my bike after eating my sandwich.


### After each block of story, the sentence entered by the user will be used by an IR system (tf-idf with cosine similarity) to pick the  next part of the story.

In [85]:
# the first section of possible texts for the story
storychunk_one = place_of_story(1,protagonist,antagonist,place_one,language_one,\
                        trump_story.sentence_generate(2),place_two,\
                          single_noun_one,single_noun_two,plural_noun_one,plural_noun_two,verb_one,verb_two,verb_three) #protagonist

  generated_text.append(starting_chars[start_char_index])


In [86]:
tfidf_one = TfidfVectorizer() 
tf_idf_storyone= tfidf_one.fit_transform(storychunk_one) ##tf-idf on the possible story chunks in storychunk_one
choice1_tfidf = tfidf_one.transform(choice1) ## turn the user sentence into a vector
cosine_sim_one = linear_kernel(tf_idf_storyone,choice1_tfidf).flatten() ## cosine similarity between the story chunks and user sentence
re.sub(r'(\t)','',storychunk_one[cosine_sim_one.argsort()[::-1][0]]) ##return the story chunk

' "It is time to explore!" you announce to hans . With that, you jump on your bike, put Trump on the handlebars, and start going. "Wait!" yells hans "You are going to need this," she says handing you a king. What do you do next? '

In [87]:
choice2=[]
while True:
    choice = input('Please write a sentence for what you would like to do next \
    incorporating an item you have choosen.If you want to see the items you have, type items.')  
    if choice =='items':
        print(utility_item_chosen,trans_item_chosen,funny_item_chosen)
    else:
        choice2.append(choice)
        break
        

Please write a sentence for what you would like to do next     incorporating an item you have choosen.If you want to see the items you have, type items.ride my bike away


In [88]:
# the second section of possible texts for the story
storychunk_two = place_of_story(2,protagonist,antagonist,place_one,language_one,\
                        trump_story.sentence_generate(3),place_two,\
                          single_noun_one,single_noun_two,plural_noun_one,plural_noun_two,verb_one,verb_two,verb_three) 
tfidf_two = TfidfVectorizer()  #create a model
tf_idf_storytwo= tfidf_two.fit_transform(storychunk_two) ##tf-idf on the possible story chunks in storychunk_one
choice2_tfidf = tfidf_two.transform(choice2) ## turn the user sentence into a vector
cosine_sim_two = linear_kernel(tf_idf_storytwo,choice2_tfidf).flatten() ## cosine similarity between the story chunks and user sentence
re.sub(r'(\t)','',storychunk_two[cosine_sim_two.argsort()[::-1][0]]) ##return the story chunk

  generated_text.append(starting_chars[start_char_index])


" Feeling good about your self you decide that some fresh air would do you well.   You take your bike and start biking along the man with hans watching while Trump runs alongside. 'Be careful!' hans yells before noticing gretel also watching you. 'Not you again!' exclaims hans. ' O yes, I am here to make sure everyone is a little less comfortable' gretel says smiling. What should you do?"

In [92]:
choice3=[]
while True:
    choice = input('Please write a sentence for what you would like to do next \
    incorporating an item you have choosen.If you want to see the items you have, type items.')  
    if choice =='items':
        print(utility_item_chosen,trans_item_chosen,funny_item_chosen)
    else:
        choice3.append(choice)
        break
        

Please write a sentence for what you would like to do next     incorporating an item you have choosen.If you want to see the items you have, type items.eat my half-eaten-sandwich


In [93]:
# the third section of possible texts for the story
storychunk_three = place_of_story(3,protagonist,antagonist,place_one,language_one,\
                        trump_story.sentence_generate(4),place_two,\
                          single_noun_one,single_noun_two,plural_noun_one,plural_noun_two,verb_one,verb_two,verb_three) 

tfidf_three = TfidfVectorizer()  #create a model
tf_idf_storythree= tfidf_three.fit_transform(storychunk_three) ##tf-idf on the possible story chunks in storychunk_one
choice3_tfidf = tfidf_three.transform(choice3) ## turn the user sentence into a vector
cosine_sim_three = linear_kernel(tf_idf_storythree,choice3_tfidf).flatten() ## cosine similarity between the story chunks and user sentence
third_section = re.sub(r'(\t)','',storychunk_three[cosine_sim_three.argsort()[::-1][0]] )
third_section ##return the story chunk

  generated_text.append(starting_chars[start_char_index])


'You quickly take out what is left of your half-eaten-sandwich and show it to hans.  "Good," sayd hans, "We can use this to pay off gretel and get you to the man".   You wonder where that is, or who gretel is, but you follow hans. What should you do now?'

In [94]:
if 'dead' in third_section:
    print('You died with Trump by your side. Better luck next time!')
elif 'died' in third_section:
    print('You died with Trump by your side. Better luck next time!')
else:
    choice4=[]
    while True:
        choice = input('Please write a sentence for what you would like to do next \
        incorporating an item you have choosen.If you want to see the items you have, type items.')  
        if choice =='items':
            print(utility_item_chosen,trans_item_chosen,funny_item_chosen)
        else:
            choice4.append(choice)
            break


Please write a sentence for what you would like to do next         incorporating an item you have choosen.If you want to see the items you have, type items.half-eaten-sandwich


In [97]:
#The fourth chunck - Only go here if the user made it here
storychunk_four = place_of_story(4,protagonist,antagonist,place_one,language_one,\
                        trump_story.sentence_generate(2),place_two,\
                          single_noun_one,single_noun_two,plural_noun_one,plural_noun_two,verb_one,verb_two,verb_three,story) 

tfidf_four = TfidfVectorizer()  #create a model
tf_idf_storyfour= tfidf_four.fit_transform(storychunk_four) ##tf-idf on the possible story chunks in storychunk_one
choice4_tfidf = tfidf_four.transform(choice4) ## turn the user sentence into a vector
cosine_sim_four = linear_kernel(tf_idf_storyfour,choice4_tfidf).flatten() ## cosine similarity between the story chunks and user sentence
fourth_section = re.sub(r'(\t)','',storychunk_four[cosine_sim_four.argsort()[::-1][0]] )
fourth_section ##return the story chunk

  generated_text.append(starting_chars[start_char_index])


' "Congratulations!" says hans "You defeated gretel and survived Grimms fairy tales. Best of luck in your future adventures" '

- The secret to surviving is using the half eaten sandwich at the third story section :). Now, let us test the accuracy of the IR system.

## Evaluation Metrics
> Accuracy

- To evaluate our system, we want to look at the sentence entered by our users that contains an item they choose (i.e. bike) and pick the response that has that item. Since there is only one correct answer for each query, we will look at the average accuracy using each item over a number of queries.

In [65]:
test_chunk_one = place_of_story(1,protagonist,antagonist,place_one,language_one,\
                        trump_story.sentence_generate(3),place_two,\
                          single_noun_one,single_noun_two,plural_noun_one,plural_noun_two,verb_one,verb_two,verb_three,story)

  generated_text.append(starting_chars[start_char_index])


In [69]:
test_chunk_two = place_of_story(2,protagonist,antagonist,place_one,language_one,\
                        trump_story.sentence_generate(3),place_two,\
                          single_noun_one,single_noun_two,plural_noun_one,plural_noun_two,verb_one,verb_two,verb_three,story)

  generated_text.append(starting_chars[start_char_index])


In [71]:
test_chunk_three = place_of_story(3,protagonist,antagonist,place_one,language_one,\
                        trump_story.sentence_generate(3),place_two,\
                          single_noun_one,single_noun_two,plural_noun_one,plural_noun_two,verb_one,verb_two,verb_three,story)

  generated_text.append(starting_chars[start_char_index])


In [102]:
list_of_test_one_queries = [['pocketknife'],\
                            ['I like to run with my pocketknife'],\
                            ['This is a flashy fun way to show how cool you are with friends pocketknife'],\
                           ['Do you know who I am that I love to watch Trump with my pocketknife?'],\
                           ['This is incredible!\
                           I love to carve wood and eat food with my own pocketknife even though you know Trump\
                           has a way with words!']]

list_of_test_two_queries =[['half-eaten-sandwich'],
                          ['I like to run around the town with my half-eaten-sandwich'],
                          ["Wow, he really has a way with words with your half-eaten-sandwich"],
                          ['asdf asdf sandwich asdfa'],
                          ['This is lightsaber battle run fight half-eaten-sandwich']]

list_of_test_three_queries = [['bike'],
                             ['i like to ride my bicycle i like to ribe my bike'],
                             ['run away to the furtherst point that you can with your bike'],
                             ['looing at the moon and the stars and trump overhead you jump on the bike'],
                             ['Running, jumping, biking, with your bike along side eating, swimming, walking']]

In [183]:
#First chuck
print(accuracy_score_test(list_of_test_one_queries,test_chunk_one,'pocketknife'))
txt,num_one,quertxt,query_one = accuracy_score_test(list_of_test_one_queries,test_chunk_one,'pocketknife')
q_one = TextBlob(query_one[0][0])
len_query_one = sum(q_one.word_counts.values())
total_avg_len_query_one = np.mean([sum(TextBlob(i[0][0]).word_counts.values()) for i in list_of_test_one_queries])
#Second Chucnk
print(accuracy_score_test(list_of_test_one_queries,test_chunk_two,'pocketknife'))
txt_two,num_two,quertxt,query_two = accuracy_score_test(list_of_test_one_queries,test_chunk_one,'pocketknife')
q_two = TextBlob(query_two[0][0])
len_query_two = sum(q_two.word_counts.values())
#Third chunk
print(accuracy_score_test(list_of_test_one_queries,test_chunk_three,'pocketknife'))
txt_three,num_three,quertxt,query_three = accuracy_score_test(list_of_test_one_queries,test_chunk_one,'pocketknife')
q_three = TextBlob(query_three[0][0])
len_query_three = sum(q_three.word_counts.values())

('The acuracy is', 0.8, 'The query that failed was :', [['This is incredible!                           I love to carve wood and eat food with my own pocketknife even though you know Trump has a way with words!']])
('The acuracy is', 1.0)
('The acuracy is', 1.0)


In [181]:
#First chuck
print(accuracy_score_test(list_of_test_two_queries,test_chunk_one,'half-eaten-sandwich'))
    
#Second Chucnk
print(accuracy_score_test(list_of_test_two_queries,test_chunk_two,'half-eaten-sandwich'))
    
#Third chunk
print(accuracy_score_test(list_of_test_two_queries,test_chunk_three,'half-eaten-sandwich'))

('The acuracy is', 0.8, 'The query that failed was :', [['Wow, he really has a way with words with your half-eaten-sandwich']])
('The acuracy is', 1.0)
('The acuracy is', 1.0)


In [182]:
#First chuck
print(accuracy_score_test(list_of_test_three_queries,test_chunk_one,'bike'))
    
#Second Chucnk
print(accuracy_score_test(list_of_test_three_queries,test_chunk_two,'bike'))
    
#Third chunk
print(accuracy_score_test(list_of_test_three_queries,test_chunk_three,'bike'))

('The acuracy is', 0.8, 'The query that failed was :', [['run away to the furtherst point that you can with your bike']])
('The acuracy is', 0.4, 'The query that failed was :', [['i like to ride my bicycle i like to ribe my bike'], ['run away to the furtherst point that you can with your bike'], ['looing at the moon and the stars and trump overhead you jump on the bike']])
('The acuracy is', 0.8, 'The query that failed was :', [['Running, jumping, biking, with your bike along side eating, swimming, walking']])


> In general, our IR system works pretty well unless the query closely resembles a different story chunk.