# morning drinks for existencial questions

by nicolás escarpentier

## Structure of this notebook
I want to be able to switch the source dialogs, but the recipe structure will always be the same. At the same time, the list of target questions and answers will be something I input. 

The flow of this notebook will be the following:
- load recipe structures
- create tracery grammar rules
- create dialog loader
  - this creates the lists of tokens and rest of the stuff
- construct functions that let me use all from above with the input of q&a targets

## import packages

In [1]:
# python packages
import random as rng
from collections import Counter
import numpy as np
from numpy import dot
from numpy.linalg import norm

In [2]:
# spacy
import spacy
nlp = spacy.load('en_core_web_md')

In [3]:
# tracery
import tracery
from tracery.modifiers import base_english

In [4]:
# recipe scraper package
from recipe_scrapers import scrape_me

### functions
I will also define some base functions that have been used in class and are not specific to this project

In [5]:
# list functions
def remove_all(bye, words):
    while(bye in words):
        words.remove(bye)

In [6]:
# vector addition
def addv(coord1, coord2):
    return [c1 + c2 for c1, c2 in zip(coord1, coord2)]

# vector subtraction
def subtractv(coord1, coord2):
    return [c1 - c2 for c1, c2 in zip(coord1, coord2)]

# vector average
def meanv(coords):
    # assumes every item in coords has same length as item 0
    sumv = [0] * len(coords[0])
    for item in coords:
        for i in range(len(item)):
            sumv[i] += item[i]
    mean = [0] * len(sumv)
    for i in range(len(sumv)):
        mean[i] = float(sumv[i]) / len(coords)
    return mean

In [7]:
# get spacy vector
def vec(s):
    return nlp.vocab[s].vector

# get spacy sentence vector
def sentvec(s):
    sent = nlp(s)
    return meanv([w.vector for w in sent])

# cosine similarity
def cosine(v1, v2):
    if norm(v1) > 0 and norm(v2) > 0:
        return dot(v1, v2) / (norm(v1) * norm(v2))
    else:
        return 0.0

# closest word to target vector from token list
def spacy_closest(token_list, vec_to_check, n=10):
    return sorted(token_list, key=lambda x: cosine(vec_to_check, vec(x)), reverse=True)[:n]

# closest sentence to target vector from token list
def spacy_closest_sent(token_list, vec_to_check, n=10):
    return sorted(token_list, key=lambda x: cosine(vec_to_check, sentvec(x)), reverse=True)[:n]

# recipe structures

### units and extra ingredients

First, the extra ingredients and ingredients unit list is just made by hand

In [8]:
ingr_units = ["fluid oz.",
             "tablespoons",
             "teaspoons",
             "oz."]

ingr_extra = ["fresh mint leaves",
             "1 lime, cut into wedges",
             "ice cubes",
             "rimming salt",
             "1 orange, sliced",
             "twist lime zest",
             "maraschino cherries",
             "pineapple wedges"]

### instructions

Now, onto creating the scraped instructions

In [9]:
def verb_tracer(s):
    # look for the verb on the sentence
    verb = ""
    for word in s:
        if word.tag_ == "VB":
            verb = word
    # if there's no verb, return ""
    if verb is "":
        return ""
    # if the verb has children, go through them looking for "prep" and "dobj"
    elif len(list(verb.children)) > 0:
        # get the prep
        prep_children = [ch for ch in list(verb.children) if ch.dep_ == "prep"]
        # join them for tracery
        prep_text = " ".join([ch.text+" #np#" for ch in prep_children])
        # get the dobj
        dobj_children = [ch for ch in list(verb.children) if ch.dep_ == "dobj"]
        # joint them as a string
        dobj_text = ""
        for ch in dobj_children:
            dobj_text += " ".join([word.text for word in ch.subtree])
        # get the noun_chunks from the sentence and replace them with
        # a tracery placeholder in the dobj_text
        chunks = s.noun_chunks
        for ch in chunks:
            dobj_text = dobj_text.replace(ch.text, "#np#")
        # return the beautiful phrase
        return verb.text + " " + dobj_text + " " + prep_text
    # else, just return the verb + a tracery placeholder
    else:
        return verb.text+" #np#"

In [10]:
# scrape the full recipes
drinks_sources = [line.strip() for line in open('./recipe_sources.txt').readlines()]
drinks_scraped = [scrape_me(item) for item in drinks_sources]

In [11]:
# extract the instructions from the recipes
drinks_instructions = []
[drinks_instructions.append(drink.instructions()) for drink in drinks_scraped]

[None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None]

In [12]:
# use nlp to separate the sentences and have all the data we need
nlp_instructions = [list(nlp(inst).sents) for inst in drinks_instructions]

In [13]:
# get the finishing touches
instr_finish = [instr[-1].text.strip() for instr in nlp_instructions]

In [14]:
# separate the instruction bodies
instr_nlp_body = []
[instr_nlp_body.extend(instr[:-1]) for instr in nlp_instructions]

[None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None]

In [15]:
# and create the instructions ready for tracery
instr_body = [verb_tracer(instr).strip() for instr in instr_nlp_body]
remove_all('', instr_body)

## dialog loader

In [16]:
# turn everything into a function!
def dialog_loader(file_name):
    # load the dialog and use nlp to get the sentences
    dialog = [line.strip() for line in open(file_name).readlines()]
    dialog_sents = [line.text.strip() for line in list( nlp(' '.join(dialog)).sents )]
    # get the questions and answers
    q_a_lines = [[q_line, a_line] for q_line,a_line in zip(dialog_sents, dialog_sents[1:]) if '?' in q_line]
    # return these pairs!
    return q_a_lines

## final functions

This is what puts everything together and are the only places where you'd have to enter inputs

## INPUT !!

In [35]:
# filename input
filename = "./sw7_dialog.txt"
# load dialog and create the lists for q&a + answer noun_chunks > the ingredients (global)
dialog = dialog_loader(filename)
dialog_q = [pair[0] for pair in dialog]
dialog_a = [pair[1] for pair in dialog]
verbs_q  =    list(set( [vb.text.lower() for vb in nlp( ' '.join(dialog_q) ) if vb.tag_ == "VB"] ))
verbs_a  =    list(set( [vb.text.lower() for vb in nlp( ' '.join(dialog_a) ) if vb.tag_ == "VB"] ))
ingr_tokens = list(set( [ch.text.lower() for ch in nlp( ' '.join(dialog_a) ).noun_chunks] ))

## EXECUTE !!

In [45]:
q_develop = ["Do you feel like #qvb#ing yourself?",
             "Obsessed with #qvb#ing.",
             "Oppressed by #qvb#ing.",
             "Does it get better?",
             "Will it get better?",
             "Get up and #qvb#ing.",
             "Sleep or stay awake.",
             "Waves of me #qvb#ing feelings",
             "Sometimes the times are pressing, the constraints on top of your head",
             "The days and weeks are neverending."]

a_develop = [" for 10 seconds.",
             " to taste.",
             ". Consider your actions during the day.",
             ". What you did yesterday was fine.",
             " until your confidence resurfaces.",
             ". Strive for your dreams.",
             " and hear your feelings.",
             " and sit down for 5 second, your body needs rest.",
             ". You are a great person.",
             " and fuck the opinion of critics."]

In [46]:
def develop_questions(q):
    first_bit = q +" "
    verb_closests = spacy_closest(q_develop, vec(q), 10)
    for i in range(rng.randrange(5,9)):
        first_bit += rng.choice(verb_closests)+" "
        first_bit = first_bit.replace("#qvb#", rng.choice(verbs_q))
    print(first_bit)

In [51]:
def develop_answers(ingred):
    aa = ''
    for ingr in ingred:
        amnt = rng.randrange(1, 4)
        unit = rng.choice(ingr_units)
        aa += str(amnt) + ' ' + unit + ' of ' + ingr
        aa += rng.choice(a_develop) + '\n'
    print(aa)

In [59]:
def develop_instructions(ingred):
    # select the instructions
    instructions = ''
    for i in range(rng.randrange(3, 5)):
        # select a random instruction
        instructions += rng.choice(instr_body) + rng.choice(a_develop) + '\n'
    # replace the placeholders with ingredients in the order of the list, 
    # overflowing if the index goes out of bounds
    ingr_index = 0
    while "#np#" in instructions:
        instructions = instructions.replace("#np#", ingred[ingr_index], 1)
        ingr_index = (ingr_index+1)%len(ingred)
    # add the finishing instruction
    instructions += rng.choice( instr_finish )
    print(instructions)

In [62]:
def get_drink_new(q_target, a_target):
    # == DEFINE FROM DIALOG SOURCE
    # define a question according to the target
    all_questions = spacy_closest_sent( dialog_q, vec(q_target), 5 )
    question = rng.choice( all_questions )
    develop_questions(question)
    # get the target noun_chunk to "solve" and create the solution vector (question to target) 
    question_chunks = [ch.text for ch in nlp(question).noun_chunks]
    selected_chunk = rng.choice(question_chunks)
    solution_vector = subtractv(vec(q_target), sentvec(selected_chunk))
    # get the answer noun_chunks > the ingredients
    a_vector = vec(a_target)
    all_ingredients = spacy_closest(ingr_tokens, addv(solution_vector, a_vector), 15)
    # == INGREDIENTS
    # get closest 2 ingredients + 2 random
    ingredients = all_ingredients[0:2]
    ingredients.extend(rng.sample(all_ingredients,2))
    # and write the list with the amounts
    print()
    for ingr in ingredients:
        amnt = rng.randrange(1, 4)
        unit = rng.choice(ingr_units)
        print( str(amnt) + ' ' + unit + ' of ' + ingr )
    # add random extra ingredient
    print( rng.choice(ingr_extra) )
    print('')
    # == INSTRUCTIONS
    # select the instructions
    develop_instructions(ingredients)
    print("\n==========\n")

In [65]:
question_target = "anxiety"
answer_target = "vacations"
get_drink_new(question_target, answer_target)

Have you felt it? Oppressed by leaveing. The days and weeks are neverending. Get up and trying. Sleep or stay awake. Oppressed by hearing. 

3 teaspoons of transportation
3 teaspoons of sanitation
2 oz. of rumors
2 fluid oz. of us
pineapple wedges

shake transportation until your confidence resurfaces.
Fill sanitation with rumors and fuck the opinion of critics.
pour  into us and fuck the opinion of critics.
release transportation. Strive for your dreams.
Fill the glasses with club soda, stir, and garnish with additional lime wedges.




In [76]:
# filename input
filename = "./sw3_dialog.txt"
# load dialog and create the lists for q&a + answer noun_chunks > the ingredients (global)
dialog = dialog_loader(filename)
dialog_q = [pair[0] for pair in dialog]
dialog_a = [pair[1] for pair in dialog]
verbs_q  =    list(set( [vb.text.lower() for vb in nlp( ' '.join(dialog_q) ) if vb.tag_ == "VB"] ))
verbs_a  =    list(set( [vb.text.lower() for vb in nlp( ' '.join(dialog_a) ) if vb.tag_ == "VB"] ))
ingr_tokens = list(set( [ch.text.lower() for ch in nlp( ' '.join(dialog_a) ).noun_chunks] ))

question_target = "melancholy"
answer_target = "summer"
get_drink_new(question_target, answer_target)

Did you ever hear the tragedy of Darth Plagueis "the wise"? Obsessed with accomplishing. The days and weeks are neverending. The days and weeks are neverending. Do you feel like destroying yourself? Waves of me takeing feelings 

1 fluid oz. of she
1 fluid oz. of me
2 fluid oz. of nothing
2 teaspoons of everything
1 orange, sliced

serve  over she. You are a great person.
Pour  in me and sit down for 5 second, your body needs rest.
Pour nothing , everything and she into me and fuck the opinion of critics.
Garnish with a lime twist.




In [80]:
# filename input
filename = "./hp5_dialog.txt"
# load dialog and create the lists for q&a + answer noun_chunks > the ingredients (global)
dialog = dialog_loader(filename)
dialog_q = [pair[0] for pair in dialog]
dialog_a = [pair[1] for pair in dialog]
verbs_q  =    list(set( [vb.text.lower() for vb in nlp( ' '.join(dialog_q) ) if vb.tag_ == "VB"] ))
verbs_a  =    list(set( [vb.text.lower() for vb in nlp( ' '.join(dialog_a) ) if vb.tag_ == "VB"] ))
ingr_tokens = list(set( [ch.text.lower() for ch in nlp( ' '.join(dialog_a) ).noun_chunks] ))

question_target = "death"
answer_target = "memories"
get_drink_new(question_target, answer_target)

"What's worse than death?" Sometimes the times are pressing, the constraints on top of your head The days and weeks are neverending. Oppressed by disarming. Waves of me understanding feelings Get up and hateing. Sleep or stay awake. 

2 teaspoons of memories
3 tablespoons of death
1 tablespoons of feelings
3 fluid oz. of glee
fresh mint leaves

Rub memories around death. You are a great person.
strain  into feelings. Strive for your dreams.
shake glee. What you did yesterday was fine.
Fill memories for 10 seconds.
Pour into glasses and serve.


