# Generate Verbal Metaphors for a Given Sentence
An attempt to improve upong this paper:
https://arxiv.org/pdf/2002.12854.pdf

##Set-up

In [None]:
%%capture
!pip install transformers
!pip install nltk

In [None]:
import nltk
import spacy
from transformers import pipeline

In [None]:
%%capture
nltk.download('wordnet')
nlp = spacy.load("en_core_web_sm")

In [None]:
from nltk.corpus import wordnet as wn

In [None]:
%%capture
classifier = pipeline('fill-mask', topk=300)

Some weights of RobertaForMaskedLM were not initialized from the model checkpoint at distilroberta-base and are newly initialized: ['lm_head.decoder.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


## A Function to Find Verbal Metaphors for a Given Sentence
Pretty messy right now. I'll clean up when I get a chance outside of colab. No time today.

In [None]:
def find_verbal_metaphors(sentence, model_new_word_token='Ġ', n=5):
    
    # find root verb if it exists
    root_verbs = [token.text for token in nlp(sentence) if token.dep_ == 'ROOT' and token.pos_ == 'VERB']
    if not root_verbs:
        print('No matching root verbs found.')
        return None
    root_verb = root_verbs[0]

    # have bert fill the mask of the root verb (TODO: what if multiple root verbs?)
    results = classifier(sentence.replace(root_verb, '<mask>'))
    tokens = [result['token_str'][len(model_new_word_token):] for result in results if result['token_str'].startswith(model_new_word_token)]

    # find all possible synsets for each results token that might fill the mask
    possibilities = []
    synset_to_form = {}
    for token in tokens:
        for token_ss in wn.synsets(token):
            if token_ss.pos() == 'v':
                possibilities.append(token_ss)
                synset_to_form[token_ss] = token  # duplicates possible here! TODO: fix!
    possible_ss_indices = {v: k for k, v in enumerate(possibilities)}

    # find most likely metaphors and return sentence with sorted choices
    metaphor_indices = {}
    root_verb_synsets = [ss for ss in wn.synsets(root_verb) if ss.pos() == 'v']
    c = 0
    for ss in root_verb_synsets:
        for hn_ss in ss.hyponyms():
            if hn_ss in possible_ss_indices:
                metaphor_indices[synset_to_form[hn_ss]] = c
                c += 1
            # recurse once
            for subhn_ss in hn_ss.hyponyms():
                if subhn_ss in possible_ss_indices:
                    metaphor_indices[synset_to_form[subhn_ss]] = c
                    c += 1
    metaphors = [m[0] for m in sorted(metaphor_indices.items(), key=lambda x: x[1])[:n]]
    return sentence.replace(root_verb, '[' + '/'.join(metaphors) + ']', 1)

## Test the Function: Generate Some Verbal Metaphors

In [None]:
sentences = ['Deer run through the forest with wreckless abandon.',
             'Human beings create meaning with language.',
             'The economy fails when pandemics hit.']
for sentence in sentences:
    print('=' * 80)
    metaphors = find_verbal_metaphors(sentence)
    print(sentence, '\n  WITH METAPHORS =>', metaphors)

Deer run through the forest with wreckless abandon. 
  WITH METAPHORS => Deer [rushed/sprint/streaks/fly/breaks] through the forest with wreckless abandon.
Human beings create meaning with language. 
  WITH METAPHORS => Human beings [compound/confused/regulate/evoke/trace] meaning with language.
The economy fails when pandemics hit. 
  WITH METAPHORS => The economy [choked/falls/blew/crashing] when pandemics hit.
