# Detecting Rhetorical Devices Using NLTK
Now that we have covered some rhetorical devices, let's review them and add code.

In [20]:
import nltk
import string
from nltk.corpus import stopwords

In [21]:
#utilities
def is_stopword(token):
    stops = stopwords.words('english').copy()
    return token in stops

def is_punctuation(token):
    return token in string.punctuation

def is_vowel(char):
    if char in ('a','e','i','o','u','y'):
        return True
    return False

## Rhetorical Question

### What is it?
* A question asked for some purpose other than to acquire information
* “Why oh why me?”
* “Who do you think you are?!”

### How can we find it?
* Our texts are monologues
* Any question is not expected to be answered
* Look for sentences ending in “?”


In [5]:
#http://www.americanrhetoric.com/speeches/hueyplongking.htm
huey = '''Now, what did they mean by that? Did they mean, my friends, to say that all me were created equal and that that meant that any one man was born to inherit $10,000,000,000 and that another child was to be born to inherit nothing?

Did that mean, my friends, that someone would come into this world without having had an opportunity, of course, to have hit one lick of work, should be born with more than it and all of its children and children's children could ever dispose of, but that another one would have to be born into a life of starvation?

That was not the meaning of the Declaration of Independence when it said that all men are created equal of "That we hold that all men are created equal."'''

huey_sents = nltk.sent_tokenize(huey.lower())
print(huey_sents)

['now, what did they mean by that?', 'did they mean, my friends, to say that all me were created equal and that that meant that any one man was born to inherit $10,000,000,000 and that another child was to be born to inherit nothing?', "did that mean, my friends, that someone would come into this world without having had an opportunity, of course, to have hit one lick of work, should be born with more than it and all of its children and children's children could ever dispose of, but that another one would have to be born into a life of starvation?", 'that was not the meaning of the declaration of independence when it said that all men are created equal of "that we hold that all men are created equal."']


In [4]:
def is_rhetorical(sent):
    if sent.endswith('?'):
        return True
    return False

for sent in huey_sents:
    print(is_rhetorical(sent))

True
True
True
False


## Epitrophe
###  What is it?
* Ending successive phrases in similar manner
* “What lies behind us and what lies before us are tiny compared to what lies within us." —Emerson

### How can we find it?
* Just a single word repeating is not epitrophe
* We need to look at the sentence structure
* If surrounded by the same parts of speech, consider epitrophe
* Should consider looking within a sentence and between successive sentences


In [34]:
#Identify words,ngrams that occur multiple times
#Get context of word
#POS tag context
from collections import Counter

def has_epitrophe(sent):
    epitrophe_instances = []
    
    tokens = nltk.word_tokenize(sent.lower())
    word_dist = Counter(tokens)
    repeated_words = [word for word,count in word_dist.items() if count > 1]
    
    trigrams = nltk.ngrams(tokens,3)
    for word in repeated_words:
        anchor = None
        for trigram in trigrams:
            if trigram[1] == word:
                tags = nltk.pos_tag(trigram)
                if anchor:
                    if tags[0][1] == anchor[0][1]:
                        epitrophe_instances.append((tuple([word for word,pos in anchor]), trigram))
                else:
                    anchor = tags
        return epitrophe_instances
        
emerson = '''What lies behind us and what lies before us are tiny compared to what lies within us.'''
emerson_sents = nltk.sent_tokenize(emerson.lower())

for sent in emerson_sents:
    print(has_epitrophe(sent))

[(('what', 'lies', 'behind'), ('what', 'lies', 'before')), (('what', 'lies', 'behind'), ('what', 'lies', 'within'))]


## Alliteration
### What is it?
The occurrence of the same letter or sound at the beginning of adjacent or closely connected words
* "They are part of the finest fighting force that the world has ever known. They have served tour after tour of duty in distant, different, and difficult places..."  —Obama

### How can we find it?
* Look for successive words with similar sounds
    * Same two first letters
    * Same first vowel
    * Same first consonant followed by a vowel
* Remove stop words
* Look for successive words and words one apart


In [7]:
obama = '''"This generation of soldiers, sailors, airmen, Marines, and Coast Guardsmen have volunteered in the time of certain danger. They are part of the finest fighting force that the world has ever known. They have served tour after tour of duty in distant, different, and difficult places...They are men and women -- white, black, and brown -- of all faiths and all stations -- all Americans, serving together to protect our people, while giving others half a world away the chance to lead a better life....In today’s wars, there's not always a simple ceremony that signals our troops’ success -- no surrender papers to be signed, or capital to be claimed...."'''

obama_sents = nltk.sent_tokenize(obama.lower())
print(obama_sents)

['"this generation of soldiers, sailors, airmen, marines, and coast guardsmen have volunteered in the time of certain danger.', 'they are part of the finest fighting force that the world has ever known.', 'they have served tour after tour of duty in distant, different, and difficult places...they are men and women -- white, black, and brown -- of all faiths and all stations -- all americans, serving together to protect our people, while giving others half a world away the chance to lead a better life....in today’s wars, there\'s not always a simple ceremony that signals our troops’ success -- no surrender papers to be signed, or capital to be claimed...."']


In [24]:


def has_alliteration(word1, word2):
    gram00 = word1[0]
    gram10 = word2[0]
    
    if gram00==gram10:
        if is_vowel(gram00):
            return True
        else:
            if len(word1) > 1 and len(word2) > 1:
                gram01 = word1[1]
                gram11 = word2[1]
                if is_vowel(gram01) and is_vowel(gram11):
                    return (word1, word2)
                elif gram01 == gram11:
                    return (word1, word2)
    return None

def count_alliteration(sent):
    allit_instances = []
    tokens = nltk.word_tokenize(sent.lower())
    #ignore stopwords
    tokens = [token for token in tokens if not(is_punctuation(token) or is_stopword(token))]
    
    bigrams = nltk.ngrams(tokens,2)
    for one,two in bigrams:
        if has_alliteration(one,two):
            allit_instances.append((one,two))
    trigrams = nltk.ngrams(tokens,3)
    for one,two,three in trigrams:
        #the not avoids double counting
        if has_alliteration(one,three) and not has_alliteration(one,two):
            allit_instances.append((one,two,three))
    print(allit_instances)
    return len(allit_instances)

for sent in obama_sents:
    print(sent)
    print(count_alliteration(sent))

"this generation of soldiers, sailors, airmen, marines, and coast guardsmen have volunteered in the time of certain danger.
[('soldiers', 'sailors')]
1
they are part of the finest fighting force that the world has ever known.
[('finest', 'fighting'), ('fighting', 'force')]
2
they have served tour after tour of duty in distant, different, and difficult places...they are men and women -- white, black, and brown -- of all faiths and all stations -- all americans, serving together to protect our people, while giving others half a world away the chance to lead a better life....in today’s wars, there's not always a simple ceremony that signals our troops’ success -- no surrender papers to be signed, or capital to be claimed...."
[('tour', 'tour'), ('duty', 'distant'), ('distant', 'different'), ('different', 'difficult'), ('lead', 'better', 'life'), ('simple', 'ceremony', 'signals'), ('signals', 'troops’', 'success'), ('success', '--', 'surrender'), ('surrender', 'papers', 'signed')]
9


## Epenalepsis
### What is it?
Repeating the same phrase at the beginning and end of a phrase 
“Believe not all you can hear, tell not all you believe." —Native American proverb
### How can we find it?
*You Try*


In [None]:
native = '''Believe not all you can hear, tell not all you believe.'''

def has_epenalepsis(sent):
    
    
print(has_epanalepsis(native.lower()))

## Polyptoton
### What is it?
* Repeating a word in a sentence, but in a different form (think lemma vs lexeme)
    * “The Greeks are strong, and skillful to their strength, fierce to their skill, and to their fierceness valiant” -- Shakespeare

    
### How can we find it?
*You Try*


In [None]:
shake = '''The Greeks are strong, and skillful to their strength, fierce to their skill, and to their fierceness valiant.'''

def has_polyptoton(sent):
    
    
print(has_polyptoton(shake.lower())

## Chiasmus
### What is it?
* Reordering parallel grammatical structures in a sentence/phrase
    * "Ask not what your country can do for you; ask what you can do for your country." John F. Kennedy 

### How can we find it?
*You Try*
