## Décimas in english

Now that we learned how to electronically generate rhymes, I wanted to try to write "Décimas". 

> The decima in all Latin America and in Spain is a style of poetry that is octosyllabic and has 10 lines to the stanza. The rhyming scheme is ABBAACCDDC. It is spoken, sung and written throughout Latin America with variations in different countries. It is often improvised.
    [(source)](https://en.wikipedia.org/wiki/D%C3%A9cima)

So I decided I would write the code to produce 10 lines stanzas of 8 syllable each, and follow the ryhme pattern.

In [2]:
import pronouncing as pr

In [3]:
import random

In [4]:
from collections import Counter

In [5]:
import spacy
import en_core_web_md

In [230]:
nlp = spacy.load('en_core_web_md')

### Input text
Since this poetry style is more common among in Latin American writers, I chose the introduction of one of my favourite books, "Open Veins of Latin America" by Uruguayan writer Eduardo Galeano. 
>In the book, Galeano analyzes the history of the Americas as a whole, from the time period of the European settlement of the New World to contemporary Latin America, describing the effects of European and later United States economic exploitation and political dominance over the region.
    [(source)](https://en.wikipedia.org/wiki/Open_Veins_of_Latin_America)

To start, I gathered the introduction of the book and run Spacy to extract verbs, nouns, pronouns and adjectives.

In [7]:
veins = open("veins.txt").read()

In [8]:
doc = nlp(veins)

In [9]:
nouns = [item.text for item in doc if item.tag_ == 'NN']

In [10]:
nouns

['division',
 'labor',
 'part',
 'world',
 'today',
 'ocean',
 'role',
 'era',
 'fact',
 'fable',
 'imagination',
 'conquest',
 'gold',
 'silver',
 'region',
 'menial',
 'service',
 'source',
 'reserve',
 'oil',
 'iron',
 'copper',
 'meat',
 'fruit',
 'coffee',
 'coordinator',
 'concept',
 'era',
 'trade',
 'freedom',
 'business',
 'business',
 'inquisitor',
 'hangman',
 'profit',
 'condition',
 'way',
 'right',
 'history',
 'century',
 'pilgrims',
 'coast',
 'world',
 'today',
 'region',
 'sub',
 'class',
 'identity',
 'region',
 'Everything',
 'discovery',
 'capital',
 'power',
 'Everything',
 'soil',
 'mineral',
 'capacity',
 'Production',
 'class',
 'structure',
 'area',
 'gearbox',
 'capitalism',
 'area',
 'function',
 'benefit',
 'metropolis',
 'moment',
 'chain',
 'dependency',
 'chain',
 'oppression',
 'country',
 'exploitation',
 'food',
 'labor',
 'today',
 'history',
 'competition',
 'backwardness',
 'poverty',
 'result',
 'failure',
 'losing',
 'history',
 'underdevelopment

In [11]:
verbs = [item.text for item in doc if item.tag_ == 'VBD']

In [12]:
verbs

['was',
 'ventured',
 'buried',
 'passed',
 'perfected',
 'surpassed',
 'was',
 'said',
 'observed',
 'was',
 'said',
 'was',
 'appeared',
 'existed',
 'lost',
 'won',
 'was',
 'became',
 'had',
 'was',
 'enjoyed',
 'told',
 'did',
 'did',
 'confirmed',
 'was',
 'had',
 'wrote',
 'had',
 'had',
 'was',
 'prophesied',
 'continued',
 'considered',
 'said',
 'buried',
 'lay',
 'forbidden',
 'triumphed',
 'had',
 'was',
 'was',
 'investigated',
 'found',
 'called',
 'meant',
 'opened']

In [13]:
prnouns = [item.text for item in doc if item.tag_ == 'PRP']

In [14]:
prnouns

['it',
 'We',
 'It',
 'them',
 'them',
 'we',
 'they',
 'You',
 'You',
 'They',
 'He',
 'he',
 'he',
 'we',
 'ourselves',
 'we',
 'it',
 'it',
 'We',
 'itself',
 'they',
 'they',
 'I',
 'it',
 'them',
 'it',
 'it',
 'it',
 'They',
 'it',
 'it',
 'it',
 'it',
 'it',
 'they',
 'he',
 'us',
 'it',
 'us',
 'they',
 'it',
 'they',
 'us',
 'it',
 'It',
 'us',
 'it',
 'it',
 'it',
 'it',
 'it',
 'we',
 'it',
 'us',
 'itself',
 'it',
 'it',
 'I',
 'themselves',
 'it',
 'he']

In [15]:
adjs = [item.text for item in doc if item.tag_ == 'JJ']

In [16]:
adjs

['precocious',
 'remote',
 'Indian',
 'raw',
 'rich',
 'fair',
 'medieval',
 'free',
 'dominating',
 'external',
 'foreign',
 'dominated',
 'internal',
 'foreign',
 'foreign',
 'confident',
 'foreign',
 'apt',
 'domestic',
 'new',
 'second',
 'nebulous',
 'open',
 'European',
 'such',
 'distant',
 'rich',
 'natural',
 'human',
 'universal',
 'foreign',
 'endless',
 'many',
 'small',
 'big',
 'internal',
 'Latin',
 'American',
 'integral',
 'implicit',
 'native',
 'desolate',
 'deep',
 'empty',
 'precious',
 'rich',
 'aware',
 'imperialist',
 'vast',
 'same',
 'dominating',
 'last',
 'rich',
 'poor',
 'twentieth',
 'imperialist',
 'whole',
 'necessary',
 'dramatic',
 'absolute',
 'relative',
 'own',
 'poor',
 'capitalist',
 'average',
 'Latin',
 'deceptive',
 'many',
 'poor',
 'rich',
 'few',
 'Latin',
 'social',
 'same',
 'other',
 'private',
 'sterile',
 'unproductive',
 'total',
 'imperialist',
 'profitable',
 'only',
 'international',
 'mortgaged',
 'other',
 'cynical',
 'social',
 

In [231]:
import tracery

### Making lists by syllable count
Although in this first version I didn't accomplish 8 syllable lines, I made this firs attempt to organize the list of words by syllable count so the I could select the amount needed for each line. 
For the output text I would choose a random word from the lists and assing it to the type of rhymes (A, B, C and D). Then search for all the words that rhyme the first one in the list. And finally select randomly from that rhyming list for the output line.

In [18]:
prnouns_s = []
for i in range(3):
    prnouns_s.append([])
    prnouns_s[i] = [word for word in prnouns if pr.syllable_count(pr.phones_for_word(word)[0]) == i]

In [19]:
verbs_s = []
for i in range(6):
    verbs_s.append([])
    verbs_s[i] = [word for word in verbs if pr.syllable_count(pr.phones_for_word(word)[0]) == i]

In [20]:
nouns_s = []
for i in range(6):
    nouns_s.append([])
    nouns_s[i] = [word for word in nouns if pr.syllable_count(pr.phones_for_word(word)[0]) == i]

In [21]:
adjs_s = []
for i in range(6):
    adjs_s.append([])
    adjs_s[i] = [word for word in adjs if pr.syllable_count(pr.phones_for_word(word)[0]) == i]

In [22]:
import random

In [23]:
rh_a_first = random.choice(nouns)
print(rh_a_first)

generation


In [232]:
rh_a = []
for i in pr.rhymes(rh_a_first):
    for rh_a_second in nouns:
        if(rh_a_second == i):
            rh_a.append(i)
print(rh_a)

['constellation', 'delegation', 'exploitation', 'humiliation', 'imagination', 'industrialization', 'liberation', 'nation', 'ostentation', 'perpetuation', 'perpetuation', 'population', 'population', 'population', 'population', 'population', 'population', 'population', 'population']


### Not enough ryming words
For now, this last part of the code doesn't work very well because there are many words that do not have any rhyming words that are of the same kind and they are found on the input text (and not the pronouncing dictionary).

In [203]:
rh_b_first = random.choice(nouns)
print(rh_b_first)

imperialism


In [204]:
rh_b = []
for i in pr.rhymes(rh_b_first):
    for rh_b_second in nouns:
        if(rh_b_second == i):
            rh_b.append(i)
print(rh_b)

['capitalism', 'capitalism', 'patriotism']


In [227]:
rh_c_first = random.choice(nouns)
print(rh_c_first)

oppression


In [228]:
rh_c = []
for i in pr.rhymes(rh_c_first):
    for rh_c_second in nouns:
        if(rh_c_second == i):
            rh_c.append(i)
print(rh_c)

['question']


In [108]:
rh_d_first = random.choice(nouns)
print(rh_d_first)

commission


In [233]:
rh_d = []
for i in pr.rhymes(rh_d_first):
    for rh_d_second in nouns:
        if(rh_d_second == i):
            rh_d.append(i)
print(rh_d)

['competition', 'condition', 'malnutrition']


## Result
So far I was able to write one stanza that makes a little bit of sense and tries to generate the words as random as possible. What I want to fix now is:
- Avoid some repetition by also choosing random words for the first part of the line and not only for the last rhyming word.
- Add gramatical rules for the generation of each by adding connectors and prepositions (tracery modifiers not working)
- Make the line actually 8 syllables!! For this I would have to add some math to count the syllable of the rhyimg word and then substract that so that the beggining of the line has 8 - x syllables. 
- Try other syllable counts as Wikipedia suggests: *Given the flexible method of counting syllables in Spanish verse, where an "octosyllabic" line could easily have seven or nine syllables (as normally counted), in writing a décima in English it would seem not unreasonable to write in iambic pentameter (theoretically ten syllables), which comes more naturally to English verse.*

In [229]:
rules = {
    "line_a": ["#prp_1# #vb_1# #adjs_2# #noun_a#", "#prp_2# #vb_2# #adjs_3# #noun_a_f#"],
    
    "line_b": ["#prp_1# #vb_2# #adjs_3# #noun_b#", "#prp_1# #vb_1# #adjs_4# #noun_b_f#"],
    
    "line_c": ["#prp_1# #vb_3# #adjs_5# #noun_c#", "#prp_1# #vb_2# #adjs_2# #noun_c_f#"],
    
    "line_d": ["#prp_1# #vb_1# #adjs_4# #noun_d#", "#prp_1# #vb_1# #adjs_3# #noun_d_f#"],
    
    
    "prp_1": prnouns_s[1],
    "prp_2": prnouns_s[2],
    
    "vb_1": verbs_s[1],
    "vb_2": verbs_s[2],
    "vb_3": verbs_s[3],
    
    "adjs_1": adjs_s[1],
    "adjs_2": adjs_s[2],
    "adjs_3": adjs_s[3],
    "adjs_4": adjs_s[4],
    "adjs_5": adjs_s[5],
    
    "noun_a_f": rh_a_first,
    "noun_a": rh_a,
    "noun_b_f": rh_b_first,
    "noun_b": rh_b,
    "noun_c_f": rh_c_first,
    "noun_c": rh_c,
    "noun_d_f": rh_d_first,
    "noun_d": rh_d,

}
grammar = tracery.Grammar(rules)
print(grammar.flatten("#line_a#"))
print(grammar.flatten("#line_b#"))
print(grammar.flatten("#line_b#"))
print(grammar.flatten("#line_a#"))
print(grammar.flatten("#line_a#"))
print(grammar.flatten("#line_c#"))
print(grammar.flatten("#line_c#"))
print(grammar.flatten("#line_d#"))
print(grammar.flatten("#line_d#"))
print(grammar.flatten("#line_c#"))

it called Latin population
You became deceptive patriotism
it was dominated imperialism
itself surpassed plentiful generation
themselves surpassed general generation
it triumphed unjust oppression
us forbidden intrauterine question
we found confident commission
they did populated competition
it buried open oppression


In [118]:
from tracery.modifiers import base_english