# A simple haiku generator 

### Traditional Haiku Structure
The structure of a traditional haiku is always the same, including the following features:

There are only three lines, totaling 17 syllables.
The first line is 5 syllables.
The second line is 7 syllables.
The third line is 5 syllables like the first.
Punctuation and capitalization are up to the poet, and need not follow the rigid rules used in structuring sentences.
A haiku does not have to rhyme, in fact usually it does not rhyme at all.
It can include the repetition of words or sounds

Note: we are counting syllables and not words. 

The 5-7-5 rhythm has been lost in translation, as not every Japanese word has the same number of syllables, or sounds, as its English version. For example, haiku has two syllables in English. In Japanese, the word has three sounds

The work here is inspired by Sean Zhai
https://reddotblues.medium.com/



#### setup (if not already installed)

In [None]:
#!pip3 install -U spacy
#!pip3 install syllapy

In [None]:
#!python -m spacy download en_core_web_sm

#### import relevant libs

In [27]:
import spacy
from spacy.matcher import Matcher
import syllapy
import random

#### create patterns

Used the spacy's rule based matcher by creating token patterns and running them over our text

these matchers can be generated using the tool 
https://demos.explosion.ai/matcher?text=A%20match%20is%20a%20tool%20for%20starting%20a%20fire.%20Typically%2C%20modern%20matches%20are%20made%20of%20small%20wooden%20sticks%20or%20stiff%20paper.%20One%20end%20is%20coated%20with%20a%20material%20that%20can%20be%20ignited%20by%20frictional%20heat%20generated%20by%20striking%20the%20match%20against%20a%20suitable%20surface.%20Wooden%20matches%20are%20packaged%20in%20matchboxes%2C%20and%20paper%20matches%20are%20partially%20cut%20into%20rows%20and%20stapled%20into%20matchbooks.&model=en_core_web_sm&pattern=%5B%7B%22id%22%3A0%2C%22attrs%22%3A%5B%7B%22name%22%3A%22POS%22%2C%22value%22%3A%22ADJ%22%7D%2C%7B%22name%22%3A%22OP%22%2C%22value%22%3A%22%3F%22%7D%5D%7D%2C%7B%22id%22%3A1%2C%22attrs%22%3A%5B%7B%22name%22%3A%22LEMMA%22%2C%22value%22%3A%22match%22%7D%2C%7B%22name%22%3A%22POS%22%2C%22value%22%3A%22NOUN%22%7D%5D%7D%2C%7B%22id%22%3A2%2C%22attrs%22%3A%5B%7B%22name%22%3A%22LEMMA%22%2C%22value%22%3A%22be%22%7D%5D%7D%5D

In [28]:
nlp = spacy.load("en_core_web_sm")
nlp.max_length = 1000000000000;
matcher2 = Matcher(nlp.vocab)
matcher3 = Matcher(nlp.vocab)
matcher4 = Matcher(nlp.vocab)

pattern = [{'POS':  {"IN": ["NOUN", "ADP", "ADJ", "ADV"]} },
           {'POS':  {"IN": ["NOUN", "VERB"]} }]
matcher2.add("TwoWords", [pattern])
pattern = [{'POS':  {"IN": ["NOUN", "ADP", "ADJ", "ADV"]} },
           {'IS_ASCII': True, 'IS_PUNCT': False, 'IS_SPACE': False},
           {'POS':  {"IN": ["NOUN", "VERB", "ADJ", "ADV"]} }]
matcher3.add("ThreeWords", [pattern])
pattern = [{'POS':  {"IN": ["NOUN", "ADP", "ADJ", "ADV"]} },
           {'IS_ASCII': True, 'IS_PUNCT': False, 'IS_SPACE': False},
           {'IS_ASCII': True, 'IS_PUNCT': False, 'IS_SPACE': False},
           {'POS':  {"IN": ["NOUN", "VERB", "ADJ", "ADV"]} }]
matcher4.add("FourWords", [pattern])

#### load data
Here I am using a text file for one of the famous novels. This will be used as refernece for creating haiku
I downloaded one from https://www.gutenberg.org/ in text UTF format

In [29]:
file = r"HaikuGenerator\data\GreatExpectations.txt"
doc = nlp(open(file,encoding='utf-8').read())

#### create matches for syllables

In [32]:
matches2 = matcher2(doc)
matches3 = matcher3(doc)
matches4 = matcher4(doc)

g_5 = []
g_7 = []

#### create sets of 5 and 7 syllables

In [51]:
for match_id, start, end in matches2 + matches3 + matches4:
    string_id = nlp.vocab.strings[match_id]  # Get string representation
    span = doc[start:end]  # The matched span

    syl_count = 0
    for token in span:
        syl_count += syllapy.count(token.text)
    if syl_count == 5:
        if span.text not in g_5:
            g_5.append(span.text)
    if syl_count == 7:
        if span.text not in g_7:
            g_7.append(span.text)
print("Enter for a new haiku. ^C to quit\n")

Enter for a new haiku. ^C to quit



#### Keep generating Haikus :)

In [None]:
while (True):
    print("%s\n%s\n%s" %(random.choice(g_5),random.choice(g_7),random.choice(g_5)))
    input("\n")

classes were holden
attentively engaged
previously been quite


attempts at pieces
out in his accustomed
seldom if ever


in which country boys
set of ivory tablets
slowly unclenched


laws regulating
exemplary transactions
knife many a time


baby on her lap
O equal to anythink
to the first figure
