Your task is create a random Haiku generator program. A Haiku is a poem like:

Whitecaps on the bay: <br>
A broken signboard banging <br>
In the April wind. <br>
— Richard Wright, collected in Haiku: This Other World, 1998, copied from <a href = "https://en.wikipedia.org/wiki/Haiku">Wikipedia</a>

A haiku is defined not by a rhyme pattern, but by the number of syllables in each line. Traditionally, a haiku has three lines: <br>
First: Five syllables. <br>
Seven in the second line, <br>
and Five in the third. <br>

Your random haiku genarator will generate haikus worthy of literary praise almost surely. Of course, it will generate many, many more bad haikus, like:

dorr comfortably <br>
mahabharata brandwein <br>
boltz gumbert fromm glib <br>

Before you begin, you might need to use the NLTK downloader to get the corpora <tt>cmudict</tt> and <tt>words</tt>. If they are already installed, the following should succeed.

In [1]:
from nltk.corpus import cmudict
from nltk.corpus import words

If not, you need to get them.

In [None]:
import nltk
#nltk.download()


I'm going to give you some help. <tt>cmudict.dict()</tt> returns a Python dictionary in which each key is a word and the corresponding value is a <i>list</i> containing ways of pronouncing the word. When there is more than one pronunciation, the list has more than one element. I suggest you explore some entries in <tt>cmudict.dict()</tt> to get a better sense of what's going on. Try looking up the pronunciation "hello" and "goodbye". <i>Don't worry at all if you don't understand how to interpret the pronunciations. I don't. It's irrelevant for this problem. The only point is that each key is a word and the corresponding value is a <i>list</i> containing some representation of ways of pronouncing the word</i>.

In [3]:
# Explore this object.
# Suggestion: don't print it all. 
# That would take a while...

print type(cmudict.dict())
print cmudict.dict().keys()[0]
d = cmudict.dict()
d['fawn'] #return the pronouciation of 'fawn'

<type 'dict'>
fawn


[[u'F', u'AO1', u'N']]

Based on this, we can write a function that determines the number of syllables in a given word.

In [4]:
def nsyl(word): #(return No of syllabus of given word)
    return [len(list(y for y in x if y[-1].isdigit())) for x in d[word.lower()]]


You don't need to understand <i>exactly</i> how the function works, because that would require understanding how the dictionary represents pronunciations. But in short, <tt>nsyl</tt> does some processing on the pronunciations to determine the number of syllables in each pronunciation. Before proceeding, I suggest you try it out on "hello", "goodbye", and maybe some other common words to get a sense of how it works. 

In [7]:
# try it
print nsyl('hello') #if word has more than one pronounciation, return a list with syllabus count in all kinds of sound
print nsyl('goodbye')


[2, 2]
[2]


1) Create a dictionary <tt>d2</tt> in which each key is an integer and the corresponding value is a list of all words with that many syllables. <i>For words with multiple pronunciations, consider only the first pronunciation. </i>

In [8]:
d2 = {} #or d2 = dict()
for word in d.keys():
    n = nsyl(word) #n is a list now
    n = n[0]
    try:
        d2[n].append(word)
    except:
        d2[n] = [word]
    

In [10]:
print d2[2]
print nsyl('sonji')

[2]


2) One word in the dictionary contains more syllables than any other. Print this word.

In [13]:
d2.keys() #longest word has syllabus of 14
#see photo notes
print d2[14][0]

supercalifragilisticexpealidoshus


3) Print the number of words with a given number of syllables like:
<pre>
0: 4
1: 16240
2: 56982
</pre>
etc... <br>
Note that there are are some words with zero syllables. That's fine. Not all "words" in the dictionary are real English words. We'll revisit this in the very last step.

In [14]:
for n in d2.keys():
    print str(n)+': '+str(len(d2[n])) #get how many words are of each syllabus length

0: 4
1: 16240
2: 56982
3: 33850
4: 12132
5: 3398
6: 722
7: 108
8: 15
9: 2
12: 1
14: 1


4) Write a function <tt>sylPattern(n)</tt> that returns a list of random integers that sum to <tt>n</tt>. (Later, you will choose for each element of the list a random word with the given number of syllables.) Your function should work for any n > 1. Test it for n = 15. For example:
<pre>
x = sylPattern(15)
print x
print sum(x) == 15
</pre>
should print something like:
<pre>
[1, 5, 1, 1, 5, 2]
True
</pre>
<i>Hint: You don't need to know any special functions to do this; all you need is basic random number generation capability from the </i><tt>random</tt><i> module. The rest of the algorithm is up to you.</i>

In [27]:
# Your code here
import random 
def sylPattern(n):
    l = []
    while n> 0:
        x = random.randint(1,5)
        if x<=n:
            l.append(x)
            n = n-x
    return l

In [28]:
x = sylPattern(15) #get a syllabus pattern that sums up to 15
print x
print sum(x) == 15

[2, 4, 2, 5, 2]
True


5) Write and test a function <tt>randWord(n)</tt> that returns a random word with <tt>n</tt> syllables. For instance:
<pre>
print randWord(6)
</pre>
shows something like:
<pre>
amiability
</pre>

In [31]:
# Your code here
def randWord(n):
    words = d2[n]
    N = len(words)
    return d2[n][random.randint(0,N)]

In [33]:
randWord(5)
print nsyl('continually')

[5, 4]


6) Write and test a function <tt>randLine(n)</tt> that returns a line with <tt>n</tt> syllables (separated by spaces). For instance:
<pre>
randLine(10)
</pre>
shows something like:
<pre>
porcupine melodrama gable scot
</pre>

In [40]:
# Your code here
def randLine(n):
    pattern = sylPattern(n)
    result_string= ''
    for i in pattern:
        word = randWord(i)
        result_string += word + " "
    result_string += '\n'
    return result_string


In [41]:
#randLine(5)
print nsyl('knickerbockered')
print nsyl('rep\'s')

[4]
[1]


7) Finally, write and test haiku(). For instance:
<pre>
print haiku()
</pre>
should show a haiku formatted like:
<pre>
psalm degenerate
lapsed land mend holl franchiser
chia ill pint draft
</pre>

In [39]:
# Your code here
def haiku():
    print randLine(5)
    print randLine(7)
    print randLine(5)

In [42]:
haiku()

galati lu kohls 

patuxet indemnities 

kilburg dofasco 

