# Hidden Markov Model

_Experiments with HMMs on various texts_

---

### Imports

In [20]:
import nltk
import random as rand

In [2]:
from nltk import corpus
gb = corpus.gutenberg

### Class Definition

In [223]:
print(gb.fileids())
whitman = gb.words('whitman-leaves.txt')
alice = gb.words('carroll-alice.txt')
emma = gb.words('austen-emma.txt')

['austen-emma.txt', 'austen-persuasion.txt', 'austen-sense.txt', 'bible-kjv.txt', 'blake-poems.txt', 'bryant-stories.txt', 'burgess-busterbrown.txt', 'carroll-alice.txt', 'chesterton-ball.txt', 'chesterton-brown.txt', 'chesterton-thursday.txt', 'edgeworth-parents.txt', 'melville-moby_dick.txt', 'milton-paradise.txt', 'shakespeare-caesar.txt', 'shakespeare-hamlet.txt', 'shakespeare-macbeth.txt', 'whitman-leaves.txt']


In [220]:
class HMM:
    def __init__(self, input_text):
        self.text = input_text
        self.transitions = {}
        self.sentence = " "
        self.buildHMM()
    
    def gen(self, length):
        self.buildSentence(length)
        print(self.sentence)
    
    def buildHMM(self):
        # build basic transition table
        for i in range(len(self.text) - 1):
            word = self.text[i].lower()
            if word not in self.transitions:
                self.transitions[word] = {}
            next_word = self.text[i+1].lower()
            if next_word not in self.transitions[word]:
                self.transitions[word][next_word] = 1
            else:
                self.transitions[word][next_word] += 1
        
        # normalize probabilities
        for word in list(self.transitions.keys()):
            total = 0
            for next_word in list(self.transitions[word].keys()):
                total += self.transitions[word][next_word]
            for next_word in list(self.transitions[word].keys()):
                self.transitions[word][next_word] /= total
                
    
    def buildSentence(self, length):
        
        
        for i in range(length):
            if i == 0:
                current = self.randWordText()
                self.sentence = current.capitalize()
                continue
            
            current = self.weightedWord(current.lower())
            if current == "." or current == "?" or current == "!":
                self.sentence += " " + current.capitalize()
                continue
            
            if current.isalpha() or current.isnumeric():
                if i % 10 == 0:
                    self.sentence += "\n" + current.capitalize()
                else:
                    self.sentence += " " + current
            else:
                self.sentence += current
    
    def weightedWord(self, current):
        rand_val = rand.uniform(0, 1)
        total = 0
        for k, v in self.transitions[current].items():
            total += v
            if rand_val <= total:
                return k
        return rand.choice(list(self.transitions[current].keys()))
    
    def randWordText(self):
        current = rand.choice(self.text)
        # make sure the first word is not a symbol
        while not current.isalpha():
            current = rand.choice(self.text)
        return current

In [221]:
whitmanHMM = HMM(whitman)
whitmanHMM.gen(100)

Yet o soul in the emigrant and spheric there within
Me as for once to the object, it avails
Not for maize, turbulent musical shuttle, back at
The inducements shall solve the sea,) o boundless blue ! thou in his voice and how quick from its
Embower' d by every one advancing ! lo !
O trumpeter free march, revoltress ! thou with me, ascending mount and utter joyous of an ox-
Deck the beauty, of men leaning my spirit,
Far behind . but include them, martyrs, the


In [222]:
aliceHMM = HMM(alice)
aliceHMM.gen(100)

Ground-- but she came rather doubtfully: and how
She made entirely disappeared;' why do no use
In the queen to think ! good way you might
As she succeeded in the lizard, and i'
T squeeze so suddenly upon a bat ! you have
Lessons to sea!" and half those are,'
And, low, it all stopped and an invitation
For apples, i breathe when it was very little
Glass .' s asleep again!'' t know
Whether you are very soon as much frightened by it


In [225]:
emmaHMM = HMM(emma)
emmaHMM.gen(100)

And all at once, and suffering, too,
Though poor mr . if this, however, but
Just what do better . if you and, was
Convinced . nobody else . woodhouse' s brain during
The other any man like him . my own .
A pity that was no time . emma engaging .
Don' s." emma . elton on purpose."" indifferent as propriety.-- my heart at ford'
S situation, and a cordial as was engaged.--
Very-- no, and emma; i ought,
