## Markov Chain
- Probabistic Model for Text/Natural Language Generation
- Simple and effective way of generating new text
    - Text
    - Lyrics
    - Story/Novel
    - Code

In [2]:
text = "the man was ....they...then.... the ... the  "

# X is the sequence of 'K = 3' and Y is predicted character or K+1 the character

# X      Y     Freq
# the    " "    4
# the    "n"    2
# the    "y"    1
# the    "i"    1
# man    "_"    1

In [3]:
def generateTable(data,k=4):
    
    T = {}
    for i in range(len(data)-k):
        X = data[i:i+k]
        Y = data[i+k]
        #print("X  %s and Y %s  "%(X,Y))
        
        if T.get(X) is None:
            T[X] = {}
            T[X][Y] = 1
        else:
            if T[X].get(Y) is None:
                T[X][Y] = 1
            else:
                T[X][Y] += 1
    
    return T
        
    

In [4]:
T = generateTable("hello hello helli")
print(T)

{'hell': {'o': 2, 'i': 1}, 'ello': {' ': 2}, 'llo ': {'h': 2}, 'lo h': {'e': 2}, 'o he': {'l': 2}, ' hel': {'l': 2}}


In [5]:
def convertFreqIntoProb(T):     
    for kx in T.keys():
        s = float(sum(T[kx].values()))
        for k in T[kx].keys():
            T[kx][k] = T[kx][k]/s
                
    return T

In [6]:
T = convertFreqIntoProb(T)
print(T)

{'hell': {'o': 0.6666666666666666, 'i': 0.3333333333333333}, 'ello': {' ': 1.0}, 'llo ': {'h': 1.0}, 'lo h': {'e': 1.0}, 'o he': {'l': 1.0}, ' hel': {'l': 1.0}}


In [7]:
text_path = "harrypotter1.txt"
def load_text(filename):
    with open(filename,encoding='utf8') as f:
        return f.read().lower()
    
text = load_text(text_path)
#text = load_text("sample_code.txt")

In [8]:
print(text[:1000])

/ 




the boy who lived 

mr. and mrs. dursley, of number four, privet drive, 
were proud to say that they were perfectly normal, 
thank you very much. they were the last people you’d 
expect to be involved in anything strange or 
mysterious, because they just didn’t hold with such 
nonsense. 

mr. dursley was the director of a firm called 
grunnings, which made drills. he was a big, beefy 
man with hardly any neck, although he did have a 
very large mustache. mrs. dursley was thin and 
blonde and had nearly twice the usual amount of 
neck, which came in very useful as she spent so 
much of her time craning over garden fences, spying 
on the neighbors. the dursley s had a small son 
called dudley and in their opinion there was no finer 
boy anywhere. 

the dursleys had everything they wanted, but they 
also had a secret, and their greatest fear was that 
somebody would discover it. they didn’t think they 
could bear it if anyone found out about the potters. 
mrs. potter was mrs. dursl

## Train our Markov Chain

In [9]:
def trainMarkovChain(text,k=4):
    
    T = generateTable(text,k)
    T = convertFreqIntoProb(T)
    
    return T
    

In [10]:
model = trainMarkovChain(text)

In [11]:
print(model)

{'/ \n\n': {'\n': 1.0}, ' \n\n\n': {'\n': 1.0}, '\n\n\n\n': {'\n': 0.37344398340248963, 't': 0.052904564315352696, 'm': 0.012448132780082987, 'c': 0.006224066390041493, 'p': 0.18568464730290457, 'h': 0.052904564315352696, 'a': 0.02074688796680498, 'e': 0.002074688796680498, 'd': 0.00933609958506224, '“': 0.17323651452282157, 'f': 0.008298755186721992, 'w': 0.017634854771784232, 'o': 0.008298755186721992, 'b': 0.00933609958506224, '3': 0.001037344398340249, 'k': 0.002074688796680498, 'n': 0.008298755186721992, 's': 0.011410788381742738, 'v': 0.0031120331950207467, '•': 0.002074688796680498, 'i': 0.007261410788381743, 'u': 0.0031120331950207467, 'q': 0.004149377593360996, '1': 0.002074688796680498, 'l': 0.002074688796680498, 'r': 0.008298755186721992, '7': 0.001037344398340249, 'y': 0.006224066390041493, '9': 0.001037344398340249, 'g': 0.002074688796680498, 'j': 0.002074688796680498}, '\n\n\nt': {'h': 0.8823529411764706, 'r': 0.0196078431372549, 'u': 0.0392156862745098, 'w': 0.0196078431

## Generate Text at Text Time!


In [12]:
import numpy as np

In [13]:
# sampling !
fruits = ["apple","banana","mango"]
prob = ["0.8",".1","0.1"]
for i in range(10):
    #sampling according a probability distribution
    print(np.random.choice(fruits,p=prob))


apple
mango
apple
apple
apple
apple
apple
apple
mango
banana


In [14]:
def sample_next(ctx,T,k):
    ctx = ctx[-k:]
    if T.get(ctx) is None:
        return " "
    possible_Chars = list(T[ctx].keys())
    possible_values = list(T[ctx].values())
    
    #print(possible_Chars)
    #print(possible_values)
    
    return np.random.choice(possible_Chars,p=possible_values)

In [15]:
sample_next("comm",model,4)

'o'

In [16]:
def generateText(starting_sent,k=4,maxLen=1000):
    
    sentence = starting_sent
    ctx = starting_sent[-k:]
    
    for ix in range(maxLen):
        next_prediction = sample_next(ctx,model,k)
        sentence += next_prediction
        ctx = sentence[-k:]
    return sentence

In [19]:

text = generateText("Ron,Harry and herm",k=4,maxLen=2000)
print(text)

Ron,Harry and hermione - j.k. rowling await outta get on the dursleys sparks any more cat onto asked, but that was fat really 
thrown was creat on, an’ ron was feeling ...” 

page | 122 harry look should the hat, he hourse noise believe it 
would 
back.” 

hair tea what,” said. “i wonderin left in 
through. he hadn’t stree.” 





“we’ve dog, i’ve has glided quirrell on they sat until the pelts class harry couldn’t was gone — unlessor first quidditching hot a funny, which 
cloak on hair, and got him. not fan urged a horse — you. our sister advention, 
parcel told — can’t, the philosophers stone ignore have.” 

the other, will, i can to 
dormitory didn’t stone train? howls through 
the fully to they’d training they leave begin on much the pulled soft him then.” 



page | 262 harved its hermione, very coulderstand. harry. mr. 
mars are to do, hagrid drived slythere, potter attacked hermione, and his to they’re you ther lap, he airplant forcerer’s he started 
for gryffingers, i was me sw

## Congrats, you have learnt how to build your own text generator  !
## How about a Rap/Song Lyrics Generate or Whatsapp Autocomplete  as assigment!

![](modi.gif)