## Markov Chain
- Probabistic Model for Text/Natural Language Generation
- Simple and effective way of generating new text
    - Text
    - Lyrics
    - Story/Novel
    - Code

In [None]:
text = "the man was ....they...then.... the ... the  "

# X is the sequence of 'K = 3' and Y is predicted character or K+1 the character

X      Y     Freq
the    " "    4
the    "n"    2
the    "y"    1
the    "i"    1
man    "_"    1

In [1]:
import numpy as np

In [2]:
def generateTable(text,k = 4):
    text = text.lower()
    d = {}
    for i in range(len(text)-k):
        t = text[i:i+k]
        p = text[i+k]
        if t in d:
            d[t][p] = d[t].get(p,0)+1
        else:
            d[t] = {p:1}
    return d

In [7]:
T = generateTable("hello helli hello hello")
print(T)

{'hell': {'o': 3, 'i': 1}, 'ello': {' ': 2}, 'llo ': {'h': 2}, 'lo h': {'e': 2}, 'o he': {'l': 2}, ' hel': {'l': 3}, 'elli': {' ': 1}, 'lli ': {'h': 1}, 'li h': {'e': 1}, 'i he': {'l': 1}}


In [8]:
def generateProb(text,k=4):
    prob = generateTable(text,k)
    for w in prob:
        tc = sum(prob[w].values())
        for c in prob[w]:
            prob[w][c] /= tc
    return prob

In [10]:
# T = generateProb("hello hello helli")
T = generateProb('hello helli hello hello')
print(T)

{'hell': {'o': 0.75, 'i': 0.25}, 'ello': {' ': 1.0}, 'llo ': {'h': 1.0}, 'lo h': {'e': 1.0}, 'o he': {'l': 1.0}, ' hel': {'l': 1.0}, 'elli': {' ': 1.0}, 'lli ': {'h': 1.0}, 'li h': {'e': 1.0}, 'i he': {'l': 1.0}}


In [11]:
def nextSequenceChar(seq,prob):
    if seq not in prob:
        return ' '
    else:
        return np.random.choice(list(prob[seq].keys()),p = list(prob[seq].values()))

In [15]:
for i in range(10):
    print(nextSequenceChar("hell",T))

o
o
o
i
o
i
i
o
o
o


In [16]:
def generateText(intial_text,prob,max_len = 10,k=4):
    intial_text = intial_text.lower()
    seq = intial_text[-k:]
    g_text = intial_text
    for i in range(max_len):
        ch = nextSequenceChar(seq,prob)
        g_text = g_text+ch
        seq = g_text[-k:]
    return g_text

In [18]:
generateText("hell",T,max_len=50)

'hello helli hello hello helli hello helli helli hello '

In [19]:
def markovProcess(text,seed_text,k = 4,max_len = 100):
    prob = generateProb(text,k)
    # print(prob)
    g_text = generateText(seed_text,prob,max_len,k)
    return g_text

In [21]:
with open('english_speech_2.txt','r') as file:
    print(markovProcess(file.read(),'dear',k=4,max_len = 1000))

dear country, along had the glory of the service, in 12 year, on their great service for all the country independence this freedom of sacrifice andhra pradesh - our daughters of their misery.

my devoted to be a hundred years of the tricolor of their sacres of our heroes, i bow down to proud of the are celebrating today, i am very happiness. i heart to be a hundred in the countrymen the are celebrating their greet the celebrating the country's in our loved on the are countrymen hanged on everest.


my dear country is like the new excitement, new excitement, among the holy festival of the country, along with hard world, to backward, by give gay

i bow my heart to social justice. today is protect all are country's in the commissions under these status to overcrowding forceful with hard world, today is freedom of luck.

today and gives, have jubilee prisons of the glory of the tricoloring the country, our daught a new excitement to the session of freedom to be a hundred force, for a confi