## Markov Chain
- Probabistic Model for Text/Natural Language Generation
- Simple and effective way of generating new text
    - Text
    - Lyrics
    - Story/Novel
    - Code

In [None]:
text = "the man was ....they...then.... the ... the  "

# X is the sequence of 'K = 3' and Y is predicted character or K+1 the character

X      Y     Freq
the    " "    4
the    "n"    2
the    "y"    1
the    "i"    1
man    "_"    1

In [7]:
import numpy as np

In [47]:
def generateTable(text,k = 4):
    text = text.lower()
    d = {}
    for i in range(len(text)-k):
        t = text[i:i+k]
        p = text[i+k]
        if t in d:
            d[t][p] = d[t].get(p,0)+1
        else:
            d[t] = {p:1}
    return d

In [48]:
T = generateTable("hello hello helli")
print(T)

{'hell': {'o': 2, 'i': 1}, 'ello': {' ': 2}, 'llo ': {'h': 2}, 'lo h': {'e': 2}, 'o he': {'l': 2}, ' hel': {'l': 2}}


In [49]:
def generateProb(text,k=4):
    prob = generateTable(text,k)
    for w in prob:
        tc = sum(prob[w].values())
        for c in prob[w]:
            prob[w][c] /= tc
    return prob

In [50]:
T = generateProb("hello hello helli")
print(T)

{'hell': {'o': 0.6666666666666666, 'i': 0.3333333333333333}, 'ello': {' ': 1.0}, 'llo ': {'h': 1.0}, 'lo h': {'e': 1.0}, 'o he': {'l': 1.0}, ' hel': {'l': 1.0}}


In [51]:
def nextSequenceChar(seq,prob):
    if seq not in prob:
        return ' '
    else:
        return np.random.choice(list(prob[seq].keys()),p = list(prob[seq].values()))

In [52]:
for i in range(10):
    print(nextSequenceChar("hell",T))

o
o
o
o
o
o
i
o
i
o


In [55]:
def generateText(intial_text,prob,max_len = 10,k=4):
    intial_text = intial_text.lower()
    seq = intial_text[-k:]
    g_text = intial_text
    for i in range(max_len):
        ch = nextSequenceChar(seq,prob)
        g_text = g_text+ch
        seq = g_text[-k:]
    return g_text

In [56]:
generateText("hell",T,max_len=20)

'hello helli             '

In [59]:
def markovProcess(text,seed_text,k = 4,max_len = 100):
    prob = generateProb(text,k)
    # print(prob)
    g_text = generateText(seed_text,prob,max_len,k)
    return g_text

In [65]:
with open('english_speech_2.txt','r') as file:
    print(markovProcess(file.read(),'dear',k=4,max_len = 1000))

dear country's sixth largest economy. in that time parliament, among with their rights.

today.

my dear country, many good rajya bagh. how order to the festival of the glory of tried to protect the evidence and floods, when our soldiers of sacrifice forceful with hard work. today's independence. i heartily great men hanging the festival of independence, in the hanging us to the celebrating the are celebrating the countrymen, the country and how long. this freedom of you wishes of the leadership of the hanged ones due to lives in the countrymen, to living this time parliament, among the seven seas and flower lives in the soldiers of independence in their live and die, the house ranks of massacrifice of the evidence.

today is not india has brought new constitutional of the world's sixth largest tricolor of dreams with sensitivity and happiness. i heartily resolve of the respect the commissions of independence at the has tricolor flag to be oppressed seven states crossing in the so many