<a href="https://colab.research.google.com/github/shivangisahay14/MLH-LOCAL-HACKDAY/blob/main/Markov_Chain.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Language Modelling** A language model attempts to learn the structure of natural language. A key feature of language modelling is that it is generative, meaning that it aims to predict the next word given a previous sequence of words. It is able to do this because language models are typically trained on very large datasets in an unsupervised manner, and hence the model can “learn” the syntactic features of language

To predict the next word of a sentence, the model actually needs to know quite a lot about the language and quite a lot of world knowledge. Here is an an example:


> “I’d like to eat a hot ___”: Obviously, “dog”, right?




> “It was a hot ___”: Probably “day”



**Mini-Project : Fake Text Generator**


- Fake New Story
- Fake Speech
- Fake Bollywood Song 

In [None]:
text = """Who let the dogs out?
Who, who, who, who, who?
Who let the dogs out?
Who, who, who, who, who?
Who let the dogs out?
Who, who, who, who, who?
Who let the dogs out?
Well, the party was nice, the party was pumpin'
Yippie yi yo
And everybody havin' a ball
Yippie yi yo
I tell the fellas start the name callin'
Yippie yi yo
And the girls respond to the call
I heard a woman shout out
Who let the dogs out?
Who, who, who, who, who?
Who let the dogs out?
Who, who, who, who, who?
Who let the dogs out?
Who, who, who, who, who?
Who let the dogs out?
Who, who, who, who, who?
I see de dance people had a ball
'Coz she really want to skip town
Get back, Gruffy, back, Scruffy
Get back you flea infested mongrel
Gonna tell myself, "Hey, man, no get angry"
Yippie yi yo
To any girls callin' them canine
Yippie yi yo
But they tell me, "Hey, man, it's part of the party?
Yippie yi yo
You put a woman in front and her man behind
I heard woman shout out
Who let the dogs out?
Who, who, who, who, who?
Who let the dogs out?
Who, who, who, who, who?
Who let the dogs out?
Who, who, who, who, who?
Who let the dogs out?
Who, who, who, who, who?
Say, a doggy is nuttin' if he don' have a bone
All doggy, hold ya' bone, all doggy, hold it
A doggy is nuttin' if he don' have a bone
All doggy, hold ya' bone, all doggy, hold it
Who let the dogs out?
Who, who, who, who, who?
Who let the dogs out?
Who, who, who, who, who?
Who let the dogs out?
Who, who, who, who, who?
Who let the dogs out?
Who, who, who, who, who?
I see de dance people had a ball
'Coz she really want to skip town
Get back, Gruffy, back, Scruffy
Get back you flea infested mongrel
Well, if I am a dog, the party is on
I gotta get my groove 'cause my mind done gone
Do you see the rays comin' from my eye
Walkin' through the place that Digi-man is breakin' it down?
Me and my white short shorts
And I can't see color, any color will do
I'll stick on you, that's why they call me 'Pit bull'
'Cause I'm the man of the land
When they see me they say, ? Who?
Who let the dogs out?
Who, who, who, who, who?
Who let the dogs out?
Who, who, who, who, who?
Who let the dogs out?
Who, who, who, who, who?
Who let the dogs out?
Who, who, who, who, who?
Who let the dogs out?
Who, who, who, who, who?
Who let the dogs out?
Who, who, who, who, who?"""

text = text.lower()

In [None]:
X          y 
=============
hell       o
ello       ''
llo_       h
hell       o
.
.
.

In [None]:
def learnProb(data,k = 4):

    T = {}


    # First Step - Learn the X,Y Counts
    for i in range(len(data)-k):
        X = data[i:i+k]
        Y = data[i+k]
        #print(X,Y)

        if T.get(X) is None:
            T[X] = {} 
            T[X][Y] = 1
        else:
            if T[X].get(Y) is None:
                T[X][Y] = 1
            else:
                T[X][Y] += 1

    # 2nd Step - Convert Counts into Probabilites
    for kx in T.keys():
        s = float(sum(T[kx].values()))
        for k in T[kx].keys():
            T[kx][k] = T[kx][k]/s

    return T

In [None]:
T = learnProb(text)
print(T)

{'indi': {'a': 1.0}, 'ndia': {"'": 1.0}, "dia'": {'s': 1.0}, "ia's": {' ': 1.0}, "a's ": {'c': 1.0}, "'s c": {'o': 1.0}, 's co': {'v': 1.0}, ' cov': {'i': 1.0}, 'covi': {'d': 1.0}, 'ovid': {' ': 1.0}, 'vid ': {'c': 0.5, 'd': 0.25, 'p': 0.25}, 'id c': {'a': 1.0}, 'd ca': {'s': 1.0}, ' cas': {'e': 1.0}, 'case': {'l': 0.16666666666666666, 's': 0.8333333333333334}, 'asel': {'o': 1.0}, 'selo': {'a': 1.0}, 'eloa': {'d': 1.0}, 'load': {' ': 1.0}, 'oad ': {'s': 1.0}, 'ad s': {'u': 1.0}, 'd su': {'r': 1.0}, ' sur': {'g': 1.0}, 'surg': {'e': 1.0}, 'urge': {'d': 0.5, ' ': 0.5}, 'rged': {' ': 1.0}, 'ged ': {'t': 0.25, '6': 0.25, 'i': 0.25, '3': 0.25}, 'ed t': {'o': 0.5, 'h': 0.5}, 'd to': {' ': 1.0}, ' to ': {'1': 0.2, 'f': 0.2, 'g': 0.2, 'h': 0.2, '9': 0.2}, 'to 1': {'.': 1.0}, 'o 1.': {'4': 1.0}, ' 1.4': {'5': 1.0}, '1.45': {' ': 1.0}, '.45 ': {'c': 1.0}, '45 c': {'r': 1.0}, '5 cr': {'o': 1.0}, ' cro': {'r': 1.0}, 'cror': {'e': 1.0}, 'rore': {' ': 1.0}, 'ore ': {'c': 1.0}, 're c': {'a': 0.5, 'o'

In [None]:
## Step-3 Predict the next letter
def next_letter(ctx,T,k):
    ctx = ctx[-k:]

    if T.get(ctx) is None:
        return " "
    
    possible_letters = list(T[ctx].keys())
    probs = list(T[ctx].values())

    return np.random.choice(possible_letters,p=probs)

In [None]:
next_letter("who ",T,4)

['l'] [1.0]


In [None]:
### Step - 4 Generate a Complete Song

output = "covi"
for i in range(2000):
    pred = next_letter(output,T,4)
    output += pred

print(output)

covid deaths, which is its biggest ever daily spike - as the novel coronavirus caseload surged 63,729 infections.

the last 24 hours as the city's active a boost the reported in the sharpest-ever daily spike - as the reported 141 deaths in the country saw deaths were discharged 63,729 infectional capital also reported 141 deaths, which is its biggest data.

while, maharashtra and delhi on friday recover daily spike - as the novel coronavirus cases for their biggest one-day of pandemic this year.
prime minister narendra modi this morning the recovernment data.

while, maharashtra, the national capital also reported in the third stressingle-day of pandemic.

meanwhile, maharashtra logged 63,729 infectional capital also reported in the country saw deadliest one-day of pandemic this year.
prime minister narendra modi this morning to 90.94 period.

maharashtra, the sharpest-ever 61,000 coronavirus pandemic this year.
prime minister narendra modi this morning the city rate has dropped that i

In [None]:
text = """India's Covid caseload surged to 1.45 crore cases with 2,34,692 fresh infections - the sharpest-ever daily spike - as the country recorded over 2 lakh cases for the third straight day. 1,341 deaths were reported in the last 24 hours as the country saw deadliest day of pandemic this year.
Prime Minister Narendra Modi this morning appealed that annual Kumbh Mela "should now only be symbolic" amid the novel coronavirus pandemic, stressing that it will give a boost to fight against the pandemic.

Meanwhile, Maharashtra and Delhi on Friday reported their biggest ever single-day surge in coronavirus cases, according to government data.

While Delhi reported 19,486 Covid cases in the last 24 hours, Maharashtra logged 63,729 infections.

The national capital also reported 141 deaths, which is its biggest one-day Covid death count. The city's active cases have risen to highest-ever 61,000. The recovery rate has dropped to 90.94 per cent.

Delhi's positivity rate was 19.69 per cent. 12,649 Covid patients were discharged in the last 24 hours. Nearly 99,000 coronavirus tests were conducted in the city during the period.

Maharashtra, the worst-hit state, logged 398 deaths in the last 24 hours. 45,335 patients were discharged""".lower()