# Challenge - Apna Time Aaega!
## Apna Time Aayega - Lyrics Generation!
### Use Markov Chains to create a Predictive Model for Text
In this fun challenge, you will generate song lyrics for 'Apna Time Aaega' using Machine Learning for the movie Gully Boy (2019 Indian Hindi-language musical drama film). You are given a training set which contains lyrics created by Ranveer Singh, your task is to train a model using Markov Chains to generate lyrics which look similar to actual lyrics.

<img src="1651780.svg" width = 10% style="text-align:left;">

## Dataset
Dataset contains the text file containing actual lyrics of the song. Since the data-set is scrapped from the internet you need to remove the starting and ending
tags and clean it before feeding to model.
### Train
https://www.dropbox.com/s/b194dcosl4ri6eh/Apna%20Time%20Aayega.txt?dl=1
### Test


## Submission Format
Submit a '.txt' file containing generated lyrics upto 2000 characters. Your model should be able to generate new line characters as well.

Your lyrics must start with the word 'apna' and use numpy random seed of 11 (for consistent result and avoid any randomization).

You result will be evaluated word by word with the expected output given by Markov Chain.

You can assume prediction of current character depends only on last 4 characters (Use K=4 in Markov Chain Model)

## Scoring

Your score will depend upon number of words matched with the expected output. Take care of white spaces including new line characters.

## Importing Libraries

In [49]:
import numpy as np

In [50]:
def loadDataset(path):
    with open(path) as f:
        return f.read().lower()
    
x_train = loadDataset('./Apna Time Aayega.txt')
print(x_train[:50])

apna time aayega
uth ja apni raakh se
tu udd ja ab


## Creating Markov Chain Model

In [51]:
def generateTable(text, k=4):
    T = {}
    for i in range(len(text) - k):
        x = text[i:i+k]
        y = text[i+k]
        if T.get(x) == None:
            T[x] = {}
            T[x][y] = 1
        else:
            if T[x].get(y) == None:
                T[x][y] = 1
            else:
                T[x][y] += 1
    return T

In [52]:
def convertFreqToProbability(T):
    for word in T.keys():
        total_freq = sum(T[word].values())
        for y in T[word].keys():
            T[word][y] /= total_freq
    return T

In [53]:
def trainMarkovChain(text, k=4):
    T = generateTable(text)
    T = convertFreqToProbability(T)
    return T

In [54]:
np.random.seed(11)
def samplingText(start_seq, T, k):
    start_seq = start_seq[-k:]
    if T.get(start_seq) == None:
        return " "
    possible_char = list(T[start_seq].keys())
    possible_val = list(T[start_seq].values())
    
    return np.random.choice(possible_char, p = possible_val)

In [55]:
def generatingText(start_seq, model, k=4, max_len=1000):
    sentence = start_seq
    start_seq = start_seq[-k:]
    for i in range(max_len):
        next_prediction = samplingText(start_seq, model, k)
        sentence = sentence + next_prediction
        start_seq = sentence[-k:]
    
    return sentence

In [56]:
model = trainMarkovChain(x_train)
y_pred = generatingText('apna', model, max_len=2000)
print(y_pred)

apna time aaya hath nahi
utna hi to aaya hai seene se
matlab bana lala
mere bhai tu
utna time aaya hai
phir bhi satayega
zinda mera khud ki hai, amaana lala
tujhe na mila paseene se jeenenge
sab kuchh mil payega
jitni rehmat mein
har raakh se
tu nanga hi to khauf nahin hai jaisa koyi hai seene mein nahi hai
kyon ki ab laalach nai hai
kya ghanta lekar jaayega
kya tu dafnayega
apna time aayega
kya tu ghanta lekar..
 
kyon
kyon ka hai
kya ghanta lekar jaayega
apna time aayega
zinda mera khwaab hai chheene mein
 
kyon ki hai jaisa shaan pighlayega
jitni taakat ne hi barkat ki mehnat ki, amaana ab talaash mein
parwane ki hai
jo darr ko bhi ladka sehmat mein nahi
utna hi to khaayega
 
yeh shabdon ka hai
zaroorat ki
himaakat di hairat ki, ibaadat ki hai
kya ghanta lekar jaayega
apna time aayega
 
tu nanga hi barkat ki
adalat yahaan par
yahaan marzi ki
jeetne ki
jeetne ki mehnat se main
jitna hi to aayega
 
ab hai
kya tu dafnayega
mere jaisa mera khud ki hai
kya ghanta lekar jaayega
kya tu gha

In [58]:
# saving results
def saveResult(text):
    with open('./pred.txt','w+',encoding = 'utf8') as f:
        f.write(text)
saveResult(y_pred)