## Markov Chain: Indian Cricket Commentary

The sentences used in Commentary dataset are small, and we get pretty good sentences generated from this Markov Chain. 

Interesting Sentences Generated:
   - Drags it from outside off and manages to get four runs.
   - We saw him adjust very well to Dhawan when he was given the role of anchoring the innings.
   - Shakib to Kohli, out Caught by Shreyas Iyer!! Needed something special to get out of the way.
   - Was making room and the ball jammed off his thigh pad before nestling into Dhoni's mitts.

### Import Dataset

In [10]:
import markovify
import nltk
import pandas as pd
import re

In [3]:
pd.set_option('display.max_colwidth', 5000)
df = pd.read_csv('IPL_Match_Highlights_Commentary.csv')
df = df.loc[:, ['Commentary']]
df['Commentary'][0]

'Nehra to Mandeep, FOUR, first boundary for Mandeep and RCB. Full and on the pads, needed to be put away and Mandeep did just that, picked it up and dispatched it over mid-wicket, couple of bounces and into the fence'

### Pre-Processing the Text

In [4]:
def preprocess(text):
    text=text.split(' ')
    length= len(text)
    text[length-1]=(str(text[length-1])+'.')
    text=' '.join(x for x in text)
    return text

df['Commentary'] = df['Commentary'].apply(lambda x:preprocess(x))
df[10:15]

Unnamed: 0,Commentary
10,"Cutting to Binny, SIX, slower ball again but this time Binny hacks it over deep mid-wicket. It was slightly closer to off-stump and was given a fair tonk."
11,"Cutting to Watson, FOUR, plants his foot down and absolutely drills the ball wide of a diving extra-cover. Overpitched from Cutting, Watson blasts him away."
12,"Bhuvneshwar to Binny, 1 run, dropped! Naman Ojha, with the gloves on, puts one down. The ball went high up and was swirling awkwardly all the way through. But at this level, it should be taken. Binny was completely duped by the slower leg-cutter. Such good bowling from Bhuvi to keep it wide of off and out of reach."
13,"Bhuvneshwar to Binny, out Caught by Yuvraj!! Gone this time. Yuvraj makes no mistake at backward point. What a top over this has been from Bhuvneshwar. It may now be a mountain too high for the RCB to climb. Binny backs away to slash a back of a length delivery over point. But this is a leg-cutter which bounces spongily and he slices it up to the fielder. Binny c Yuvraj b Bhuvneshwar 11(10) [6s-1]."
14,"Nehra to Watson, out Caught by Henriques!! That's the end of Watson, and of RCB. The little bit of reverse-swing which Nehra uses to curl the ball away from Watson does his undoing. Despite it being another full toss, Watson slices it straight up off the outside half of his bat. Simple catch at extra-cover. Watson c Henriques b Nehra 22(17) [4s-1 6s-1]."


In [5]:
corpus = df['Commentary'].to_string(index=False)

### Markov Chain Model

In [6]:
# Build the model
text_model = markovify.Text(corpus, state_size=4)

### Generating Sentences

In [7]:
# Print randomly-generated sentences
for i in range(10):
    print(text_model.make_short_sentence(100)+"\n")

Length ball on the pads, Bumrah misses the flick and is rapped on the back thigh.

Shakib to Kohli, out Caught by Shreyas Iyer!! Needed something special to get out of the way.

Was making room and the ball jammed off his thigh pad before nestling into Dhoni's mitts.

We saw him adjust very well to Dhawan when he was given the role of anchoring the innings.

Shivam Mavi to de Kock, out Caught by Miller!! Stunned silence at the Chinnaswamy.

Karthik went back in his crease and punch it away as he beats the fielder in the ring.

After conceding 13 runs in his first over, width on offer and Pant is all over it like a rash.

Perfect ball but Hooda managed to get enough wood to place it over mid-on.

Takes on the seam-up length ball and muscles it away over the wide long-on boundary.

Drags it from outside off and manages to get four runs.



## Markov Chain: Elon Musk Twitter Deal

The internet specially twitter exploded when Elon Musk was buying twitter and he tried to back out of the deal. We got the dataset from Kaggle, it is a datset of tweets which were made during this time. 

Interesting Sentences Generated:
   - My understanding is he's a scientist a businessman amp his deals will sway on prudence not emotions.
   - Twitter never wanted the deal.
   - Twitter has tried to pull out of Twitter deal because he needs the money to buy Twitter.
   - RT Elon Musks deal to buy Twitter could be trusted to be honest when making a deal.

### Import Dataset

In [9]:
import markovify
import nltk
import pandas as pd
import re

In [10]:
pd.set_option('display.max_colwidth', 5000)
df = pd.read_csv('twitter_deal.csv', low_memory=False)

In [21]:
df = df.loc[:, ['tweet']]
df.shape

(72131, 1)

### Pre-Processing the Text

In [12]:
### Pre-Processing the Textdef remove_hashtags(text):
    words = text.split(" ")
    cleanedwords = [word for word in words if ("@" not in word) and ("#" not in word)]
    return " ".join(cleanedwords)

In [13]:
def preprocess(text):
    text = re.sub(r'https://\S+|www\.\S+','',text)
    #text = remove_hashtags(text)
    text = re.sub('[^a-zA-Z0-9@#.\' ]','',text)
    text = re.sub(r"[\t\n]*", "", text)
    return text

In [14]:
df['tweet'] = df['tweet'].apply(lambda x:preprocess(x))
df[10:15]

Unnamed: 0,tweet
10,@JustinHaworth My criticism of him is not an endorsement of Twitter. If this is some kind of strategy for a better deal and he still plans on acquiring it that would be great.. but as of now it just looks like he cut and ran on everyone
11,#affiliate #twitter #business #presents #crypto #socialmedia #marketing #blogger #gift #gifts #giftideas #shop #shopping #affiliatemarketing #ad #discounts #deal #discountcode #code #codes #drizly #drinks #Beer #wine #liquor
12,#affiliate #twitter #business #presents #crypto #bitcoin #marketing #blog #blogger #gift #gifts #giftideas #shop #shopping #affiliatemarketing #ad #discounts #deal #discountcode #code #codes #drizly #drinks #Beer #wine #liquor
13,@GRDecter Thought it was Elon getting away from the Twitter deal.
14,@leafguy403 The rollercoaster that is leafs Twitter Kurtis the fact people are criticizing this deal before hes even played a game is bonkerscant wait to see him play for us


In [15]:
#df['tweet'] = df['tweet'].astype(str)
corpus = df['tweet'].to_string(index=False)

### Markov Chain Model

In [17]:
# Build the model
text_model = markovify.Text(corpus, state_size=3)

### Generating Sentences

In [19]:
# Print randomly-generated sentences
for i in range(10):
    print(text_model.make_short_sentence(120) + "\n")

Since Elon backed out of @Twitter deal.

Hul wil nou saak maak teen hom om hy uit die Twitter deal niet doorgaat is weer probleem minder.

My understanding is he's a scientist a businessman amp his deals will sway on prudence not emotions.

Bad News For Elon Musk amp Twitter.

Mars #TwitterDeal #ElonMusk #ola Elon Musk seeks to end Twitter deal is dead democracy dodged a huge bullet.

@Twitter says @elonmusk request to terminate the deal they're going to hold him to that and then negotiate a new deal.

Twitter never wanted the deal.

Twitter has tried to pull out of Twitter deal because he needs the money to buy Twitter.

RT Elon Musks deal to buy Twitter could be trusted to be honest when making a deal.

They were forced to take the L Delusional.

