# **Import the libraries**

In [1]:
import pandas as pd
import numpy as np
import scipy as sp

# **Define my states**

In [2]:
states = {
    0 : "Rainy",
    1 : "Cloudy",
    2 : "Sunny"
}

# **Define the transition Matrix**

In [3]:
A = np.array([
    [0.4, 0.3, 0.3],
    [0.3, 0.5, 0.2],
    [0.1, 0.2, 0.7]
])

# **Perform the random walk**

In [4]:
def random_walk(steps, start):
  # This variable stores your number of steps into a new variable
  n_steps = steps
  # Another variable for storing the start stage
  start_stage = start
  #Get the current state
  print(states[start_stage], end = " ")
  # new state - current (prev)
  prev_state = start_stage

  # We will run a while loop till we the number of steps remains 1
  while(n_steps > 1):
    # This will generate the output for the next step based on the
    # probabilities that are defined in the transition matrix
    curr_state = np.random.choice([0, 1, 2], p = A[prev_state])
    print("-> ", states[curr_state], end = " ")
    prev_state = curr_state
    n_steps -= 1

  print(f"\nRandom Walk done for {steps} steps")

In [5]:
random_walk(8, 2)

Sunny ->  Cloudy ->  Cloudy ->  Rainy ->  Rainy ->  Cloudy ->  Sunny ->  Rainy 
Random Walk done for 8 steps


* **Calculation for the probabilities of initial stage**
  * **Monte Carlo Mulitplication**
  * **Repeated Matrix Multiplication**
  * **Left Eigenvector Calculations**

In [6]:
def repeated_matrix_multiplication(steps, A):
  A_new = A
  # Define a variable which will keep the track of iteration
  iter = 0
  while(iter < steps):
    A_new = np.matmul(A_new, A)
    iter += 1

  print(f"PI: {A_new[0]}")

In [7]:
repeated_matrix_multiplication(10, A) # Intial Matrix

PI: [0.23406206 0.31916273 0.44677521]


<hr>

# **Markovify: This is the library that helps us to generate the sequence of output any given steps**

**Data Loading**

In [8]:
data = pd.read_csv("/content/Restaurant reviews.csv")

**Data Inspection**

In [9]:
data.head()

Unnamed: 0,Restaurant,Reviewer,Review,Rating,Metadata,Time,Pictures,7514
0,Beyond Flavours,Rusha Chakraborty,"The ambience was good, food was quite good . h...",5,"1 Review , 2 Followers",5/25/2019 15:54,0,2447.0
1,Beyond Flavours,Anusha Tirumalaneedi,Ambience is too good for a pleasant evening. S...,5,"3 Reviews , 2 Followers",5/25/2019 14:20,0,
2,Beyond Flavours,Ashok Shekhawat,A must try.. great food great ambience. Thnx f...,5,"2 Reviews , 3 Followers",5/24/2019 22:54,0,
3,Beyond Flavours,Swapnil Sarkar,Soumen das and Arun was a great guy. Only beca...,5,"1 Review , 1 Follower",5/24/2019 22:11,0,
4,Beyond Flavours,Dileep,Food is good.we ordered Kodi drumsticks and ba...,5,"3 Reviews , 2 Followers",5/24/2019 21:37,0,


**We will extract the review from the given dataset**

In [10]:
reviews = data[["Review"]]

In [11]:
reviews.head()

Unnamed: 0,Review
0,"The ambience was good, food was quite good . h..."
1,Ambience is too good for a pleasant evening. S...
2,A must try.. great food great ambience. Thnx f...
3,Soumen das and Arun was a great guy. Only beca...
4,Food is good.we ordered Kodi drumsticks and ba...


**Clean the data, by dropping the null values**

In [12]:
cleaned = reviews.dropna()

**Merge all the reviews**

In [16]:
cleaned["Review"][0] + "" + cleaned["Review"][1] + "" + cleaned["Review"][2]

'The ambience was good, food was quite good . had Saturday lunch , which was cost effective .\nGood place for a sate brunch. One can also chill with friends and or parents.\nWaiter Soumen Das was really courteous and helpful.Ambience is too good for a pleasant evening. Service is very prompt. Food is good. Over all a good experience. Soumen Das - kudos to the serviceA must try.. great food great ambience. Thnx for the service by Pradeep and Subroto. My personal recommendation is Penne Alfredo Pasta:) ....... Also the music in the background is amazing.'

In [17]:
full_text = "".join(cleaned["Review"][:])

In [18]:
full_text



**Creating the Markov Model**

In [19]:
!pip install markovify

Collecting markovify
  Downloading markovify-0.9.4.tar.gz (27 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting unidecode (from markovify)
  Downloading Unidecode-1.3.8-py3-none-any.whl.metadata (13 kB)
Downloading Unidecode-1.3.8-py3-none-any.whl (235 kB)
[2K   [90m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m [32m235.5/235.5 kB[0m [31m5.3 MB/s[0m eta [36m0:00:00[0m
[?25hBuilding wheels for collected packages: markovify
  Building wheel for markovify (setup.py) ... [?25l[?25hdone
  Created wheel for markovify: filename=markovify-0.9.4-py3-none-any.whl size=18605 sha256=be38b229e67c117091fcea568b0193ddf2df52df657f69e286b634d59fa279f2
  Stored in directory: /root/.cache/pip/wheels/ca/8c/c5/41413e24c484f883a100c63ca7b3b0362b7c6f6eb6d7c9cc7f
Successfully built markovify
Installing collected packages: unidecode, markovify
Successfully installed markovify-0.9.4 unidecode-1.3.8


In [20]:
import markovify

**Model Creation**

In [22]:
model = markovify.Text(full_text)

**Generate the sequence of data using the given model**

In [25]:
for i in range(5):
  a = model.make_short_sentence(100)
  print(f"Sentence {i+1}: {a}")

Sentence 1: The Dal Makhni is to be honest it didn't had much shawarmas before.
Sentence 2: Phuket Fish: One of best service.
Sentence 3: The maggi was good so I am enjoying the appetizers.
Sentence 4: Staff is humble and hospitable.
Sentence 5: The kitchen should probably revisit the place crowded no? were we dint have to spend a rainy day.


<hr>

# **Using Markov Chain state size concept for generation**

**Data Loading**

In [26]:
news_data = pd.read_json("/content/drive/MyDrive/Datasets/arxivData.json")

**Data Inspection**

In [27]:
news_data.head()

Unnamed: 0,author,day,id,link,month,summary,tag,title,year
0,"[{'name': 'Ahmed Osman'}, {'name': 'Wojciech S...",1,1802.00209v1,"[{'rel': 'alternate', 'href': 'http://arxiv.or...",2,We propose an architecture for VQA which utili...,"[{'term': 'cs.AI', 'scheme': 'http://arxiv.org...",Dual Recurrent Attention Units for Visual Ques...,2018
1,"[{'name': 'Ji Young Lee'}, {'name': 'Franck De...",12,1603.03827v1,"[{'rel': 'alternate', 'href': 'http://arxiv.or...",3,Recent approaches based on artificial neural n...,"[{'term': 'cs.CL', 'scheme': 'http://arxiv.org...",Sequential Short-Text Classification with Recu...,2016
2,"[{'name': 'Iulian Vlad Serban'}, {'name': 'Tim...",2,1606.00776v2,"[{'rel': 'alternate', 'href': 'http://arxiv.or...",6,We introduce the multiresolution recurrent neu...,"[{'term': 'cs.CL', 'scheme': 'http://arxiv.org...",Multiresolution Recurrent Neural Networks: An ...,2016
3,"[{'name': 'Sebastian Ruder'}, {'name': 'Joachi...",23,1705.08142v2,"[{'rel': 'alternate', 'href': 'http://arxiv.or...",5,Multi-task learning is motivated by the observ...,"[{'term': 'stat.ML', 'scheme': 'http://arxiv.o...",Learning what to share between loosely related...,2017
4,"[{'name': 'Iulian V. Serban'}, {'name': 'Chinn...",7,1709.02349v2,"[{'rel': 'alternate', 'href': 'http://arxiv.or...",9,We present MILABOT: a deep reinforcement learn...,"[{'term': 'cs.CL', 'scheme': 'http://arxiv.org...",A Deep Reinforcement Learning Chatbot,2017


<hr>

**Process the data for both the columns**

In [28]:
news_data["ProcessedText"] = news_data["title"] + ". " + news_data["summary"]

**Processed data**

In [33]:
news_data["ProcessedText"].head()[1]

'Sequential Short-Text Classification with Recurrent and Convolutional\n  Neural Networks. Recent approaches based on artificial neural networks (ANNs) have shown\npromising results for short-text classification. However, many short texts\noccur in sequences (e.g., sentences in a document or utterances in a dialog),\nand most existing ANN-based systems do not leverage the preceding short texts\nwhen classifying a subsequent one. In this work, we present a model based on\nrecurrent neural networks and convolutional neural networks that incorporates\nthe preceding short texts. Our model achieves state-of-the-art results on three\ndifferent datasets for dialog act prediction.'

**Removing the `\n` with spaces**

In [34]:
news_data["ProcessedText"] = news_data["ProcessedText"].map(lambda x : x.replace("\n", " "))

In [35]:
news_data["ProcessedText"][0]

'Dual Recurrent Attention Units for Visual Question Answering. We propose an architecture for VQA which utilizes recurrent layers to generate visual and textual attention. The memory characteristic of the proposed recurrent attention units offers a rich joint embedding of visual and textual features and enables the model to reason relations between several parts of the image and question. Our single model outperforms the first place winner on the VQA 1.0 dataset, performs within margin to the current state-of-the-art ensemble model. We also experiment with replacing attention mechanisms in other state-of-the-art models with our implementation and show increased accuracy. In both cases, our recurrent attention mechanism improves performance in tasks requiring sequential or relational reasoning on the VQA dataset.'

<hr>

# **Buiding the markov Model for the data**

**`state_size`, this represents that the model will look for the previous 3 words (values) to determine the probability of the next word**

In [38]:
text_model = markovify.NewlineText(news_data["ProcessedText"], state_size = 1)

**Generate the text**

In [39]:
for i in range(10):
  print(text_model.make_sentence())
  print("-----------------------------------------------------------------------------------------------")

Sparse Activity Recognition. Speech Emotion Detection, which we show how words to resolve processing is Twitter. We explore how to iteractively chase the resolution 3D animation problems, given trajectory using both CNN model in modelling tool chain, such an arbitrary geometry of Network on neural network to type error rates are welcome, and a novel bounds on their ability to make connections between present and are complemented with the algorithm that it into object detectors in future studies into four systems. It is a sparsely encoded programmatically. We evaluate this work, we find that the assigned sensors is a process are insufficient for a new concept, it performs measurement-to-target association rules. Being supported phylogenetic tree rotations in Historical and generalizes the Gaussian mixture, while keeping local stationarity of Boosting with a shared parameters. In real-world events with their best explanation purpose. Earlier techniques from a consequence of data for the 

**Generation using model having state size as 2**

**Markov Model but with different state size**

In [40]:
text_model2 = markovify.NewlineText(news_data["ProcessedText"], state_size = 2)

text_model3 = markovify.NewlineText(news_data["ProcessedText"], state_size = 3)

text_mode4 = markovify.NewlineText(news_data["ProcessedText"], state_size = 4)

**Generation using model having state size as 2**

In [41]:
for i in range(10):
  print(text_model2.make_sentence())
  print("-----------------------------------------------------------------------------------------------")

Deep Learning techniques and finally a semi-supervised approach. We study the randomized distributed coordinate descent algorithm. We demonstrate that multi-task deep models where, surprisingly, inference remains tractable even when very little attention so far. Our results on MSC-12 Kinect Gesture dataset and prior to be evaluated separately and combine it with using the proposed method can run at real time vehicles detection algorithm is more feature rich than its binary counterpart. The two main contributions. First, we propose a novel architecture for galaxies classification is presented along with a human's real life. Techniques for Deep Neural Networks, and show how they handle unbalanced datasets and compared their results are also quantitatively and qualitatively. Finally, our analysis also shows that the combination of evidences in all symmetric values. Our proof considers row and column cluster structure. For this challenging problem. Existing approaches are either better tha

**Generation using model having state size as 3**

In [42]:
for i in range(10):
  print(text_model3.make_sentence())
  print("-----------------------------------------------------------------------------------------------")

An effective Procedure for Speeding up Logic Inference. It is time-consuming and expensive, especially for large amounts of labelled data with minimal supervision, in a domain adaptation module to online adapt the pre-learned features according to this methodology are presented and compared in this study.
-----------------------------------------------------------------------------------------------
Location of Single Neuron Memories and the Network's Proximity Matrix. This paper presents an algorithm, Voted Kernel Regularization , that provides the investor a suitable balance between simulation speed and potential skill-depth. Results show that rich linguistic features combined with texture classification on superpixel regions. Our method improves the precision from 5\% to 20\% while the query time is comparable to state of the world.
-----------------------------------------------------------------------------------------------
Variational Neural Machine Translation. We explore multi

**Generation using model having state size as 4**

In [43]:
for i in range(10):
  print(text_mode4.make_sentence())
  print("-----------------------------------------------------------------------------------------------")

Axioms in Model-based Planners. Axioms can be used to define differences between coreference, subevent and topical relations.
-----------------------------------------------------------------------------------------------
None
-----------------------------------------------------------------------------------------------
Deep Learning for Answer Sentence Selection. Answer sentence selection is the task of predicting the presence or absence of population diversity and the introduction of diversity mechanisms.
-----------------------------------------------------------------------------------------------
The Deductive Database System LDL++. This paper describes the analysis of quantitative characteristics of frequent sets and association rules which characterize the semantic relations between words remains under developed. We propose an unsupervised approach based on the association rules learning. This approach has the advantage of using task specific or category labels in combination w