### `About Markov Chain`
- Source Link `https://www.edureka.co/blog/introduction-to-markov-chains/`
- What Is A Markov Chain?
  - A stochastic process containing random variables, transitioning from one state to another depending on certain assumptions.
  - These random variables transition from one to state to the other, based on an important mathematical property called Markov Property.
- What Is The Markov Property?
  - Discrete Time Markov Property states that the calculated probability of a next possible state is only dependent on the current state
  - According to Markov Chain next state depends only on current state but not the states prior to it.
- Applications
  - Text generation
  - Auto-completion applications.

### `Import Libraries`

In [4]:
import pandas as pd 
import numpy as np
import json
import warnings

warnings.filterwarnings("ignore")

### `Sample Data`

- `Test Case`

In [5]:
##### complete sentence, we want to test is : "good weather to play cricket in ground" 
test_data = "good weather"
print(test_data)

good weather


- `Reference Data for Next Text Prediction`

In [6]:
ref_data = '''weather is good. play cricket. weather is good to play. people playing in ground. to play in ground. to play cricket in ground. nice weather to play in ground.
              best time to play in ground. nice weather to play in ground. good weather to play cricket in ground. shall we play cricket in ground?'''
print(ref_data)

weather is good. play cricket. weather is good to play. people playing in ground. to play in ground. to play cricket in ground. nice weather to play in ground.
              best time to play in ground. nice weather to play in ground. good weather to play cricket in ground. shall we play cricket in ground?


- `Data Split`

In [7]:
text_corpus = ref_data.split()
print(text_corpus)

['weather', 'is', 'good.', 'play', 'cricket.', 'weather', 'is', 'good', 'to', 'play.', 'people', 'playing', 'in', 'ground.', 'to', 'play', 'in', 'ground.', 'to', 'play', 'cricket', 'in', 'ground.', 'nice', 'weather', 'to', 'play', 'in', 'ground.', 'best', 'time', 'to', 'play', 'in', 'ground.', 'nice', 'weather', 'to', 'play', 'in', 'ground.', 'good', 'weather', 'to', 'play', 'cricket', 'in', 'ground.', 'shall', 'we', 'play', 'cricket', 'in', 'ground?']


- `Key Pair Formation Based on Current & Next Word`

In [8]:
def get_word_pairs(text_corpus):
    ''' 
    This functions reads Text Corpus and creates word-pairs
    Word Pair : Present Word, Next Word
    '''
    for i in range(len(text_corpus) - 1):
        yield (text_corpus[i], text_corpus[i+1])

In [9]:
##### Creating Word Pairs #####
word_pairs = get_word_pairs(text_corpus)

##### Creating Word Pairs Dictionary #####
key_pairs_dictr = {}
for word1, word2 in word_pairs:
    if word1 in key_pairs_dictr.keys():
        # print("     >> Inside If")
        # print("       ", word1, ", ", word2)
        temp_list = []
        temp_data = key_pairs_dictr[word1]
        ##### If Key has single value
        if type(temp_data)==str:
            temp_list.append(temp_data)
            temp_list.append(word2)
            key_pairs_dictr[word1] = temp_list
        ##### If Key has multiple list values    
        else:
            temp_data.append(word2)
            key_pairs_dictr[word1] = temp_data
    else:
        # print("     >> Inside Else")
        key_pairs_dictr[word1] = word2

print("Key-Pairs-Dictr")
print("--"*10)
print(key_pairs_dictr)

Key-Pairs-Dictr
--------------------
{'weather': ['is', 'is', 'to', 'to', 'to'], 'is': ['good.', 'good'], 'good.': 'play', 'play': ['cricket.', 'in', 'cricket', 'in', 'in', 'in', 'cricket', 'cricket'], 'cricket.': 'weather', 'good': ['to', 'weather'], 'to': ['play.', 'play', 'play', 'play', 'play', 'play', 'play'], 'play.': 'people', 'people': 'playing', 'playing': 'in', 'in': ['ground.', 'ground.', 'ground.', 'ground.', 'ground.', 'ground.', 'ground.', 'ground?'], 'ground.': ['to', 'to', 'nice', 'best', 'nice', 'good', 'shall'], 'cricket': ['in', 'in', 'in'], 'nice': ['weather', 'weather'], 'best': 'time', 'time': 'to', 'shall': 'we', 'we': 'play'}


- `Json View`

In [71]:
print(json.dumps(key_pairs_dictr, indent=6))

Json View
----------
{
      "weather": [
            "is",
            "is",
            "to",
            "to",
            "to"
      ],
      "is": [
            "good.",
            "good"
      ],
      "good.": "play",
      "play": [
            "cricket.",
            "in",
            "cricket",
            "in",
            "in",
            "in",
            "cricket",
            "cricket"
      ],
      "cricket.": "weather",
      "good": [
            "to",
            "weather"
      ],
      "to": [
            "play.",
            "play",
            "play",
            "play",
            "play",
            "play",
            "play"
      ],
      "play.": "people",
      "people": "playing",
      "playing": "in",
      "in": [
            "ground.",
            "ground.",
            "ground.",
            "ground.",
            "ground.",
            "ground.",
            "ground.",
            "ground?"
      ],
      "ground.": [
            "to",
          

In [91]:
test_words_list = test_data.split()
print("test_words_list --> ", test_words_list)

test_words_list -->  ['good', 'weather']


In [92]:
predict_words_n = 5
chain_list = test_words_list
print("Chain List --> ", chain_list)

Chain List -->  ['good', 'weather']


In [93]:
##### Predicting Next Word Based on Current Word : Next word would be looked up from key-pair dictionary & selected randomly from list of values #####
for i in range(predict_words_n):
    chain_list.append(np.random.choice(key_pairs_dictr[chain_list[-1]]))

print("Next Predicted Chain List --> ", chain_list)

Next Predicted Chain List -->  ['good', 'weather', 'to', 'play', 'in', 'ground.', 'nice']


### `Observation`
 - Our starting words are "good weather..". From here next words which were predicted like above, nearly close to the actual sentence.
 - It's not predicted extremly well. But the next words prediction make sense using Markov Chain Logic in this example.
 - Markov Chain would work pretty good for data where next state is always same based on current state. 
 - If next state changes randomly without based on current state, then this model would not able to give proper predictions.

### `Reference Articles`
- https://www.edureka.co/blog/introduction-to-markov-chains/
- https://ericmjl.github.io/essays-on-data-science/machine-learning/markov-models/
- https://medium.com/analytics-vidhya/how-to-build-a-market-simulator-using-markov-chains-and-python-b925a106b1c4
- https://www.kdnuggets.com/2019/11/markov-chains-train-text-generation.html
- https://www.upgrad.com/blog/markov-chain-in-python-tutorial/
- https://medium.com/@balamurali_m/markov-chain-simple-example-with-python-985d33b14d19
- https://www.geeksforgeeks.org/markov-chain/