# Markov Chain in Class/Object

We learn how to work with markov chain, it will be easier to use if we make it into a class.
Here is an example of coding markov chain as object.

In [1]:
class Markov():
    '''Generate a text with a simple one-word based markov chain.'''
    def __init__(self, txt):
        self.txt = txt # Holds the text corpora

In [2]:
txt = '''The quick brown fox jumps over the lazy dog. The lazy programmer jumps over the fire fox.'''

m = Markov(txt)

In [3]:
# Check that the text is available:
print(m.txt)

The quick brown fox jumps over the lazy dog. The lazy programmer jumps over the fire fox.


## Define a Markov Class

In [4]:
class Markov():
    '''Generate a text with a simple one-word based markov chain.'''
    def __init__(self, txt):
        self.txt = txt # Holds the text corpora.
        self.dictionary = {} # Holds the dictionary for probabilities.

Then we need a method to create that dictionary based on the text corpora:

## Adding function create dictioanry

In [5]:
class Markov():
    '''Generate a text with a simple one-token word markov chain.'''
    def __init__(self, txt, order):
        self.txt = txt
        self.dictionary = {} # Holds the dictionary for probabilities.
        self.order = order

    def create_dictionary(self):
        # Split txt into a list:
        self.txt = self.txt.lower().split()
        
        for i in range(len(self.txt) -self.order):
            key = tuple(self.txt[i:i+self.order])
            value = self.txt[i+self.order]
            # Check if the key exists.
            if key in self.dictionary.keys():
                # If yes, append the value.
                self.dictionary[key].append(value)
            # Else insert a new key + value.
            else:
                self.dictionary[key] = [value]

        ''' Calculate the probability. '''

        for key, value in self.dictionary.items():
            length = len(self.dictionary[key])
            temporary_dic = {}
            for char in value:
                if(char not in temporary_dic.keys()):
                    temporary_dic[char] = 1
                else:
                    temporary_dic[char] += 1   

            for _keys,amount in temporary_dic.items():
                temporary_dic[_keys] = (amount/length)
            self.dictionary[key] = temporary_dic


In [6]:
m = Markov(txt,2)
m.create_dictionary()
m.dictionary

{('the', 'quick'): {'brown': 1.0},
 ('quick', 'brown'): {'fox': 1.0},
 ('brown', 'fox'): {'jumps': 1.0},
 ('fox', 'jumps'): {'over': 1.0},
 ('jumps', 'over'): {'the': 1.0},
 ('over', 'the'): {'lazy': 0.5, 'fire': 0.5},
 ('the', 'lazy'): {'dog.': 0.5, 'programmer': 0.5},
 ('lazy', 'dog.'): {'the': 1.0},
 ('dog.', 'the'): {'lazy': 1.0},
 ('lazy', 'programmer'): {'jumps': 1.0},
 ('programmer', 'jumps'): {'over': 1.0},
 ('the', 'fire'): {'fox.': 1.0}}

## Adding Generation

The last part is a method to generate a sentence:

In [7]:
class Markov():
    '''Generate a text with a simple one-word based markov chain.'''
        
    def __init__(self, txt, order):    
        self.txt = txt # Holds the text corpora.
        self.dictionary = {} # Holds the dictionary for probabilities.
        self.order = order

    def create_dictionary(self):
        self.txt = self.txt.lower().split()
        
        for i in range(len(self.txt) -self.order):
            key = tuple(self.txt[i:i+self.order])
            value = self.txt[i+self.order]
            # Check if the key exists.
            if key in self.dictionary.keys():
                # If yes, append the value.
                self.dictionary[key].append(value)
            # Else insert a new key + value.
            else:
                self.dictionary[key] = [value]

        ''' Calculate the probability. '''

        for key, value in self.dictionary.items():
            length = len(self.dictionary[key])
            temporary_dic = {}
            for char in value:
                if(char not in temporary_dic.keys()):
                    temporary_dic[char] = 1
                else:
                    temporary_dic[char] += 1   

            for _keys,amount in temporary_dic.items():
                temporary_dic[_keys] = (amount/length)
            self.dictionary[key] = temporary_dic
                
    def generate_token(self, initial_key):
        import random
        key = initial_key[-self.order:]
         
        
        # Check if key is included in the vocabulary.
        if not key in  self.dictionary.keys():
            # If not, pick a random key from the vocabulary.
            key = random.choice(list( self.dictionary.keys()))

        # Otherwise we'll use the key given as argument.

        # Return the next token for the key.
        # The [0] in the end is because the random choice based on probability returns a list.
        initial_key += ' '
        initial_key += random.choices(list( self.dictionary[key].keys()), weights= self.dictionary[key].values())[0]
       
        return initial_key


## Usage 

In [8]:
m = Markov(txt,2)
m.create_dictionary()
new_text = m.generate_token('The')
print(new_text)

The jumps


In [9]:
initial_text = 'The'
for i in range(10):
    initial_text = m.generate_token(initial_text)
    print(initial_text)

The jumps
The jumps jumps
The jumps jumps fox.
The jumps jumps fox. lazy
The jumps jumps fox. lazy fire
The jumps jumps fox. lazy fire dog.
The jumps jumps fox. lazy fire dog. dog.
The jumps jumps fox. lazy fire dog. dog. brown
The jumps jumps fox. lazy fire dog. dog. brown lazy
The jumps jumps fox. lazy fire dog. dog. brown lazy lazy
