<h1 align="center">Volume 2: Markov Chains</h1>

    Ethan Crawford Taylor
    Math 321
    11/3/2

<hr>

### Lab Objective: 

A Markov chain is a collection of states with specified probabilities for transitioning from one state to another. They are characterized by the fact that the future behavior of the
system depends only on its current state. In this lab we learn to construct, analyze, and interact with
Markov chains, then use a Markov-based approach to simulate natural language.

In [5]:
import numpy as np
from scipy import linalg as la

In [6]:
class MarkovChain:
    """A Markov chain with finitely many states."""
    # Problem 1
    def __init__(self, A, states=None):
        """Check that A is column stochastic and construct a dictionary
        mapping a state's label to its index (the row / column of A that the
        state corresponds to). Save the transition matrix, the list of state
        labels, and the label-to-index dictionary as attributes.

        Parameters:
        A ((n,n) ndarray): the column-stochastic transition matrix for a
            Markov chain with n states.
        states (list(str)): a list of n labels corresponding to the n states.
            If not provided, the labels are the indices 0, 1, ..., n-1.

        Raises:
            ValueError: if A is not square or is not column stochastic.

        Example:
            >>> MarkovChain(np.array([[.5, .8], [.5, .2]], states=["A", "B"])
        corresponds to the Markov Chain with transition matrix
                                   from A  from B
                            to A [   .5      .8   ]
                            to B [   .5      .2   ]
        and the label-to-index dictionary is {"A":0, "B":1}.
        """
        # Save matrix A as attribute
        self.matrix = A
        m,n = A.shape

        # Verify that A is column stochastic
        if not np.allclose(np.full(n,1.0), np.sum(A, axis=0)) or m != n:
            raise ValueError("A is not column stochastic")
        
        # Initialize and save states as attribbute if not previously defined
        if states == None:
            states = list(range(0,n))
        self.states = states

        # Initialize dictionary mapping states to column indicies
        self.dict = {}
        for i in range(len(states)):
            self.dict[states[i]] = i

    # Problem 2
    def transition(self, state):
        """Transition to a new state by making a random draw from the outgoing
        probabilities of the state with the specified label.

        Parameters:
            state (str): the label for the current state.

        Returns:
            (str): the label of the state to transitioned to.
        """
        # Get the column matching the state 
        col_index = self.dict[state]
        # Create a distribution from the column
        transition = np.random.multinomial(1, self.matrix[:,col_index])
        # Get the index of the multinomial
        index = np.argmax(transition)
        # Return the matching value
        return self.states[index]

    # Problem 3
    def walk(self, start, N):
        """Starting at the specified state, use the transition() method to
        transition from state to state N-1 times, recording the state label at
        each step.

        Parameters:
            start (str): The starting state label.

        Returns:
            (list(str)): A list of N state labels, including start.
        """
        # Initialize empty list
        path = list()
        # If N is 0, return empty
        if N == 0:
            return path

        # If N is 1, return only starting state
        path.append(str(start))
        if N == 1:
            return path

        # If N >= 2, call transition N-2 times
        state = self.transition(start)
        path.append(str(state))

        for i in range(N-2):
            state = self.transition(state)
            path.append(str(state))
        
        return path

    # Problem 3
    def path(self, start, stop):
        """Beginning at the start state, transition from state to state until
        arriving at the stop state, recording the state label at each step.

        Parameters:
            start (str): The starting state label.
            stop (str): The stopping state label.

        Returns:
            (list(str)): A list of state labels from start to stop.
        """
        # Initialize empty list and append first iteration
        path = list()
        path.append(str(start))
        state = self.transition(start)
        path.append(str(state))
        if state == stop:
            return path
        
        # Loop until transition iterates to stop state
        while state != stop:
            state = self.transition(state)
            path.append(str(state))
    
        return path

    # Problem 4
    def steady_state(self, tol=1e-12, maxiter=40):
        """Compute the steady state of the transition matrix A.

        Parameters:
            tol (float): The convergence tolerance.
            maxiter (int): The maximum number of iterations to compute.

        Returns:
            ((n,) ndarray): The steady state distribution vector of A.

        Raises:
            ValueError: if there is no convergence within maxiter iterations.
        """
        # Generate random state distribution vector and first iteration
        x = np.random.random(self.matrix.shape[0])
        x = x / sum(x)
        x1 = self.matrix @ x
        k=0

        # Normalize x and x1 until within tol, if not, error
        while la.norm(x-x1,ord=1) > tol:
            k += 1

            x = x1
            x1 = self.matrix @ x

            if k > maxiter:
                raise ValueError("A^k doesn't converge")
            
        return x1

### Problem 1: 

Define a MarkovChain class whose constructor accepts an $n×n$ transition matrix
$A$ and, optionally, a list of state labels.
Construct a dictionary mapping the state labels to the row/column index that they correspond
to in $A$ (given by order of the labels in the list), and save $A$, the list of labels, and this dictionary
as attributes.

### Problem 2: 
Write a method for the MarkovChain class that accepts a single state label. Use
the label-to-index dictionary to determine the column of A that corresponds to the provided
state label, then draw from the corresponding categorical distribution to choose a state to
transition to.

### Problem 3:

Add the following methods to the MarkovChain class.
- walk(): Accept a state label and an interger N. Starting at the specified state, use your
method from Problem 2 to transition from state to state N − 1 times, recording the state
label at each step. Return the list of N state labels, including the initial state.
- path(): Accept labels for an initial state and an end state. Beginning at the initial state,
transition from state to state until arriving at the specified end state, recording the state
label at each step. Return the list of state labels, including the initial and final states.

### Problem 4 

Write a method for the MarkovChain class that accepts a convergence tolerance
tol and a maximum number of iterations maxiter. Generate a random state distribution vector
$x_0$ and calculate $x_{k+1} = Ax_k$ until $ \lVert x_{k-1} − x_k \rVert _1 < $ tol, where $A$ is the transition matrix saved
in the constructor. If $k$ exceeds maxiter, raise a ValueError to indicate that $A^k$ does not
converge. Return the approximate steady state distribution $x$ of $A$.

### Using Markov Chains to Simulate English
One of the original applications of Markov chains was to study natural languages, meaning spoken
or written languages like English or Russian [VHL06]. In the early 20th century, Markov used his
chains to model how Russian switched from vowels to consonants. By mid-century, they had been
used as an attempt to model English. It turns out that plain Markov chains are, by themselves,
insufficient to model or produce very good English. However, they can approach a fairly good model
of bad English, with sometimes amusing results.
61

By nature, a Markov chain is only concerned with its current state, not with previous states.
A Markov chain simulating transitions between English words is therefore completely unaware of
context or even of previous words in a sentence. For example, if a chain’s current state is the word
“continuous,” the chain may say that the next word in a sentence is more likely to be “function”
rather than “raccoon.” However the phrase “continuous function” may be gibberish in the context of
the rest of the sentence.

### Generating Random Sentences

Consider the problem of generating English sentences that are similar to the text contained in a
specific file, called the training set. The goal is to construct a Markov chain whose states and
transition probabilities represent the vocabulary and — hopefully — the style of the source material.
There are several ways to approach this problem, but one simple strategy is to assign each unique
word in the training set to a state, then construct the transition probabilities between the states
based on the ordering of the words in the training set. To indicate the beginning and end of a
sentence requires two extra states: a start state, start, marking the beginning of a sentence; and a
stop state, stop, marking the end. The start state should only transitions to words that appear at
the beginning of a sentence in the training set, and only words that appear at the end a sentence in
the training set should transition to the stop state.

In [7]:
class SentenceGenerator(MarkovChain):
    """A Markov-based simulator for natural language.

    Attributes:
        MarkovChain (MarkovChain): The MarkovChain object used to generate the sentences.
    """
    # Problem 5
    def __init__(self, filename):
        """Read the specified file and build a transition matrix from its
        contents. You may assume that the file has one complete sentence
        written on each line.
        """
        # Read the training set from the file
        with open(filename, 'r', encoding='utf8') as inf:
            file = inf.read()
        # Get the set of unique words in the training set (the state labels)
        state_labels = set(file.split())
        sentences = file.split('\n')

        #Add labels "$tart" and "$top" to the set of states labels.
        state_labels.add("$tart")
        state_labels.add("$top")
        self.states = list(state_labels)
        
        # Initialize dictionary
        self.dict = {}
        for i, state in enumerate(self.states):
            self.dict.update({state: i})
        
        #Initialize an appropriately sized square array of zeros to be the transition matrix.
        self.matrix = np.zeros((len(state_labels),len(state_labels)))
        
        # For each sentence in the training set
        for sentence in sentences:
            split = sentence.split()
            
            # Prepend "$tart" and append "$top" to the list of words.
            split.insert(0,"$tart")
            split.append("$top")

            # For each consecutive pair (x, y) of words in the list of words
            
            for index in range(len(split)):
                # Add 1 to the entry of the transition matrix that corresponds to
                # transitioning from state x to state y
                self.matrix[self.dict.get(split[index]),self.dict.get(split[index-1])] += 1
        
        # Make sure the stop state transitions to itself
        self.matrix[len(state_labels)-1, len(state_labels)-1] = 1
        # Normalize each column by dividing by the column sums
        self.matrix /= self.matrix.sum(axis=0, keepdims=1)

    # Problem 6
    def babble(self):
        """Create a random sentence using MarkovChain.path().

        Returns:
            (str): A sentence generated with the transition matrix, not
                including the labels for the $tart and $top states.
        """
        sentence = self.path("$tart", "$top")
        sentence.remove("$tart")
        sentence.remove("$top")
        return " ".join(sentence)

### Problem 5. 

Write a class called SentenceGenerator that inherits from the MarkovChain
class. The constructor should accept a filename (the training set). Read the file and build
a transition matrix from its contents as described in Algorithm 5.1. Save the same attributes
as the constructor of MarkovChain does so that inherited methods work correctly.

### Problem 6. 

Add a method to the SentenceGenerator class called babble(). Use the path()
method from Problem 3 to generate a random sentence based on the training document. That
is, generate a path from the start state to the stop state, remove the "$tart" and "$top" labels
from the path, and join the resulting list together into a single, space-separated string.
For example, your SentenceGenerator class should be able to create random sentences
that sound somewhat like Yoda speaking.


In [14]:
mchain = SentenceGenerator("files/yoda.txt")
for _ in range(5):
    print(mchain.babble())

Stopped they are.
You will keep on who is your solitude on where he finished what it is this is.
Yes, even between this boy has become.
Yes, even between this is.
The shadow of this, we will be.
