## Introduction to Markov Chains!

This notebook is based on a YouTube video by HarvardX (https://www.youtube.com/watch?v=JHwyHIz6a8A)

In this notebook, we will first build a Markov chain model for the five cities shown in the video and study how Markov chains behave. Let's get started!

The only thing you need to know to run this notebook is that you press Shift+Enter on any given block (shown in gray) to run it.

#### Step 1 - Import all relevant packages

In Python, _packages_ are pieces of software which can perform certain tasks. For the purpose of this activity, we will import two _packages_ - (i) _pandas_ to read Excel files, (ii) _MarkovChain_ to build our Markov chain, and (iii) _matplotlib_ which will allow us to visualize results

In [1]:
import pandas as pd
from markov_chain import MarkovChain
import matplotlib.pyplot as plt

#### Step 2 - Fill in the right probabilities

1. Open the spreadsheet titled "mc_probabilities.xlsx"
2. Fill in the probabilities of going from one city to another
3. Note down your observations - what can you say about the probability of leaving any one city? How will the probabilities change if the number of flights from one city to the next are different?

Once you have filled in the probabilities, we will read in the notebook and define all the cities

In [2]:
df = pd.read_excel('mc_probabilities.xlsx', header=0, index_col=0)
probs = df.to_numpy()
cities = ['Averagemont', 'Bayesville', 'Continuopolis', 'Discretown', 'East Vandermonde']

#### Step 3 - Define starting city and number of trips

We will now run a "_simulation_" of Ana Markov's journey. A simulation is a virtual experiment which will help us study the behavior of Markov chains without having to go on all the journeys with Ana!

1. Define the starting city and number of trips that Ana will take
2. _Initialize_ a Markov chain simulation based on the cities and probabilities 
2. Run the _simulation_ - Ana will take the number of trips you decide starting from a city of your choice

In [3]:
######################################################
## This section is for you to change :) ##

starting_city = 'Discretown'
number_of_trips = 200

######################################################

# Initialize a simulation
mc = MarkovChain( cities, probs )

# Run simulation
mc.run_simulation( starting_city, number_of_trips )

Simulation has been completed! Ana has traveled to 200 cities, starting from Discretown


#### Step 4 - Visualize results

Now comes the fun part! You will see how much time Ana spends in each city as she takes more trips

*And now also comes the difficult part*.... you will have to answer some questions!

1. Can you change the number of trips and see if the time spent by Ana in each city changes?
2. Does the starting city matter in Ana's journey?

In [4]:
%matplotlib nbagg

mc.plot_simulation()
plt.show()

<IPython.core.display.Javascript object>