discrete.time.markov

Applies the discrete time Markov model (Markov chain) to analyze data (both text and non-text) by computing probabilities of transitions between different states.

Inspired by my Stochastic Models in Management module, as well as my Honors Dissertation, as part of my coursework in the National University of Singapore (NUS) Business School.

Installation

You can install the released version of discrete.time.markov from my GitHub with:

library(devtools)
install_github("kaiwei-tan/discrete.time.markov")

Functionality

This package contains the following functions:

get.transitions: creates dataframe of all state transition probabilities
get.transitions.text: text version of get.transitions
get.transition.matrix: creates transition probability matrix
get.transition.matrix.text: text version of get.transition.matrix
get.steady.state: calculates steady-state / long-run probabilities of all states

Examples

Here is a basic example, where we generate a random sequence of states and calculate their transition probabilities:

library(discrete.time.markov)

# Generate random sequence of states
set.seed(57)
states <- sample(c('up', 'down', 'same'), 10000, replace=TRUE)

# Create transition probability matrix
get.transition.matrix(states, option='prob', output_type='matrix')
#>           down      same        up
#> down 0.3348044 0.3309797 0.3342159
#> same 0.3324275 0.3324275 0.3351449
#> up   0.3524939 0.3302920 0.3172141

Here is another example involving text. Let each word in our example sentence be considered as a state, and we calculate their transition probabilities accordingly:

sentence <- 'the quick, brown fox jumps over the lazy dog.'

# Function ignores all punctuation
# Create transition probability matrix
get.transition.matrix.text(sentence, 1, option='prob', output_type='matrix', punct='none')
#>       brown dog fox jumps lazy over quick the
#> brown     0   0   1     0  0.0    0   0.0   0
#> dog       0   1   0     0  0.0    0   0.0   0
#> fox       0   0   0     1  0.0    0   0.0   0
#> jumps     0   0   0     0  0.0    1   0.0   0
#> lazy      0   1   0     0  0.0    0   0.0   0
#> over      0   0   0     0  0.0    0   0.0   1
#> quick     1   0   0     0  0.0    0   0.0   0
#> the       0   0   0     0  0.5    0   0.5   0

Now let’s see what happens when we consider two words in a state, and also involve the period at the end (ignoring mid-sentence punctuation, each symbol counts as a state regardless of the number of words considered as a state).

# Function ignores mid-sentence punctuation (the comma)
# Create transition probability matrix
get.transition.matrix.text(sentence, 2, option='prob', output_type='matrix', punct='end')
#>             . brown fox fox jumps jumps over lazy dog over the quick brown
#> .           1         0         0          0        0        0           0
#> brown fox   0         0         1          0        0        0           0
#> fox jumps   0         0         0          1        0        0           0
#> jumps over  0         0         0          0        0        1           0
#> lazy dog    1         0         0          0        0        0           0
#> over the    0         0         0          0        0        0           0
#> quick brown 0         1         0          0        0        0           0
#> the lazy    0         0         0          0        1        0           0
#> the quick   0         0         0          0        0        0           1
#>             the lazy the quick
#> .                  0         0
#> brown fox          0         0
#> fox jumps          0         0
#> jumps over         0         0
#> lazy dog           0         0
#> over the           1         0
#> quick brown        0         0
#> the lazy           0         0
#> the quick          0         0

We can also use the output of get.transitions.text (for text data) or get.transitions (for non-text), which return a dataframe of state transitions and their probabilities, with igraph::graph_from_data_frame to get a state transition diagram:

library(magrittr)
library(igraph)

lyrics <- c('never gonna give you up', 'never gonna let you down')

# Create dataframe of state transitions and probabilities
lyrics_transitions <- get.transitions.text(lyrics, 1, option='prob', punct='none')
lyrics_transitions
#> # A tibble: 7 x 3
#>   start_state end_state  prob
#>   <chr>       <chr>     <dbl>
#> 1 give        you         1  
#> 2 gonna       give        0.5
#> 3 gonna       let         0.5
#> 4 let         you         1  
#> 5 never       gonna       1  
#> 6 you         down        0.5
#> 7 you         up          0.5

# Plot graph
igraph::graph_from_data_frame(lyrics_transitions) %>%
  plot(edge.label=lyrics_transitions$prob, vertex.color='white')

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
R		R
man		man
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
DESCRIPTION		DESCRIPTION
NAMESPACE		NAMESPACE
README.Rmd		README.Rmd
README.md		README.md
discrete.time.markov.Rproj		discrete.time.markov.Rproj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

R

R

man

man

.Rbuildignore

.Rbuildignore

.gitignore

.gitignore

DESCRIPTION

DESCRIPTION

NAMESPACE

NAMESPACE

README.Rmd

README.Rmd

README.md

README.md