# Markov Chain

First let's use Ulysses as a base corpus for our Markov Model example. 

Download it from Gutenberg.org.

In [8]:
wget -q http://www.gutenberg.org/files/4300/4300-0.txt -O /tmp/tmp
tail -n +22 /tmp/tmp | head -n -363 > /tmp/ulysses.txt
rm /tmp/tmp

First, let's use [Markovify](https://github.com/jsvine/markovify), a simple, extensible Markov chain generator in Python.

Instalation:

In [13]:
import sys
!{sys.executable} -m pip install markovify
import markovify

# Get raw text as string.
with open("/tmp/ulysses.txt") as f:
    text = f.read()
    
# Build the model.
text_model = markovify.Text(text)

# Save model
model_json = text_model.to_json()
file = open("markovmodel.json","w") 
file.write(model_json)
file.close() 

# Load model form file
with open("markovmodel.json") as f:
    model_json = f.read()
text_model = markovify.Text.from_json(model_json)

# Print five randomly-generated sentences
for i in range(5):
    print(str(i) + ' : ' + text_model.make_sentence())

print("-------------------")
    
# Print three randomly-generated sentences of no more than 280 characters
for i in range(3):
    print(text_model.make_short_sentence(280))

0 : Contemporaneously, a heated fashion offensively.
1 : M. Drumont, famous journalist, Drumont, know what it would be, he said.
2 : Onions of his coat a pocketbook bound by a triple change of rite and dogma like his own cheek.
3 : See the malt stored in many a true word spoken in jest.
4 : Do right to close and chain the door with a little moved but very handsomely told him no offence and all the occupants have been buried alive.
-------------------
Mother of the _corpora cavernosa_ to rapidly dilate in such cases an arrest of embryonic development at some stage antecedent to the little misadventure mentioned between the bodily organism and its phantoms, Stephen said.
—My wife too, he added, on the proceedings, after the last man who was _enceinte_ which she stated were Greek and Irish and a millionaire, _maestro di color che sanno_.
Working tooth and superfluous hair.


We are goind to use [Pykov](https://github.com/riccardoscalco/Pykov), which is a tiny Python module on finite regular Markov chains.

In [None]:
pip install git+git://github.com/riccardoscalco/Pykov@master 
pip install --upgrade git+git://github.com/riccardoscalco/Pykov@master

In [13]:
import sys
!{sys.executable} -m pip install pykov
import pykov



Vamos utilizar como exemplo esta cadeia de markov simples:
![markov chain](imgs/markovchain01.png)

Onde a matriz de transição é dada a seguir

$$
P = 
\left(\begin{array}{ccc} 
0.5 & 0.25 & 0.25\\
0.25& 0.5  & 0.25\\
0.25& 0.25 & 0.5
\end{array}\right)
$$ 

In [None]:
import pykov
import math
from numpy.linalg import matrix_rank
import networkx as nx


P = pykov.Matrix({('0','0'): .5, ('0','1'): .25, ('0','2'): .25,   \
                  ('1','0'): .25, ('1','1'): .5, ('1','2'): .25,   \
                  ('2','0'): .25, ('2','1'): .25, ('2','2'): .5})

P.states()

C = pykov.Chain(P)

C.walk(10)
# ['2', '2', '0', '0', '1', '2', '1', '1', '0', '0', '0']

entropyrate = math.log(math.e,2)*C.entropy()
print "taxa de entropia da cadeia de markov = " + "%.2f" % entropyrate + " bits"
# 1.5 bits

# compute entropy from probabilities
# p: counts
# b: base
def entropy(p, b):
  p = filter(lambda a: a != 0, p)
  return sum( [ -pp*math.log(pp,b) for pp in p ] )

entropy([0.5, 0.25, 0.25], 2)
# 1.5
G = nx.DiGraph(list(P.keys()))
print "a cadeia de markov é irredutível (fortemente conectada)? " + str( nx.is_strongly_connected(G) ) # True
print "a cadeia de markov é aperiódica? " +  str( nx.is_aperiodic(G) ) # True
print "número de distribuições estacionárias: " + str(matrix_rank(P - P.eye())) # 1
mu = C.steady() # 1/3, 1/3, 1/3
entropysteady = math.log(math.e,2)*mu.entropy() # log_2 3 = 1.5849625007211563
print "estado estacionário: "
print mu 
print "entropia do estado estacionário = " + "%.2f" % entropysteady + " bits"
print "numero de estados = " + str(len(mu))