Support for reducible Markov chains #20

ZacCranko · 2014-11-19T06:22:12Z

This commit makes a number of changes including adding support for
calculating stationary distributions on reducible Markov chains inline
with QuantEcon/QuantEcon.py#79.

Use Graphs.jl to determine the irreducible subsets of a reducible
Markov chain
Changed DMarkov to MarkovChain in line with mc_tools.py
Move to Distributions.jl for sampling
Removed support for Matrix types: this is not really necessary,
but I think we should either pick matrices or MarkovChain types rather
than maintaining support for both. If we wanted to stick with matrices, we could check for the validity of the stochastic matrix using a macro.

This commit makes a number of changes including adding support for calculating stationary distributions on reducible Markov chains inline with QuantEcon/QuantEcon.py#79 *Use Graphs.jl to determine the irreducible subsets of a reducible Markov Chain *Changed DMarkov to MarkovChain in line with mc_tools.py *Move to Distributions.jl for sampling *Removed support for ``Matrix`` types: this is not really necessary, but I think we should either pick matrices or MarkovChain types rather than maintaining support for both.

jstac · 2014-11-19T08:12:35Z

@ZacCranko Well done for pulling all this together. It's really useful for analysis of Markov chains, and quite unique. The code looks very nice.

@spencerlyon2 Zac has written this to bring the jl side up to date with the new functionality provided by @oyamad 's Markov chain / network algorithms. Now might not be a great time for you to look at code. We can wait on you if you like. There's no great rush to get this merged.

oyamad · 2014-11-19T13:38:37Z

Hi @ZacCranko

I wanted to try your code, but failed to import this version of QuantEcon.
I am totally new to Julia; I have installed the Julia environment just now, so I may be making some obvious mistakes..

Let me report what I did and got:

I installed julia v0.3.2 on OS X 10.9.5.

I installed QuantEcon from @ZacCranko 's fork by:

julia> Pkg.clone("https://github.com/ZacCranko/QuantEcon.jl.git")

On IPython notebook, I ran using QuantEcon, then I got

@compat not defined
while loading /Users/oyama/.julia/v0.3/QuantEcon/src/estspec.jl, in expression starting on line 57
while loading /Users/oyama/.julia/v0.3/QuantEcon/src/QuantEcon.jl, in expression starting on line 107
while loading In[1], in expression starting on line 1

 in include at /Applications/Julia-0.3.2.app/Contents/Resources/julia/lib/julia/sys.dylib
 in include_from_node1 at /Applications/Julia-0.3.2.app/Contents/Resources/julia/lib/julia/sys.dylib
 in include at /Applications/Julia-0.3.2.app/Contents/Resources/julia/lib/julia/sys.dylib
 in include_from_node1 at /Applications/Julia-0.3.2.app/Contents/Resources/julia/lib/julia/sys.dylib
 in reload_path at loading.jl:152
 in _require at loading.jl:67
 in require at loading.jl:51

The error @compat not defined in the first line also shows up in the current failure message from Travis CI.

The version I installed with
```
julia> Pkg.add("QuantEcon")
```
worked with no error.

sglyon · 2014-11-19T14:04:27Z

@oyamad, this is a problem on the current master branch of QuantEcon.

I have fixed it in 0f58183.

@ZacCranko you will need to rebase off of this master branch to have the changes propagate into your branch.

sglyon · 2014-11-19T14:10:46Z

src/mc_tools.jl

+    samples = [init]
+    for t=2:sample_size
+        last = samples[end]
+        push!(samples,rand(dist[last]))


Any reason you use push! instead of preallocating the array up front and filing entries in?

If there isn't a good reason we should change 132-135 to

samples = Array(Float64, sample_size) samples[1] = init for t=2:sample_size samples[t] = rand(dist[samples[t - 1])

From what I understand push!() is as fast (and sometimes even faster) than preallocating, and much more elegant in my opinion.

Will you do a quick/simple benchmark to compare the two?

If it isn't a performance hit, I'm happy to leave it as you wrote it.

I think this is some code to do a quick/simple benchmark

function testpush(n::Int) ret = Array(Float64, 1) for i=1:n push!(ret, 2.) end return ret end function testprealloc(n::Int) ret = Array(Float64, n) for i=1:n ret[i] = 2. end return ret end testpush(10000) testprealloc(10000) @elapsed testpush(10000000) @elapsed testprealloc(10000000)

My computer is showing preallocate was a hair faster, but relatively minor differences. This is just quick and dirty code, but I think it captures the difference you guys are talking about.

Thanks for that @cc7768 fairly similar performance here too.

@cc7768 thanks for the benchmark.

Here are my results using a timeit macro that mimics the ipython timeit:

julia> N = 10000000 10000000 julia> @timeit testpush(N) 1 loops, best of 3: 105.78 ms per loop julia> @timeit testprealloc(N) 10 loops, best of 3: 43.01 ms per loop

I'm seeing a factor of just over 2x slowdown for push! compared to preallocation.

At the end of the day we are talking a few milliseconds here, so it isn't a big deal.

It's up to you @spencerlyon2, I don't mind either way.

sglyon · 2014-11-19T15:17:22Z

@ZacCranko . I've had a quick read through your code and wanted to say it looks great.

I see a few things we would want to decide on/add before merging. Some of these are already mentioned above, but I thought listing them here in one place might be helpful (feel free to edit this message and add/check off things as they come up):

Move using statements to src/QuantEcon.jl
Make sure REQUIRE is up to date (e.g. add Graphs)
Finish the discussion on what the fields of MarkovChain should be. I'm very happy to defer to @oyamad, @jstac or others on this.
Make sure all new functions are tested (eigen_solve, lu_solve, reducible_subsets, ect.)
Decide on push! vs preallocation. I think I prefer preallocation -- especially in the case noted above for two reasons: 1. performance seems to be a bit better, 2. we can explicitly tell that the time t value is based on the time t-1 value (we know this is clearly happening when we use last = samples[end], but to me samples[t] = rand(dist[samples[t - 1]) make it really clear that the time t sample is dependent on the time t-1 one). That being said, I don't feel very strongly about preallocation over push!.

If I have missed anything else please add it to the list.

One more comment that doesn't require a TODO item:

Julia base is finally getting a system for documenting library code. It is only available on the dev version of julia for now, but we will eventually want to implement our within-code documentation according to their conventions. The relevant issue is here. I have not included any docstrings in QuantEcon as of yet because I have been waiting for this to get resolved. Given that you are providing documentation-esque comments, I just wanted to make you aware of this because we will have to change these at some point down the road.

jstac · 2014-11-19T21:00:38Z

@spencerlyon2 @oyamad Regarding the definition of a Markov chain / process, in general a dynamical system has two elements, a space and a map that acts on the space. The space is an essential part of the definition. A Markov process on a general space is similar. By changing the underlying state space we can change the stability properties of the process, so we need to specify it carefully.

However, in the case of a finite Markov chain with n states, the state space is always isomorphic to 1, ..., n (or 0, ..., n - 1 if you prefer). For this reason my preference is that we keep the state space out of the definition of the object.

In terms of practicality this works well. For example yesterday I was simulating Markov chains for a paper using the Python qe code. The state values were stored in an array z. After simulating a time series draws that takes values in 0, ..., n - 1 I could immediately map to the state sequence via z[draws]. This felt more straightforward than recording z as an attribute.

Of course that's all just personal preference, but anyway that's my opinion.

The initial condition can also be left out in my opinion.

albop · 2014-11-19T22:07:54Z

In many applications, you will be carrying these different fields together,
so a a structure containing them is likely to be useful. It could be
derived from the Markov class but in that case, it would make more sense to
keep the term StochasticMatrix for markov matrices, since this one is
certainly not overloaded anywhere else and to use the term Discrete Markov
chain for the objects that also contains the state-space.

Since, as you say, the state is always isomorphic to [1,n], then why not
make it default to [1,n], unless other otherwise specified ? Julia has
union types, that can be used for that purpose.

On Wed, Nov 19, 2014 at 4:00 PM, John Stachurski notifications@github.com
wrote:

@spencerlyon2 https://github.com/spencerlyon2 @oyamad
https://github.com/oyamad Regarding the definition of a Markov chain /
process, in general a dynamical system has two elements, a space and a map
that acts on the space. The space is an essential part of the definition. A
Markov process on a general space is similar. By changing the underlying
state space we can change the stability properties of the process, so we
need to specify it carefully.

However, in the case of a finite Markov chain with n states, the state
space is always isomorphic to 1, ..., n (or 0, ..., n - 1 if you prefer).
For this reason my preference is that we keep the state space out of the
definition of the object.

In terms of practicality this works well. For example yesterday I was
simulating Markov chains for a paper using the Python qe code. The state
values were stored in an array z. After simulating a time series draws
that takes values in 0, ..., n - 1 I could immediately map to the state
sequence via z[draws]. This felt more straightforward than recording z as
an attribute.

Of course that's all just personal preference, but anyway that's my
opinion.

The initial condition can also be left out in my opinion.

—
Reply to this email directly or view it on GitHub
#20 (comment).

jstac · 2014-11-20T00:11:38Z

@albop That seems like a sensible suggestion. It would be nice to keep the Python and Julia sides in sync so why don't we try to get this code merged without the state space being an attribute and add an issue to the issue tracker about adding states to both Python and Julia MC code down the road (defaulting to [0, n-1] and [1, n] respectively).

Regarding names I prefer that we just stick with what we have and add the state to the existing class / type.

oyamad · 2014-11-20T01:12:35Z

I didn't pay attention to the state space issue taking it granted that the state space is always naturally identified with {0, ..., n-1} (or {1, ..., n}), and so I agree with @jstac on his previous comment.

But it also makes sense to carry the state space as @albop says. One possible case is when one wants to obtain the sub-Markov chain corresponding to a recurrent class of a reducible Markov chain. In the current implementation, the state space of the sub-Markov chain will be (implicitly) re-indexed as {0, ..., m-1} (where m is the number of elements of the recurrent class in consideration). (Of course one can keep carrying the original indices of the recurrent class along with the mapping z though.)

Note that the same argument applies to the DiGraph class, where a directed graph is identified with its adjacency matrix. If we add the set of states to MarkovChain as an attribute, I think we should also add the set of nodes to DiGraph as well.

In NetworkX, it seems that a node can be given any name:

jstac · 2014-11-21T01:18:04Z

@oyamad That's a good point (regarding sub-chains).

I think we're in agreement that the state space should be added, defaulting to the natural index set if state values are not specified. @oyamad Would you mind to either make those changes on the Python side or note them as to-do enhancements in the issue tracker?

@ZacCranko Perhaps it's easier to merge this as is, without addressing inclusion of the state space. That can be done later. Would you please update me on where the code is at vis-a-vis @spencerlyon2 's comments?

ZacCranko · 2014-11-21T03:05:18Z

@jstac @spencerlyon2:

sorted out REQUIRE and moved over importing to QuantEcon.jl
@spencerlyon2 I agree with you re. preallocation and mc_sample_path now preallocates.
I have tested all the algorithms as much as I am able, all of the unit tests and @oyamad's tests pass.
I've also updated our unit tests to have the reducible MC from test_mc_tools.py.
@spencerlyon2 no prob regarding docs, I'm happy to go with what ever you think is best, just let me know.

Two new things:

renamed reducible_subsets to irreducible_subsets (this was a typo, my bad)
improved the irreducible_subsets algorithm slightly, it's still a bit cumbersome so if anyone is any good with graphs I'd welcome the input.

Could someone please let me know the best way for me to hand over these changes? submit a new pull request? After this one is in I'll make a new pull request, modifying the MarkovChain type to include the number of states, my proposal is to fill this out automatically in the constructor after inferring the number of states from the input matrix.

sglyon · 2014-11-21T14:11:05Z

@ZacCranko great work on the changes.

However, I'm not seeing them here. Can you push them to this branch? I'd prefer to wait on the merge until we have those things working.

ZacCranko · 2014-11-21T20:14:45Z

Forgot I needed to commit them to my fork, sorry about that. Should be all sorted now.

jstac · 2014-11-21T23:34:00Z

@spencerlyon2 I'll leave you to merge when you're happy.

sglyon · 2014-11-22T01:00:29Z

Sorry @ZacCranko , I still don't see the updates to tests, REQUIRE, or using.

Am I missing something?

ZacCranko · 2014-11-22T01:02:12Z

Ah no, it's just me being a bit incompetent. Have a look now should all be there.

sglyon · 2014-11-22T01:06:36Z

Haha no worries, thanks for pushing them.

I'm seeing that tests are failing: https://travis-ci.org/QuantEcon/QuantEcon.jl/jobs/41776822#L303-L321

Do you know why this might be happening?

jstac · 2014-11-22T01:27:43Z

Maybe just a matter of pulling the latest commits down to this fork?

sglyon · 2014-11-22T01:29:25Z

@ZacCranko I have a suggestion.

Let's merge this into a branch other than master in QuantEcon/QuantEcon.jl and then I can help you get these last few things sorted out.

Does that work for you?

Thanks for being patient with this

ZacCranko · 2014-11-22T01:31:07Z

No problem @spencerlyon2, I'm not that hot at this forking/merging stuff, so any help would be appreciated. Just let me know what you need me to do.

sglyon · 2014-11-22T01:40:29Z

I have done the following (just for reference so we know how this happened):

# pull latest from this repo
git pull origin master

# Created a new branch (based off most recent master) 
git checkout -b markov

# Added your repository as a remote
git remote add zcranko git@github.com:ZacCranko/QuantEcon.jl.git

# Pulled from your master branch into my markov branch
git pull zcranko master

#Pushed my markov branch to QuantEcon/QuantEcon.jl
git push origin markov

I then opened a new pull request from that branch.

I have a hunch that with the latest changes to the master branch here included with your changes the tests will pass.

Let's wait a few minutes and see what travis tell us.

sglyon · 2014-11-22T01:46:27Z

See #21 .

I think we are good to go. @ZacCranko I'll give you a chance to have one more look at it and then we can merge away!

Thanks for working through this

ZacCranko · 2014-11-22T01:51:04Z

Thanks for all your help @spencerlyon2. It looks good to me, if you're happy feel free to go ahead.

sglyon · 2014-11-22T01:53:11Z

No problem.

Good work. Next time it won't be so difficult.

For future reference. I think it would have worked from this original pull request if you had run git pull origin master where origin is set to track QuantEcon/QuantEcon.jl (I believe this was @jstac suggestion a few minutes ago).

ZacCranko · 2014-11-22T01:54:36Z

Good to know. Should I re-fork, and add the state parameter to the type definition now?

sglyon · 2014-11-22T01:55:55Z

What do you get when you run git remote -v from within the QuantEcon repo on your computer?

ZacCranko · 2014-11-22T01:56:42Z

Zacs-Macbook-Pro:~ zaccranko$ git remote -v
QuantEcon.jl    https://github.com/ZacCranko/QuantEcon.jl (fetch)
QuantEcon.jl    https://github.com/ZacCranko/QuantEcon.jl (push)
origin  git://github.com/JuliaStats/DataFrames.jl.git (fetch)
origin  git@github.com:JuliaStats/DataFrames.jl.git (push)

sglyon · 2014-11-22T01:58:11Z

I'll respond in email so we don't keep spamming the group...

ZacCranko added 2 commits November 19, 2014 17:12

rename with new constructor, remove deprecated tests

e4e865b

sglyon reviewed Nov 19, 2014
View reviewed changes

ZacCranko added 3 commits November 22, 2014 07:09

Update unit tests with a reducible mc chain

a04e013

Import a couple extra things and using Grahs

940694c

Now preallocating with mc_sample_path

9437e33

sglyon closed this Nov 22, 2014

oyamad mentioned this pull request Nov 23, 2014

TODO: Add states/nodes to MarkovChain/DiGraph QuantEcon/QuantEcon.py#101

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for reducible Markov chains #20

Support for reducible Markov chains #20

ZacCranko commented Nov 19, 2014

jstac commented Nov 19, 2014

oyamad commented Nov 19, 2014

sglyon commented Nov 19, 2014

sglyon Nov 19, 2014

ZacCranko Nov 19, 2014

sglyon Nov 19, 2014

cc7768 Nov 19, 2014

ZacCranko Nov 19, 2014

sglyon Nov 19, 2014

ZacCranko Nov 19, 2014

sglyon commented Nov 19, 2014

jstac commented Nov 19, 2014

albop commented Nov 19, 2014

jstac commented Nov 20, 2014

oyamad commented Nov 20, 2014

jstac commented Nov 21, 2014

ZacCranko commented Nov 21, 2014

sglyon commented Nov 21, 2014

ZacCranko commented Nov 21, 2014

jstac commented Nov 21, 2014

sglyon commented Nov 22, 2014

ZacCranko commented Nov 22, 2014

sglyon commented Nov 22, 2014

jstac commented Nov 22, 2014

sglyon commented Nov 22, 2014

ZacCranko commented Nov 22, 2014

sglyon commented Nov 22, 2014

sglyon commented Nov 22, 2014

ZacCranko commented Nov 22, 2014

sglyon commented Nov 22, 2014

ZacCranko commented Nov 22, 2014

sglyon commented Nov 22, 2014

ZacCranko commented Nov 22, 2014

sglyon commented Nov 22, 2014

Support for reducible Markov chains #20

Support for reducible Markov chains #20

Conversation

ZacCranko commented Nov 19, 2014

jstac commented Nov 19, 2014

oyamad commented Nov 19, 2014

sglyon commented Nov 19, 2014

sglyon Nov 19, 2014

Choose a reason for hiding this comment

ZacCranko Nov 19, 2014

Choose a reason for hiding this comment

sglyon Nov 19, 2014

Choose a reason for hiding this comment

cc7768 Nov 19, 2014

Choose a reason for hiding this comment

ZacCranko Nov 19, 2014

Choose a reason for hiding this comment

sglyon Nov 19, 2014

Choose a reason for hiding this comment

ZacCranko Nov 19, 2014

Choose a reason for hiding this comment

sglyon commented Nov 19, 2014

jstac commented Nov 19, 2014

albop commented Nov 19, 2014

jstac commented Nov 20, 2014

oyamad commented Nov 20, 2014

jstac commented Nov 21, 2014

ZacCranko commented Nov 21, 2014

sglyon commented Nov 21, 2014

ZacCranko commented Nov 21, 2014

jstac commented Nov 21, 2014

sglyon commented Nov 22, 2014

ZacCranko commented Nov 22, 2014

sglyon commented Nov 22, 2014

jstac commented Nov 22, 2014

sglyon commented Nov 22, 2014

ZacCranko commented Nov 22, 2014

sglyon commented Nov 22, 2014

sglyon commented Nov 22, 2014

ZacCranko commented Nov 22, 2014

sglyon commented Nov 22, 2014

ZacCranko commented Nov 22, 2014

sglyon commented Nov 22, 2014

ZacCranko commented Nov 22, 2014

sglyon commented Nov 22, 2014