Add states/nodes to MarkovChain/DiGraph #237

oyamad · 2016-03-15T22:55:30Z

This will close #101.

jstac · 2016-03-16T01:26:05Z

@oyamad Many thanks for working on this!

I get the following:

In [10]: d = qe.DiGraph([[0, 1], [1, 1]])

In [11]: d.cyclic_components
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-11-b348110263c2> in <module>()
----> 1 d.cyclic_components

/home/john/sync_dir/quantecon/QuantEcon.py/quantecon/graph_tools.py in cyclic_components(self)
    312     @property
    313     def cyclic_components(self):
--> 314         return self.get_cyclic_components()
    315 
    316     def subgraph(self, nodes):

/home/john/sync_dir/quantecon/QuantEcon.py/quantecon/graph_tools.py in get_cyclic_components(self, return_labels)
    307     def get_cyclic_components(self, return_labels=True):
    308         if return_labels:
--> 309             return self.label_nodes(self._get_cyclic_components())
    310         return self._get_cyclic_components()
    311 

/home/john/sync_dir/quantecon/QuantEcon.py/quantecon/graph_tools.py in label_nodes(self, list_of_components)
    115 
    116     def label_nodes(self, list_of_components):
--> 117         return [self.node_labels[c] for c in list_of_components]
    118 
    119     def _find_scc(self):

/home/john/sync_dir/quantecon/QuantEcon.py/quantecon/graph_tools.py in <listcomp>(.0)
    115 
    116     def label_nodes(self, list_of_components):
--> 117         return [self.node_labels[c] for c in list_of_components]
    118 
    119     def _find_scc(self):

TypeError: 'NoneType' object is not subscriptable

I think the problem is that below this line we need to test whether there are labels.

To tell the truth, I think there is quite a bit of complexity just because we want to have properties. Once we have them, there are two different ways to access information (attribute or method), which is in itself slightly "unpythonic". (There should be only one way to do a give task, I mean.)

But of course you should be the judge and I'm happy to go along with what you decide.

oyamad · 2016-03-16T01:52:24Z

@jstac Thanks for testing the code. My apologies, I had secretly made a forced push at some point which adds a checking here. Now it should work fine.

For the issue property versus method, I agree the current version is not pythonic, whereas at the same time I think I would like to maintain the idea that the connectedness and cyclicity structures are the properties of a digraph. We can discuss this again tomorrow.

oyamad · 2016-03-16T13:38:57Z

@jstac We have already discussed and I know you don't like this, but to share with other people:
Another option is to have a boolean attribute, say, return_indices, and check whether it is True or False and print the outputs of the properties accordingly (in this case we have attributes only).

Properties return indices as before

oyamad · 2016-03-17T16:09:18Z

I think I have made up my mind.

The properties are the properties of a digraph and when called display the components with indices as before.
The get_*_components methods do the task of getting these properties and annotating the nodes (if return_labels=True and node_labels is not None).

oyamad · 2016-03-17T18:00:22Z

TODO

Add tests for state values

oyamad · 2016-03-17T21:11:06Z

This is ready for review.

coveralls · 2016-03-18T15:10:05Z

Coverage increased (+0.3%) to 86.077% when pulling 84fdbaa on add_nodes_states into 4eff123 on master.

coveralls · 2016-03-19T11:59:05Z

Coverage increased (+0.3%) to 86.16% when pulling 8850b31 on add_nodes_states into 4eff123 on master.

jstac · 2016-03-20T10:36:43Z

@oyamad Thanks for all this hard work!

I'm looking at the code now. In the meantime, I know we discussed something similar before but I would like to raise the topic again: I think the DiGraph below should be aperiodic:

(Discussion continues after this code.)

In [18]: M = [[1, 1], [0, 1]]                                                                                                                                                

In [19]: g = qe.DiGraph(M)

In [20]: g.is_aperiodic
---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
<ipython-input-20-af1e951a4d55> in <module>()
----> 1 g.is_aperiodic

/home/john/sync_dir/quantecon/QuantEcon.py/quantecon/graph_tools.py in is_aperiodic(self)
    293     @property
    294     def is_aperiodic(self):
--> 295         return (self.period == 1)
    296 
    297     @property

/home/john/sync_dir/quantecon/QuantEcon.py/quantecon/graph_tools.py in period(self)
    288     def period(self):
    289         if self._period is None:
--> 290             self._compute_period()
    291         return self._period
    292 

/home/john/sync_dir/quantecon/QuantEcon.py/quantecon/graph_tools.py in _compute_period(self)
    249         if not self.is_strongly_connected:
    250             raise NotImplementedError(
--> 251                 'Not defined for a non strongly-connected digraph'
    252             )
    253 

NotImplementedError: Not defined for a non strongly-connected digraph

For the definitions I know related to Markov chains, like this one, both states are aperiodic, and hence the chain is aperiodic.

I'm less familiar with graph theory, but don't these two states each have a cycle of length 1, and therefore the graph is aperiodic? I'm using this reference for my definitions:

http://carmenere.ucsd.edu/jorge/teaching/mae247/s13/notes/lec03-graph-theory.pdf

jstac · 2016-03-20T10:43:37Z

In DiGraph, perhaps adj_matrix should be an attribute, so that we can check what it is on the fly?

In [22]: g.adj_matrix
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-22-75fcf4a25533> in <module>()
----> 1 g.adj_matrix

AttributeError: 'DiGraph' object has no attribute 'adj_matrix'

jstac · 2016-03-20T11:25:19Z

OK, so there are some things about the implementation that I don't agree with.

Consider

In [36]: P
Out[36]: [[0.5, 0.5], [0, 1]]

In [37]: mc = qe.MarkovChain(P, state_values=[1, 2])

In [38]: mc.stationary_distributions
Out[38]: array([[ 0.,  1.]])

I feel like this is already surprising for the user, since the output doesn't match the state values. This is a problem caused by the existence of attributes.

Now consider

In [41]: mc.communication_classes
Out[41]: [array([1]), array([0])]

So perhaps the user looks up the documentation to try to get output that matches his/her supplied state values. At the moment the get_xxx methods are not documented but you can find them by tab expansion. Then this kind of error is likely:

In [42]: mc.get_communication_classes()
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-42-6492f6bbb8ed> in <module>()
----> 1 mc.get_communication_classes()

TypeError: get_communication_classes() missing 1 required positional argument: 'return_values'

It seems strange to have to supply return_values. I'm not sure if this is deliberate.

In summary, I think there are several possible sources of confusion for the user. All of them are related to the coexistence of properties and methods. I also think it's confusing for some users even just to have get methods along with similar attributes.

What I propose is this: For attributes that do not return states (e.g., is_aperiodic), it's fine for this to be an attribute and not a method. There is no source of confusion here for the user.

For attributes that give back indices/states, I propose the class supplies only methods (same for DiGraph and MarkovChain). These methods have an optional argument return_values=True. If you set it to false you get indices.

I realize you disagree on these points but I really think my suggestions will make it easier for the user.

oyamad · 2016-03-20T12:45:50Z

@jstac Thanks for testing the code.
Regarding MarkovChain.get_xxx I forgot to set the default value (=True) for return_values (by copy-pasting from DiGraph where the default value is specified in the decorator).

For stationary_distributions, could you elaborate on "the output doesn't match the state values"?

oyamad · 2016-03-20T12:53:22Z

The adjacent matrix of a DiGraph is called csgraph (it is internally converted to scipy.sparse.csr_matrix):

In [2]: M = [[1, 1], [0, 1]]

In [3]: g = qe.DiGraph(M)

In [4]: g.csgraph.toarray()
Out[4]: 
array([[ True,  True],
       [False,  True]], dtype=bool)

In [5]: g = qe.DiGraph(M, weighted=True)

In [6]: g.csgraph.toarray()
Out[6]: 
array([[1, 1],
       [0, 1]], dtype=int64)

We can give it an additional name adj_matrix (or change it to the latter).

oyamad · 2016-03-20T13:08:41Z

As to periodicity, I have to look into the references I used, to recall what my reasoning was.

(If you follow the definition in Wikipedia, the period of state 0 for [[0.5, 0.5], [0, 1]] is not defined. If you define a Markov chain to be aperiodic by the property that every state has a period 1, then this Markov chain is not aperiodic. If you define a Markov chain to be aperiodic by the property that there is no k > 1 that divides all the cycles, then this Markov chain is aperiodic.)

I also have to look up the implementation issues.

oyamad · 2016-03-20T14:34:31Z

Periodicity is a separate issue. Let's discuss it in Issue #245.

oyamad · 2016-03-28T03:21:23Z

Another issue:
In the current implementation, MarkovChain.simulate assumes init to be indices, whether return_values is True or False.

In [2]: mc = MarkovChain([[0, 0, 1], [0, 1, 0], [1, 0, 0]], state_values=[1, 2, 3])

In [3]: mc.simulate(10, init=1)
Out[3]: array([2, 2, 2, 2, 2, 2, 2, 2, 2, 2])

Should we add an argument init_state_value? Or should we add a new method called simulate_values?

oyamad · 2016-03-28T06:30:54Z

There is an ambiguity if state_values = np.array([1, 2, [1, 2]], dtype=object) and init_value=[1, 2].

Should we not allow the dtype of state_values to be object?

coveralls · 2016-03-28T13:42:46Z

Coverage increased (+0.5%) to 86.317% when pulling 5bc78d5 on add_nodes_states into 4eff123 on master.

coveralls · 2016-03-28T15:24:11Z

Coverage increased (+0.6%) to 86.375% when pulling bfc91bb on add_nodes_states into 4eff123 on master.

and remove `return_values` from `simulate`

coveralls · 2016-03-29T02:57:01Z

Coverage increased (+0.6%) to 86.39% when pulling d825975 on add_nodes_states into 4eff123 on master.

oyamad · 2016-03-29T03:05:07Z

state_values must be homogeneous in type (state_values=[[1, 2], 1, 2] is not accepted).
get_index is added which returns the index given a value in state_value.
Annotating sample paths with state values is done by simulate_values (and return_values is removed from simulate). This matches with the Julia version: Refactor markov QuantEcon.jl#99 (comment).

As for 3, I am actually indifferent between this and adding an init_value option to simulate instead:

def simulate(self, ts_length, init=None, num_reps=None,
             return_values=True, init_value=None, random_state=None):
    ...

jstac · 2016-03-30T13:20:02Z

@oyamad Please give me a bit longer to respond. I'm tied up with some other stuff but this is very important code, and we want to design the best interface possible.

jstac · 2016-04-10T16:14:08Z

@oyamad I'm sorry I've taken so long to get to this and even now I don't have a lot of time to discuss. I do like it as is. It's nice to have properties, and efficient when useful state is preserved. But I think the interface could be a little clearer.

I like the Apple philosophy regarding interfaces: there's no need to read documentation unless you want to do something unusual. I also like the "principle of least surprises".

As is, I can imagine that someone will type mc.cyclic_classes and be surprised when they get indices back instead of state values.

So my preference would be to drop the get_xxx methods (except for get_index) and go back to properties. However, there would be two properties in each case:

mc.communication_classes
mc.communication_classes_as_indices
mc.recurrent_classes
mc.recurrent_classes_as_indices
mc.cyclic_classes
mc.cyclic_classes_as_indices

If I now type mc. and tab, everything is already completely clear.

Regarding simulate, I prefer the second option you give above, where there is just one method. It's then natural to look at the docstring if you don't immediately get what you want.

By the way, I currently get this error.

In [25]: P = np.asarray([ [0.5, 0.5], [0.4, 0.6] ])

In [26]: mc = qe.MarkovChain(P, state_values=(10, 20))

In [27]: mc.simulate_values(10)
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
/home/john/sync_dir/quantecon/QuantEcon.py/quantecon/markov/core.py in _get_index(self, value)
    291             try:
--> 292                 idx = np.where(self.state_values == value)[0][0]
    293                 return idx

IndexError: index 0 is out of bounds for axis 0 with size 0

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
<ipython-input-27-023d194e5620> in <module>()
----> 1 mc.simulate_values(10)

/home/john/sync_dir/quantecon/QuantEcon.py/quantecon/markov/core.py in simulate_values(self, ts_length, init_value, num_reps, random_state)
    558 
    559         """
--> 560         init_idx = self.get_index(init_value)
    561         X = self.simulate(ts_length, init=init_idx, num_reps=num_reps,
    562                           random_state=random_state)

/home/john/sync_dir/quantecon/QuantEcon.py/quantecon/markov/core.py in get_index(self, value)
    253 
    254         if values.ndim <= state_values_ndim - 1:
--> 255             return self._get_index(value)
    256         elif values.ndim == state_values_ndim:  # array of values
    257             k = values.shape[0]

/home/john/sync_dir/quantecon/QuantEcon.py/quantecon/markov/core.py in _get_index(self, value)
    293                 return idx
    294             except IndexError:
--> 295                 raise ValueError(error_msg)
    296         else:
    297             idx = 0

ValueError: value None not found

Test should fail for this commit

- Change *_components to return labels - Remove get_*_components

- Change *_classes to return values - Remove get_*_classes

- Add simulate_indices which return indices

coveralls · 2016-04-11T14:47:01Z

Coverage increased (+0.7%) to 86.505% when pulling 05e8ad8 on add_nodes_states into 4eff123 on master.

oyamad · 2016-04-11T14:57:50Z

@jstac Thank you for your comments.

I followed your opinion regarding the properties.

For each item there are now two properties *_components_indicies/*_classes_indices and *_components/*_classes (I added just _indices rather than _as_indices for the names to be a little shorter), where the former returns indices while the latter labels/values. The get_*_components/get_*_classes methods are removed.
For simulate, the former simulate and simulate_values are renamed to simulate_indices and simulate, respectively. I think this is consistent with the above, "append _indices if you want indices returned".
The bug you reported has been fixed, where the case when init is None is now handled properly, along with a test covering that case.

jstac · 2016-04-11T16:48:24Z

@oyamad Thanks!

The code is beautiful, as usual. And the interface is great.

Regarding point 2, it should be just as you say.

I think this is ready to merge.

oyamad · 2016-04-12T12:44:15Z

@jstac Thanks, I think we have reached a good solution.

jstac · 2016-04-12T12:48:46Z

@oyamad Me too. Thanks for all the hard work! Are you happy for me to merge? I think we can go ahead.

oyamad · 2016-04-12T12:53:28Z

Yes, please do merge.

sglyon · 2016-04-12T12:57:20Z

🎉

Good work!

This was a great example of open source development

Add node labels to DiGraph

b01b0ab

oyamad force-pushed the add_nodes_states branch from 152505c to b01b0ab Compare March 16, 2016 00:22

Edit and add docstrings

b43039f

Add tests for node labels

0176d3e

Add decorator annotate_nodes to get_*_components methods

3c103d6

Properties return indices as before

oyamad added 3 commits March 17, 2016 12:11

Delete unused method

01fb48b

Allow setting node_labels to None

3d243f1

Add state values to MarkovChain

43632eb

oyamad added 2 commits March 17, 2016 17:01

Fix bugs

b0ad892

Add tests for state values

b5cacb9

oyamad changed the title ~~[WIP] Add states/nodes to MarkovChain/DiGraph~~ Add states/nodes to MarkovChain/DiGraph Mar 17, 2016

Add a testcase

84fdbaa

oyamad added 2 commits March 19, 2016 20:53

MarkovChain: Remove util.numba_installed

457578b

MarkovChain: Fix bug

8850b31

oyamad mentioned this pull request Mar 20, 2016

Periodicity of DiGraph/MarkovChain #245

Open

oyamad added 2 commits March 28, 2016 22:14

Disallow non-homogeneous node_labels/state_values

ecfbc8f

MarkovChain: Add get_index

5bc78d5

MarkovChain.get_index: Accept array_like of values

bfc91bb

oyamad added 2 commits March 29, 2016 11:05

Fix typo

193f625

MarkovChain: Add simulate_values

d825975

and remove `return_values` from `simulate`

oyamad mentioned this pull request Apr 6, 2016

DiscreteDP: Allow beta=1 #244

Merged

2 tasks

oyamad added 5 commits April 11, 2016 11:23

MarkovChain: Add test for simulate_values with init_value=None

c0397df

Test should fail for this commit

MarkovChain: Fix bug in simulate_values

d761cfb

DiGraph: Add *_components_indices which return indices

78e7e67

- Change *_components to return labels - Remove get_*_components

MarkovChain: Add *_classes_indices which return indices

f0c681f

- Change *_classes to return values - Remove get_*_classes

MarkovChain: Change simulate_values to simulate

05e8ad8

- Add simulate_indices which return indices

jstac merged commit 4eade8d into master Apr 12, 2016

jstac deleted the add_nodes_states branch April 12, 2016 12:56

jstac mentioned this pull request Apr 13, 2016

TODO: Add state_values, action_values to DiscreteDP #248

Open

oyamad mentioned this pull request Mar 20, 2017

Some suggestions on quantecon.game_theory.normal_form_game #294

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add states/nodes to MarkovChain/DiGraph #237

Add states/nodes to MarkovChain/DiGraph #237

oyamad commented Mar 15, 2016

jstac commented Mar 16, 2016

oyamad commented Mar 16, 2016

oyamad commented Mar 16, 2016

oyamad commented Mar 17, 2016

oyamad commented Mar 17, 2016

oyamad commented Mar 17, 2016

coveralls commented Mar 18, 2016

coveralls commented Mar 19, 2016

jstac commented Mar 20, 2016

jstac commented Mar 20, 2016

jstac commented Mar 20, 2016

oyamad commented Mar 20, 2016

oyamad commented Mar 20, 2016

oyamad commented Mar 20, 2016

oyamad commented Mar 20, 2016

oyamad commented Mar 28, 2016

oyamad commented Mar 28, 2016

coveralls commented Mar 28, 2016

coveralls commented Mar 28, 2016

coveralls commented Mar 29, 2016

oyamad commented Mar 29, 2016

jstac commented Mar 30, 2016

jstac commented Apr 10, 2016

coveralls commented Apr 11, 2016

oyamad commented Apr 11, 2016

jstac commented Apr 11, 2016

oyamad commented Apr 12, 2016

jstac commented Apr 12, 2016

oyamad commented Apr 12, 2016

sglyon commented Apr 12, 2016

Add states/nodes to MarkovChain/DiGraph #237

Add states/nodes to MarkovChain/DiGraph #237

Conversation

oyamad commented Mar 15, 2016

jstac commented Mar 16, 2016

oyamad commented Mar 16, 2016

oyamad commented Mar 16, 2016

oyamad commented Mar 17, 2016

oyamad commented Mar 17, 2016

oyamad commented Mar 17, 2016

coveralls commented Mar 18, 2016

coveralls commented Mar 19, 2016

jstac commented Mar 20, 2016

jstac commented Mar 20, 2016

jstac commented Mar 20, 2016

oyamad commented Mar 20, 2016

oyamad commented Mar 20, 2016

oyamad commented Mar 20, 2016

oyamad commented Mar 20, 2016

oyamad commented Mar 28, 2016

oyamad commented Mar 28, 2016

coveralls commented Mar 28, 2016

coveralls commented Mar 28, 2016

coveralls commented Mar 29, 2016

oyamad commented Mar 29, 2016

jstac commented Mar 30, 2016

jstac commented Apr 10, 2016

coveralls commented Apr 11, 2016

oyamad commented Apr 11, 2016

jstac commented Apr 11, 2016

oyamad commented Apr 12, 2016

jstac commented Apr 12, 2016

oyamad commented Apr 12, 2016

sglyon commented Apr 12, 2016