ReactiveFlux preserve order of input sets #1005

marscher · 2016-12-06T18:41:21Z

To preserve the order we use a ordered set implementation.
Note that this also increases speed a bit.
The first and the last state of F are always A and B, so we use that info in
plotting.

… plots * To preserve the order we use a ordered set implementation. Note that this also increases speed a bit. * The first and the last state of F are always A and B, so we use that info in plotting.

franknoe · 2016-12-06T18:53:42Z

Note that the input and the output sets can be different because if your definition of A and B cut through set boundaries, the sets must be splitted. Thus, I don't know what "preserve order of input sets" means. It's in general not possible to preserve the order because not even the sets themselves are preserved. Am 06/12/16 um 19:41 schrieb Martin K. Scherer:

…

* To preserve the order we use a ordered set implementation. Note that this also increases speed a bit. * The first and the last state of F are always A and B, so we use that info in plotting. Fixes #1004 <#1004> ------------------------------------------------------------------------ You can view, comment on, or merge this pull request online at: #1005 Commit Summary * [reactive_flux] preserve order of intermediate sets and label A, B in plots * [circle] use two omp threads File Changes * *M* circle.yml <https://github.com/markovmodel/PyEMMA/pull/1005/files#diff-0> (1) * *A* pyemma/_ext/orderedset/.gitignore <https://github.com/markovmodel/PyEMMA/pull/1005/files#diff-1> (1) * *A* pyemma/_ext/orderedset/LICENSE <https://github.com/markovmodel/PyEMMA/pull/1005/files#diff-2> (84) * *A* pyemma/_ext/orderedset/__init__.py <https://github.com/markovmodel/PyEMMA/pull/1005/files#diff-3> (5) * *A* pyemma/_ext/orderedset/_orderedset.pyx <https://github.com/markovmodel/PyEMMA/pull/1005/files#diff-4> (512) * *M* pyemma/msm/models/reactive_flux.py <https://github.com/markovmodel/PyEMMA/pull/1005/files#diff-5> (2) * *M* pyemma/plots/networks.py <https://github.com/markovmodel/PyEMMA/pull/1005/files#diff-6> (9) * *M* setup.py <https://github.com/markovmodel/PyEMMA/pull/1005/files#diff-7> (7) Patch Links: * https://github.com/markovmodel/PyEMMA/pull/1005.patch * https://github.com/markovmodel/PyEMMA/pull/1005.diff — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#1005>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AGMeQoYfLrngY6mZlBf-R4ThQux1q-eiks5rFaxRgaJpZM4LFv4q>.

--

---------------------------------------------- Prof. Dr. Frank Noe Head of Computational Molecular Biology group Freie Universitaet Berlin Phone: (+49) (0)30 838 75354 Web: research.franknoe.de Mail: Arnimallee 6, 14195 Berlin, Germany ----------------------------------------------

codecov-io · 2016-12-06T19:40:23Z

Current coverage is 89.37% (diff: 100%)

Merging #1005 into devel will increase coverage by 0.01%

@@              devel      #1005   diff @@
==========================================
  Files           172        173     +1   
  Lines         16589      16598     +9   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits          14824      14834    +10   
+ Misses         1765       1764     -1   
  Partials          0          0

Powered by Codecov. Last update 6bf304f...a18bcf0

nsplattner · 2016-12-06T19:48:33Z

Thanks for addressing #1004 !

I understand that its probably not possible to find a general solution for the order of output sets since there are many potential use cases. However, I think the most important use case is using the PCCA sets for coarse graining, either with A and B being macrostates or with A and B being microstates belonging to different PCCA states. (This is also what we show in the BPTI notebook.) In this case I think it helps to have a defined order of the output states which can easily be connected to the input sets, as well as having A and B clearly marked, since in the end the main interest is connecting the observed transition states to macrostates and their specific properties.

franknoe · 2016-12-06T19:54:14Z

If you have coarse-grained the flux to PCCA states and then use microstates as A/B you will have exactly the problem that I described. For example if we have the coarse-graining [0, 1, 2] [3, 4, 5] [6, 7, 8] and now define A=7 and B=4, we will get the sets. [7] = A [0, 1, 2] [3, 5] [6, 8] [4] = B So what's the right order here? And what does Martin's code do in this case? Am 06/12/16 um 20:48 schrieb nsplattner:

…

Thanks for addressing #1004 <#1004> ! I understand that its probably not possible to find a general solution for the order of output sets since there are many potential use cases. However, I think the most important use case is using the PCCA sets for coarse graining, either with A and B being macrostates or with A and B being microstates belonging to different PCCA states. (This is also what we show in the BPTI notebook.) In this case I think it helps to have a defined order of the output states which can easily be connected to the input sets, as well as having A and B clearly marked, since in the end the main interest is connecting the observed transition states to macrostates and their specific properties. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#1005 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AGMeQkRaC20iDyyFNYN6R6FCfRb2FYhXks5rFbwUgaJpZM4LFv4q>.

--

---------------------------------------------- Prof. Dr. Frank Noe Head of Computational Molecular Biology group Freie Universitaet Berlin Phone: (+49) (0)30 838 75354 Web: research.franknoe.de Mail: Arnimallee 6, 14195 Berlin, Germany ----------------------------------------------

nsplattner · 2016-12-06T20:03:50Z

As far as I understand the order as indicated above was not preserved before. Also, states A and B were not marked. So what I expect the code to do (I have not tested it yet) is to mark A and B in the flux plot and preserve the order of states otherwise. This should make it easier to connect the transition states to the PCCA sets of the model.
Please correct me if this is wrong and comment if you think there is a better way of solving this problem.

franknoe · 2016-12-06T20:18:36Z

This makes sense, but A and B could consist of multiple sets if they had been divided before. I think the easiest would be if A and B sets are marked in an additional variable. The part "preserve the order of states otherwise" is ill-defined I think, there is no general way of preserving order. But I think that is also less important. The most important issue is that you can recover A and B. Note that if you have defined your sets a priori such that A and B are clear-cut (i.e. the are either sets or assemblies of sets that contain no other states than A/B states), the sets should be unchanged by TPT. Then I think it also makes sense to keep the order (but that's the nicest case) - I'm not sure what the current code does. Probably it places A in the beginning and B in the end. Am 06/12/16 um 21:03 schrieb nsplattner:

…

As far as I understand the order as indicated above was not preserved before. Also, states A and B were not marked. So what I expect the code to do (I have not tested it yet) is to mark A and B in the flux plot and preserve the order of states otherwise. This should make it easier to connect the transition states to the PCCA sets of the model. Please correct me if this is wrong and comment if you think there is a better way of solving this problem. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#1005 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AGMeQuxiXQLr__uI1fQo8HvbXnsqr-STks5rFb-ngaJpZM4LFv4q>.

--

---------------------------------------------- Prof. Dr. Frank Noe Head of Computational Molecular Biology group Freie Universitaet Berlin Phone: (+49) (0)30 838 75354 Web: research.franknoe.de Mail: Arnimallee 6, 14195 Berlin, Germany ----------------------------------------------

nsplattner · 2016-12-06T20:51:27Z

That's exactly what the current code does (placing A in the beginning and B in the end). This is fine if A and B are clearly marked as in the code submitted here, but its confusing in the current release in the case your input is a defined set of PCCA states and you get the same number of states as an output, but with a different numbering compared to your input.

I agree that preserving the order of states is not meaningful in case states are divided. However, I think in these cases users should be aware of dealing with a new state definition/ordering anyways.

franknoe · 2016-12-06T20:57:59Z

OK, so how about the following: - if A/B are chosen well, i.e. no states need to be split up, leave order as is. - if states are being split up, raise a Warning, notifying the user that the state definition has changed and care needs to be taken when indexing states. Am 06/12/16 um 21:51 schrieb nsplattner:

…

That's exactly what the current code does (placing A in the beginning and B in the end). This is fine if A and B are clearly marked as in the code submitted here, but its confusing in the current release in the case your input is a defined set of PCCA states and you get the same number of states as an output, but with a different numbering compared to your input. I agree that preserving the order of states is not meaningful in case states are divided. However, I think in these cases users should be aware of dealing with a new state definition/ordering anyways. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#1005 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AGMeQlBiIyCKN0LTxhO_n_exV1aDTFciks5rFcrPgaJpZM4LFv4q>.

--

---------------------------------------------- Prof. Dr. Frank Noe Head of Computational Molecular Biology group Freie Universitaet Berlin Phone: (+49) (0)30 838 75354 Web: research.franknoe.de Mail: Arnimallee 6, 14195 Berlin, Germany ----------------------------------------------

nsplattner · 2016-12-06T21:02:46Z

I think that would be a good solution!

franknoe · 2016-12-06T21:06:18Z

ok, @marscher can you do that? Am 06/12/16 um 22:02 schrieb nsplattner:

…

I think that would be a good solution! — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#1005 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AGMeQi2ePpAeNy-GYJOxki8ZHrLbX8LIks5rFc13gaJpZM4LFv4q>.

--

---------------------------------------------- Prof. Dr. Frank Noe Head of Computational Molecular Biology group Freie Universitaet Berlin Phone: (+49) (0)30 838 75354 Web: research.franknoe.de Mail: Arnimallee 6, 14195 Berlin, Germany ----------------------------------------------

marscher · 2016-12-07T13:56:35Z

The changes in #1005 reflect exactly what have been discussed here:

The order is preserved if A and B are already disjoint to the sets to coarse grain to.
If the set definition changes, the user will get a warning that it had to be changed.
Plotting the flux will label A and B

franknoe · 2016-12-07T14:47:00Z

ok, can you provide an additional class variable which indicates where A and B are (not just the visualization)? Am 07/12/16 um 14:56 schrieb Martin K. Scherer:

…

The changes in #1005 <#1005> reflect exactly what have been discussed here: 1. The order is preserved if A and B are already disjoint to the sets to coarse grain to. 2. If the set definition changes, the user will get a warning that it had to be changed. 3. Plotting the flux will label A and B — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#1005 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AGMeQhiICZaVgMwbTM375NCpvBIhWZbeks5rFrsUgaJpZM4LFv4q>.

--

---------------------------------------------- Prof. Dr. Frank Noe Head of Computational Molecular Biology group Freie Universitaet Berlin Phone: (+49) (0)30 838 75354 Web: research.franknoe.de Mail: Arnimallee 6, 14195 Berlin, Germany ----------------------------------------------

marscher · 2016-12-07T14:50:56Z

On 12/07/2016 03:47 PM, Frank Noe wrote: ok, can you provide an additional class variable which indicates where A and B are (not just the visualization)?

Do you mean to provide these within the NetworkPlot class?

franknoe · 2016-12-07T14:53:32Z

ReactiveFlux. We are talking about the ReactiveFlux, which does TPT/coarse-graining, no? Am 07/12/16 um 15:50 schrieb Martin K. Scherer:

…

On 12/07/2016 03:47 PM, Frank Noe wrote: > ok, can you provide an additional class variable which indicates where A > and B are (not just the visualization)? Do you mean to provide these within the NetworkPlot class? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#1005 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AGMeQrBn4sou4Q5YB5UxP7lTUf1ppBidks5rFsfQgaJpZM4LFv4q>.

--

---------------------------------------------- Prof. Dr. Frank Noe Head of Computational Molecular Biology group Freie Universitaet Berlin Phone: (+49) (0)30 838 75354 Web: research.franknoe.de Mail: Arnimallee 6, 14195 Berlin, Germany ----------------------------------------------

marscher · 2016-12-07T14:55:51Z

On 12/07/2016 03:53 PM, Frank Noe wrote: ReactiveFlux. We are talking about the ReactiveFlux, which does TPT/coarse-graining, no?

I just asked because A, B and I (intermediates) are already properties of this class.

franknoe · 2016-12-07T14:57:26Z

The question is does this help to identify how states were split up if they are split up. I don't remember right now, but think A/B are the input A/B, i.e. microstate sets. Am 07/12/16 um 15:55 schrieb Martin K. Scherer:

…

On 12/07/2016 03:53 PM, Frank Noe wrote: > ReactiveFlux. > > We are talking about the ReactiveFlux, which does TPT/coarse-graining, no? I just asked because A, B and I (intermediates) are already properties of this class. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#1005 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AGMeQsNkxtBeRjfwOdqt7buSeVxV2EmHks5rFsj3gaJpZM4LFv4q>.

--

---------------------------------------------- Prof. Dr. Frank Noe Head of Computational Molecular Biology group Freie Universitaet Berlin Phone: (+49) (0)30 838 75354 Web: research.franknoe.de Mail: Arnimallee 6, 14195 Berlin, Germany ----------------------------------------------

marscher · 2016-12-07T15:12:04Z

The coarse_grain method returns a new ReactiveFlux instance, which will have the new A and B sets as properties.

franknoe · 2016-12-07T15:14:08Z

Yeah, but does that show in a simple way how macrostates were splitted up? If that's the same functionality before, I guess it doesn't solve Nuria's assignment problem. Nuria, please comment. Am 07/12/16 um 16:12 schrieb Martin K. Scherer:

…

The coarse_grain method returns a new ReactiveFlux instance, which will have the new A and B sets as properties. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#1005 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AGMeQveb7g0LB5BfCWlX-hCvddVcfrYvks5rFszEgaJpZM4LFv4q>.

--

---------------------------------------------- Prof. Dr. Frank Noe Head of Computational Molecular Biology group Freie Universitaet Berlin Phone: (+49) (0)30 838 75354 Web: research.franknoe.de Mail: Arnimallee 6, 14195 Berlin, Germany ----------------------------------------------

nsplattner · 2016-12-07T15:26:09Z

The main problem I have with the current release is that the numbering is confusing in case the macrostates are not split. This is solved by the changes above.

For the case where macrostate are split I'm not sure what information should be provided. On one hand this could be treated as a new macrostate definition. In this case the user would just need the indices of the new microstates in each macrostates. However, if the new macrostates can clearly be related to the input macrostates then it would be good to have this information. But I'm not sure if this can be done in a generally applicable way.

franknoe · 2016-12-07T15:47:34Z

OK, then merge and try if this helps. Am 07/12/16 um 16:26 schrieb nsplattner:

…

The main problem I have with the current release is that the numbering is confusing in case the macrostates are not split. This is solved by the changes above. For the case where macrostate are split I'm not sure what information should be provided. On one hand this could be treated as a new macrostate definition. In this case the user would just need the indices of the new microstates in each macrostates. However, if the new macrostates can clearly be related to the input macrostates then it would be good to have this information. But I'm not sure if this can be done in a generally applicable way. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#1005 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AGMeQm8_wi87WdQJsGLnqYONFaniyT2Yks5rFtASgaJpZM4LFv4q>.

--

---------------------------------------------- Prof. Dr. Frank Noe Head of Computational Molecular Biology group Freie Universitaet Berlin Phone: (+49) (0)30 838 75354 Web: research.franknoe.de Mail: Arnimallee 6, 14195 Berlin, Germany ----------------------------------------------

marscher added 2 commits December 6, 2016 19:12

[reactive_flux] preserve order of intermediate sets and label A, B in…

e811266

… plots * To preserve the order we use a ordered set implementation. Note that this also increases speed a bit. * The first and the last state of F are always A and B, so we use that info in plotting.

[circle] use two omp threads

a18bcf0

marscher merged commit 7e4ef42 into markovmodel:devel Dec 7, 2016

marscher deleted the flux_preserve_order branch December 7, 2016 15:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ReactiveFlux preserve order of input sets #1005

ReactiveFlux preserve order of input sets #1005

marscher commented Dec 6, 2016

franknoe commented Dec 6, 2016 via email

codecov-io commented Dec 6, 2016

nsplattner commented Dec 6, 2016

franknoe commented Dec 6, 2016 via email

nsplattner commented Dec 6, 2016

franknoe commented Dec 6, 2016 via email

nsplattner commented Dec 6, 2016

franknoe commented Dec 6, 2016 via email

nsplattner commented Dec 6, 2016

franknoe commented Dec 6, 2016 via email

marscher commented Dec 7, 2016

franknoe commented Dec 7, 2016 via email

marscher commented Dec 7, 2016 via email

franknoe commented Dec 7, 2016 via email

marscher commented Dec 7, 2016 via email

franknoe commented Dec 7, 2016 via email

marscher commented Dec 7, 2016 via email

franknoe commented Dec 7, 2016 via email

nsplattner commented Dec 7, 2016

franknoe commented Dec 7, 2016 via email

ReactiveFlux preserve order of input sets #1005

ReactiveFlux preserve order of input sets #1005

Conversation

marscher commented Dec 6, 2016

franknoe commented Dec 6, 2016 via email

codecov-io commented Dec 6, 2016

Current coverage is 89.37% (diff: 100%)

nsplattner commented Dec 6, 2016

franknoe commented Dec 6, 2016 via email

nsplattner commented Dec 6, 2016

franknoe commented Dec 6, 2016 via email

nsplattner commented Dec 6, 2016

franknoe commented Dec 6, 2016 via email

nsplattner commented Dec 6, 2016

franknoe commented Dec 6, 2016 via email

marscher commented Dec 7, 2016

franknoe commented Dec 7, 2016 via email

marscher commented Dec 7, 2016 via email

franknoe commented Dec 7, 2016 via email

marscher commented Dec 7, 2016 via email

franknoe commented Dec 7, 2016 via email

marscher commented Dec 7, 2016 via email

franknoe commented Dec 7, 2016 via email

nsplattner commented Dec 7, 2016

franknoe commented Dec 7, 2016 via email