PyEmma Analysis #19

thempel · 2017-03-15T14:10:22Z

For actual adaptive simulations, a flexible way to analyze the data is very important. Apart from choosing input parameters such as the lag time and msm states, it will be necessary to modify e.g. input features. Modifying msmanlyze.py, which is effectively part of the adaptivemd source code, is not very convenient. It might further be a bit risky to give the user the standard analysis for the alanine dipeptide because he must opt-out in order to avoid meaningless yet working results. For more complex types of trajectories, even more options need to be considered. I would suggest to either

have a script-based solution that allows the user to write custom scripts. Maybe adding its path to a modeller object. This would also allow to keep track of which modeller has been used to generate which trajectories.

or

a function-based solution: I don't know if this even works, but it might be even better to define custom functions for the analysis which must take a given set of input parameters and produce the output in a given shape. I am thinking of something similar to PyEmma's featurizer.add_custom_function(). It would allow to directly see which keyword arguments can be chosen. Further, the function could be stored in the database, I suppose, making it easy to keep track of the used strategy. Might also be easier to add this to the "brain"...

The text was updated successfully, but these errors were encountered:

franknoe · 2017-03-15T14:12:55Z

I agree. I thought it's possibly to pass arbitrary code. The user should be able to access the full database information and just tell the framework which starting conditions should be selected next. Am 15/03/17 um 15:10 schrieb thempel:

…

For actual adaptive simulations, a flexible way to analyze the data is very important. Apart from choosing input parameters such as the lag time and msm states, it will be necessary to modify e.g. input features. Modifying |msmanlyze.py|, which is effectively part of the adaptivemd source code, is not very convenient. It might further be a bit risky to give the user the standard analysis for the alanine dipeptide because he must opt-out in order to avoid meaningless yet working results. For more complex types of trajectories, even more options need to be considered. I would suggest to either * have a script-based solution that allows the user to write custom scripts. Maybe adding its path to a |modeller| object. This would also allow to keep track of which modeller has been used to generate which trajectories. * a function-based solution: I don't know if this even works, but it might be even better to define custom functions for the analysis which must take a given set of input parameters and produce the output in a given shape. I am thinking of something similar to PyEmma's |featurizer.add_custom_function()|. It would allow to directly see which keyword arguments can be chosen. Further, the function could be stored in the database, I suppose, making it easy to keep track of the used strategy. Might also be easier to add this to the "brain"... — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#19>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AGMeQijwn7okLIUMhzlnBoQjZAWUQBE0ks5rl_FOgaJpZM4Md_ZS>.

--

---------------------------------------------- Prof. Dr. Frank Noe Head of Computational Molecular Biology group Freie Universitaet Berlin Phone: (+49) (0)30 838 75354 Web: research.franknoe.de Mail: Arnimallee 6, 14195 Berlin, Germany ----------------------------------------------

nsplattner · 2017-03-15T14:48:10Z

For basic model building functionality required for adaptive sampling it would be sufficient to slightly extend the options of remote_analysis(). The minimal functionality includes the following options:

featurizer selection (e.g. 'add_all', 'add_backbone_torsion')
transformation (e.g. None or TICA)
TICA options (lag, kinetic variance or number of dimensions)
clustering method (k-means or regspace + metric, cutoff or number of clusters)
MSM lagtime

If these options can be passed most cases will be covered. For everything more complicated a custom function or additional script could be used.

franknoe · 2017-03-15T14:56:09Z

I think it's important to not just be able to select from a few options, but to be able to develop new strategies. For that we effectively need to be able to access the data. The form of the decision making can be standartized, such as returning a set of selected starting points. Am 15/03/17 um 15:48 schrieb nsplattner:

…

For basic model building functionality required for adaptive sampling it would be sufficient to slightly extend the options of |remote_analysis()|. The minimal functionality includes the following options: * featurizer selection (e.g. 'add_all', 'add_backbone_torsion') * transformation (e.g. None or TICA) * TICA options (lag, kinetic variance or number of dimensions) * clustering method (k-means or regspace + metric, cutoff or number of clusters) * MSM lagtime If these options can be passed most cases will be covered. For everything more complicated a custom function or additional script could be used. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#19 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AGMeQsN9K3t1R9ht5r5J3SD5qXCx513Xks5rl_oqgaJpZM4Md_ZS>.

--

---------------------------------------------- Prof. Dr. Frank Noe Head of Computational Molecular Biology group Freie Universitaet Berlin Phone: (+49) (0)30 838 75354 Web: research.franknoe.de Mail: Arnimallee 6, 14195 Berlin, Germany ----------------------------------------------

jhprinz · 2017-03-22T19:30:30Z

featurizer selection (e.g. 'add_all', 'add_backbone_torsion')

This is in there now.

I think we should cover the usual suspects. If you want something really fancy you can always write your own analysis code. NP. But most people will want to use PyEmma in some standard ways like we teach in the courses (as @nsplattner listed). Features are in there now. TICA is always on but has some options. Clustering should be selectable, but so far we only have n_states. MSM lagtime is in there.

nsplattner · 2017-03-23T15:11:33Z

I'm not sure how the function remote_analysis() is supposed to work. The choice of features seems to be hardcoded (line 44, feat.add_backbone_torsions()) Is this supposed to be an example or customizable? How can arguments be passed to the featurizer? If its an example it should not be in the main code but rather in the tutorial directory.

jhprinz · 2017-03-23T15:34:27Z

This was an example where I hardcoded it. it should be obvious what to change. Unfortunately PyEMMA does not allow to store a feature description in some way. but the upcoming PR #28 will change that.

nsplattner · 2017-03-23T16:17:22Z

O.k., thanks for the details!
It is obvious what to change, the problem is that a) its not clear that this is an example since its placed in the package and b) its not convenient to have a custom function placed in the package since its lost when the code is updated.

jhprinz · 2017-03-23T23:02:53Z

Sorry for the confusing. It was not planned originally to turn it into a package. I did that to make it easier for you guys. All the additional work including cleanups, documentation is kind of hard to do in 2 weeks time.

PR #28 and #35 will solve that problem and allow much more customization tough.

thempel added the enhancement label Mar 15, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PyEmma Analysis #19

PyEmma Analysis #19

thempel commented Mar 15, 2017 •

edited

Loading

franknoe commented Mar 15, 2017 via email

nsplattner commented Mar 15, 2017

franknoe commented Mar 15, 2017 via email

jhprinz commented Mar 22, 2017

nsplattner commented Mar 23, 2017

jhprinz commented Mar 23, 2017

nsplattner commented Mar 23, 2017

jhprinz commented Mar 23, 2017

PyEmma Analysis #19

PyEmma Analysis #19

Comments

thempel commented Mar 15, 2017 • edited Loading

franknoe commented Mar 15, 2017 via email

nsplattner commented Mar 15, 2017

franknoe commented Mar 15, 2017 via email

jhprinz commented Mar 22, 2017

nsplattner commented Mar 23, 2017

jhprinz commented Mar 23, 2017

nsplattner commented Mar 23, 2017

jhprinz commented Mar 23, 2017

thempel commented Mar 15, 2017 •

edited

Loading