Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PyEmma Analysis #19

Open
thempel opened this issue Mar 15, 2017 · 8 comments
Open

PyEmma Analysis #19

thempel opened this issue Mar 15, 2017 · 8 comments

Comments

@thempel
Copy link
Member

thempel commented Mar 15, 2017

For actual adaptive simulations, a flexible way to analyze the data is very important. Apart from choosing input parameters such as the lag time and msm states, it will be necessary to modify e.g. input features. Modifying msmanlyze.py, which is effectively part of the adaptivemd source code, is not very convenient. It might further be a bit risky to give the user the standard analysis for the alanine dipeptide because he must opt-out in order to avoid meaningless yet working results. For more complex types of trajectories, even more options need to be considered. I would suggest to either

  • have a script-based solution that allows the user to write custom scripts. Maybe adding its path to a modeller object. This would also allow to keep track of which modeller has been used to generate which trajectories.

or

  • a function-based solution: I don't know if this even works, but it might be even better to define custom functions for the analysis which must take a given set of input parameters and produce the output in a given shape. I am thinking of something similar to PyEmma's featurizer.add_custom_function(). It would allow to directly see which keyword arguments can be chosen. Further, the function could be stored in the database, I suppose, making it easy to keep track of the used strategy. Might also be easier to add this to the "brain"...
@franknoe
Copy link
Collaborator

franknoe commented Mar 15, 2017 via email

@nsplattner
Copy link
Collaborator

For basic model building functionality required for adaptive sampling it would be sufficient to slightly extend the options of remote_analysis(). The minimal functionality includes the following options:

  • featurizer selection (e.g. 'add_all', 'add_backbone_torsion')
  • transformation (e.g. None or TICA)
  • TICA options (lag, kinetic variance or number of dimensions)
  • clustering method (k-means or regspace + metric, cutoff or number of clusters)
  • MSM lagtime

If these options can be passed most cases will be covered. For everything more complicated a custom function or additional script could be used.

@franknoe
Copy link
Collaborator

franknoe commented Mar 15, 2017 via email

@jhprinz
Copy link
Contributor

jhprinz commented Mar 22, 2017

featurizer selection (e.g. 'add_all', 'add_backbone_torsion')

This is in there now.

I think we should cover the usual suspects. If you want something really fancy you can always write your own analysis code. NP. But most people will want to use PyEmma in some standard ways like we teach in the courses (as @nsplattner listed). Features are in there now. TICA is always on but has some options. Clustering should be selectable, but so far we only have n_states. MSM lagtime is in there.

@nsplattner
Copy link
Collaborator

I'm not sure how the function remote_analysis() is supposed to work. The choice of features seems to be hardcoded (line 44, feat.add_backbone_torsions()) Is this supposed to be an example or customizable? How can arguments be passed to the featurizer? If its an example it should not be in the main code but rather in the tutorial directory.

@jhprinz
Copy link
Contributor

jhprinz commented Mar 23, 2017

This was an example where I hardcoded it. it should be obvious what to change. Unfortunately PyEMMA does not allow to store a feature description in some way. but the upcoming PR #28 will change that.

@nsplattner
Copy link
Collaborator

O.k., thanks for the details!
It is obvious what to change, the problem is that a) its not clear that this is an example since its placed in the package and b) its not convenient to have a custom function placed in the package since its lost when the code is updated.

@jhprinz
Copy link
Contributor

jhprinz commented Mar 23, 2017

Sorry for the confusing. It was not planned originally to turn it into a package. I did that to make it easier for you guys. All the additional work including cleanups, documentation is kind of hard to do in 2 weeks time.

PR #28 and #35 will solve that problem and allow much more customization tough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants