New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add VAMP, CKTests (MSM and VAMP) #25
Conversation
Actually this should not be needed, as estimators are a plain model factory.
TODO: investigate why
https://stackoverflow.com/a/34225828/3086470 there are some patterns, as well as design considerations. |
the test failure in kmeans is totally unrelated and surprising! |
well it is a randomized test 🤷♂️ and it just checks whether the callback was invoked twice.. couldve converged after one iteration? |
Would it help to always yield a copy when calling fetch_model? |
On 14.09.19 20:47, Moritz Hoffmann wrote:
Would it help to always yield a copy when calling fetch_model?
It would cure the symptoms, but I thought we decided against it in the
beginning, since copying is a heavy (memory) operation and unexpected at
this point.
|
Symptoms implying there is a deeper underlying issue? To be fair I don't think it is very heavy in terms of memory since there is no data attached to models, just statistics. Also isn't it rather unexpected that a factory returns references? |
I'll add the default copy then, but it also involves enforcing implementations of fetch_model to do the copy. I think it would be much clearer, that a fit() should produce a new instance. |
Creating the instance prior invocations of fit is rather easy to do, minimal invasive, because people should derive from Estimator in any case. There is a note in the doc string quoting this behavior and we can also put it in the developer docs at some point. |
Sounds good to me in principle, but is that compatible with partial fit? I just suggested fetch_model because of that. |
The default Estimator constructor checks for this case and creates the model if partial_fit is implemented. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very nitpicky probably 😇
@clonker: I'd be happy, if you could explain the deviation in msm cktest... I couldn't spot it :(
@clonker figured it out by myself. Thanks for the rigorous review, it was very helpful! |
sweet! what was it in the end? |
On 24.09.19 22:24, Moritz Hoffmann wrote:
sweet! what was it in the end?
an indexing issue (mlag0 offset+1) and a too high precision for testing
(just adopted to the pyemma rtol and atol).
|
yeah indexing issues are nasty.. nice! |
This adds VAMP estimator/model and the infrastructure for lagged model validation (cktests).
During the path of getting the stuff to work, I noticed that calling fit on an estimator has unexpected side effects. That is why we need to take a copy of it in LaggedModelValidator. The factory pattern however should make the need for this copy unnecessary, but because we encapsulate the current model instance, we can not work around this.
@clonker do you think it would be sane to call _create_model upon fit() to avoid this kind of hassle? How would we enforce this behavior without interfering with overridden fit methods?