New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Cross-validation #235

Closed
wants to merge 13 commits into
base: master
from

Conversation

Projects
None yet
3 participants
@arokem
Member

arokem commented Aug 24, 2013

Another bit of WIP. The idea here is to use the nascent prediction API on the tensor model (and hopefully also on other models soon enough...) to perform k-fold cross validation.

The scheme is that in each iteration, a certain number of random diffusion-weighted directions is dropped. A model is fit to the rest of the directions and then a prediction is made for the left out directions. This is performed so that every direction is dropped in one iteration, so we get a complete prediction of out-of-sample measurements, which in the end comprise all the measurements in the original sample.

Questions at this point: this function relies on a rather uniform API: it requires that the Fit object derived from the model have a predict method that takes a GradientTable as input and returns a signal for each of the directions/b-values in the that input. The function currently also assumes that the model itself has bval and bvec attributes. This is true for the TensorModel class, with which I have been testing this (and that is currently the only model for which prediction is implemented).

Maybe we should implement a base-class from which all the models inherit that enforces these assumptions? Are there any models that are not initialized with a GradientTable in this way?

Also - I created a separate module for this under stats, but does this belong in some other pre-existing module?

Comments and ideas are most welcome at this point. I am still trying to hammer this out in the best possible way.

@@ -0,0 +1,111 @@
"""
Cross-validation analysis of diffusion models

This comment has been minimized.

@Garyfallidis

Garyfallidis Sep 6, 2013

Member

I don't think that you need a new module for this file. This should go in the dipy/core module. There are other stats there too.

gtab = gt.gradient_table(model.bval, model.bvec)
elif hasattr(model, 'gtab'): # e.g. DKI
gtab = model.gtab

This comment has been minimized.

@Garyfallidis

Garyfallidis Sep 6, 2013

Member

Wait, if this function concerns only the reconstruction models the file should go to dipy/reconst If you can later make it more general it can go to dipy/core.

This comment has been minimized.

@arokem

arokem Sep 7, 2013

Member

I am hoping that in some far-flung future it will be more general. For
example, for the cross-validation of tracks, but I will move this into core
for now.

On Fri, Sep 6, 2013 at 11:36 AM, Eleftherios Garyfallidis <
notifications@github.com> wrote:

In dipy/stats/xvalidation.py:


  • This function assumes that a prediction API is implemented in the Model
  • class for which prediction is conducted. That is, the Fit object that gets
  • generated upon fitting the model needs to have a predict method, which
  • receives a GradientTable class instance as input and produces a predicted
  • signal as output.
  • It also assumes that the model object has bval and bvec attributes
  • holding b-values and corresponding unit vectors.
  • """
  • if hasattr(model, 'bval'): # e.g. DTI
  •    gtab = gt.gradient_table(model.bval, model.bvec)
    
  • elif hasattr(model, 'gtab'): # e.g. DKI
  •    gtab = model.gtab
    

Wait, if this function concerns only the reconstruction models the file
should go to dipy/reconst If you can later make it more general it can go
to dipy/core.


Reply to this email directly or view it on GitHubhttps://github.com//pull/235/files#r6215951
.

prediction[..., gtab.b0s_mask] = S0[..., None]
return prediction

This comment has been minimized.

@Garyfallidis

Garyfallidis Sep 6, 2013

Member

There is no information up in the docstring of what this function actually returns.

@MrBago

This comment has been minimized.

Contributor

MrBago commented Sep 6, 2013

@arokem, this is good stuff. Just a few comments regarding design. All models will at some point need information from some GradientTable, and I believe that currently all models do this in a way that is compatible with your implementation. Adding an abstract model class wouldn't really help here, except to serve as an example, because each child class would need to override the abstract init. At that point a developer could choose to change the signature of init in a non-compatible way. My suggestion would be to add these requirements to our Model API and document them well for future developers. The other option would be to have a method like Model.add_grad_table which could be implemented in a parent class and re-implemented by models as needed. I don't really like this option, I prefer to think of models as semi-immutables that are never changed once they are created.

@arokem

This comment has been minimized.

Member

arokem commented Sep 7, 2013

The idea with having a base-class is to enforce certain uniformity in the
function signatures and calls.

You would of course write a new init for every child-class of the
abstract base-class, but that init could start by making a call to the
base-class init to populate the common attributes, such as gtab. I'd
rather do that than have a function that adds a gradient table to the
model. The problem I was dealing with here was just my own sloppiness in
naming variables, where the attribute that points to the gradient table had
a different name in different cases. But it's rather easy to enforce a
naming convention through inheritance.

On Fri, Sep 6, 2013 at 4:17 PM, MrBago notifications@github.com wrote:

@arokem https://github.com/arokem, this is good stuff. Just a few
comments regarding design. All models will at some point need information
from some GradientTable, and I believe that currently all models do this in
a way that is compatible with your implementation. Adding an abstract model
class wouldn't really help here, except to serve as an example, because
each child class would need to override the abstract init. At that
point a developer could choose to change the signature of init in a
non-compatible way. My suggestion would be to add these requirements to our
Model API and document them well for future developers. The other option
would be to have a method like Model.add_grad_table which could be
implemented in a parent class and re-implemented by models as needed. I
don't really like this option, I prefer to think of models as
semi-immutables that are never changed once they are cre ated.


Reply to this email directly or view it on GitHubhttps://github.com//pull/235#issuecomment-23975409
.

@arokem

This comment has been minimized.

Member

arokem commented Sep 7, 2013

More specifically, DTI has bval and bvec attributes, and no gtab
attribute, even though a GradientTable class instance is supplied as an
input. It seems inconsistent to not instead refer to the bval/bvec
attributes of the gtab. And there is the constant bval/bvals,
bvec/bvecs confusion to deal with ...

On Sat, Sep 7, 2013 at 11:12 AM, Ariel Rokem arokem@gmail.com wrote:

The idea with having a base-class is to enforce certain uniformity in the
function signatures and calls.

You would of course write a new init for every child-class of the
abstract base-class, but that init could start by making a call to the
base-class init to populate the common attributes, such as gtab. I'd
rather do that than have a function that adds a gradient table to the
model. The problem I was dealing with here was just my own sloppiness in
naming variables, where the attribute that points to the gradient table had
a different name in different cases. But it's rather easy to enforce a
naming convention through inheritance.

On Fri, Sep 6, 2013 at 4:17 PM, MrBago notifications@github.com wrote:

@arokem https://github.com/arokem, this is good stuff. Just a few
comments regarding design. All models will at some point need information
from some GradientTable, and I believe that currently all models do this in
a way that is compatible with your implementation. Adding an abstract model
class wouldn't really help here, except to serve as an example, because
each child class would need to override the abstract init. At that
point a developer could choose to change the signature of init in a
non-compatible way. My suggestion would be to add these requirements to our
Model API and document them well for future developers. The other option
would be to have a method like Model.add_grad_table which could be
implemented in a parent class and re-implemented by models as needed. I
don't really like this option, I prefer to think of models as
semi-immutables that are never changed once they are cre ated.


Reply to this email directly or view it on GitHubhttps://github.com//pull/235#issuecomment-23975409
.

@arokem arokem referenced this pull request Oct 21, 2013

Merged

Release 0.7 a few cleanups #259

@arokem arokem closed this Mar 11, 2014

@arokem arokem referenced this pull request Mar 11, 2014

Merged

Xval #335

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment