Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return S0 value for DTI fits #1162

Merged
merged 45 commits into from
Jan 24, 2017
Merged

Return S0 value for DTI fits #1162

merged 45 commits into from
Jan 24, 2017

Conversation

etpeterson
Copy link
Contributor

The goal of this PR is to optionally return the S0 (b=0) value for DTI fits without negatively impacting performance or memory unless the S0 is requested. It captures the fitted S0 values generated in the DTI fitting routines and allows the user to access them.

It's similar in intent to PR 1036 but rather than calculating the S0 values from the fits, it captures the values directly in each fit method.

My local tests look like the speed is only slightly slower but keeping the S0 values does increase memory usage because it's another image being saved. I'm timing it using time for the whole fit and finding the size by pickling the fit object so if there's a better way to do that let me know.

I think it's missing a few tests and probably some documentation but I think the major pieces are there.

Added tests for fast linear fitting in test_ivim.py
Added a fitting comparison function that uses AIC
Also added bounds checking and clamping options.
Also made it so the relative likelihood can be positive or negative.
think it's working correctly.
Fixed errors in the bound checking and clipping.
Added an AIC weight calculation function.
Added tests for the aic functions.
Fixed a bug that caused an error when asking for the S0 values when they weren't calculated.
Removed the error for the above methods preventing S0 fitting in those cases.
S0params[i:i + step] = fit_output[1]
else:
dtiparams[i:i + step] = fit_tensor(design_matrix,
data[i:i + step], *args, **kwargs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PEP8: align data[i:i + step]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, ensure that you don't have a line with more than 79 characters

@etpeterson
Copy link
Contributor Author

That sounds reasonable to me. I made the changes so S0_hat is the model fit. I think it's working and doing everything it should.

I suspect I'm missing some test cases and documentation so I'll try to check that when I can but comments are welcome before then.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.03%) to 88.365% when pulling 9913fa7 on etpeterson:dti_S0 into 7a89ef1 on nipy:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.0004%) to 88.399% when pulling d447f4d on etpeterson:dti_S0 into 7a89ef1 on nipy:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.02%) to 88.421% when pulling 154382d on etpeterson:dti_S0 into 7a89ef1 on nipy:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.02%) to 88.421% when pulling 7dd0b6a on etpeterson:dti_S0 into 7a89ef1 on nipy:master.

@etpeterson
Copy link
Contributor Author

I updated the tests and documentation. Let me know what you think. I'm also interested if you have thoughts about speed and memory testing beyond the basic script I'm using locally.

Copy link
Contributor

@arokem arokem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work -- thanks for implementing this! I wonder whether some of the logic could be simplified a bit (see comments inline). Regarding profiling: you can use this script: https://github.com/nipy/dipy/blob/master/scratch/profile_dti.py, together with the line-profiler: https://github.com/rkern/line_profiler to measure timing, and profile this line-by-line. At the very least using %timeit to compare performance with and without this would be good, just to see that we are not slowing this down for people who don't care about estimating S0.

One more request I have is to test this with different configurations of b=0 measurements: with one measurement at the beginning, and at the end of the acquisition, with several b=0 measurements, either bunched together, or interspersed throughout. I want to be sure that this is not somehow making assumptions about the b=0 measurements that just happen to be true for the test conditions.

params_in_mask = self.fit_method(self.design_matrix, data_in_mask,
*self.args, **self.kwargs)

if self.return_S0_hat:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need this if... else construct. If it's false, it will just do the right thing (assuming it's implemented correctly in the fit_method).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree on the input side that's fine but the output of fit_method is a little more tricky because they either return 1 or 2 arguments. So I think the alternative below still requires an if statement but is perhaps more clean?

params_in_mask = self.fit_method(
                self.design_matrix,
                data_in_mask,
                return_S0_hat=self.return_S0_hat,
                *self.args,
                **self.kwargs)
if self.return_S0_hat:
    params_in_mask, model_S0 = params_in_mask

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - that's cleaner. In general, I prefer it when there is only one call to the function. That way, if the API changes for reasons unrelated to S0_hat, we only have to change the call in one place, instead of two places.


if mask is None:
out_shape = data.shape[:-1] + (-1, )
dti_params = params_in_mask.reshape(out_shape)
if self.return_S0_hat:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the logic here could be simplified a bit by setting model_S0 to None, changing it if required, and otherwise passing it on as None to TensorFit below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain how to simplify the if statements in the mask logic? I don't see how that would work. If model_S0 is None then it would fail trying to reshape or it would return a large None array.

I do agree that the ifs with TensorFit can be removed though to leave return TensorFit(self, dti_params, model_S0=S0_params) as long as S0_params is None.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - I would make S0_params an optional input to fit with a default of None. Does that make sense? But you're right that I don't see a way to simplify more.

N = model_params.ndim
if type(index) is not tuple:
index = (index,)
elif len(index) >= model_params.ndim:
raise IndexError("IndexError: invalid index")
index = index + (slice(None),) * (N - len(index))
return type(self)(self.model, model_params[index])
if model_S0 is None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And same here. If None was passed through, you can pass probably pass it on to the call to type(self)(self.model, ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But model_S0=model_S0[index_S0] would fail if model_S0 is None. Or are you suggesting just pass it along if it's None anyway on line 848?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like:

if model_S0 is not None: 
   model_S0 = model_S0[index[:-1]] 

return type(self)(self.model, model_params[index], model_S0=model_S0)

@@ -1162,6 +1196,10 @@ def predict(self, gtab, S0=1., step=None):
which a signal is to be predicted and $b$ is the b value provided in
the GradientTable input for that direction
"""
if S0 is None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be documented in the docstring. In particular, now that the "1" is now no longer part of the function signature, it is worth mentioning that value specifically in the docstring for this function.

@@ -1242,22 +1283,41 @@ def wrapped_fit_tensor(design_matrix, data, step=step,
size = np.prod(shape)
step = int(step) or size
if step >= size:
return fit_tensor(design_matrix, data, *args, **kwargs)
if return_S0_hat:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here as well, consider whether the logic could be simplified by passing along the default value in the case where return_S0_hat is set to False.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, this can be simplified. Though not anywhere else in this function as far as I can see.

@@ -631,6 +657,11 @@ def test_restore():
tensor_model = dti.TensorModel(gtab, fit_method='restore', sigma=0.0001)
tensor_model.fit(Y.copy())

# Test return_S0_hat
tensor_model = dti.TensorModel(gtab, fit_method='restore', sigma=0.0001, return_S0_hat=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PEP8: Line's too long

@etpeterson
Copy link
Contributor Author

I'll give profiling a shot. I'll try to do this PR with and without returning S0 and master. I'll also see if I can track memory in a meaningful way.

Regarding testing different structured inputs, is there a dataset that's already in dipy you recommend? Do you think it would be good to test on data without any b=0 at all?

@etpeterson
Copy link
Contributor Author

Here are some times, timed with the version that's up right now. It's not taking much more time at all, but the memory size does increase because we need to hold on to the S0 data rather than throw it away.

Master for comparison because it has no S0 logic at all:
timeit repetition time = 2.9841374158859253 s
structure size using pickle = 1.5746278100000002 GB

Current version, S0 OFF:
timeit repetition time = 3.0333045959472655 s
structure size using pickle = 1.574627854 GB

Current version, S0 ON:
timeit repetition time = 3.0728065013885497 s
structure size using pickle = 1.6998900060000002 GB

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.06%) to 88.339% when pulling de23cbb on etpeterson:dti_S0 into 7a89ef1 on nipy:master.

@etpeterson
Copy link
Contributor Author

@arokem your comments sound good to me so I made and pushed the changes. Also, do you want more details about timing and profiling or is what I have above using timeit satisfactory?

@arokem
Copy link
Contributor

arokem commented Jan 19, 2017

Code looks good to me. Regarding profiling: what data did you use to run the profiling? Assuming this is the difference on reasonably large whole-brain data, it seems to be that there wouldn't be harm in adding this code -- the difference seems minor. Any thoughts from anyone else?

@etpeterson
Copy link
Contributor Author

Good to hear.

I used some 3T human brain data I have here that has multiple b-values so the S0 fitting is meaningful. It's 256x256x64 and 3 directions with 7 b-values. I did mask it but it's still fitting 905,768 voxels.

@etpeterson
Copy link
Contributor Author

A somewhat related question: is Dipy participating in Google Summer of Code this year? I'm thinking about this PR and integrating it into IVIM and other IVIM improvements that could work as a project for GSoC. I think the first deadline is Feb 9th so it's already looming on the horizon.

@arokem
Copy link
Contributor

arokem commented Jan 24, 2017

OK - I am happy with this. If no one else has any comments here, I will merge this in a couple of days.

Re: GSoC -- I will not be able to be involved this year. I am going to be on family leave during the summer 🍼 😄 !

@etpeterson
Copy link
Contributor Author

Congrats on the 👶 ! That's a legit reason to skip it. We're expecting our first in February so I was figuring I'd at least be partly back in action by April in time for GSoC. But maybe less is more this year 😄

@RafaelNH
Copy link
Contributor

I've just did a last look to the code - for me this is ready to go! Thanks @etpeterson for the really nice work!

@arokem arokem merged commit 94a4668 into dipy:master Jan 24, 2017
ShreyasFadnavis pushed a commit to ShreyasFadnavis/dipy that referenced this pull request Sep 20, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants