Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated how drift is calculated and how masking is done (Ready for review) #244

Merged
merged 53 commits into from Nov 10, 2017

Conversation

@CameronTEllis
Copy link
Contributor

@CameronTEllis CameronTEllis commented Jul 21, 2017

No description provided.

@mihaic mihaic requested a review from cbaldassano Jul 21, 2017
CameronTEllis added 13 commits Jul 22, 2017
…_sfnr and _calc_temporal_noise and then edited accordingly
…inuous template. Both are 3d (and made into 4d volumes where necessary)
@mihaic
Copy link
Contributor

@mihaic mihaic commented Jul 25, 2017

@CameronTEllis, if you plan to continue making changes to this PR, it would be useful to add a tag to the title, such as "WIP" (work in progress), to signal to the reviewers it is not ready for review. I will add this suggestion to CONTRIBUTING.rst.

@CameronTEllis CameronTEllis changed the title Updated how drift is calculated and how masking is done Updated how drift is calculated and how masking is done (Work in Progress) Jul 25, 2017
@CameronTEllis CameronTEllis changed the title Updated how drift is calculated and how masking is done (Work in Progress) Updated how drift is calculated and how masking is done (ready for review) Jul 28, 2017
@mihaic
Copy link
Contributor

@mihaic mihaic commented Aug 8, 2017

@CameronTEllis, feel free to contact @cbaldassano or ask others to review if you want this merged faster.

@CameronTEllis CameronTEllis changed the title Updated how drift is calculated and how masking is done (ready for review) Updated how drift is calculated and how masking is done (work in progress) Sep 18, 2017
…around SNR and SFNR. Made stimfunction and signal function more useable and extensible. Added more documentation
…th the new procedure for generating functions
… the tests done for calc_noise
@CameronTEllis CameronTEllis changed the title Updated how drift is calculated and how masking is done (work in progress) Updated how drift is calculated and how masking is done (Ready for review) Oct 22, 2017
@CameronTEllis
Copy link
Contributor Author

@CameronTEllis CameronTEllis commented Oct 22, 2017

@mihaic I am unsure what the error with Travis is. It says I am using the wrong version of Brew. Please advise the appropriate next steps.

Otherwise, this is ready for review e.g. @cbaldassano @lcnature

@mihaic
Copy link
Contributor

@mihaic mihaic commented Oct 23, 2017

@CameronTEllis, thanks for pointing out the Travis error. It not related to your PR. I am looking for a workaround (issue #274), but we can merge your PR irrespective of it, as soon as we get a review.

Copy link
Collaborator

@cbaldassano cbaldassano left a comment

I think my comments cover most of the main changes. I also looked at the notebook and that looks good to me. Congrats Cameron on finally getting this PR together!

""" Output an epoch file, necessary for some inputs into brainiak
This takes in the time course of stimulus events and outputs the epoch
file used in Brainiak.

This comment has been minimized.

@cbaldassano

cbaldassano Nov 6, 2017
Collaborator

Could you say here what the definition of an epoch is in this context?

This comment has been minimized.

@CameronTEllis

CameronTEllis Nov 9, 2017
Author Contributor

Firstly, Thank you so much for looking over this. Your comments are fantastic and much appreciated.

Secondly, I have added more description

How big is the response relative to the peak
# Cycle through the participants, different entries in the list
epoch_file = [0] * len(stimfunction)
for participant_counter in list(range(0, len(stimfunction))):

This comment has been minimized.

@cbaldassano

cbaldassano Nov 6, 2017
Collaborator

A bit confused by the syntax here and in other loops like one below, is this different than just:

for participant_counter in range(len(stimfunction)):

This comment has been minimized.

@CameronTEllis

CameronTEllis Nov 9, 2017
Author Contributor

Simplified syntax here and below

----------
multi dimensional array
The time course of the HRF convolved with the stimulus function.

This comment has been minimized.

@cbaldassano

cbaldassano Nov 6, 2017
Collaborator

Is this right? It looks like this function is just the HRF itself.

This comment has been minimized.

@CameronTEllis

CameronTEllis Nov 9, 2017
Author Contributor

Good find, corrected

Parameters
----------
stimfunction : timepoint by timecourse array

This comment has been minimized.

@cbaldassano

cbaldassano Nov 6, 2017
Collaborator

Does stimfunction have one number per TR, or is it at the temporal resolution specified in the function (e.g. 1000Hz)? If it is at a different temporal resolution than the HRF, I don't understand how you can convolve them.

This comment has been minimized.

@CameronTEllis

CameronTEllis Nov 9, 2017
Author Contributor

Thanks for pointing this out, made simpler

float

What is the max activity measured here, a point of reference for masking
float


This comment has been minimized.

@cbaldassano

cbaldassano Nov 6, 2017
Collaborator

Weird character here, maybe a Windows carriage return

This comment has been minimized.

@CameronTEllis

CameronTEllis Nov 9, 2017
Author Contributor

I wasn't able to find this issue- as you say, it might be Windows specific.


# Calculate the fwhm on a subset of volumes

if volume.shape[3] > 100:
# Take only 100 shuffled TRs
trs = np.arange(volume.shape[3])

This comment has been minimized.

@cbaldassano

cbaldassano Nov 6, 2017
Collaborator

You can replace these three lines with
trs = np.random.choice(volume.shape[3], size=100. replace=False)

This comment has been minimized.

@CameronTEllis

CameronTEllis Nov 9, 2017
Author Contributor

Great thank you

@@ -959,77 +1223,129 @@ def calc_noise(volume,
else:
trs = list(range(0, volume.shape[3]))

# Go through the trs and pull out the fwhm
# Go through the trs and pull out the fwhm and snr

This comment has been minimized.

@cbaldassano

cbaldassano Nov 6, 2017
Collaborator

Confused by this comment - looks like snr is being computed after this loop, not just on the sampled trs

This comment has been minimized.

@CameronTEllis

CameronTEllis Nov 9, 2017
Author Contributor

Woops, didn't update from previous edits

# Since you are combining spatial and temporal noise, you need to
# subtract the variance of the two to get the spatial sd
if spatial_sd > temporal_sd:
spatial_sd = np.sqrt(spatial_sd ** 2 - temporal_sd ** 2)

This comment has been minimized.

@cbaldassano

cbaldassano Nov 6, 2017
Collaborator

Is there a reference for this? It makes intuitive sense but it is not totally obvious to me, e.g. should there be a factor here for the fact that there are 3 spatial dims and only 1 temporal?

This comment has been minimized.

@CameronTEllis

CameronTEllis Nov 9, 2017
Author Contributor

No there is no reference here. This is part of the tool that I am least sure about. It was a very difficult problem to model noise that corresponded to the brain (things like autoregression and spatial noise) while independently manipulating system noise. The solution I had was to distinguish them as here but I am not confident this is the optimal way. I played with a number of alternative procedures and they all seemed sub optimal but I am open to suggestions.

@@ -1444,42 +1789,56 @@ def _generate_noise_temporal(stimfunction_tr,
# Finally, z score each voxel so things mix nicely
noise_temporal = stats.zscore(noise_temporal, 3)

This comment has been minimized.

@cbaldassano

cbaldassano Nov 6, 2017
Collaborator

zscoring when some variables have 0 variance may cause some warnings to be thrown - not necessarily a problem, but could be solved by using a masked array

This comment has been minimized.

@CameronTEllis

CameronTEllis Nov 9, 2017
Author Contributor

Yes good point. I will put this on the list for future things to fix

@@ -1524,48 +1882,77 @@ def mask_brain(volume,
zoom_factor = (brain_dim[0] / mask_dim[0],
brain_dim[1] / mask_dim[1],
brain_dim[2] / mask_dim[2],
1)
)

# Scale the mask according to the input brain
# You might get a warning but ignore it

This comment has been minimized.

@cbaldassano

cbaldassano Nov 6, 2017
Collaborator

Which part is throwing a warning? If it is not worth fixing, we should at least be more specific so that meaningful warnings don't get mistakenly ignored.

This comment has been minimized.

@CameronTEllis

CameronTEllis Nov 9, 2017
Author Contributor

The issue is when zoom factor values are non integers. I have checked that the functionality is still present even with the warnings so added a comment describing this and saying why they can be safely ignored

Copy link
Contributor

@lcnature lcnature left a comment

Hi @CameronTEllis Sorry for my delayed review.
All these are great works! Just a few comments for your consideration.

generate_stimfunction
Create a function to describe the stimulus onsets/durations
Create a timecourse of the signal activation. This can be specified using event

This comment has been minimized.

@lcnature

lcnature Nov 7, 2017
Contributor

Maybe worth mentioning this function is before convolution with HRF, and has higher temporal resolution than TR.

This comment has been minimized.

@CameronTEllis

CameronTEllis Nov 9, 2017
Author Contributor

Firstly, thank you so so much for all of these sophisticated and in depth comments. You really dived deep into the code and I appreciate the scrutiny greatly!

Secondly, good suggestion, added

double_gamma_hrf
Convolve the stimulus function with the HRF to model when events are expected.
export_epoch_file:

This comment has been minimized.

@lcnature

lcnature Nov 7, 2017
Contributor

might be worth explaining what an epoch file is.

This comment has been minimized.

@CameronTEllis

CameronTEllis Nov 9, 2017
Author Contributor

Added more detail

@@ -85,18 +107,17 @@ def _generate_feature(feature_type,
thickness=1):
"""Generate features corresponding to signal
Generate signal in specific regions of the brain with for a single
volume. This will then be convolved with the HRF across time
Generate a single feature, that can be inserted into the signal volume

This comment has been minimized.

@lcnature

lcnature Nov 7, 2017
Contributor

It looks like the feature means uniform activation in a region of specified shape here. The word "feature" itself is a bit vague to me.

This comment has been minimized.

@CameronTEllis

CameronTEllis Nov 9, 2017
Author Contributor

Good idea, added detail

@@ -109,7 +130,7 @@ def _generate_feature(feature_type,
----------
3 dimensional array
The volume representing the signal to be outputed
The volume representing the signal
"""

This comment has been minimized.

@lcnature

lcnature Nov 7, 2017
Contributor

Line 145 signal = np.ones(np.power(feature_size, 3)) to line 150 may be reduced to one command of np.ones. You can directly create a 3-d array with np.ones (by making the size a tuple or list). The same applies for other feature_types

This comment has been minimized.

@lcnature

lcnature Nov 7, 2017
Contributor

Is it worth throwing some warning message to log files when loop reduces to a disk or when a cavity reduces to a sphere?

This comment has been minimized.

@CameronTEllis

CameronTEllis Nov 9, 2017
Author Contributor

Good suggestions, added both

idx_n = round(event_durations[onset_counter] * temporal_resolution)
stimfunction[onset_idx:offset_idx] = [weights[onset_counter]] * idx_n
idx = offset_idx - onset_idx
stimfunction[onset_idx:offset_idx, 0] = [weights[onset_counter]] * idx

This comment has been minimized.

@lcnature

lcnature Nov 7, 2017
Contributor

I think the multiplication by idx might not be necessary.
stimfunction[onset_idx:offset_idx, 0] = weights[onset_counter] might just work

@@ -1189,6 +1510,7 @@ def _generate_noise_temporal_phys(timepoints,
----------
one dimensional array, float
Generates the physiological temporal noise timecourse
"""

noise_phys = [] # Preset

This comment has been minimized.

@lcnature

lcnature Nov 7, 2017
Contributor

suggestion (not critical): random phases may be added to the respiratory radian and heart radian.
I am personally against z-scoring. Although it makes the sample variance equal across any time series generated, essentially every time you generate noise, a different coefficient is multiplied due to the normalization. So there is no guarantee that the sinusoidal component has the same magnitude across different samples drawn. But this is personal flavor.
I also wonder whether the addition of Gaussian noise here is necessary. Isn't Gaussian noise also added somewhere else?

This comment has been minimized.

@CameronTEllis

CameronTEllis Nov 9, 2017
Author Contributor

Aha, good find, that was actually what I intended to do but did it wrong. In terms of z scoring, I do it in order to make the combination of noise types more tractable. When they are z scored I know how to mix their relative values in order to fit SFNR and SNR

This comment has been minimized.

@lcnature

lcnature Nov 9, 2017
Contributor

For periodic signal, you might get away with analytically calculating the expected standard deviation with magnitude/sqrt(2) according to this wiki page:
https://en.wikipedia.org/wiki/Root_mean_square
That way you do not need to z-score. But of course there may be issue of aliasing. But I think as long as ratio between the frequency of breathing and the fMRI sampling frequency is not a regular number, this estimate should OK.

This comment has been minimized.

@CameronTEllis

CameronTEllis Nov 9, 2017
Author Contributor

I am not sure what the current issues are with zscoring: at the moment it doesn't really matter that I do it because already the mean is ~0 and std is ~1 but I could remove it if you think it is necessary

This comment has been minimized.

@lcnature

lcnature Nov 9, 2017
Contributor

It is more for reproducibility concern - that you always have the same expected summary statistics for any given type of noise. (i.e., when a user say I simulated certain noise from certain generative model with certain parameters, the parameters that the user put in indeed governs how the samples drawn). I agree it is not a critical issue. So it is up to you.

This comment has been minimized.

@CameronTEllis

CameronTEllis Nov 9, 2017
Author Contributor

I think I will leave it as is but I appreciate the concern. Given that the critical output of this is the summed response of a respiratory and heart rate oscillation I think the code does what is necessary

Returns the FWHM of each TR in mm"""
Returns the FWHM of each TR in mm
"""

This comment has been minimized.

@lcnature

lcnature Nov 7, 2017
Contributor

I am not familiar with the math of calculating fwhm. But I am a bit uncertain of the meaning of taking absolute value in Line 952 v_sum += np.abs(volume[x, y, z]). And also I do not understand the last three equations before return. Of course if you are confident this is correct, then you can ignore this comment.

This comment has been minimized.

@CameronTEllis

CameronTEllis Nov 9, 2017
Author Contributor

I am fairly confident that this is correct and I have checked it using external fwhm calculators

temporal_noise noise_dict entry. To change the relative mixing of the
noise components, change the sigma's specified below.
Generate the time course of the average brain voxel. To increase or
decrease the amount of total noise then change the SFNR noise_dict

This comment has been minimized.

@lcnature

lcnature Nov 7, 2017
Contributor

it seems "then" can be replaced by "," instead. But actually are SFNR or noise_dict inputs to this function? Where should the user change them?

This comment has been minimized.

@CameronTEllis

CameronTEllis Nov 9, 2017
Author Contributor

You're right, this was confusing. THose values aren't inputs. I had that sentence there to say what you can/should do with the output of this function but I removed it for clarity

What is the full width half max of the gaussian fields being created
to model spatial noise.
motion_sigma : float

This comment has been minimized.

@lcnature

lcnature Nov 7, 2017
Contributor

I suggest making the description of this type of noise a bit clearer. For example, making it clear that the noise is assumed to only occur during a task event and its magnitude is modulated by the weight of the task event time course. The users might not immediately understand why the variable name has a "motion" in it, and what "pre-processing" here means (aka., I am a bit confused by these :P).

This comment has been minimized.

@CameronTEllis

CameronTEllis Nov 9, 2017
Author Contributor

Good suggestion, made the change

noise_system = _generate_noise_system(dimensions_tr=dimensions_tr)
# Calculate the sd that is necessary to be combined with itself in order
# to generate the temporal_sd
temporal_sd_element = np.sqrt(temporal_sd ** 2 / 2)

This comment has been minimized.

@lcnature

lcnature Nov 7, 2017
Contributor

There is no concrete suggestion in this comment. It is just suggestion for future consideration.
It seems intuitive here to divide half of the temporal noise variance to noise_temporal and half to the temporal part of noise_system if we do not have much information. But this division of half and half seems a bit arbitrary. Is it helpful to actually allow the user to specify the magnitudes of the noise_temporal and noise_system separately? Maybe very few users would have a good intuition of what number they should put in if they have a choice. For future development, I think it may be possible to actually estimate these magnitudes from data. I think if you specify the belief of the distribution of the two types of noise, there may be ways to estimate their magnitudes.

This comment has been minimized.

@CameronTEllis

CameronTEllis Nov 9, 2017
Author Contributor

This is an astute point to bring up. This is related to my last comment from Chris above. Basically I agree with you and think that what I am doing is definitely a massive simplification. Assuming equality between the system and the non-system noise is a big (fallacious assumption). This is something I would love to improve upon in the future

@mihaic
Copy link
Contributor

@mihaic mihaic commented Nov 8, 2017

@CameronTEllis, I assume you will need some time to address the helpful comments you received, so I am already preparing a new release in PR #295, without this code. Please let me know if I'm making a wrong assumption.

@CameronTEllis
Copy link
Contributor Author

@CameronTEllis CameronTEllis commented Nov 8, 2017

@mihaic mihaic mentioned this pull request Nov 8, 2017
@@ -180,6 +175,10 @@ def _generate_feature(feature_type,
# Subtract the two disks to get a loop
loop = outer != inner

# Check if the loop is a disk
if np.all(inner is False):
logger.warn('Loop feature is actually a disk')

This comment has been minimized.

@lcnature

lcnature Nov 9, 2017
Contributor

Should this be logger.warning or logger.warn?
I would make the message something like "Loop feature reduces to a disk because the inner radius is too small"
Similarly for the others.

This comment has been minimized.

@CameronTEllis

CameronTEllis Nov 9, 2017
Author Contributor

Great suggestions

file used in Brainiak. The epoch file is a way to structure the timing
information in fMRI that allows you to flexibly input different stimulus
sequences. This is a list with each entry a 3d matrix corresponding to a
participant. The dimensions condition by epoch by time

This comment has been minimized.

@lcnature

lcnature Nov 9, 2017
Contributor

The last sentence may be changed to "The dimensions of the 3d matrix are ..."

This comment has been minimized.

@CameronTEllis

CameronTEllis Nov 9, 2017
Author Contributor

Good suggestion

different condition. For this to be able to partition the start and
end of an epoch it is necessary that there is a change in value in
the function between the two epochs: if they are coded with the same
and weight and there is no time between blocks then this won't be

This comment has been minimized.

@lcnature

lcnature Nov 9, 2017
Contributor

The "and" may be redundant. Please double-check the grammar.

This comment has been minimized.

@CameronTEllis

CameronTEllis Nov 9, 2017
Author Contributor

Good suggestion, made this section more clear

# Down sample the stim function
stride = tr_duration * temporal_resolution
stimfunction_temp = stimfunction_ppt[:, condition_counter]
stimfunction_temp = stimfunction_temp[::int(stride)]

This comment has been minimized.

@lcnature

lcnature Nov 9, 2017
Contributor

I actually don't think this is a problem. A 0.01 s (I assume you meant s instead of ms) stimulus would predict an HRF almost indistinguishable from that of a 1 s stimulus multiplied by 1/100 because of the slow HRF (you can simulate this with your own functions). Such small a magnitude wouldn't influence GLM result in general. But if a 0.5 s stimulus get undetected, then there may be a problem. This might be left for future due to time pressure. But I highly suggest fixing this.

Takes in a string describing the hrf that ought to be created.
Can instead take in a vector describing the HRF as it was
specified by any function
hrf_type : str or list

This comment has been minimized.

@lcnature

lcnature Nov 9, 2017
Contributor

I remember there may be some requirement of brainiak code formatting that default values should be explicitly documented. But for this case, I suggest at least documenting the possible values that can be accepted. For example, which strings are acceptable?

This comment has been minimized.

@CameronTEllis

CameronTEllis Nov 9, 2017
Author Contributor

Added some details

@@ -1476,7 +1496,7 @@ def _generate_noise_temporal_autoregression(timepoints,
random = np.random.normal(0, 1)
temp.append(past_trs * past_reg + random)

noise_autoregression.append(np.mean(temp))
noise_autoregression.append(np.mean(temp))

This comment has been minimized.

@lcnature

lcnature Nov 9, 2017
Contributor

I still think that there is some issue here by adding Gaussian noise multiple times and taking the mean of temp. I think if you generate AR(2) noise with the current equation and recover the rhos with functions in nitime package, you might find both the rhos and variance being half of the true values.
It should be done by summing the product of the past signals and their corresponding rho coefficients, and adding only the Gaussian noise only once, but no averaging. The way to generate first p (p being the order) sample is more complex for p>1.
I think this document is correct that you can refer to:
https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=5&ved=0ahUKEwjZ7bu-yLDXAhUCziYKHUwxAaQQFghAMAQ&url=http%3A%2F%2Fwww.stat.washington.edu%2Fcourses%2Fstat520%2Fwinter99%2Far-recipe%2Far-recipe.ps&usg=AOvVaw2wGB5KOz-LlYyNxTjLitFy

This comment has been minimized.

@CameronTEllis

CameronTEllis Nov 9, 2017
Author Contributor

Aha good find, I now see the problem. I did some texting with the updated code and it seems improved

This comment has been minimized.

@lcnature

lcnature Nov 10, 2017
Contributor

@CameronTEllis One thing I noticed is the default rho is 1. You probably do not want this default value because you will end up generating Brownian motion with the variance ever growing. Some number smaller than 1 is perhaps more sensible. Please also keep in mind the issue with AR(p) and fix it in future.

This comment has been minimized.

@mihaic

mihaic Nov 10, 2017
Contributor

@CameronTEllis, this seems to be the last sticking point.

This comment has been minimized.

@CameronTEllis

CameronTEllis Nov 10, 2017
Author Contributor

Changed the value to 0.5. It is a good compromise to avoid drift while still retaining autoregressive properties.

@@ -1189,6 +1510,7 @@ def _generate_noise_temporal_phys(timepoints,
----------
one dimensional array, float
Generates the physiological temporal noise timecourse
"""

noise_phys = [] # Preset

This comment has been minimized.

@lcnature

lcnature Nov 9, 2017
Contributor

For periodic signal, you might get away with analytically calculating the expected standard deviation with magnitude/sqrt(2) according to this wiki page:
https://en.wikipedia.org/wiki/Root_mean_square
That way you do not need to z-score. But of course there may be issue of aliasing. But I think as long as ratio between the frequency of breathing and the fMRI sampling frequency is not a regular number, this estimate should OK.

@CameronTEllis
Copy link
Contributor Author

@CameronTEllis CameronTEllis commented Nov 10, 2017

@lcnature
Copy link
Contributor

@lcnature lcnature commented Nov 10, 2017

@mihaic Given the time pressure, I think it is fine to leave the code as is for the concerns I have raised and in the expectation that @CameronTEllis will improve it in future PR.

@mihaic
Copy link
Contributor

@mihaic mihaic commented Nov 10, 2017

@lcnature, thanks for your input. @cbaldassano, you also requested changes; what do you think about the code?

Copy link
Collaborator

@cbaldassano cbaldassano left a comment

All my major concerns have been addressed, I'm happy to approve the current version of the PR.

@mihaic
Copy link
Contributor

@mihaic mihaic commented Nov 10, 2017

@CameronTEllis, if you are done, push the "Update branch" button the PR website.

@mihaic mihaic merged commit 74dae88 into brainiak:master Nov 10, 2017
2 of 3 checks passed
2 of 3 checks passed
continuous-integration/travis-ci/pr The Travis CI build is in progress
Details
linux Build finished.
Details
macos Build finished.
Details
@mihaic
Copy link
Contributor

@mihaic mihaic commented Nov 10, 2017

Thanks again, everyone. Please create issues based on the review comments.

@CameronTEllis CameronTEllis deleted the CameronTEllis:fmrisim_mean_mask branch Nov 10, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

4 participants
You can’t perform that action at this time.