New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Robust Shared Response Model #302

Merged
merged 33 commits into from Dec 12, 2017

Conversation

Projects
None yet
3 participants
@TuKo
Contributor

TuKo commented Nov 23, 2017

The code introduces the Robust SRM, tests and an example with synthetic data.

@TuKo TuKo added the enhancement label Nov 23, 2017

@TuKo

This comment has been minimized.

Contributor

TuKo commented Nov 23, 2017

@CameronTEllis and @hejiaz, please do you mind reviewing this PR?

Thank you and happy thanksgiving!

@CameronTEllis

Overall it is looking good, my comments are relatively minor and mainly just focused on fleshing details out

"""Robust Shared Response Model (RSRM)
The implementation is based on the following publications:

This comment has been minimized.

@CameronTEllis

CameronTEllis Nov 24, 2017

Contributor

It would be good to provide a paragraph explaining what RSRM offers and how it compares to SRM, SSRM etc. You could copy and paste one of the paragraphs from manuscript. Also should mention the influence of RPCA

This comment has been minimized.

@TuKo

TuKo Nov 27, 2017

Contributor

I added an extended description of the method. However, I don't think that the relation to RPCA is relevant to the code/implementation. The idea is that people can understand the code and relate to the paper if there is something unclear.

This comment has been minimized.

@CameronTEllis

CameronTEllis Dec 10, 2017

Contributor

Fair enough

gamma : float, default: 1.0
Regularization parameter for the sparseness of the individual
components.

This comment has been minimized.

@CameronTEllis

CameronTEllis Nov 24, 2017

Contributor

Would be good to add direction to this description (Is high more or less sparse?)

This comment has been minimized.

@TuKo

TuKo Nov 27, 2017

Contributor

done

The shared response.
s_ : list of array, element i has shape=[voxels_i, samples]
The individual components with outlying data for each subject.

This comment has been minimized.

@CameronTEllis

CameronTEllis Nov 24, 2017

Contributor

You use outlier here and elsewhere but sometimes refer to this as the individual component. It is important to be consistent so check that the same word is used everywehere

This comment has been minimized.

@TuKo

TuKo Nov 27, 2017

Contributor

Done

----
The number of voxels may be different between subjects. However, the
number of samples for the alignment data must be the same across

This comment has been minimized.

@CameronTEllis

CameronTEllis Nov 24, 2017

Contributor

'samples' is not a typical way of describing timepoints in fMRI data. Consider changing this to say that 'timepoints must be aligned across participants'

This comment has been minimized.

@TuKo

TuKo Nov 27, 2017

Contributor

Done (everywhere)

X : list of 2D arrays, element i has shape=[voxels_i, samples]
Each element in the list contains the fMRI data of one subject.
y : not used

This comment has been minimized.

@CameronTEllis

CameronTEllis Nov 24, 2017

Contributor

What is this for then?

This comment has been minimized.

@TuKo

TuKo Nov 27, 2017

Contributor

Removed.

Parameters
----------
X : array, shape=[voxels, samples]

This comment has been minimized.

@CameronTEllis

CameronTEllis Nov 24, 2017

Contributor

Maybe use a different letter, or lower case since this is no longer a list like it is in other functions. same with S below

This comment has been minimized.

@TuKo

TuKo Nov 27, 2017

Contributor

Done

"""
A = X.dot(R.T)
A -= S.dot(R.T)
U, _, V = np.linalg.svd(A, full_matrices=False)

This comment has been minimized.

@CameronTEllis

CameronTEllis Nov 24, 2017

Contributor

This is probably obvious but I am unsure why you are doing the SVD. A comment would help

This comment has been minimized.

@TuKo

TuKo Nov 27, 2017

Contributor

Done

" \n",
"S = [None] * subjects\n",
"for s in range(subjects):\n",
" S[s] = (np.random.rand(data[s].shape[0], samples) < p) * ((np.random.rand(data[s].shape[0], samples) * amplitude - amplitude/2) )\n",

This comment has been minimized.

@CameronTEllis

CameronTEllis Nov 24, 2017

Contributor

Why use uniform noise? I would have thought gaussian noise would be more reasonable?

This comment has been minimized.

@TuKo

TuKo Nov 27, 2017

Contributor

S is not the noise but the sparse individual component. If we use a Normal distribution, S will mix with the shared signal because sums of Normal distributions are also Normal distributions. In that case you will have a difficult time to identify what comes from where (Shared vs individual). This is also a common assumption for the RPCA.

This comment has been minimized.

@CameronTEllis

CameronTEllis Dec 10, 2017

Contributor

Fine

"features = 3\n",
"snr = 20 # in dB\n",
"amplitude = 8.0 # approximate amplitude used for the values in Si\n",
"p = 0.05 # probability of being non-zero in Si"

This comment has been minimized.

@CameronTEllis

CameronTEllis Nov 24, 2017

Contributor

How does this relate to the gamma value used later? Why are they not the same?

This comment has been minimized.

@TuKo

TuKo Nov 27, 2017

Contributor

p is the probability of an element to be non-zero in the original Si (drawn from the model).
gamma is selected to approximately have the same number of non-zeros in the estimated Si.
However, there is no reason for them to have the same values. Ones is the probability and the other is a balancing value between regularization terms.

This comment has been minimized.

@CameronTEllis

CameronTEllis Dec 10, 2017

Contributor

I might still be misunderstanding but shouldn't the non-zero elements of the original Si be thought of as the individual component that RSRM is aiming to extract? Why would there be a different number in the input vs the output?

This comment has been minimized.

@TuKo

TuKo Dec 11, 2017

Contributor

The math is not translated 1:1. Remember that this is probability, and when you draw a signal there are not exactly 5% of non-zeros. There could be anywhere near the 0.05. Added to that this will be differently with all subjects. So every subject will have approximately that, but not exactly.
Therefore, even if you can obtain a 1:1 translation from p (range 0 to 1) to gamma (range 0 to +inf), still the signal itself will not necessarily behave in that way.

Also, this is a particular case of drawing Si's, and you can change it by other distributions as well. In such case, you will have parameters to generate the signals and still gamma to recover it. Dividing between gamma and p, it is a good idea.

This comment has been minimized.

@CameronTEllis

CameronTEllis Dec 11, 2017

Contributor

Okay fair enough I understand

" n = noise_level * np.random.randn(voxels, samples)\n",
" data[s] += n\n",
" \n",
"S = [None] * subjects\n",

This comment has been minimized.

@CameronTEllis

CameronTEllis Nov 24, 2017

Contributor

Might be good to make another example where there is no S variability and show that RSRM does as well as SRM

This comment has been minimized.

@TuKo

TuKo Nov 27, 2017

Contributor

The idea of the example is to show how to use it. I chose the synthetic example as it is easy to replicate and visually show the improvements. If you set gamma to "infinity", you will be solving exactly the (deterministic) SRM.

This comment has been minimized.

@CameronTEllis

CameronTEllis Dec 10, 2017

Contributor

Good

@TuKo

This comment has been minimized.

Contributor

TuKo commented Dec 6, 2017

@CameronTEllis do you have any suggestion or comment for the newest version?

@TuKo

This comment has been minimized.

Contributor

TuKo commented Dec 11, 2017

@CameronTEllis do you think there are more changes to be done? If we are done, then don't forget to approve the review. Otherwise let me know.
Thanks!

@mihaic mihaic merged commit f708dfb into brainiak:master Dec 12, 2017

3 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
linux Build finished.
Details
macos Build finished.
Details

@TuKo TuKo deleted the TuKo:rsrm branch Dec 12, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment