Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Robust Shared Response Model #302

Merged
merged 33 commits into from Dec 12, 2017
Merged

Robust Shared Response Model #302

merged 33 commits into from Dec 12, 2017

Conversation

@TuKo
Copy link
Contributor

@TuKo TuKo commented Nov 23, 2017

The code introduces the Robust SRM, tests and an example with synthetic data.

@TuKo
Copy link
Contributor Author

@TuKo TuKo commented Nov 23, 2017

@CameronTEllis and @hejiaz, please do you mind reviewing this PR?

Thank you and happy thanksgiving!

Copy link
Contributor

@CameronTEllis CameronTEllis left a comment

Overall it is looking good, my comments are relatively minor and mainly just focused on fleshing details out

"""Robust Shared Response Model (RSRM)
The implementation is based on the following publications:

This comment has been minimized.

@CameronTEllis

CameronTEllis Nov 24, 2017
Contributor

It would be good to provide a paragraph explaining what RSRM offers and how it compares to SRM, SSRM etc. You could copy and paste one of the paragraphs from manuscript. Also should mention the influence of RPCA

This comment has been minimized.

@TuKo

TuKo Nov 27, 2017
Author Contributor

I added an extended description of the method. However, I don't think that the relation to RPCA is relevant to the code/implementation. The idea is that people can understand the code and relate to the paper if there is something unclear.

This comment has been minimized.

@CameronTEllis

CameronTEllis Dec 10, 2017
Contributor

Fair enough

gamma : float, default: 1.0
Regularization parameter for the sparseness of the individual
components.

This comment has been minimized.

@CameronTEllis

CameronTEllis Nov 24, 2017
Contributor

Would be good to add direction to this description (Is high more or less sparse?)

This comment has been minimized.

@TuKo

TuKo Nov 27, 2017
Author Contributor

done

The shared response.
s_ : list of array, element i has shape=[voxels_i, samples]
The individual components with outlying data for each subject.

This comment has been minimized.

@CameronTEllis

CameronTEllis Nov 24, 2017
Contributor

You use outlier here and elsewhere but sometimes refer to this as the individual component. It is important to be consistent so check that the same word is used everywehere

This comment has been minimized.

@TuKo

TuKo Nov 27, 2017
Author Contributor

Done

----
The number of voxels may be different between subjects. However, the
number of samples for the alignment data must be the same across

This comment has been minimized.

@CameronTEllis

CameronTEllis Nov 24, 2017
Contributor

'samples' is not a typical way of describing timepoints in fMRI data. Consider changing this to say that 'timepoints must be aligned across participants'

This comment has been minimized.

@TuKo

TuKo Nov 27, 2017
Author Contributor

Done (everywhere)

X : list of 2D arrays, element i has shape=[voxels_i, samples]
Each element in the list contains the fMRI data of one subject.
y : not used

This comment has been minimized.

@CameronTEllis

CameronTEllis Nov 24, 2017
Contributor

What is this for then?

This comment has been minimized.

@TuKo

TuKo Nov 27, 2017
Author Contributor

Removed.

Parameters
----------
X : array, shape=[voxels, samples]

This comment has been minimized.

@CameronTEllis

CameronTEllis Nov 24, 2017
Contributor

Maybe use a different letter, or lower case since this is no longer a list like it is in other functions. same with S below

This comment has been minimized.

@TuKo

TuKo Nov 27, 2017
Author Contributor

Done

"""
A = X.dot(R.T)
A -= S.dot(R.T)
U, _, V = np.linalg.svd(A, full_matrices=False)

This comment has been minimized.

@CameronTEllis

CameronTEllis Nov 24, 2017
Contributor

This is probably obvious but I am unsure why you are doing the SVD. A comment would help

This comment has been minimized.

@TuKo

TuKo Nov 27, 2017
Author Contributor

Done

" \n",
"S = [None] * subjects\n",
"for s in range(subjects):\n",
" S[s] = (np.random.rand(data[s].shape[0], samples) < p) * ((np.random.rand(data[s].shape[0], samples) * amplitude - amplitude/2) )\n",

This comment has been minimized.

@CameronTEllis

CameronTEllis Nov 24, 2017
Contributor

Why use uniform noise? I would have thought gaussian noise would be more reasonable?

This comment has been minimized.

@TuKo

TuKo Nov 27, 2017
Author Contributor

S is not the noise but the sparse individual component. If we use a Normal distribution, S will mix with the shared signal because sums of Normal distributions are also Normal distributions. In that case you will have a difficult time to identify what comes from where (Shared vs individual). This is also a common assumption for the RPCA.

This comment has been minimized.

@CameronTEllis

CameronTEllis Dec 10, 2017
Contributor

Fine

"features = 3\n",
"snr = 20 # in dB\n",
"amplitude = 8.0 # approximate amplitude used for the values in Si\n",
"p = 0.05 # probability of being non-zero in Si"

This comment has been minimized.

@CameronTEllis

CameronTEllis Nov 24, 2017
Contributor

How does this relate to the gamma value used later? Why are they not the same?

This comment has been minimized.

@TuKo

TuKo Nov 27, 2017
Author Contributor

p is the probability of an element to be non-zero in the original Si (drawn from the model).
gamma is selected to approximately have the same number of non-zeros in the estimated Si.
However, there is no reason for them to have the same values. Ones is the probability and the other is a balancing value between regularization terms.

This comment has been minimized.

@CameronTEllis

CameronTEllis Dec 10, 2017
Contributor

I might still be misunderstanding but shouldn't the non-zero elements of the original Si be thought of as the individual component that RSRM is aiming to extract? Why would there be a different number in the input vs the output?

This comment has been minimized.

@TuKo

TuKo Dec 11, 2017
Author Contributor

The math is not translated 1:1. Remember that this is probability, and when you draw a signal there are not exactly 5% of non-zeros. There could be anywhere near the 0.05. Added to that this will be differently with all subjects. So every subject will have approximately that, but not exactly.
Therefore, even if you can obtain a 1:1 translation from p (range 0 to 1) to gamma (range 0 to +inf), still the signal itself will not necessarily behave in that way.

Also, this is a particular case of drawing Si's, and you can change it by other distributions as well. In such case, you will have parameters to generate the signals and still gamma to recover it. Dividing between gamma and p, it is a good idea.

This comment has been minimized.

@CameronTEllis

CameronTEllis Dec 11, 2017
Contributor

Okay fair enough I understand

" n = noise_level * np.random.randn(voxels, samples)\n",
" data[s] += n\n",
" \n",
"S = [None] * subjects\n",

This comment has been minimized.

@CameronTEllis

CameronTEllis Nov 24, 2017
Contributor

Might be good to make another example where there is no S variability and show that RSRM does as well as SRM

This comment has been minimized.

@TuKo

TuKo Nov 27, 2017
Author Contributor

The idea of the example is to show how to use it. I chose the synthetic example as it is easy to replicate and visually show the improvements. If you set gamma to "infinity", you will be solving exactly the (deterministic) SRM.

This comment has been minimized.

@CameronTEllis

CameronTEllis Dec 10, 2017
Contributor

Good

@TuKo TuKo force-pushed the TuKo:rsrm branch from 431c130 to 61b9d14 Nov 28, 2017
@TuKo
Copy link
Contributor Author

@TuKo TuKo commented Dec 6, 2017

@CameronTEllis do you have any suggestion or comment for the newest version?

@TuKo
Copy link
Contributor Author

@TuKo TuKo commented Dec 11, 2017

@CameronTEllis do you think there are more changes to be done? If we are done, then don't forget to approve the review. Otherwise let me know.
Thanks!

@mihaic mihaic merged commit f708dfb into brainiak:master Dec 12, 2017
3 checks passed
3 checks passed
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@buildbot-princeton
linux Build finished.
Details
@buildbot-princeton
macos Build finished.
Details
@TuKo TuKo deleted the TuKo:rsrm branch Dec 12, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

3 participants