{{ message }}

# Adding ISC and ISFC #183

Merged
merged 13 commits into from Feb 24, 2017
Merged

merged 13 commits into from Feb 24, 2017

## Conversation

### cbaldassano commented Feb 21, 2017

 I've added some simple functions for computing ISC and ISFC, taking advantage of the optimized correlation engine from FCMA. I also wrote a utility function for loading a set of masked subject data for use with these functions. Per the contributing doc, I've put these functions in a single file rather than a submodule.
added 4 commits Feb 20, 2017
 Adding isfc functions 
 003cd8f 
 Full ISFC code prior to pr checks 
 d0bd0c4 
 Full isfc commit 
 10ec572 
 Adding script to download example data 
 298001e 

### mihaic commented Feb 21, 2017

 The code looks great! I will make some suggestions to improve readability.
requested changes
 group = np.mean(D[:, :, np.arange(n_subj) != loo_subj], axis=2) subj = D[:, :, loo_subj] for v in range(n_vox): ISC[v, loo_subj] = stats.pearsonr(group[v, :], subj[v, :])[0]

#### yidawang Feb 21, 2017 Member

Why don't you use fcma.util.compute_correlation here as well? If it is because of the data layouts, you can simply convert them accordingly. I believe even adding the layout conversion time up, the overall running time should outperform scipy.stats.pearsonr.

#### mihaic Feb 22, 2017 Contributor

@yidawang, is this what you had in mind? I though you meant it would be better to make single call to compute_correlation per subject.

#### cbaldassano Feb 22, 2017 Author Collaborator

I don't think that is possible (@yidawang, correct me if I'm wrong) - compute_correlation computes correlations between all pairs of vectors, while for ISC we only need correlations between corresponding pairs (i.e. the diagonal of what compute_correlation returns).

#### yidawang Feb 23, 2017 Member

If this is the case, calling scipy.stats.pearsonr is fine with me.

#### mihaic Feb 23, 2017 Contributor

Then it is probably best to revert to the original code, for clarity.

#### yidawang Feb 23, 2017 Member

Yes, reverting it would be better in terms of performance as well. Sorry for the confusion, I didn't fully get what you are doing in the beginning. @cbaldassano

#### cbaldassano Feb 24, 2017 Author Collaborator

I've just tested this, and yes it is much faster to use pearsonr in this case, so I've reverted the change.

requested changes
 Parameters ---------- data_files : list of filenames of subject nii files

#### mihaic Feb 21, 2017 Contributor

Please document the required data shape.

#### cbaldassano Feb 22, 2017 Author Collaborator

This is just a base python list, so it is 1D

#### mihaic Feb 22, 2017 Contributor

That is clear, but what is the expected shape of the data inside the nii files?

#### cbaldassano Feb 22, 2017 Author Collaborator

Ah sorry, I've clarified in the comments now

 ------- (D, coords) : tuple of voxel by time by subject ndarray, and x,y,x (as provided by nibabel) coordinates of voxel locations (as a tuple of ndarrays

#### mihaic Feb 21, 2017 Contributor

I think the coords type and meaning could be clearer. Please document D and coords separately, using the expected style, i.e., one line for the type and separate text for the meaning.

 n_vox = D.shape[0] n_subj = D.shape[2] ISFC = np.zeros((n_vox, n_vox, n_subj)) for loo_subj in range(D.shape[2]):

#### mihaic Feb 21, 2017 Contributor

Is loo an abbreviation for index? If so, could you please replace it with a more common abbreviation, e.g., subj_idx?

#### cbaldassano Feb 22, 2017 Author Collaborator

loo is short for leave-one-out, which I've now clarified in a comment

 subj = D[:, :, loo_subj] ISFC[:, :, loo_subj] = compute_correlation(group, subj) ISFC[:, :, loo_subj] = (ISFC[:, :, loo_subj] + ISFC[:, :, loo_subj].T) / 2

#### mihaic Feb 21, 2017 Contributor

 import numpy as np from matplotlib import pyplot as plt from scipy.cluster.hierarchy import fcluster, linkage import sys, os

#### mihaic Feb 21, 2017 Contributor

Imports should be on separate lines.

 ISC_vol = np.zeros(brain_nii.shape) ISC_vol[coords] = ISC ISC_nifti = nib.Nifti1Image(ISC_vol, brain_nii.get_affine(), brain_nii.header)

#### mihaic Feb 21, 2017 Contributor

coords_ISFC is not used.

 z = fcluster(Z, 2, criterion='maxclust') clust_inds = np.argsort(z) plt.imshow(ISFC[np.ix_(clust_inds,clust_inds)])

#### mihaic Feb 21, 2017 Contributor

Could you please add a comment explaining briefly what the image shows?

 Functions for computing intersubject correlation (ISC) and intersubject functional correlation (ISFC), and utility functions for loading subject data files and masks """

#### mihaic Feb 21, 2017 Contributor

 ISFC = brainiak.isfc.isfc(D) gt = \

#### mihaic Feb 21, 2017 Contributor

ground_truth?

added 2 commits Feb 22, 2017
 Minor changes in response to comments 
 931b0a0 
 Updated description of nii files 
 a04f2f7 

### cbaldassano commented Feb 22, 2017 • edited

 Hmm, the macos build is failing but it is working fine on my laptop. If I click Details, I get a permissions error "chrisb is missing the Overall/Read permission" so I'm not able to see what is going wrong.
added 2 commits Feb 22, 2017
 Fixed bug in example 
 3d62874 
 Fixed bug in calling compute_correlation 
 c811e7c 

### mihaic commented Feb 22, 2017

 As the commiter, you should have received an email from Jenkins with the output. (The Princeton Jenkins administrator does not have a solution for allowing the public to view the output yet.) The problem is your usage of compute_correlation. You pass 1D ndarrays and it expects 2D ones: brainiak/isfc.py:68: in isc ISC[v, loo_subj] = compute_correlation(group[v, :], subj[v, :]) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ matrix1 = array([-0.29038674, -0.5220086 , 0.23681971, 0.08854008, -0.41586313], dtype=float32) matrix2 = array([-0.36225432, -0.43482456, 0.26723158, 0.16461712, -0.37991464], dtype=float32) def compute_correlation(matrix1, matrix2): """compute correlation between two sets of variables Correlate the rows of matrix1 with the rows of matrix2. If matrix1 == matrix2, it is auto-correlation computation resulting in a symmetric correlation matrix. The number of columns MUST agree between set1 and set2. The correlation being computed here is the Pearson's correlation coefficient, which can be expressed as .. math:: corr(X, Y) = \\frac{cov(X, Y)}{\\sigma_X\\sigma_Y} where cov(X, Y) is the covariance of variable X and Y, and .. math:: \\sigma_X is the standard deviation of variable X Reducing the correlation computation to matrix multiplication and using BLAS GEMM API wrapped by Scipy can speedup the numpy built-in correlation computation (numpy.corrcoef) by one order of magnitude .. math:: corr(X, Y) &= \\frac{\\sum\\limits_{i=1}^n (x_i-\\bar{x})(y_i-\\bar{y})}{(n-1) \\sqrt{\\frac{\\sum\\limits_{j=1}^n x_j^2-n\\bar{x}}{n-1}} \\sqrt{\\frac{\\sum\\limits_{j=1}^{n} y_j^2-n\\bar{y}}{n-1}}}\\\\ &= \\sum\\limits_{i=1}^n(\\frac{(x_i-\\bar{x})} {\\sqrt{\\sum\\limits_{j=1}^n x_j^2-n\\bar{x}}} \\frac{(y_i-\\bar{y})}{\\sqrt{\\sum\\limits_{j=1}^n y_j^2-n\\bar{y}}}) Parameters ---------- matrix1: 2D array in shape [r1, c] MUST be continuous and row-major matrix2: 2D array in shape [r2, c] MUST be continuous and row-major Returns ------- corr_data: 2D array in shape [r1, r2] continuous and row-major in np.float32 """ matrix1 = matrix1.astype(np.float32) matrix2 = matrix2.astype(np.float32) > [r1, d1] = matrix1.shape E ValueError: not enough values to unpack (expected 2, got 1) 

### cbaldassano commented Feb 22, 2017

 I fixed the compute_correlation issue in the most recent commit c811e7c but the builds are still failing. I've never gotten email from Jenkins, is there a way to see what email it is sending to? My local git email might not be set correctly?

### mihaic commented Feb 22, 2017

 The email address is extracted from the commit, so you should make sure it is valid. The error now is: flake8 --config setup.cfg brainiak brainiak/isfc.py:70:58: E231 missing whitespace after ',' 
 Fixed whitespace 
 eec71b2 

### cbaldassano commented Feb 22, 2017

 Ok, fixed - forgot that run-tests didn't also run the static code checker. I have my local email set up correctly now, so hopefully Jenkins will send me emails now.
added 3 commits Feb 22, 2017
 Fixed whitespace 
 3d4608f 
 Merge branch 'master' of https://github.com/cbaldassano/brainiak 
 0119c77 
 Reverting correlation computation to use scipy 
 7ed3f65 

### cbaldassano commented Feb 24, 2017

 During the last commit one of the Travis builds ended in an error due to a timeout on the server - hopefully it goes through successfully this time, otherwise it might need to manually re-run in some way.
approved these changes
approved these changes

### mihaic commented Feb 24, 2017

 @cbaldassano, I am ready to merge this. However, we need to update it with the latest commits from master. Usually, I would just use the "Update branch" button on GitHub to update it myself, but I noticed you are using your master branch instead of a branch specifically created for this PR. Can you please do the update?
 Merge pull request #1 from IntelPNI/master 
 9a2501b 
Bringing fork up to date with master

### cbaldassano commented Feb 24, 2017

 OK I think my fork is now up to date and should be mergeable by @mihaic
merged commit 7940c62 into brainiak:master Feb 24, 2017
2 of 3 checks passed
2 of 3 checks passed
continuous-integration/travis-ci/pr The Travis CI build is in progress
Details
linux Build finished.
Details
macos Build finished.
Details
pushed a commit that referenced this pull request Nov 16, 2017
 Merge pull request #183 from amplab/search_paths 
 bcbebf4 
Search PATH in addition to current directory for executables
pushed a commit that referenced this pull request Nov 16, 2017
 Make numpy arrays immutable (#183) 
 cac473b 
* Make numpy arrays immutable in numbuf

* Move break statement outside of brackets

* Simplify test case

* Simplify test case