New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding ISC and ISFC #183

Merged
merged 13 commits into from Feb 24, 2017

Conversation

Projects
None yet
3 participants
@cbaldassano
Collaborator

cbaldassano commented Feb 21, 2017

I've added some simple functions for computing ISC and ISFC, taking advantage of the optimized correlation engine from FCMA. I also wrote a utility function for loading a set of masked subject data for use with these functions. Per the contributing doc, I've put these functions in a single file rather than a submodule.

Christopher Baldassano added some commits Feb 20, 2017

@mihaic

This comment has been minimized.

Show comment
Hide comment
@mihaic

mihaic Feb 21, 2017

Contributor

The code looks great! I will make some suggestions to improve readability.

Contributor

mihaic commented Feb 21, 2017

The code looks great! I will make some suggestions to improve readability.

group = np.mean(D[:, :, np.arange(n_subj) != loo_subj], axis=2)
subj = D[:, :, loo_subj]
for v in range(n_vox):
ISC[v, loo_subj] = stats.pearsonr(group[v, :], subj[v, :])[0]

This comment has been minimized.

@yidawang

yidawang Feb 21, 2017

Contributor

Why don't you use fcma.util.compute_correlation here as well? If it is because of the data layouts, you can simply convert them accordingly. I believe even adding the layout conversion time up, the overall running time should outperform scipy.stats.pearsonr.

@yidawang

yidawang Feb 21, 2017

Contributor

Why don't you use fcma.util.compute_correlation here as well? If it is because of the data layouts, you can simply convert them accordingly. I believe even adding the layout conversion time up, the overall running time should outperform scipy.stats.pearsonr.

This comment has been minimized.

@mihaic

mihaic Feb 22, 2017

Contributor

@yidawang, is this what you had in mind? I though you meant it would be better to make single call to compute_correlation per subject.

@mihaic

mihaic Feb 22, 2017

Contributor

@yidawang, is this what you had in mind? I though you meant it would be better to make single call to compute_correlation per subject.

This comment has been minimized.

@cbaldassano

cbaldassano Feb 22, 2017

Collaborator

I don't think that is possible (@yidawang, correct me if I'm wrong) - compute_correlation computes correlations between all pairs of vectors, while for ISC we only need correlations between corresponding pairs (i.e. the diagonal of what compute_correlation returns).

@cbaldassano

cbaldassano Feb 22, 2017

Collaborator

I don't think that is possible (@yidawang, correct me if I'm wrong) - compute_correlation computes correlations between all pairs of vectors, while for ISC we only need correlations between corresponding pairs (i.e. the diagonal of what compute_correlation returns).

This comment has been minimized.

@yidawang

yidawang Feb 23, 2017

Contributor

If this is the case, calling scipy.stats.pearsonr is fine with me.

@yidawang

yidawang Feb 23, 2017

Contributor

If this is the case, calling scipy.stats.pearsonr is fine with me.

This comment has been minimized.

@mihaic

mihaic Feb 23, 2017

Contributor

Then it is probably best to revert to the original code, for clarity.

@mihaic

mihaic Feb 23, 2017

Contributor

Then it is probably best to revert to the original code, for clarity.

This comment has been minimized.

@yidawang

yidawang Feb 23, 2017

Contributor

Yes, reverting it would be better in terms of performance as well. Sorry for the confusion, I didn't fully get what you are doing in the beginning. @cbaldassano

@yidawang

yidawang Feb 23, 2017

Contributor

Yes, reverting it would be better in terms of performance as well. Sorry for the confusion, I didn't fully get what you are doing in the beginning. @cbaldassano

This comment has been minimized.

@cbaldassano

cbaldassano Feb 24, 2017

Collaborator

I've just tested this, and yes it is much faster to use pearsonr in this case, so I've reverted the change.

@cbaldassano

cbaldassano Feb 24, 2017

Collaborator

I've just tested this, and yes it is much faster to use pearsonr in this case, so I've reverted the change.

Parameters
----------
data_files : list of filenames of subject nii files

This comment has been minimized.

@mihaic

mihaic Feb 21, 2017

Contributor

Please document the required data shape.

@mihaic

mihaic Feb 21, 2017

Contributor

Please document the required data shape.

This comment has been minimized.

@cbaldassano

cbaldassano Feb 22, 2017

Collaborator

This is just a base python list, so it is 1D

@cbaldassano

cbaldassano Feb 22, 2017

Collaborator

This is just a base python list, so it is 1D

This comment has been minimized.

@mihaic

mihaic Feb 22, 2017

Contributor

That is clear, but what is the expected shape of the data inside the nii files?

@mihaic

mihaic Feb 22, 2017

Contributor

That is clear, but what is the expected shape of the data inside the nii files?

This comment has been minimized.

@cbaldassano

cbaldassano Feb 22, 2017

Collaborator

Ah sorry, I've clarified in the comments now

@cbaldassano

cbaldassano Feb 22, 2017

Collaborator

Ah sorry, I've clarified in the comments now

Show outdated Hide outdated brainiak/isfc.py Outdated
Show outdated Hide outdated brainiak/isfc.py Outdated
Show outdated Hide outdated brainiak/isfc.py Outdated
Show outdated Hide outdated examples/isfc/isfc.py Outdated
Show outdated Hide outdated examples/isfc/isfc.py Outdated
Show outdated Hide outdated examples/isfc/isfc.py Outdated
Show outdated Hide outdated examples/isfc/isfc.py Outdated
Show outdated Hide outdated brainiak/isfc.py Outdated
Show outdated Hide outdated tests/isfc/test_isfc.py Outdated

Christopher Baldassano added some commits Feb 22, 2017

@cbaldassano

This comment has been minimized.

Show comment
Hide comment
@cbaldassano

cbaldassano Feb 22, 2017

Collaborator

Hmm, the macos build is failing but it is working fine on my laptop. If I click Details, I get a permissions error "chrisb is missing the Overall/Read permission" so I'm not able to see what is going wrong.

Collaborator

cbaldassano commented Feb 22, 2017

Hmm, the macos build is failing but it is working fine on my laptop. If I click Details, I get a permissions error "chrisb is missing the Overall/Read permission" so I'm not able to see what is going wrong.

Christopher Baldassano added some commits Feb 22, 2017

Christopher Baldassano Christopher Baldassano
@mihaic

This comment has been minimized.

Show comment
Hide comment
@mihaic

mihaic Feb 22, 2017

Contributor

As the commiter, you should have received an email from Jenkins with the output. (The Princeton Jenkins administrator does not have a solution for allowing the public to view the output yet.)

The problem is your usage of compute_correlation. You pass 1D ndarrays and it expects 2D ones:

brainiak/isfc.py:68: in isc
    ISC[v, loo_subj] = compute_correlation(group[v, :], subj[v, :])
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

matrix1 = array([-0.29038674, -0.5220086 ,  0.23681971,  0.08854008, -0.41586313], dtype=float32)
matrix2 = array([-0.36225432, -0.43482456,  0.26723158,  0.16461712, -0.37991464], dtype=float32)

    def compute_correlation(matrix1, matrix2):
        """compute correlation between two sets of variables
    
        Correlate the rows of matrix1 with the rows of matrix2.
        If matrix1 == matrix2, it is auto-correlation computation
        resulting in a symmetric correlation matrix.
        The number of columns MUST agree between set1 and set2.
        The correlation being computed here is
        the Pearson's correlation coefficient, which can be expressed as
    
        .. math:: corr(X, Y) = \\frac{cov(X, Y)}{\\sigma_X\\sigma_Y}
    
        where cov(X, Y) is the covariance of variable X and Y, and
    
        .. math:: \\sigma_X
    
        is the standard deviation of variable X
    
        Reducing the correlation computation to matrix multiplication
        and using BLAS GEMM API wrapped by Scipy can speedup the numpy built-in
        correlation computation (numpy.corrcoef) by one order of magnitude
    
        .. math::
            corr(X, Y)
            &= \\frac{\\sum\\limits_{i=1}^n (x_i-\\bar{x})(y_i-\\bar{y})}{(n-1)
            \\sqrt{\\frac{\\sum\\limits_{j=1}^n x_j^2-n\\bar{x}}{n-1}}
            \\sqrt{\\frac{\\sum\\limits_{j=1}^{n} y_j^2-n\\bar{y}}{n-1}}}\\\\
            &= \\sum\\limits_{i=1}^n(\\frac{(x_i-\\bar{x})}
            {\\sqrt{\\sum\\limits_{j=1}^n x_j^2-n\\bar{x}}}
            \\frac{(y_i-\\bar{y})}{\\sqrt{\\sum\\limits_{j=1}^n y_j^2-n\\bar{y}}})
    
        Parameters
        ----------
        matrix1: 2D array in shape [r1, c]
            MUST be continuous and row-major
    
        matrix2: 2D array in shape [r2, c]
            MUST be continuous and row-major
    
        Returns
        -------
        corr_data: 2D array in shape [r1, r2]
            continuous and row-major in np.float32
        """
        matrix1 = matrix1.astype(np.float32)
        matrix2 = matrix2.astype(np.float32)
>       [r1, d1] = matrix1.shape
E       ValueError: not enough values to unpack (expected 2, got 1)
Contributor

mihaic commented Feb 22, 2017

As the commiter, you should have received an email from Jenkins with the output. (The Princeton Jenkins administrator does not have a solution for allowing the public to view the output yet.)

The problem is your usage of compute_correlation. You pass 1D ndarrays and it expects 2D ones:

brainiak/isfc.py:68: in isc
    ISC[v, loo_subj] = compute_correlation(group[v, :], subj[v, :])
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

matrix1 = array([-0.29038674, -0.5220086 ,  0.23681971,  0.08854008, -0.41586313], dtype=float32)
matrix2 = array([-0.36225432, -0.43482456,  0.26723158,  0.16461712, -0.37991464], dtype=float32)

    def compute_correlation(matrix1, matrix2):
        """compute correlation between two sets of variables
    
        Correlate the rows of matrix1 with the rows of matrix2.
        If matrix1 == matrix2, it is auto-correlation computation
        resulting in a symmetric correlation matrix.
        The number of columns MUST agree between set1 and set2.
        The correlation being computed here is
        the Pearson's correlation coefficient, which can be expressed as
    
        .. math:: corr(X, Y) = \\frac{cov(X, Y)}{\\sigma_X\\sigma_Y}
    
        where cov(X, Y) is the covariance of variable X and Y, and
    
        .. math:: \\sigma_X
    
        is the standard deviation of variable X
    
        Reducing the correlation computation to matrix multiplication
        and using BLAS GEMM API wrapped by Scipy can speedup the numpy built-in
        correlation computation (numpy.corrcoef) by one order of magnitude
    
        .. math::
            corr(X, Y)
            &= \\frac{\\sum\\limits_{i=1}^n (x_i-\\bar{x})(y_i-\\bar{y})}{(n-1)
            \\sqrt{\\frac{\\sum\\limits_{j=1}^n x_j^2-n\\bar{x}}{n-1}}
            \\sqrt{\\frac{\\sum\\limits_{j=1}^{n} y_j^2-n\\bar{y}}{n-1}}}\\\\
            &= \\sum\\limits_{i=1}^n(\\frac{(x_i-\\bar{x})}
            {\\sqrt{\\sum\\limits_{j=1}^n x_j^2-n\\bar{x}}}
            \\frac{(y_i-\\bar{y})}{\\sqrt{\\sum\\limits_{j=1}^n y_j^2-n\\bar{y}}})
    
        Parameters
        ----------
        matrix1: 2D array in shape [r1, c]
            MUST be continuous and row-major
    
        matrix2: 2D array in shape [r2, c]
            MUST be continuous and row-major
    
        Returns
        -------
        corr_data: 2D array in shape [r1, r2]
            continuous and row-major in np.float32
        """
        matrix1 = matrix1.astype(np.float32)
        matrix2 = matrix2.astype(np.float32)
>       [r1, d1] = matrix1.shape
E       ValueError: not enough values to unpack (expected 2, got 1)
@cbaldassano

This comment has been minimized.

Show comment
Hide comment
@cbaldassano

cbaldassano Feb 22, 2017

Collaborator

I fixed the compute_correlation issue in the most recent commit c811e7c but the builds are still failing. I've never gotten email from Jenkins, is there a way to see what email it is sending to? My local git email might not be set correctly?

Collaborator

cbaldassano commented Feb 22, 2017

I fixed the compute_correlation issue in the most recent commit c811e7c but the builds are still failing. I've never gotten email from Jenkins, is there a way to see what email it is sending to? My local git email might not be set correctly?

@mihaic

This comment has been minimized.

Show comment
Hide comment
@mihaic

mihaic Feb 22, 2017

Contributor

The email address is extracted from the commit, so you should make sure it is valid.

The error now is:

flake8 --config setup.cfg brainiak
brainiak/isfc.py:70:58: E231 missing whitespace after ','
Contributor

mihaic commented Feb 22, 2017

The email address is extracted from the commit, so you should make sure it is valid.

The error now is:

flake8 --config setup.cfg brainiak
brainiak/isfc.py:70:58: E231 missing whitespace after ','
Christopher Baldassano Christopher Baldassano
@cbaldassano

This comment has been minimized.

Show comment
Hide comment
@cbaldassano

cbaldassano Feb 22, 2017

Collaborator

Ok, fixed - forgot that run-tests didn't also run the static code checker. I have my local email set up correctly now, so hopefully Jenkins will send me emails now.

Collaborator

cbaldassano commented Feb 22, 2017

Ok, fixed - forgot that run-tests didn't also run the static code checker. I have my local email set up correctly now, so hopefully Jenkins will send me emails now.

@cbaldassano

This comment has been minimized.

Show comment
Hide comment
@cbaldassano

cbaldassano Feb 24, 2017

Collaborator

During the last commit one of the Travis builds ended in an error due to a timeout on the server - hopefully it goes through successfully this time, otherwise it might need to manually re-run in some way.

Collaborator

cbaldassano commented Feb 24, 2017

During the last commit one of the Travis builds ended in an error due to a timeout on the server - hopefully it goes through successfully this time, otherwise it might need to manually re-run in some way.

@mihaic

mihaic approved these changes Feb 24, 2017

@mihaic

This comment has been minimized.

Show comment
Hide comment
@mihaic

mihaic Feb 24, 2017

Contributor

@cbaldassano, I am ready to merge this. However, we need to update it with the latest commits from master. Usually, I would just use the "Update branch" button on GitHub to update it myself, but I noticed you are using your master branch instead of a branch specifically created for this PR. Can you please do the update?

Contributor

mihaic commented Feb 24, 2017

@cbaldassano, I am ready to merge this. However, we need to update it with the latest commits from master. Usually, I would just use the "Update branch" button on GitHub to update it myself, but I noticed you are using your master branch instead of a branch specifically created for this PR. Can you please do the update?

Merge pull request #1 from IntelPNI/master
Bringing fork up to date with master
@cbaldassano

This comment has been minimized.

Show comment
Hide comment
@cbaldassano

cbaldassano Feb 24, 2017

Collaborator

OK I think my fork is now up to date and should be mergeable by @mihaic

Collaborator

cbaldassano commented Feb 24, 2017

OK I think my fork is now up to date and should be mergeable by @mihaic

@mihaic mihaic merged commit 7940c62 into brainiak:master Feb 24, 2017

2 of 3 checks passed

continuous-integration/travis-ci/pr The Travis CI build is in progress
Details
linux Build finished.
Details
macos Build finished.
Details

danielsuo pushed a commit that referenced this pull request Nov 16, 2017

Merge pull request #183 from amplab/search_paths
Search PATH in addition to current directory for executables

danielsuo pushed a commit that referenced this pull request Nov 16, 2017

Make numpy arrays immutable (#183)
* Make numpy arrays immutable in numbuf

* Move break statement outside of brackets

* Simplify test case

* Simplify test case
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment