Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fcma partial sim matrix #168

Merged
merged 8 commits into from Jan 26, 2017
Merged

Conversation

@yidawang
Copy link
Member

@yidawang yidawang commented Jan 25, 2017

add partial similarity matrix algorithm in correlation-based classification
add the corresponding test code
rename some methods
improve the example code

@@ -245,6 +303,12 @@ def fit(self, X, y):
and prepared for correlation computation.
assuming all elements of X has the same num_voxels value
y: labels, len(X) equals len(Y)
num_training_samples: int, default None
the number of samples that used in the training,

This comment has been minimized.

@mihaic

mihaic Jan 25, 2017
Contributor

Typo: "that used".

This comment has been minimized.

@yidawang

yidawang Jan 25, 2017
Author Member

done

@@ -79,27 +91,43 @@ class Classifier(BaseEstimator):
num_samples_: int
The number of samples of the training set
num_digits_: int
The number of digit of the first value of the kernel matrix,

This comment has been minimized.

@mihaic

mihaic Jan 25, 2017
Contributor

Typo: "number of digit".

This comment has been minimized.

@yidawang

yidawang Jan 25, 2017
Author Member

done

Used for SVM with precomputed kernel,
every time only compute correlation between num_process_voxels and
the whole brain to aggregate the kernel matrices.
This is to better use the memory

This comment has been minimized.

@mihaic

mihaic Jan 25, 2017
Contributor

You mean to save memory? Also, please write sentences in docstrings, including full stops and all other punctuation.

This comment has been minimized.

@yidawang

yidawang Jan 25, 2017
Author Member

yes, computing similarity matrices portion by portion can save the memory so as to handle correlations at a larger scale.

@@ -57,7 +63,10 @@ class Classifier(BaseEstimator):
default None

This comment has been minimized.

@mihaic

mihaic Jan 25, 2017
Contributor

Attributes are not set during initialization. They are not None.

This comment has been minimized.

@yidawang

yidawang Jan 25, 2017
Author Member

deleted this line

@@ -57,7 +63,10 @@ class Classifier(BaseEstimator):
default None
training_data\_ is None except clf is SVM.SVC with precomputed kernel,
in which case training data is needed to compute
the similarity vector for each sample to be classified
the similarity vector for each sample to be classified.
However, if the test samples are also provided during the fit,

This comment has been minimized.

@mihaic

mihaic Jan 25, 2017
Contributor

Please add a paragraph at the beginning of the docstring explaining how testing data can be passed to fit as an optimization.

This comment has been minimized.

@yidawang

yidawang Jan 26, 2017
Author Member

done

subsequent operations, e.g. getting decision values of the prediction
subsequent operations, e.g. getting decision values of the prediction.
test_data\_ may also be set in the fit method
if SVM.SVC with precomputed kernel and the test samples are known.

This comment has been minimized.

@mihaic

mihaic Jan 25, 2017
Contributor

Use the full package name in docstrings for modules outside BrainIAK, i.e., sklearn.svm.SVC.

This comment has been minimized.

@yidawang

yidawang Jan 25, 2017
Author Member

done

Parameters
----------
X: a list of numpy array in shape [num_TRs, num_voxels]
len(X) is the number of samples
assuming all elements of X has the same num_voxels value
start_voxel: int, default 0
the starting voxel id for correlation computation
num_voxels_1: int, default None

This comment has been minimized.

@mihaic

mihaic Jan 25, 2017
Contributor

Please use a descriptive name.

This comment has been minimized.

@yidawang

yidawang Jan 26, 2017
Author Member

done

which is set when the similarity matrix is constructed
portion by portion so the similarity vectors of the
test data have to be computed here.
If it is set, only those samples will be used to fit the model

This comment has been minimized.

@mihaic

mihaic Jan 25, 2017
Contributor

Document also that this only applies to SVMs with precomputed kernels, where it must be present.

This comment has been minimized.

@yidawang

yidawang Jan 26, 2017
Author Member

done

@@ -27,25 +28,32 @@
logging.basicConfig(level=logging.INFO, format=format, stream=sys.stdout)
logger = logging.getLogger(__name__)

def example_of_aggregating_sim_matrix(raw_data, labels):
# aggregate the similarity matrix to save memory
use_clf = svm.SVC(kernel='precomputed', shrinking=False, C=1)

This comment has been minimized.

@mihaic

mihaic Jan 25, 2017
Contributor

use_clf seems like a Boolean name. How about svm_clf?

This comment has been minimized.

@yidawang

yidawang Jan 26, 2017
Author Member

done

yidawang added 2 commits Jan 26, 2017
@mihaic
mihaic approved these changes Jan 26, 2017
@mihaic mihaic merged commit ae41eba into brainiak:master Jan 26, 2017
2 of 3 checks passed
2 of 3 checks passed
continuous-integration/travis-ci/pr The Travis CI build is in progress
Details
linux Build finished.
Details
macos Build finished.
Details
@yidawang yidawang deleted the yidawang:fcma_partial_sim_matrix branch Jan 26, 2017
danielsuo pushed a commit that referenced this pull request Nov 16, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

2 participants