Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Created scaffolding script #2

Merged
merged 3 commits into from Jun 1, 2020

Conversation

mschoettner
Copy link
Contributor

Added a file called script-scaffolding.py that can be used as a
basis for each classifier that we want to train. How training and
test set are to be defined as well as the definition of the model
have been left empty.
@anproulx @emilyemchen @guenounz Please check for errors

Added a file called script-scaffolding.py that can be used as a
basis for each classifier that we want to train. How training and
test set are to be defined as well as the definition of the model
have been left empty.
Updated script-scaffolding.py to include argument parser. Also corrected an error and added the
nilearn cache to .gitignore.
Added a check if a feature file has already been saved and loads it if 
it exists.
Copy link
Contributor

@anproulx anproulx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super!!! I have one small comment about the part where you extract the target vector... it is not necessary to iterate to extract the data since sklearn has been conceived to work with pandas dataframes. to extract the Y variable you can only do : Y=phenotypic["DX_GROUP"] and you will be able to use that in your model after. I can make the changes if you'd like

Copy link
Contributor

@emilyemchen emilyemchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for this! A few questions that I'm not sure about so I won't request changes:

  1. If we're not plotting we don't need %matplotlib inline right? I have that in my script but maybe it isn't necessary
  2. You did say subfolder ABIDE_pcp but maybe we specify that the super folder is called nilearn_data?

@emilyemchen emilyemchen merged commit 60c24c3 into brainhack-school2020:master Jun 1, 2020
Copy link
Contributor

@anproulx anproulx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will later today add the code to remove half the matrix, and I will apply PCA so that we reduce the amount features used in our prediction models (while stile keeping 99% of explained variance)

@mschoettner
Copy link
Contributor Author

@emilyemchen

  1. Yes, exactly, %matplotlib inline makes plots appear below cells if you run them. In a regular Python script they do not serve any purpose.
  2. Could be, I think I renamed the super folder (:

@mschoettner
Copy link
Contributor Author

I will later today add the code to remove half the matrix, and I will apply PCA so that we reduce the amount features used in our prediction models (while stile keeping 99% of explained variance)

@anproulx that sounds awesome! Maybe open an issue for that for the sake of transparency :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants