Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Claim an sklearn algorithm to implement and troubleshoot #27

Open
dhimmel opened this issue Aug 7, 2016 · 21 comments
Open

Claim an sklearn algorithm to implement and troubleshoot #27

dhimmel opened this issue Aug 7, 2016 · 21 comments

Comments

@dhimmel
Copy link
Member

dhimmel commented Aug 7, 2016

In the August 26 meetup, we discussed having each team member in the machine learning group claim an algorithm. We've made lot's of progress on the example notebook (1.TCGA-MLexample.ipynb) since then (see #18 & #25). Currently, 1.TCGA-MLexample.ipynb uses elastic net logistic regression implemented in SGDClassifier.

The goal of this repository is for people to:

  1. Claim an algorithm. See the list of classifiers at Create a table classifying the different algorithms  #5 (comment). The main requirement is that the algorithm uses the sklearn API so we can use it in the pipeline. Make a comment here once you've chosen an algorithm.
  2. Create a modified version of 1.TCGA-MLexample.ipynb in an algorithms directory. So if I took the SVM classifier, I would copy 1.TCGA-MLexample.ipynb to algorithms/SVC-dhimmel.ipynb. Then I would make my edits to algorithms/SVC-dhimmel.ipynb to switch to an SVC classifier.
  3. Your goal should be to pick a good set of parameters for grid search. It would also be great if you could document what seems to work well about the algorithm (or if it doesn't seem to work well).

Best of luck! If you can work on this before the August 9 meetup then great! Otherwise make sure to bring a laptop with the cognoma-machine-learning environment installed.

@dhimmel
Copy link
Member Author

dhimmel commented Aug 7, 2016

Tagging everyone who said they were interested in contributing to the machine learning part of the project in the introduction thread: @htcai, @loucru1 @bmcgeehan @Umeshiso @swbiggs4 @danrieman @rramyr @Inquisitive-Geek @brankaj @yl565 @Ramaa-Nathan @ejsegall @FadiAlnabolsi @sameertipnis @VijYadav @ctipnis.

Update: also tagging @yigalron.

@htcai
Copy link
Member

htcai commented Aug 7, 2016

Thanks for the sample notebook! I would like to implement Linear SVM with regularization.

@dhimmel
Copy link
Member Author

dhimmel commented Aug 8, 2016

@htcai awesome. Can you specify which sklearn function(s) you plan to use for the model?

@brankaj
Copy link
Member

brankaj commented Aug 8, 2016

Thanks for setting this up. I would like to try LASSO, implementation sklearn.linear_model.LassoCV.

@htcai
Copy link
Member

htcai commented Aug 8, 2016

I consulted the chart posted by @yl565 . I plan to try sklearn.svm.LinearSVC.

@yigalron
Copy link
Contributor

yigalron commented Aug 8, 2016

I plan to implement the Nearest Neighbors Classification

@dhimmel
Copy link
Member Author

dhimmel commented Aug 8, 2016

@yigalron did you want to claim KNeighborsClassifier, RadiusNeighborsClassifier, or both?

@yigalron
Copy link
Contributor

yigalron commented Aug 8, 2016

I'm planning to start with the KNeighborsClassifier

@VijYadav
Copy link

VijYadav commented Aug 8, 2016

I plan to test Decision Tree CART (sklearn.tree.DecisionTreeClassifier) algorithm

@VijYadav
Copy link

VijYadav commented Aug 8, 2016

Hi Daniel,
Is there "algorithms" directory created already? I I can't see it. Sorry, I am still learning about Github.

@dhimmel
Copy link
Member Author

dhimmel commented Aug 8, 2016

@VijYadav, you'll have to create the directory, since it currently doesn't exist. I'll submit a pull request as an example.

@yigalron
Copy link
Contributor

yigalron commented Aug 8, 2016

when trying to set up the conda environment on windows (conda env create --quiet --force --file environment.yml) I get an error:

yaml.scanner.ScannerError: mapping values are not allowed here
in "", line 7, column 19:
<head prefix="og: http://ogp.me/ns# fb: http://o ...
^

any ideas what I did wrong?

dhimmel added a commit to dhimmel/machine-learning that referenced this issue Aug 8, 2016
@dhimmel
Copy link
Member Author

dhimmel commented Aug 8, 2016

@yigalron can you file a new issue or comment on #15 with the conda installation issue? It's best to keep issues focused and uncluttered.

@yigalron
Copy link
Contributor

yigalron commented Aug 8, 2016

OK; anyway it seems to have been a user error; I'm trying again

gwaybio pushed a commit that referenced this issue Aug 9, 2016
* Create a template and directory for algorithm

See #27

* Add notebook nomenclature example
@beelze-b
Copy link
Contributor

beelze-b commented Aug 9, 2016

Hi all, I will claim AdaBoost.

@mans2singh
Copy link
Contributor

Hey Folks - I will give RandomForestClassifier a shot. Mans

@VijYadav VijYadav mentioned this issue Aug 25, 2016
@yigalron
Copy link
Contributor

I have an initial version of the K-Nearest neighbor algorithm, and I issued a pull request; not sure if it got into the master.
I won't be able to join the next meeting, but will continue to work on this remotely.

@George-Zipperlen
Copy link
Contributor

I would like to claim spectral clustering, and give it a try

@KT12
Copy link
Contributor

KT12 commented Nov 1, 2016

I would like to work on the multi-layer perceptron classifier.

sklearn.neural_network.MLPClassifier

@davidrichardsteinmetz
Copy link

I'd like to claim LDA/QDA and give it a shot

@KT12
Copy link
Contributor

KT12 commented Nov 17, 2016

I'll also take a look at the Passive Aggressive Classifier

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants