Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SGD learning rate tuning + pytorch implementation, simulations #81

Merged
merged 24 commits into from
Jun 1, 2023

Conversation

jjc2718
Copy link
Member

@jjc2718 jjc2718 commented May 26, 2023

Sorry in advance - this is kind of a large PR with no super obvious way to split it up into smaller ones. I'll summarize some of the main changes I'm making in this PR below, and if you want to review each of those individually feel free to do that and just approve once you finish them all.

Main changes:

  • Tune learning rate for SGD optimizer: see changes to 01_stratified_classification/run_stratified_lasso_penalty.py, 01_stratified_classification/scripts/run_lasso_lr_compare.sh, parts of pancancer_evaluation/prediction/classification.py and pancancer_evaluation/utilities/classify_utilities.py. This turns out to matter quite a bit for SGD performing well, and once we use a slightly more sophisticated approach for tuning the learning rate (constant learning rate + a grid search in this case) we get much better performance, on par with liblinear.

  • Try a pytorch implementation of SGD: we did this primarily to make sure the SGD performance/regularization dynamics weren't specific to the sklearn implementation. These changes are in 01_stratified_classification/run_stratified_nn.py and pancancer_evaluation/prediction/classification.py (the train_mlp_lr function primarily). We probably won't end up using these results for much in the paper, but it was a useful sanity check.

  • Try SGD and liblinear on some simulated data: these are the 01_stratified_classification/sgd_params/sim.ipynb and 01_stratified_classification/sgd_params/sim_lr.ipynb notebooks. I used these to iterate quickly on the learning rate changes and compare our results to L2 regularization, but the results turn out to be a bit different on real data so I'm not sure how applicable this simulation approach is to the problem we're trying to address in our paper. I want to keep these scripts around for future reference, though.

I have a summary of the main plots/conclusions in these slides, in case they help a bit with putting the results in context: https://docs.google.com/presentation/d/1LRBq_ciFeS503J8-GeH51l-1p4RTdWJPn_LcGhHNjgM/edit?usp=sharing. Let me know if you have questions!

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@jjc2718 jjc2718 requested a review from arielah May 26, 2023 19:11
Copy link

@arielah arielah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm glad you checked this out! Some interesting results in here, and they'll definitely be interesting for the paper.

@jjc2718 jjc2718 merged commit 0564770 into greenelab:master Jun 1, 2023
1 check passed
@jjc2718 jjc2718 deleted the sgd_simulations branch June 1, 2023 17:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants