-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updating LOIO analysis to include all features spaces #46
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
@@ -0,0 +1,1258 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I agree. I will remove!
@@ -0,0 +1,1258 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! It is maybe worth noting that this is mostly Roshan's code ported over to this new file.
@@ -0,0 +1,1258 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line #60. straified_k_folds = StratifiedKFold(n_splits=10, shuffle=False)
Consider removing this code
Reply via ReviewNB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great point! this must have stuck around and I forgot to delete. Thanks for noting. (I will also delete the GridSearchCV object call)
@@ -0,0 +1,1258 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line #62. # create logistic regression model with following parameters
Same thing here, consider removing this code
Reply via ReviewNB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💯 thanks!
@@ -0,0 +1,1258 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -0,0 +1,1258 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line #94. C=model.C,
Obviously it works, but I'm surprised you can access the C parameter, since it is not an attribute on the 1.1.1 sklearn logistic regression documentation
Reply via ReviewNB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting! Thanks for digging into this and raising. I confirmed that I am using 1.1.1 (see below), but maybe it is through the saving and/or loading with joblib process that exposes the attribute again? Maybe this process also exposes parameters (C is a parameter in 1.1 docs) 🤷
@@ -0,0 +1,1258 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line #7. # define combinations to test over
May also consider storing these in a dictionary
Reply via ReviewNB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it might be more elegant to do so. However, it is a relatively minor change that would cause additional compute time that likely isn't worth pursuing. Roshan also uses itertools.product()
on these combinations, so it is efficiently iterating over them with limited code, which is one of the primary benefits of dictionary storage. I will skip doing this one, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It LGTM @gwaybio! Just some minor comments. Let me know if you have any questions!
Thanks for the review @MattsonCam - I will go ahead and merge! |
I also add shuffled data to this analysis. Dependent on merging #40 first. After merging #40, I will update visualization