Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating LOIO analysis to include all features spaces #46

Merged
merged 7 commits into from
Oct 30, 2023

Conversation

gwaybio
Copy link
Member

@gwaybio gwaybio commented Oct 24, 2023

I also add shuffled data to this analysis. Dependent on merging #40 first. After merging #40, I will update visualization

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@@ -0,0 +1,1258 @@
{
Copy link
Member

@MattsonCam MattsonCam Oct 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May also consider removing this


Reply via ReviewNB

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree. I will remove!

@@ -0,0 +1,1258 @@
{
Copy link
Member

@MattsonCam MattsonCam Oct 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very well documented, easy to follow


Reply via ReviewNB

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! It is maybe worth noting that this is mostly Roshan's code ported over to this new file.

@@ -0,0 +1,1258 @@
{
Copy link
Member

@MattsonCam MattsonCam Oct 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line #60.            straified_k_folds = StratifiedKFold(n_splits=10, shuffle=False)

Consider removing this code


Reply via ReviewNB

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great point! this must have stuck around and I forgot to delete. Thanks for noting. (I will also delete the GridSearchCV object call)

@@ -0,0 +1,1258 @@
{
Copy link
Member

@MattsonCam MattsonCam Oct 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line #62.            # create logistic regression model with following parameters

Same thing here, consider removing this code


Reply via ReviewNB

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯 thanks!

@@ -0,0 +1,1258 @@
{
Copy link
Member

@MattsonCam MattsonCam Oct 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line #68.            grid_search_cv = GridSearchCV(

Consider removing this code


Reply via ReviewNB

@@ -0,0 +1,1258 @@
{
Copy link
Member

@MattsonCam MattsonCam Oct 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line #94.                        C=model.C,

Obviously it works, but I'm surprised you can access the C parameter, since it is not an attribute on the 1.1.1 sklearn logistic regression documentation


Reply via ReviewNB

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting! Thanks for digging into this and raising. I confirmed that I am using 1.1.1 (see below), but maybe it is through the saving and/or loading with joblib process that exposes the attribute again? Maybe this process also exposes parameters (C is a parameter in 1.1 docs) 🤷

image

@@ -0,0 +1,1258 @@
{
Copy link
Member

@MattsonCam MattsonCam Oct 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line #7.    # define combinations to test over

May also consider storing these in a dictionary


Reply via ReviewNB

Copy link
Member Author

@gwaybio gwaybio Oct 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it might be more elegant to do so. However, it is a relatively minor change that would cause additional compute time that likely isn't worth pursuing. Roshan also uses itertools.product() on these combinations, so it is efficiently iterating over them with limited code, which is one of the primary benefits of dictionary storage. I will skip doing this one, thanks!

Copy link
Member

@MattsonCam MattsonCam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It LGTM @gwaybio! Just some minor comments. Let me know if you have any questions!

@gwaybio
Copy link
Member Author

gwaybio commented Oct 30, 2023

Thanks for the review @MattsonCam - I will go ahead and merge!

@gwaybio gwaybio merged commit 9066225 into WayScience:main Oct 30, 2023
@gwaybio gwaybio deleted the loio-all branch October 30, 2023 19:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants