-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Visualize performance for subset featurespaces #43
Visualize performance for subset featurespaces #43
Conversation
gwaybio
commented
Oct 19, 2023
•
edited
Loading
edited
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
@jenna-tomkinson - the recent commits update the PR curves for the balanced model. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! I had a few comments to address but mainly just questions for my understanding.
@@ -139,12 +167,20 @@ f1_score_df <- readr::read_tsv( | |||
"Phenotypic_Class" = "c", | |||
"data_split" = "c", | |||
"shuffled" = "c", | |||
"feature_type" = "c" | |||
"feature_type" = "c", | |||
"balance_type" = "c" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the plots, I am assuming that the balance type is no longer included, so could this be removed from the code since we have shifted to class-balanced models?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On this specific command, I am loading a dataset using readr
. IMO, readr does even a better job than pandas at specifying data type, AKA dtype
. This line that you are highlighting is reading in the column named "balance_type" (and other columns) as a character dtype
.
It is true that we only use balanced models in this visualization, but in order to do so, we need to make sure the column is loaded properly (specifically, as a character), so that we can filter later.
…henotypic_profiling_model into visualize-subset-featurespaces
Thanks for the review @jenna-tomkinson ! I will go ahead and merge now. |