Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Envest/calculate and plot delta kappa #72

Merged
merged 7 commits into from Oct 15, 2021

Conversation

envest
Copy link
Contributor

@envest envest commented Oct 1, 2021

Brings #51 to a point where we can run all the scripts in full... fun weekend!

This PR modifies the existing 3-plot_category_kappa.R to allow for delta kappas to be calculated, then uses the same plotting code. The --null_model option signals to the script to look for both the regular model and null model and take the difference in their kappa values (regular kappa minus null kappa). Throughout the script, a FALSE value for null_model results in the same processing and output as before, and a TRUE value for null_model adds in additional steps via if() statement blocks.

:delta::kappa: Thanks!

Copy link
Collaborator

@jaclyn-taroni jaclyn-taroni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We chatted about this yesterday and this looks good overall and consistent with that convo. I had a couple comments on potential improvements, but nothing that I feel I must re-review.

Comment on lines 98 to 100
left_join(null_array.list[[pair_index]],
by = c("perc.seq", "classifier", "norm.method")) %>%
mutate(delta_kappa = kappa.x - kappa.y) %>% # regular kappa - null kappa
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would use suffix here so that this can be a little more explicit:

Suggested change
left_join(null_array.list[[pair_index]],
by = c("perc.seq", "classifier", "norm.method")) %>%
mutate(delta_kappa = kappa.x - kappa.y) %>% # regular kappa - null kappa
left_join(null_array.list[[pair_index]],
by = c("perc.seq", "classifier", "norm.method"),
suffix = c(".true", ".null")) %>%
mutate(delta_kappa = kappa.true - kappa.null) %>% # regular kappa - null kappa

If you make this change here, don't forget to apply it to the seq step below too!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oooh I like that!

Comment on lines +50 to +54
# check that we have ordered pairs of regular and null files for array and seq
array_seeds <- stringr::str_sub(array.files, -8, -5)
null_array_seeds <- stringr::str_sub(null_array.files, -8, -5)
seq_seeds <- stringr::str_sub(seq.files, -8, -5)
null_seq_seeds <- stringr::str_sub(null_seq.files, -8, -5)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm... my instinct here is to suggest that we use other strings that we expect the seeds to be between to detect the seeds, but that seems potentially just as brittle to file name changes as the indexing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Not sure what the best solution is. One mitigation of this is we have previously forced the seed to be a four digit number, so as long as it's the last thing to come before .tsv, it will be captured here.

@envest envest merged commit f7b463a into master Oct 15, 2021
@envest envest deleted the envest/calculate_and_plot_delta_kappa branch October 15, 2021 17:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants