-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automated ploidy cutoffs for karyotype assignment in sample_qc #127
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! A few additional things to think about ...
thanks for the review, Laurent -- I made a few more changes based on your comments. let me know what you think! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good! I have a few more things, some suggestions and some small problems caused by the change in interface..
thanks for the review -- I tried some new changes, let me know what you think! happy to also sit down and chat through this sometime instead in case that's easier |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking better and better!
utils/sample_qc.py
Outdated
xx_x_stats.mean + (upper_SD_cutoff * xx_x_stats.stdev) | ||
] | ||
cutoffs["Y"] = [ | ||
abs(xx_y_stats.mean - (upper_SD_cutoff * xx_y_stats.stdev)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we want abs
here. I think max(0, ...)
is what we want in theory. Although I don't even think we need this as coverage can never be negative 0 is the same as any negative number in practice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe I misunderstood the purpose of abs
here...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this abs
because i wanted to use the upper cutoff of Y ploidy in XX samples as the lower cutoff for a single copy of Y (any samples above this cutoff will get marked as at least having one copy of Y). Unfortunately, I realized when testing the code that Y ploidy distribution in the XX samples was so tight that this value was negative (and in this case, max(0, ...)
wouldn't work, as every sample would get called as having at least one copy of Y). Happy to change this, let me know what you think!
thanks for the suggestions -- the code looks much tidier now! I think we're getting close, let me know what you think about the lower cutoff for single copy of Y (#127 (comment)) |
…X samples (changed '-' to '+')
I updated the cutoffs and unnested |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A las bit of nit-picking about interface types for the cutoffs
…x_expr (sample_qc.py)
… additional logging statements in get_ploidy_cutoffs
Thanks for those suggestions -- I added the edits and an additional set of logging statements (just to send the XX/XY stats and X/Y cutoffs to stdout) |
I decided to have the code check for X and Y for each case, as that made the most sense to me, but I'm open to any feedback!