-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for homologous recombination deficiency scores in Dx.R #159
Comments
I actually "reverse-engineered" the code (because it is really hard to read and understand what does it do) when implementing it internally, using PureCN data as a source to calculate HRD scores. If you're interested I might write down the notes (I'd have put up a PR, but I rewrote the implementation in Python as I was more familiar with it). |
Hi @lbeltrame, that would be awesome. More than happy to add it as Python code for now. |
Here are some notes on how scarHRD and how I adapted it: Preprocessing (before the HRD calculation)
Segment generationContrary to "popular" belief, this does not use the segmentation output. It generates "segments" of identical major and minor CN. It works like this:
LOH scoreStart with a LOH score of 0
LST score (large scale transitions)This was pretty hard to figure out. Start with LST score of 0.
TAI (Telomeric allelic imbalance)This is another confusing one, and I'm not sure I understood the logic completely (but my implementation produces identical results to scarHRD). Start with a TAI score of 0.
Wrapping up
Currently the script I use depends on pandas (which pulls in numpy as well) and requires a file with centromere coordinates. What would be the minimum Python version you consider acceptable (consider it will likely only work with Python 3.4 or later)? |
Thanks so much Luca, that's super helpful. And sorry for the delayed response. I think for now I'll take whatever you have. At some point when I need it I might reimplement in R to make the installation easier, but as a prototype that's awesome for now. |
Super! I'll clean it up tomorrow and attach it here. Do you have a data file with centromere information already inside PureCN? That's needed for the TAI calculation (you need to know you're not crossing the centromere). |
Yes, I have the centromeres. Currently as serialized RDA file in https://github.com/lima1/PureCN/tree/master/data though. |
I'll see whether https://github.com/ofajardo/pyreadr can be used to handle this (I wanted to get rid of a dependency on a random file that might or might not be in some person's disk anyway). |
Hi @lbeltrame , do you have a script for HRD scoring? I would be very grateful for help! |
No, not at the moment, unfortunately (I switched institutions in the meantime). |
@lbeltrame, thank you very much for your answer! I understand that you are a professional in this field, I come from the field of microbiome. Maybe, as an experienced professional, you can advise me on a tool or workflow for calculating HRD score from the results of unpaired tumor samples? |
Probably wrapper around https://github.com/sztup/scarHRD
The text was updated successfully, but these errors were encountered: