-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LOH / state4,cn=2 is never called #92
Comments
To enable the state4, you need to modify the HMM file so that the expected
LRR value is 0 for that state (right now it shows 100 so it will not be
called). Most users are not interested in copy neutral LOH events so this
is by default disabled.
…On Mon, Sep 26, 2022 at 6:58 AM Nicolai von Kügelgen < ***@***.***> wrote:
I am currently trying to set up a (mostly) automated CNV calling pipeline
including PennCNV. Installation (from source) worked without issue as does
running the program itself.
However, when investigating the results for a large set of samples (>150),
I never see even a single detected loss of heterozygostiy (LOH) state,
which should have cn=2 (state4 in the hmm model afaik). I am working with
data has been analysed before (manually with GenomeStudio), I mostly know
what to expect (and gain/loss CNVs are matching these expectations).
This is the command I'm using:
PennCNV-1.0.5/detect_cnv.pl -test -hmm PennCNV-1.0.5/lib/hhall.hmm -pfb
compiled-samples.pfb array-data-tsv/*.tsv -out
array-results/full_analysis_run.txt -log
array-results/full_analysis_run.log -confidence
I've created my pfb file by running compile_pfb.pl on all samples. I
imagine that this could maybe be an issue, as the samples are all from
celllines, several of which of which might be derived from another. So
overall heterogeneity between samples is likely lower than usual for a
dataset of comparable size.
—
Reply to this email directly, view it on GitHub
<#92>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABNG3OF52EHEHQO4WT4X3MTWAF6VFANCNFSM6AAAAAAQVVRFUI>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Thanks for that help. Sadly it did not work yet. I've changed this line in the hmm file:
To this:
Is there anything else I need to do? So far I still don't get any LOH calls. |
can you show the actual output (LOG and WARNING message for example) after running the command to help diagnose the issue. |
This is what the log file shows for a single sample (there is little difference between samples, apart from exact numbers, some samples have a GC-waviness warning). The 'WARNING: Unrecognizable LRR or BAF values are treated as zero:' line is repeated quite often when I'm using unfiltered input files, but even if I pre-filter the input I still don't get LOH calls.
|
I do not see anything wrong in the LOG information, except the 52K records that are discarded. If you do not mind just emailing one example file (together with your PFB), I can do some diagnosis to see what is wrong. Also you may want to check whether your PFB file contains allele frequency spectrum from 0 to 1 (it is fine to have lower heterozygosity than typical sample collections), rather than being only 0 or 1. |
The discarded records probes that don't have usable information - I will likely filter these out in the future before running analysis. The PFB file also looks proper from what I can tell (pfb values are continous between 0 and 1) If you could look into this issue, that would be very much appreciated, I've checked and there should be no issue with sending you one of the files. I'll assume I can send the files to your chop.edu email? |
Hey Kai, I hope you got my email with the example files, if not please let me know so I can re-send it. |
I actually never received the example file. Please email a google drive
link to ***@***.*** directly. I checked chop.edu email and do not see
it in inbox or spam, and I think microsoft just flagged and deleted the
email without notifying me.
…On Mon, Oct 17, 2022 at 4:14 AM Nicolai von Kügelgen < ***@***.***> wrote:
Hey Kai,
I hope you got my email with the example files, if not please let me know
so I can re-send it.
—
Reply to this email directly, view it on GitHub
<#92 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABNG3OAHQTMKIC5J3AOOYWDWDUDG3ANCNFSM6AAAAAAQVVRFUI>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Since I already had an drive (like) link in my last email and the mail is censoered in comments here, I've now resend it to you gmail adress (from the github commit logs). |
Yes I got it!
…On Mon, Oct 24, 2022 at 11:27 AM Nicolai von Kügelgen < ***@***.***> wrote:
Since I already had an drive (like) link in my last email and the mail is
censoered in comments here, I've now resend it to you gmail adress (from
the github commit logs).
Thanks!
—
Reply to this email directly, view it on GitHub
<#92 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABNG3OHJZI6LUUEYXNTTL4TWE2TFFANCNFSM6AAAAAAQVVRFUI>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
I was able to generate LOH calls after editing 100 to 0 in the hhall.hmm file and after adding the "-loh" argument in command line. I forgot to mention the latter requirement. ./detect_cnv.pl -test -hmm lib/hhall.hmm -pfb /mounted/test.pfb /mounted/204362030005_R07C01.signal.txt -gcmodel test.gcmodel -loh chr12:95690404-95944219 numsnp=113 length=253,816 state4,cn=2 /mounted/204362030005_R07C01.signal.txt startsnp=rs2431014 endsnp=GSA-rs7486703 I noted that there are lots of negative PFB values, which cannot be right. It should be made to positive. I am not sure how it happened but perhaps there are negative BAF values in your input files that were used to generate the PFB? Also this particular sample is very "wavy" which you can see from the figure below. You can consider using gcmodel to adjust for waviness. |
Thanks ! The missing -loh option must have been the reason why I didn't get any LOH calls. I've also noticed that several of the samples I have are quity wavy, but I haven't gotten around to making the GCmodel files yet. The negative PFB values probably do come from negative BAF values (I didn't use genomestudio which apparently automatcially restricts BAF to [0,1] while the underlying algorythm doesn't). |
I am currently trying to set up a (mostly) automated CNV calling pipeline including PennCNV. Installation (from source) worked without issue as does running the program itself.
However, when investigating the results for a large set of samples (>150), I never see even a single detected loss of heterozygostiy (LOH) state, which should have cn=2 (state4 in the hmm model afaik). I am working with data has been analysed before (manually with GenomeStudio), I mostly know what to expect (and gain/loss CNVs are matching these expectations).
This is the command I'm using:
PennCNV-1.0.5/detect_cnv.pl -test -hmm PennCNV-1.0.5/lib/hhall.hmm -pfb compiled-samples.pfb array-data-tsv/*.tsv -out array-results/full_analysis_run.txt -log array-results/full_analysis_run.log -confidence
I've created my pfb file by running
compile_pfb.pl
on all samples. I imagine that this could maybe be an issue, as the samples are all from celllines, several of which of which might be derived from another. So overall heterogeneity between samples is likely lower than usual for a dataset of comparable size.The text was updated successfully, but these errors were encountered: