-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to run with --cfrac 0.8 #3
Comments
Is this by any chance exome data? The error is basically caused by there --Carson From: sicotte notifications@github.com When I specify the input purity, it forces the "cell" option. You need more than 8 data samples. The number of data points must satisfy In my version, line 5204 is I do have multiple samples.. so how do I specify the input files to get this bin/cnv_caller.pl $seg $vcf --sid s_tumor.100 --bam_list $tbam --rid — |
It's 30X whole genome called using GATK. There are ~600-1.8M variants/sample. |
You may have issues with your sample IDs. They need to be identical with VCF Also whole genome would normally have 2-3M variants per sample. Do you --Carson From: sicotte notifications@github.com It's 30X whole genome called using GATK. There are ~600-1.8M — |
Those 5 segments have higher than normal coverage. Oh, it requires 2MB segments. Got 275 and 461 in tumor and germline. I created a new variable to define the minimum number of variants per segment.. (to 200 instead of the 1000 that was hardcoded in discovery_segments) .. and that worked. |
I think your variant file might have been filtered to have somatic variants Thanks, From: sicotte notifications@github.com Those 5 segments have higher than normal coverage. Oh, it requires 2MB segments. Got 275 and 461 in tumor and germline. I created a new variable to define the minimum number of variants per — |
You were right. I’ll need to recall the VCF, there are no SNV which have the same genotype in tumor and germline. Nevertheless, there are still a couple of things that could be made to make the script more robust. I got the two errors. Use of uninitialized value $cmp in numeric eq (==) at /data5/sicotte/tools/WaveCNV-caller/bin/cnv_caller.pl line 2241. Sort subroutine didn't return a numeric value at /data5/sicotte/tools/WaveCNV-caller/bin/cnv_caller.pl line 4866. From: carsonhh [mailto:notifications@github.com] I think your variant file might have been filtered to have somatic variants Thanks, From: sicotte <notifications@github.commailto:notifications@github.com> Those 5 segments have higher than normal coverage. Oh, it requires 2MB segments. Got 275 and 461 in tumor and germline. I created a new variable to define the minimum number of variants per — — |
It's one of those things where I really just want to have more informative error messages as opposed to building in mechanisms that allow the code to continue beyond those failures. |
After recalling variants with GATK (and aside from the --fasta bug I reported), I was able to run WaveCNV.pl .. |
Good to know. I'll look at those other issues. |
3 feedbacks: I notice the code always output an estimated purity as in “cell” mode.
I had one sample that kept not finishing without any error message.. just the When I bumped the RAM from 15G to 20G then 31, it went further From: carsonhh [mailto:notifications@github.com] Good to know. I'll look at those other issues. — |
I've found that most of the time the cell mode value is different than the The copy fraction value affects the expected MAF expect at different CN No documentation is available other than the WaveCNV publication and If one of your samples has a lot of segments (100,000+), it can fail because --Carson From: sicotte notifications@github.com 3 feedbacks: I notice the code always output an estimated purity as in “cell” mode.
When I bumped the RAM from 15G to 20G then 31, it went further From: carsonhh [mailto:notifications@github.com] Good to know. I'll look at those other issues. — — |
When I specify the input purity, it forces the "cell" option.
The error message says
You need more than 8 data samples. The number of data points must satisfy the relation N > 2xK**2 where K is the number of clusters. The smallest value for K is 2.
at /data5/sicotte/tools/WaveCNV-caller/bin/cnv_caller.pl line 5204.
In my version, line 5204 is
my ($ids, $centers) = $kmeans->kmeans();
I do have multiple samples.. so how do I specify the input files to get this to work.
bin/cnv_caller.pl $seg $vcf --sid s_tumor.100 --bam_list $tbam --rid s_normal.100 --rbam_list $nbam --lfrag 200 --merge --smooth --tmp $thisdir/tmp.100 --cfrac 0.8
The text was updated successfully, but these errors were encountered: