-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Joint Calling Problem #403
Comments
GRIDSS (currently) requires joint assembly of all samples. Only one assembly
It looks very much like GRIDSS is reading the assembly file containing the 40 individuals, recognising that doesn't match the 60 samples provided, and immediately terminating (which is preferably to incorrectly allocating assembly support to the wrong samples). For very large cohorts, the total sequencing depth will be far too deep for the assembly to be reliable. In such scenarios, you'll need to do assembly in batches. #354 contains details of how you can trick GRIDSS into doing batched assembly. Unforunately, you're going to have to regenerate your existing assembly.bam files since they've already been generated with the incorrect number of samples. Proper support for batched assembly is already on the backlog as issue #397 |
That said, GRIDSS should be able to do joint assembly on 600x worth of samples (we've run in on ~1000x aggregate coverage for some tumour xenograft evolution analysis). Just make sure to check the log file and the |
Dear GRIDSS team,
When I perform gridss on germline data for 60 individuals (diploid, genome size about 1.1GB and average depth 10X), the program reported an error message
Caused by: java.lang.RuntimeException: Fatal error: GRIDSS assembly does not have the expected number of input categories (found 40, expected 60). GRIDSS performs joint assembly and does not support per-input assembly. Make sure the same input and labels are specified in the same order for the assembly and variant calling steps.
I have checked for input labels, input, assembly several times to ensure there are all 60 of them, and when I run the program separately based on populations (8, 12, and 40 individuals) it ran successfully.
Your help and time is greatly appreciated!
Rose
The code is
$gridss --threads 8 -j $gridssjar --reference $ref_genome --output output_60indv \ --repeatmaskerbed $repeatmasker \ --assembly SRS589245 --assembly SRS589246 --assembly SRS589247 --assembly SRS589248 --assembly SRS589249 --assembly SRS589250 --assembly SRS589251 --assembly SRS589252 --assembly Clean_01 --assembly Clean_02 --assembly Clean_03 --assembly Clean_04 --assembly Clean_05 --assembly Clean_06 --assembly Clean_07 --assembly Clean_08 --assembly Clean_09 --assembly Clean_ns --assembly SRS420686 --assembly SRS524489 --assembly GGS_174 --assembly GGS_175 --assembly GGS_176 --assembly GGS_2887 --assembly GGS_2888 --assembly GGS_2890 --assembly GGS_2891 --assembly GGS_2892 --assembly GGS_2893 --assembly GGS_2895 --assembly GGS_3001 --assembly GGS_3002 --assembly GGS_3003 --assembly GGS_3004 --assembly GGS_3005 --assembly GGS_3006 --assembly GGS_3007 --assembly GGS_3008 --assembly GGS_3009 --assembly GGS_3010 --assembly GGS_3011 --assembly GGS_3012 --assembly GGS_3016 --assembly GGS_3017 --assembly GGS_3028 --assembly GGS_3038 --assembly GGS_3040 --assembly GGS_3041 --assembly GGS_3042 --assembly GGS_3043 --assembly GGS_3044 --assembly GGS_3045 --assembly GGS_3046 --assembly GGS_3047 --assembly GGS_3050 --assembly GGS_3051 --assembly GGS_3052 --assembly GGS_3061 --assembly GGS_3069 --assembly GGS_3072 \ --labels SRS589245,SRS589246,SRS589247,SRS589248,SRS589249,SRS589250,SRS589251,SRS589252,Clean_01,Clean_02,Clean_03,Clean_04,Clean_05,Clean_06,Clean_07,Clean_08,Clean_09,Clean_ns,SRS420686,SRS524489,GGS_174,GGS_175,GGS_176,GGS_2887,GGS_2888,GGS_2890,GGS_2891,GGS_2892,GGS_2893,GGS_2895,GGS_3001,GGS_3002,GGS_3003,GGS_3004,GGS_3005,GGS_3006,GGS_3007,GGS_3008,GGS_3009,GGS_3010,GGS_3011,GGS_3012,GGS_3016,GGS_3017,GGS_3028,GGS_3038,GGS_3040,GGS_3041,GGS_3042,GGS_3043,GGS_3044,GGS_3045,GGS_3046,GGS_3047,GGS_3050,GGS_3051,GGS_3052,GGS_3061,GGS_3069,GGS_3072 \ $midfileDIR/dedup_SRS589245.bam $midfileDIR/dedup_SRS589246.bam $midfileDIR/dedup_SRS589247.bam $midfileDIR/dedup_SRS589248.bam $midfileDIR/dedup_SRS589249.bam $midfileDIR/dedup_SRS589250.bam $midfileDIR/dedup_SRS589251.bam $midfileDIR/dedup_SRS589252.bam $midfileDIR/final_Clean_01.bam $midfileDIR/final_Clean_02.bam $midfileDIR/final_Clean_03.bam $midfileDIR/final_Clean_04.bam $midfileDIR/final_Clean_05.bam $midfileDIR/final_Clean_06.bam $midfileDIR/final_Clean_07.bam $midfileDIR/final_Clean_08.bam $midfileDIR/final_Clean_09.bam $midfileDIR/final_Clean_ns.bam $midfileDIR/dedup_SRS420686.bam $midfileDIR/dedup_SRS524489.bam $midfileDIR/GGS_174_dedup.bam $midfileDIR/GGS_175_dedup.bam $midfileDIR/GGS_176_dedup.bam $midfileDIR/GGS_2887_dedup.bam $midfileDIR/GGS_2888_dedup.bam $midfileDIR/GGS_2890_dedup.bam $midfileDIR/GGS_2891_dedup.bam $midfileDIR/GGS_2892_dedup.bam $midfileDIR/GGS_2893_dedup.bam $midfileDIR/GGS_2895_dedup.bam $midfileDIR/GGS_3001_dedup.bam $midfileDIR/GGS_3002_dedup.bam $midfileDIR/GGS_3003_dedup.bam $midfileDIR/GGS_3004_dedup.bam $midfileDIR/GGS_3005_dedup.bam $midfileDIR/GGS_3006_dedup.bam $midfileDIR/GGS_3007_dedup.bam $midfileDIR/GGS_3008_dedup.bam $midfileDIR/GGS_3009_dedup.bam $midfileDIR/GGS_3010_dedup.bam $midfileDIR/GGS_3011_dedup.bam $midfileDIR/GGS_3012_dedup.bam $midfileDIR/GGS_3016_dedup.bam $midfileDIR/GGS_3017_dedup.bam $midfileDIR/GGS_3028_dedup.bam $midfileDIR/GGS_3038_dedup.bam $midfileDIR/GGS_3040_dedup.bam $midfileDIR/GGS_3041_dedup.bam $midfileDIR/GGS_3042_dedup.bam $midfileDIR/GGS_3043_dedup.bam $midfileDIR/GGS_3044_dedup.bam $midfileDIR/GGS_3045_dedup.bam $midfileDIR/GGS_3046_dedup.bam $midfileDIR/GGS_3047_dedup.bam $midfileDIR/GGS_3050_dedup.bam $midfileDIR/GGS_3051_dedup.bam $midfileDIR/GGS_3052_dedup.bam $midfileDIR/GGS_3061_dedup.bam $midfileDIR/GGS_3069_dedup.bam $midfileDIR/GGS_3072_dedup.bam
The text was updated successfully, but these errors were encountered: