Skip to content

Commit

Permalink
Merge branch 'master' of https://github.com/PapenfussLab/gridss
Browse files Browse the repository at this point in the history
  • Loading branch information
d-cameron committed May 20, 2018
2 parents 624a04f + 7cf86fa commit e8e5ce1
Showing 1 changed file with 4 additions and 33 deletions.
37 changes: 4 additions & 33 deletions Readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -322,39 +322,10 @@ WORKING_DIR/*output*.gridss.working/*output*.breakpoint.vcf | Raw unannotated va
Maven is used for build and dependency management which simplifies compile to the following steps:

* `git clone https://github.com/PapenfussLab/gridss`
* `cd gridss`
* `mvn clean package`

If GRIDSS was built successfully, a combined jar containing GRIDSS and all required libraries located at target/GRIDSS-_VERSION_-jar-with-dependencies.jar will have been created.


# Multi-mapping read alignment

**WARNING: Multi-mapping support has been deprecated and will be removed in a future release.** Our benchmarking results indicates that multi-mapping read alignment increases overall sensitivity by around 25% but reduces precision by 75%. Only use multi-mapping alignments if extremely low precision is acceptable for your use case (hint: it probably isn't).

~~GRIDSS supports input files that report multiple read alignments for each input read. GRIDSS supports multi-mapping read alignments in which each alignment is reported as an independent SAM record with the primary.secondary flags set as per the SAM specifications.~~

~~As the default GRIDSS configuration parameters are suitable for input BAMs in which only the best read alignment is reported, some configuration settings overrides must be supplied. If the aligner does not report a meaningful MAPQ for multi-mapping alignments (true for all aligners we have tested), the scoring model should be updated to ignore mapping quality. The mapping quality filter should be removed and the maximum coverage increased. A typical configuration file for a multi-mapping input looks as follows:~~


> ~~multimapping=true~~
> ~~maxCoverage=50000~~
> ~~minMapq=0~~
> ~~variantcalling.lowQuality=13~~
> ~~variantcalling.minScore=2~~
> ~~variantcalling.minSize=32~~
> ~~scoring.model=ReadCount~~
~~When multi-mapping mode is enabled, both `INPUT` and `INPUT_NAME_SORTED` must be supplied for all input files.~~

~~For multi-mapping input files, GRIDSS must perform the additional steps of uniquely assigning reads to assemblies and reads to variant calls. As this cannot be done in a streaming manner, GRIDSS caches this information in a large off-heap lookup table. This lookup uses a large amount of memory in addition to the 32GB heap. For human 50x WGS with up to 100 alignment locations per read (thus resulting in a 5TB input BAM file), approximately 300GB of memory is required for this lookup. This additional memory is only required for multi-mapping input files (e.g. mrFAST alignment, or the bowtie2 -k and -a options).~~

~~Due to 3rd party library dependencies, multi-mapping mode requires a full Java JDK installed and will not run with just a JRE.~~
If GRIDSS was built successfully, a combined jar containing GRIDSS and all required libraries located at target/GRIDSS-_VERSION_-gridss-jar-with-dependencies.jar will have been created.

# Error Messages

Expand Down Expand Up @@ -385,11 +356,11 @@ Can you run the bwa command exactly as it appears in the error message?
### (Too many open files)

GRIDSS has attempted to open too many files at once and the OS file handle limit has been reached.
On linux 'ulimit -n' displays your current limit. This error likely to be encountered if you have specified a large number of input files or threads but can also be encountered when processing many small contigs. The following solution is recommended:
On linux 'ulimit -n' displays your current limit. This error likely to be encountered if you have specified a large number of input files or threads. The following solution is recommended:
* Increase your OS limit on open file handles (eg `ulimit -n _<larger number>_`)
* Added `-Dgridss.defensiveGC=true` to the java command-line used for GRIDSS. Memory mapped file handles are not released to the OS until the buffer is garbage collected . This option add a request forr garbage collection whenever a file handle is no longer used.

Other options that have solve this problem include:
Other options that have solved this problem include:
* Reduce number of worker threads. A large number of input files being processed in parallel results in a large number of files open at the same time.

### Reference genome used by _input.bam_ does not match reference genome _reference.fa_. The reference supplied must match the reference used for every input.
Expand Down

0 comments on commit e8e5ce1

Please sign in to comment.