You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When using the Bowtie2 indexer on a very large genomes, e.g. the Axolotl genome (~32GB), the auto-detection of small/large genome sequences doesn't seem to work as expected:
...
Building a SMALL index
Reading reference sizes
Building a SMALL index
Reading reference sizes
Error: Reference sequence has more than 2^32-1 characters! Please build a large index by passing the --large-index option to bowtie2-build
Time reading reference sizes: 00:00:57
Total time for call to driver() for forward index: 00:01:00
Error: Encountered internal Bowtie 2 exception (#1)
Command: bowtie2-build --wrapper basic-0 -f --threads 2 genome_mfa.GA_conversion.fa BS_GA
Deleting "BS_GA.3.bt2" file written during aborted indexing attempt.
Deleting "BS_GA.4.bt2" file written during aborted indexing attempt.
Error: Reference sequence has more than 2^32-1 characters! Please build a large index by passing the --large-index option to bowtie2-build
Time reading reference sizes: 00:00:57
Total time for call to driver() for forward index: 00:01:00
Error: Encountered internal Bowtie 2 exception (#1)
Command: bowtie2-build --wrapper basic-0 -f --threads 2 genome_mfa.CT_conversion.fa BS_CT
Deleting "BS_CT.3.bt2" file written during aborted indexing attempt.
Deleting "BS_CT.4.bt2" file written during aborted indexing attempt.
Parent process: Failed to build index
It appear that we need to allow passing on the indexing option --large-index to bowtie2-build to make this work.
PS: It works in default (single-core) indexing mode, i.e. it finds and automatically generates a large index. The wallclock time was roughly 2d 6h, and took ~150GB of RAM.
The text was updated successfully, but these errors were encountered:
It appears that HISAT2 is also failing, even with the very same Bowtie-2 message... 📦
...Reading reference sizes
Reading reference sizes
Error: Reference sequence has more than 2^32-1 characters! Please build a large index by passing the --large-index option to bowtie2-build
Time reading reference sizes: 00:00:53
Total time for call to driver() for forward index: 00:00:57
Error: Encountered internal HISAT2 exception (#1)
Command: hisat2-build --wrapper basic-0 -f --threads 2 genome_mfa.CT_conversion.fa BS_CT
Deleting "BS_CT.1.ht2" file written during aborted indexing attempt.
Deleting "BS_CT.2.ht2" file written during aborted indexing attempt.
Deleting "BS_CT.3.ht2" file written during aborted indexing attempt.
Deleting "BS_CT.4.ht2" file written during aborted indexing attempt.
Parent process: Failed to build index
I have now added a new option --large-index to the bismark_genome_preparation which should hopefully fix the auto-detection problem. Tests are currently under way, but will probably take a day or two to complete. Should consider reporting this to the Bowtie 2 and HISAT2 developers. Added here: 5de68d5.
When using the Bowtie2 indexer on a very large genomes, e.g. the Axolotl genome (~32GB), the auto-detection of small/large genome sequences doesn't seem to work as expected:
It appear that we need to allow passing on the indexing option
--large-index
to bowtie2-build to make this work.PS: It works in default (single-core) indexing mode, i.e. it finds and automatically generates a large index. The wallclock time was roughly 2d 6h, and took ~150GB of RAM.
The text was updated successfully, but these errors were encountered: