New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error: Cannot read from buffer; Error: cannot load book-keeping #7012
Comments
@ccastane9, looks like a memory issue. Some questions -
|
@nalinigans, to answer your questions as best as I can (sorry, I'm a bit of a novice)
Newest output: A USER ERROR has occurred: Couldn't create GenomicsDBFeatureReader Set the system property GATK_STACKTRACE_ON_USER_EXCEPTION (--java-options '-DGATK_STACKTRACE_ON_USER_EXCEPTION=true') to print the stack trace. A USER ERROR has occurred: Couldn't create GenomicsDBFeatureReader org.broadinstitute.hellbender.exceptions.UserException: Couldn't create GenomicsDBFeatureReader |
Your -Xmx option seems to be rather small. Can you try 32G or 48G? Also, if you can share the following files from the GenomicsDB array |
Hopefully I have attached the files correctly, I also tried increasing the -Xmx option but received the same error. Thanks again for all of your help! |
Thanks, we have reproduced the issue with your files. Did you see any errors logged during the GenomicsDBImport phase? What OS are you running on? Will you be able to help by re-running GenomicsDBImport with a debug version of the libtiledbgenomicsdb.so and As a workaround for now, can you split the intervals to GenomicsDBImport - see https://gatk.broadinstitute.org/hc/en-us/articles/360035531852-Intervals-and-interval-lists? Splitting the chromosome into 2 or 3 roughly equal regions may help. |
Hi, I do not recall seeing any of those errors during the GenomicsSBImport phase nor can I find any errors readily logged, although I can recreate the database for Chromosome 3 and tell you for sure if I find any errors logged. I am running GenomicsDBImport on a Linux v3.10.0-1127.19.1.el7.x86_64 amd64 server - and yes, I can rerun a debugged version. |
@ccastane9, what flavor of Linux is your server running on? |
@nalinigans scientific Linux 7.9 |
Hi, you have probably deleted out the previous comment with the |
Yes, the "no space left on device" was temporary and has been fixed - I'm creating a new workspace with the --genomicsdb-shared-posixfs-optimizations turned on. |
Did you reproduce the issue after using |
@nalinigans I did, and the import was a success and I saw no errors pop up - however I still cannot call variants. Below is the output for the GenomicsImport and then the same errors when I tried to call variants. GenomicsImport output A USER ERROR has occurred: Couldn't create GenomicsDBFeatureReader org.broadinstitute.hellbender.exceptions.UserException: Couldn't create GenomicsDBFeatureReader |
@ccastane9, it looks like we are hitting the limits of zlib memory-wise. I had asked before, but as a workaround for now, can you split the intervals to GenomicsDBImport - see |
@nalinigans, would this be an appropriate interval command? The chromosome is roughly 121Mb, so I plan on using 3 intervals to GenomicsDBImport. Or do I need to add commas /data1/_software/gatk-4.1.8.1/gatk --java-options "-Xmx16g" GenomicsDBImport --reference /data1/EquCab/_ECA30/Equus_caballus.EquCab3.0.dna_sm.toplevel.fa/ |
@ccastane9 The command you show is on the right track. Couple of things:
For completeness, I'll note that you don't necessarily have to have a single interval per workspace (though you may want to for scatter gather parallelism). You can specify multiple intervals per workspace. In your case, that could look something like |
@mlathara thank you so much for the clarification, I will try to break this chromosome into multiple intervals for the GenomicsDBImport and once more try to call variants. Again, truly appreciate all of the help! |
@nalinigans @mlathara it seems that breaking into intervals during the GenomicsDBImport is solving the problem and allowing me to joint call variants. Thanks for all of the help in this! |
Actually, it seems it will let me call variants for part 1 of my chromosome, although I seem to get empty files for part 2 and part 3 databases when trying to joint call the variants in those intervals. In my script for using GenotypeGVCFs function I also specified the intervals which match those used during the GenomicsImportDB function. The program itself (GenotypeGVCFs) does seem to be running without throwing errors though and may just be taking a while to start processing variants. Edit: it was just taking a while! |
What is the status of this issue? I have encountered exactly the same error with GATK 4.2.1.0. |
@dwuab which issue/error are you specifically referring to? As indicated in the last message before you posted, the previous user was able to use GenomicsDBImport and GenotypeGVCFs after following our suggestions to break up large chromosomes into smaller intervals. |
@mlathara The workaround of using multiple intervals does work (it took me more than one week to confirm, implying how annoying this bug could be), but several versions later, GATK still has this issue and could not even produce an informative error message. I could not find any mention of this issue in GATK's documentation. Is this bug going to be fixed? |
@dwuab, we are making some performance improvements with GenomicsDB and still are in the testing stage. Just wondering if you could try gatk from this branch |
@nalinigans Thanks. I tried the genomicsdb_142 branch. While I did not try it on chr1 and chr2, I tried it on chrX. With GATK 4.2.1.0, I encountered the problem described above while importing and joint-calling chrX, which is a bit weird since I had no problem with chr3, 4, etc. Now with genomicsdb_142, chrX has been imported with only one interval, and joint-calling is running fine at this moment. |
Hello,
I have been using he GenotypeGVCFs function to call variants on roughly 300 whole genome sequenced individuals. I have not run into any issue when calling variants for these same individuals using the majority of chromosomes, however when I use the same script for chromosomes 1, 2 and 3 of the species I get the error "Couldn't create GenomicsDBFeatureReader" as in issue #6616 although I believe our issues may differ because I also have the errors "Cannot read from buffer" and "cannot load book-keeping; Reading-tiles offset".
Below is the computer output:
Using GATK jar /data1/_software/gatk-4.1.8.1/gatk-package-4.1.8.1-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -DGATK_STACKTRACE_ON_USER_EXCEPTION=true -Xmx16g -jar /data1/_software/gatk-4.1.8.1/gatk-package-4.1.8.1-local.jar GenotypeGVCFs --reference /data1/EquCab/_ECA30/Equus_caballus.EquCab3.0.dna_sm.toplevel.fa/ -V gendb://ECA3_GenomicsDB_260/1 -O ECA3_GenomicsDB_260.1.g.vcf.gz
13:56:51.939 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/data1/_software/gatk-4.1.8.1/gatk-package-4.1.8.1-local.jar!/com/intel/gkl/native/libgkl_compression.so
Dec 21, 2020 1:56:52 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
13:56:52.185 INFO GenotypeGVCFs - ------------------------------------------------------------
13:56:52.186 INFO GenotypeGVCFs - The Genome Analysis Toolkit (GATK) v4.1.8.1
13:56:52.186 INFO GenotypeGVCFs - For support and documentation go to https://software.broadinstitute.org/gatk/
13:56:52.186 INFO GenotypeGVCFs - Executing as ccastane9@andersserver-01.cvm.tamu.edu on Linux v3.10.0-1127.19.1.el7.x86_64 amd64
13:56:52.186 INFO GenotypeGVCFs - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_275-b01
13:56:52.186 INFO GenotypeGVCFs - Start Date/Time: December 21, 2020 1:56:51 PM CST
13:56:52.186 INFO GenotypeGVCFs - ------------------------------------------------------------
13:56:52.186 INFO GenotypeGVCFs - ------------------------------------------------------------
13:56:52.187 INFO GenotypeGVCFs - HTSJDK Version: 2.23.0
13:56:52.187 INFO GenotypeGVCFs - Picard Version: 2.22.8
13:56:52.187 INFO GenotypeGVCFs - HTSJDK Defaults.COMPRESSION_LEVEL : 2
13:56:52.187 INFO GenotypeGVCFs - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
13:56:52.187 INFO GenotypeGVCFs - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
13:56:52.187 INFO GenotypeGVCFs - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
13:56:52.187 INFO GenotypeGVCFs - Deflater: IntelDeflater
13:56:52.188 INFO GenotypeGVCFs - Inflater: IntelInflater
13:56:52.188 INFO GenotypeGVCFs - GCS max retries/reopens: 20
13:56:52.188 INFO GenotypeGVCFs - Requester pays: disabled
13:56:52.188 INFO GenotypeGVCFs - Initializing engine
13:56:53.115 INFO GenomicsDBLibLoader - GenomicsDB native library version : 1.3.0-e701905
[TileDB::Buffer] Error: Cannot read from buffer; End of buffer reached.
[TileDB::BookKeeping] Error: Cannot load book-keeping; Reading tile offsets failed.
13:57:15.762 INFO GenotypeGVCFs - Shutting down engine
[December 21, 2020 1:57:15 PM CST] org.broadinstitute.hellbender.tools.walkers.GenotypeGVCFs done. Elapsed time: 0.40 minutes.
Runtime.totalMemory()=2119696384
A USER ERROR has occurred: Couldn't create GenomicsDBFeatureReader
org.broadinstitute.hellbender.exceptions.UserException: Couldn't create GenomicsDBFeatureReader
at org.broadinstitute.hellbender.engine.FeatureDataSource.getGenomicsDBFeatureReader(FeatureDataSource.java:410)
at org.broadinstitute.hellbender.engine.FeatureDataSource.getFeatureReader(FeatureDataSource.java:326)
at org.broadinstitute.hellbender.engine.FeatureDataSource.(FeatureDataSource.java:282)
at org.broadinstitute.hellbender.engine.VariantLocusWalker.initializeDrivingVariants(VariantLocusWalker.java:76)
at org.broadinstitute.hellbender.engine.VariantWalkerBase.initializeFeatures(VariantWalkerBase.java:67)
at org.broadinstitute.hellbender.engine.GATKTool.onStartup(GATKTool.java:709)
at org.broadinstitute.hellbender.engine.VariantLocusWalker.onStartup(VariantLocusWalker.java:63)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:138)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
at org.broadinstitute.hellbender.Main.main(Main.java:289)
Caused by: java.io.IOException: GenomicsDB JNI Error: VariantQueryProcessorException : Could not open array 1$1$188260577 at workspace: /data1/EquCab/GenomicsDB/ECA3_GenomicsDB_260/1
TileDB error message : [TileDB::BookKeeping] Error: Cannot load book-keeping; Reading tile offsets failed
at org.genomicsdb.reader.GenomicsDBQueryStream.jniGenomicsDBInit(Native Method)
at org.genomicsdb.reader.GenomicsDBQueryStream.(GenomicsDBQueryStream.java:209)
at org.genomicsdb.reader.GenomicsDBQueryStream.(GenomicsDBQueryStream.java:182)
at org.genomicsdb.reader.GenomicsDBQueryStream.(GenomicsDBQueryStream.java:91)
at org.genomicsdb.reader.GenomicsDBFeatureReader.generateHeadersForQuery(GenomicsDBFeatureReader.java:200)
at org.genomicsdb.reader.GenomicsDBFeatureReader.(GenomicsDBFeatureReader.java:85)
at org.broadinstitute.hellbender.engine.FeatureDataSource.getGenomicsDBFeatureReader(FeatureDataSource.java:407)
... 12 more
I'm assuming it is something in the array 1$1$188260577 files, and possibly the _book_keep.tbs.gz file, although I'm not sure how to go about trouble shooting the issue. I also recreated the database for these chromosomes (still using the same scripts as other chromosomes where variant calling was successful) to see if perhaps something went wrong during the initial database creation. I still received this error when I was trying to call variants.
What is most confusing to me is that this issue isn't happening for every chromosome, just the first 3. Any advice to get over this hump is greatly appreciated, and let me know if there is more information you need to help trouble shoot.
Thanks,
Caitlin
The text was updated successfully, but these errors were encountered: