Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bam_aux_update_int fails non-deterministically #127

Open
charlesgregory opened this issue Dec 31, 2021 · 0 comments
Open

bam_aux_update_int fails non-deterministically #127

charlesgregory opened this issue Dec 31, 2021 · 0 comments

Comments

@charlesgregory
Copy link
Contributor

Finally traced a bug in fade (blachlylab/fade#22) down to this function (exception thrown on line 447):

void opIndexAssign(T)(T value, string index)
if(!isArray!T || isSomeString!T)
{
static if(isIntegral!T){
auto err = bam_aux_update_int(b, index[0..2], value);
}else static if(is(T==float)){
auto err = bam_aux_update_float(b, index[0..2], value);
}else static if(isSomeString!T){
auto err = bam_aux_update_str(b, index[0..2], cast(int) value.length, value.ptr);
}
if(err == 0) return;
switch(errno){
case EINVAL:
throw new Exception("The bam record's aux data is corrupt or an existing tag" ~
" with the given ID is not of an integer type (c, C, s, S, i or I).");
case ENOMEM:
throw new Exception("The bam data needs to be expanded and either the attempt" ~
" to reallocate the data buffer failed or the resulting buffer would be longer" ~
" than the maximum size allowed in a bam record (2Gbytes).");
case ERANGE:
case EOVERFLOW:
throw new Exception("val is outside the range that can be stored in an integer" ~
" bam tag (-2147483647 to 4294967295).");
case -1:
throw new Exception("Something went wrong adding the bam tag.");
case 0:
return;
default:
throw new Exception("Something went wrong adding the bam tag.");
}
}

bam_aux_update_int was failing with a return value of -1 non-deterministically. bam_aux_update_int is supposed to set errno in all cases except when aux data is corrupt. After manually checking the aux data, it appeared intact. If aux data is corrupt we should get an hts_log warning from bam_aux_get which is called by bam_aux_update_int.

errno was being set to 4 which is:

enum EINTR              = 4;        /// Interrupted system call

This errno value is not used by any bam aux functions.

Furthermore, I found that running bam_aux_update_int again within the same opIndexAssign call would be successful. I am not sure if this is some bug of htslib or dhtslib. The htslib developers have been working out bugs in htslib over the past year with respect to sam multithreading and fade is very multi-threaded. So potentially this is related to those issues?

My current workaround is to make the default switch case call this function recursively (and log it). This is obviously not ideal but does seem to fix the issue and generate valid bam output from fade.

        switch(errno){
            case EINVAL:
                throw new Exception("The bam record's aux data is corrupt or an existing tag" ~ 
                " with the given ID is not of an integer type (c, C, s, S, i or I).");
            case ENOMEM:
                throw new Exception("The bam data needs to be expanded and either the attempt" ~
                " to reallocate the data buffer failed or the resulting buffer would be longer" ~
                " than the maximum size allowed in a bam record (2Gbytes).");
            case ERANGE:
            case EOVERFLOW:
                throw new Exception("val is outside the range that can be stored in an integer" ~ 
                " bam tag (-2147483647 to 4294967295).");
            case -1:
                throw new Exception("Something went wrong adding the bam tag.");
            case 0:
                return;
            default:
                // No longer throw but call recursively
                hts_log_info(__FUNCTION__, "Undefined issue adding/updating bam tag. Trying again...");
                return opIndexAssign!T(value, index);
        }

It should also be noted that in order to make the issue show up for my test data, I had to use mimalloc to override the default malloc and friends. The issue was presenting for the downstream fade user in the binaries that were built with default glibc allocation or musl-c allocation. So I highly doubt this is a bug associated with mimalloc use.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant