-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bam record should validate sequence and quality length before writing #59
Comments
Sure, if you'd like to do this, let'd do it in the BAM record writer. The SAM record writer can be used as reference: noodles/noodles-bam/src/writer/sam_record.rs Lines 101 to 124 in b4124cf
|
Looking at the specification there's another constraint that is enforced by the current version of
It's probably best to also require this. |
To avoid writing corrupt or incompatible SAM/BAM files an extra validation step is applied before writing. This makes sure: - The number of base qualities is equal to the sequence length or empty - The length of the cigar operations matches the sequence length Closes zaeleus#59
In the encoded bam there is one field that encodes the length for both the sequence and the base qualities. When writing a bam::Record this is not validated resulting in possibly corrupt BAM files.
According to the bam spec when base qualities are omitted 0xFF should be written instead.
This can be done in the writer code, but it's probably cleaner to do a
validate
function on the record? Should I do a PR?The text was updated successfully, but these errors were encountered: