Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FIX] not accepting BAM with empty header #2536

Merged
merged 2 commits into from
Apr 21, 2021
Merged

Conversation

eseiler
Copy link
Member

@eseiler eseiler commented Apr 20, 2021

Resolves #1201

@vercel
Copy link

vercel bot commented Apr 20, 2021

This pull request is being automatically deployed with Vercel (learn more).
To see the status of your deployment, click below or on the icon next to each commit.

🔍 Inspect: https://vercel.com/seqan/seqan3/4AdembVej9dTyC9fPALjAEcB4fSr
✅ Preview: https://seqan3-git-fork-eseiler-fix-bam3-seqan.vercel.app

@eseiler eseiler requested review from a team and MitraDarja and removed request for a team April 20, 2021 15:15
@MitraDarja MitraDarja requested review from a team and marehr and removed request for a team April 20, 2021 15:30
@marehr
Copy link
Member

marehr commented Apr 20, 2021

CI fails with:

../../../seqan3/include/seqan3/io/sam_file/format_bam.hpp:360:17: error: passing ‘const std::vector<std::__cxx11::basic_string<char> >’ as ‘this’ argument discards qualifiers [-fpermissive]
                 header.ref_ids().push_back(string_buffer);
                 ^~~~~~

include/seqan3/io/sam_file/format_bam.hpp Outdated Show resolved Hide resolved
include/seqan3/io/sam_file/format_bam.hpp Outdated Show resolved Hide resolved
Copy link
Member Author

@eseiler eseiler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up stuff?

string_buffer.resize(tmp32 - 1);
std::ranges::copy_n(std::ranges::begin(stream_view), tmp32 - 1, string_buffer.data()); // copy without \0 character
string_buffer.resize(l_name - 1);
std::ranges::copy_n(std::ranges::begin(stream_view), l_name - 1, string_buffer.data()); // copy without \0 character
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory, this should copy the \0 character, because l_name is 1 + length including \0.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what you mean :)

Do want to copy that \0 too?

Copy link
Member Author

@eseiler eseiler Apr 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says: We do not copy \0.
The code says: We copy \0.

So follow-up would be to double-check and, if we do not copy the \0, to add a comment as to why is that. Because l_name is one more than the length including \0, so l_name - 1 should include the \0.

Comment on lines +336 to +339
int32_t l_text{}; // length of header text including \0 character
int32_t n_ref{}; // number of reference sequences
int32_t l_name{}; // 1 + length of reference name including \0 character
int32_t l_ref{}; // length of reference sequence
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per specification, these should be uint32_t. Maybe there was a reason they are not, and maybe we can find that reason.
Just replacing does not work.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, definitely a follow-up!

Copy link
Member

@marehr marehr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! This looks great!

@marehr marehr enabled auto-merge (squash) April 21, 2021 08:20
@eseiler
Copy link
Member Author

eseiler commented Apr 21, 2021

I'll do the follow-up issue once this is merged so that I can do permalinks to the code.

@codecov
Copy link

codecov bot commented Apr 21, 2021

Codecov Report

Merging #2536 (762dc09) into master (e9e55b3) will increase coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #2536   +/-   ##
=======================================
  Coverage   98.22%   98.22%           
=======================================
  Files         269      269           
  Lines       10526    10535    +9     
=======================================
+ Hits        10339    10348    +9     
  Misses        187      187           
Impacted Files Coverage Δ
include/seqan3/io/sam_file/format_bam.hpp 96.29% <100.00%> (+0.08%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e9e55b3...762dc09. Read the comment docs.

@marehr marehr merged commit 1964715 into seqan:master Apr 21, 2021
@eseiler eseiler deleted the fix/bam3 branch May 14, 2021 14:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BAM unknown reference name
3 participants