Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[IO] Change how seqan3::field::alignment works with SAM/BAM files #2907

Closed
joshuak94 opened this issue Dec 1, 2021 · 4 comments · Fixed by #3058
Closed

[IO] Change how seqan3::field::alignment works with SAM/BAM files #2907

joshuak94 opened this issue Dec 1, 2021 · 4 comments · Fixed by #3058
Labels
bug faulty or wrong behaviour of code question a user question how to do certain things

Comments

@joshuak94
Copy link
Contributor

Currently, seqan3::field::alignment is a default field of SAM/BAM files. However, it requires a reference sequence in addition to the SAM/BAM input (if I'm understanding correctly). For the following case, this results in an error:

template <typename traits_type, typename fields_type, typename format_type>
inline auto do_something(seqan3::sam_file_input<traits_type, fields_type, format_type> & input)
{
    ...
    auto results_list = input | std::views::take_while([file_position](auto & rec) {return file_position != -1;})
                              | std::views::take_while([end](auto & rec) {return std::make_tuple(rec.reference_id().value(), rec.reference_position().value()) < end;})
                              | std::views::filter([](auto & rec) {return !unmapped(rec);})
                              | seqan3::views::to<std::vector>;
    seqan3::debug_stream << "Results: " << results_list << '\n';
    return results_list;

}

There is an error when passing an input file of type seqan3::sam_file_input with default fields.

The issue is that the default fields includes the seqan3::field::alignment field. So the debug_stream will try and print out the alignment for the records. However, there is no alignment actually stored.

Solution: Remove the seqan3::field::alignment field from the default fields for sam_file_input/output types.

Additionally, it would be nice if there were a way to enforce that seqan3::field::alignment can only be provided alongside a reference sequence file. I don't think I see anything in the documentation about how that actually works. Or a "default" empty alignment should be stored, to not cause an error when accessing an empty alignment.

@joshuak94 joshuak94 added the question a user question how to do certain things label Dec 1, 2021
@joshuak94
Copy link
Contributor Author

Minimal example:

seqan3::sam_file_input input_file{input_path};

for (auto & record : input_file)
{
    seqan3::debug_stream << record << '\n';
}

Will cause an error.

@joshuak94 joshuak94 added the bug faulty or wrong behaviour of code label Feb 15, 2022
@smehringer
Copy link
Member

This issue will be automatically solved once we remove the seqan3::field::alignment from the sam_file.

See related PRs:

@eseiler eseiler changed the title Change how seqan3::field::alignment works with SAM/BAM files. [IO] Change how seqan3::field::alignment works with SAM/BAM files Oct 20, 2022
@Lionward
Copy link

Lionward commented Oct 17, 2023

Minimal example:

seqan3::sam_file_input input_file{input_path};

for (auto & record : input_file)
{
    seqan3::debug_stream << record << '\n';
}

Will cause an error.

is there a solve for this? I am trying to read a bam file and exctract its cigar information using this function, is there an alternative way to do it for now?

this is returning the following error in my case:

terminate called after throwing an instance of 'seqan3::file_open_error'
  what():  Trying to read from a bgzf file, but no ZLIB available.
Aborted (core dumped)

this is my script, which is identical to the one mentioned in the documentation provided by the authors:

#include <filesystem>
 
#include <seqan3/core/debug_stream.hpp>
#include <seqan3/io/sam_file/all.hpp>
 
int main()
{
    auto filename = std::filesystem::current_path() / "mybam_file.bam"; 
    std::cout << filename << std::endl;
    seqan3::sam_file_input fin{filename}; // default fields
 
    for (auto & record : fin){
        seqan3::debug_stream << record.cigar_sequence() << '\n'; // access cigar vector
    }
}

the bamfile is tested to be working

@smehringer
Copy link
Member

Hi @Lionward,

your issue seems to something else. The error says that you are trying to read BAM, but there is no ZLIB available. Check that Zlib is installed on the system your have build seqan3.

  • If not, install it and rebuild the seqan3 script from scratch.
  • If it is, please provide more information on which seqan3 version and compiler your are using and how your build steps are.

Best, Svenja

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug faulty or wrong behaviour of code question a user question how to do certain things
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants