-
Notifications
You must be signed in to change notification settings - Fork 83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[I/O] Remove column interface for sequence files #1412
Conversation
707f7bb
to
463bece
Compare
Rebased on master because the CHANGELOG caused a merge conflict. |
I had to change the solution of an alignment I/O tutorial exercise, because it was using the column interface. It is now realised with a for-loop over the file. |
Codecov Report
@@ Coverage Diff @@
## master #1412 +/- ##
==========================================
- Coverage 97.59% 97.58% -0.02%
==========================================
Files 232 232
Lines 8831 8783 -48
==========================================
- Hits 8619 8571 -48
Misses 212 212
Continue to review full report at Codecov.
|
|
||
auto mapq_filter = std::views::filter([] (auto & rec) { return get<field::MAPQ>(rec) >= 30; }); | ||
auto mapq_filter = std::views::filter([] (auto & rec) { return seqan3::get<seqan3::field::MAPQ>(rec) >= 30; }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seqan3::get
works?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, I see no error...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apparently, but I'm wondering why :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess because this get
has a seqan3::record
as argument...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because this get is not a hidden friend of records
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this work. Some questions and improvements that I have.
|
||
for (auto const & record : reference_file) | ||
{ | ||
ref_ids.push_back(seqan3::get<seqan3::field::ID>(record)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use std::move and capture the record by &&
std::vector<std::vector<dna5>> ref_seqs = get<field::SEQ>(reference_file); | ||
seqan3::sequence_file_input reference_file{tmp_dir/"reference.fasta"}; | ||
seqan3::concatenated_sequences<std::string> ref_ids{}; | ||
std::vector<std::vector<seqan3::dna5>> ref_seqs{}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are the sequences not stored in an concatenated set? It would make more sense here than for the ids
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I swap them!
seqan3::alignment_file_input mapping_file{tmp_dir/"mapping.sam", | ||
ref_ids, | ||
ref_seqs, | ||
seqan3::fields<seqan3::field::ID, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please define the fields type befofe
|
||
auto mapq_filter = std::views::filter([] (auto & rec) { return get<field::MAPQ>(rec) >= 30; }); | ||
auto mapq_filter = std::views::filter([] (auto & rec) { return seqan3::get<seqan3::field::MAPQ>(rec) >= 30; }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because this get is not a hidden friend of records
auto & ref = get<0>(alignment); | ||
size_t sum_ref{}; | ||
std::ranges::for_each(ref.begin(), ref.end(), [&sum_ref] (auto c) { if (c == gap{}) ++sum_ref; }); | ||
std::ranges::for_each(ref.begin(), ref.end(), [&sum_ref] (auto c) { if (c == seqan3::gap{}) ++sum_ref; }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use std::ranges::begin/end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then I also use the range based for loop here.
size_t sum_read{}; | ||
std::ranges::for_each(read.begin(), read.end(), [&sum_read] (auto c) { if (c == gap{}) ++sum_read; }); | ||
std::ranges::for_each(read.begin(), read.end(), [&sum_read] (auto c) { if (c == seqan3::gap{}) ++sum_read; }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use range based for loop here
@@ -135,8 +136,8 @@ auto const & get_or_ignore(record<field_types, field_ids> const & r) | |||
} | |||
|
|||
//!\copydoc seqan3::detail::get_or_ignore | |||
template <size_t i, typename ...types> | |||
auto & get_or_ignore(std::tuple<types...> & t) | |||
template <size_t i, template <tuple_like ...types_> typename tuple_like_t, typename ...types> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the this should be a tuple of tulles?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No it is not a tuple of tuples. The function was only valid for std::tuple
as input argument and I want to extend it for seqan3::common_pair. This is needed when we call for instance
fout = seqan3::views::zip(ids, seq_quals);
Then parameter t
has type seqan3::common_pair<string, vector<qualified>>
.
Instead of writing two additional overloads I generalised it to tuple_like
.
@@ -152,7 +152,7 @@ namespace seqan3 | |||
* The record-based interface treats the file as a range of tuples (the records), but in certain situations | |||
* you might have the data as columns, i.e. a tuple-of-ranges, instead of range-of-tuples. | |||
* | |||
* You can use column-based writing in that case, it uses operator=() : | |||
* You can use column-based writing in that case, it uses operator=() and views::zip(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* You can use column-based writing in that case, it uses operator=() and views::zip(): | |
* You can use column-based writing in that case, it uses operator=() and seqan3::views::zip(): |
8cfb404
to
6a9752d
Compare
I rebased on master to solve merge conflicts with the fields renaming (to lowercase). |
6a9752d
to
6be8f46
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more change 💅 . Thank you.
seqan3::field::ref_id, | ||
seqan3::field::mapq, | ||
seqan3::field::alignment>{}}; | ||
auto const fields = seqan3::fields<seqan3::field::id, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would only declare a type here with using. Then construct the type inplace of the constructor argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Can you please squash the last two commits. |
144b3ac
to
78ab4d2
Compare
Sure! |
I have to extend the record's
get_or_ignore
interface forcommon_pair
, which is the result ofranges::zip
with two arguments. Therefore, I use thetemplate template tuple_like
to cover bothstd::tuple
andcommon_pair
.Resolves #1411.