Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[I/O] Remove column interface for structure files #1398

Merged
merged 2 commits into from
Dec 6, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 6 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ If possible, provide tooling that performs the changes, e.g. a shell-script.
#### Core
* Added traits for "metaprogramming" with `seqan3::type_list` and type packs.

#### Input/Output
#### I/O

* Asynchronous input (background file reading) supported via seqan3::view::async_input_buffer.
* Reading field::CIGAR into a vector over seqan3::cigar is supported via seqan3::alignment_file_input.
Expand All @@ -58,6 +58,11 @@ If possible, provide tooling that performs the changes, e.g. a shell-script.
* **The `type_list` header has moved:**
If you included `<seqan3/core/type_list.hpp>` you need to change the path to `<seqan3/core/type_list/type_list.hpp>`.

#### I/O

* The field-based in- and output interface for structure files through std::get and std::tie has been removed.
Output can instead be achieved with seqan3::views:zip(), for input we will implement unzip() in the future.

#### Range

* **The `seqan3::concatenated_sequences::data()` function has been deprecated:**
Expand Down
299 changes: 0 additions & 299 deletions include/seqan3/io/structure_file/input.hpp

Large diffs are not rendered by default.

167 changes: 1 addition & 166 deletions include/seqan3/io/structure_file/output.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -154,7 +154,7 @@ namespace seqan3
* The record-based interface treats the file as a range of tuples (the records), but in certain situations
* you might have the data as columns, i.e. a tuple-of-ranges, instead of range-of-tuples.
*
* You can use column-based writing in that case, it uses operator=() :
* You can use column-based writing in that case, it uses operator=() and views::zip():
*
* \include test/snippet/io/structure_file/structure_file_output_col_based.cpp
*
Expand Down Expand Up @@ -543,89 +543,6 @@ class structure_file_output
}
//!\}

/*!\name Tuple interface
* \brief Provides functions for field-based ("column"-based) writing.
* \{
*/
/*!\brief Write columns (wrapped in a seqan3::record) to the file.
* \tparam typelist Template argument to seqan3::record, each type must be a column (range-of-range).
* \tparam field_ids Template argument to seqan3::record, the IDs corresponding to the columns.
* \param[in] r The record of columns.
*
* \details
*
* \attention This is not part of the row-based file writing; the seqan3::record does not represent a file record,
* it is a tuple of the columns (with field information).
*
* ### Complexity
*
* Linear in the size of the columns.
*
* ### Exceptions
*
* Basic exception safety.
*
* ### Example
*
* \include test/snippet/io/structure_file/structure_file_output_col_based.cpp
*/
template <typename typelist, typename field_ids>
structure_file_output & operator=(record<typelist, field_ids> const & r)
{
write_columns(detail::range_wrap_ignore(detail::get_or_ignore<field::SEQ>(r)),
detail::range_wrap_ignore(detail::get_or_ignore<field::ID>(r)),
detail::range_wrap_ignore(detail::get_or_ignore<field::BPP>(r)),
detail::range_wrap_ignore(detail::get_or_ignore<field::STRUCTURE>(r)),
detail::range_wrap_ignore(detail::get_or_ignore<field::STRUCTURED_SEQ>(r)),
detail::range_wrap_ignore(detail::get_or_ignore<field::ENERGY>(r)),
detail::range_wrap_ignore(detail::get_or_ignore<field::REACT>(r)),
detail::range_wrap_ignore(detail::get_or_ignore<field::REACT_ERR>(r)),
detail::range_wrap_ignore(detail::get_or_ignore<field::COMMENT>(r)),
detail::range_wrap_ignore(detail::get_or_ignore<field::OFFSET>(r)));
return *this;
}

/*!\brief Write columns (wrapped in a std::tuple) to the file.
* \tparam arg_types The column types, each type must be a range-of-range.
* \param[in] t The tuple of columns.
*
* \details
*
* The columns are assumed to correspond to the field IDs given in selected_field_ids, however passing less
* is accepted if the format does not require all of them.
*
* ### Complexity
*
* Linear in the size of the columns.
*
* ### Exceptions
*
* Basic exception safety.
*
* ### Example
*
* \include test/snippet/io/structure_file/structure_file_output_col_based.cpp
*
*/
template <typename ...arg_types>
structure_file_output & operator=(std::tuple<arg_types...> const & t)
{
// index_of might return npos, but this will be handled well by get_or_ignore (and just return ignore)
write_columns(
detail::range_wrap_ignore(detail::get_or_ignore<selected_field_ids::index_of(field::SEQ)>(t)),
detail::range_wrap_ignore(detail::get_or_ignore<selected_field_ids::index_of(field::ID)>(t)),
detail::range_wrap_ignore(detail::get_or_ignore<selected_field_ids::index_of(field::BPP)>(t)),
detail::range_wrap_ignore(detail::get_or_ignore<selected_field_ids::index_of(field::STRUCTURE)>(t)),
detail::range_wrap_ignore(detail::get_or_ignore<selected_field_ids::index_of(field::STRUCTURED_SEQ)>(t)),
detail::range_wrap_ignore(detail::get_or_ignore<selected_field_ids::index_of(field::ENERGY)>(t)),
detail::range_wrap_ignore(detail::get_or_ignore<selected_field_ids::index_of(field::REACT)>(t)),
detail::range_wrap_ignore(detail::get_or_ignore<selected_field_ids::index_of(field::REACT_ERR)>(t)),
detail::range_wrap_ignore(detail::get_or_ignore<selected_field_ids::index_of(field::COMMENT)>(t)),
detail::range_wrap_ignore(detail::get_or_ignore<selected_field_ids::index_of(field::OFFSET)>(t)));
return *this;
}
//!\}

//!\brief The options are public and its members can be set directly.
structure_file_output_options options;

Expand Down Expand Up @@ -723,88 +640,6 @@ class structure_file_output
}, format);
}

//!\brief Write columns to file format, only tag-dispatch once.
template <std::ranges::input_range seq_type,
std::ranges::input_range id_type,
std::ranges::input_range bpp_type,
std::ranges::input_range structure_type,
std::ranges::input_range structured_seq_type,
std::ranges::input_range energy_type,
std::ranges::input_range react_type,
std::ranges::input_range comment_type,
std::ranges::input_range offset_type>
void write_columns(seq_type && seq,
id_type && id,
bpp_type && bpp,
structure_type && structure,
structured_seq_type && structured_seq,
energy_type && energy,
react_type && react,
react_type && react_error,
comment_type && comment,
offset_type && offset)
{
static_assert(!(detail::decays_to_ignore_v<reference_t<seq_type>> &&
detail::decays_to_ignore_v<reference_t<id_type>> &&
detail::decays_to_ignore_v<reference_t<bpp_type>> &&
detail::decays_to_ignore_v<reference_t<structure_type>> &&
detail::decays_to_ignore_v<reference_t<structured_seq_type>> &&
detail::decays_to_ignore_v<reference_t<energy_type>> &&
detail::decays_to_ignore_v<reference_t<react_type>> &&
detail::decays_to_ignore_v<reference_t<comment_type>> &&
detail::decays_to_ignore_v<reference_t<offset_type>>),
"At least one of the columns must not be set to std::ignore.");

static_assert(detail::decays_to_ignore_v<reference_t<structured_seq_type>> ||
(detail::decays_to_ignore_v<reference_t<seq_type>> &&
detail::decays_to_ignore_v<reference_t<structure_type>>),
"You may not select field::STRUCTURED_SEQ and either of field::SEQ and field::STRUCTURE "
"at the same time.");

assert(!format.valueless_by_exception());
std::visit([&] (auto & f)
{
if constexpr (!detail::decays_to_ignore_v<reference_t<structured_seq_type>>)
{
auto zipped = views::zip(structured_seq, id, bpp, energy, react, react_error, comment, offset);

for (auto && v : zipped)
{
f.write_structure_record(*secondary_stream,
options,
std::get<0>(v) | views::get<0>, // seq
std::get<1>(v), // id
std::get<2>(v), // bpp
std::get<0>(v) | views::get<1>, // structure
std::get<3>(v), // energy
std::get<4>(v), // react
std::get<5>(v), // react_error
std::get<6>(v), // comment
std::get<7>(v)); // offset
}
}
else
{
auto zipped = views::zip(seq, id, bpp, structure, energy, react, react_error, comment, offset);

for (auto && v : zipped)
{
f.write_structure_record(*secondary_stream,
options,
std::get<0>(v),
std::get<1>(v),
std::get<2>(v),
std::get<3>(v),
std::get<4>(v),
std::get<5>(v),
std::get<6>(v),
std::get<7>(v),
std::get<8>(v));
}
}
}, format);
}

//!\brief Befriend iterator so it can access the buffers.
friend iterator;
};
Expand Down
38 changes: 0 additions & 38 deletions test/snippet/io/structure_file/structure_file_input_col_read.cpp

This file was deleted.

Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
#include <sstream>
#include <string>
#include <tuple>
#include <vector>

#include <seqan3/alphabet/nucleotide/rna5.hpp>
#include <seqan3/alphabet/structure/wuss.hpp>
#include <seqan3/io/structure_file/output.hpp>
#include <seqan3/range/container/concatenated_sequences.hpp>
#include <seqan3/range/views/zip.hpp>

using seqan3::operator""_rna5;
using seqan3::operator""_wuss51;
Expand All @@ -26,5 +26,5 @@ int main()

seqan3::structure_file_output fout{std::ostringstream{}, seqan3::format_vienna{}};

fout = std::tie(data_storage.sequences, data_storage.ids, data_storage.structures);
fout = seqan3::views::zip(data_storage.sequences, data_storage.ids, data_storage.structures);
}
90 changes: 0 additions & 90 deletions test/unit/io/structure_file/structure_file_input_test.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -385,96 +385,6 @@ TEST_F(structure_file_input_read, record_file_view)
EXPECT_EQ(counter, num_records);
}

TEST_F(structure_file_input_read, column_general)
{
structure_file_input fin{std::istringstream{input}, format_vienna{},
fields<field::SEQ, field::ID, field::BPP, field::STRUCTURE, field::ENERGY>{}};

auto & seqs = get<field::SEQ>(fin); // by field
auto & ids = get<1>(fin); // by index
auto & bpps = get<field::BPP>(fin);
auto & struc = get<typename decltype(fin)::structure_column_type>(fin); // by type
auto & energies = get<field::ENERGY>(fin);

ASSERT_EQ(seqs.size(), num_records);
ASSERT_EQ(ids.size(), num_records);
ASSERT_EQ(bpps.size(), num_records);
ASSERT_EQ(struc.size(), num_records);
ASSERT_EQ(energies.size(), num_records);

for (size_t idx = 0ul; idx < num_records; ++idx)
{
EXPECT_TRUE((std::ranges::equal(seqs[idx], seq_comp[idx])));
EXPECT_TRUE((std::ranges::equal(ids[idx], id_comp[idx])));
bpp_test(bpps[idx], interaction_comp[idx]);
EXPECT_TRUE((std::ranges::equal(struc[idx], structure_comp[idx])));
EXPECT_DOUBLE_EQ(energies[idx].value(), energy_comp[idx]);
}
}

TEST_F(structure_file_input_read, column_temporary)
{
structure_file_input{std::istringstream{input}, format_vienna{}};

auto seqs = get<field::SEQ>(structure_file_input{std::istringstream{input}, format_vienna{}});

ASSERT_EQ(seqs.size(), num_records);

for (size_t idx = 0ul; idx < num_records; ++idx)
{
EXPECT_TRUE((std::ranges::equal(seqs[idx], seq_comp[idx])));
}
}

TEST_F(structure_file_input_read, column_decomposed)
{
structure_file_input fin{std::istringstream{input}, format_vienna{},
fields<field::SEQ, field::ID, field::STRUCTURE, field::ENERGY, field::BPP>{}};

auto & [ seqs, ids, struc, energies, bpps ] = fin;

ASSERT_EQ(seqs.size(), num_records);
ASSERT_EQ(ids.size(), num_records);
ASSERT_EQ(struc.size(), num_records);
ASSERT_EQ(energies.size(), num_records);
ASSERT_EQ(bpps.size(), num_records);

for (size_t idx = 0ul; idx < num_records; ++idx)
{
EXPECT_TRUE((std::ranges::equal(seqs[idx], seq_comp[idx])));
EXPECT_TRUE((std::ranges::equal(ids[idx], id_comp[idx])));
EXPECT_TRUE((std::ranges::equal(struc[idx], structure_comp[idx])));
EXPECT_DOUBLE_EQ(energies[idx].value(), energy_comp[idx]);
bpp_test(bpps[idx], interaction_comp[idx]);
}
}

TEST_F(structure_file_input_read, column_decomposed_temporary)
{
auto && [ seqs, ids, struc, energies, bpps ] = structure_file_input{std::istringstream{input},
format_vienna{},
fields<field::SEQ,
field::ID,
field::STRUCTURE,
field::ENERGY,
field::BPP>{}};

ASSERT_EQ(seqs.size(), num_records);
ASSERT_EQ(ids.size(), num_records);
ASSERT_EQ(struc.size(), num_records);
ASSERT_EQ(energies.size(), num_records);
ASSERT_EQ(bpps.size(), num_records);

for (size_t idx = 0ul; idx < num_records; ++idx)
{
EXPECT_TRUE((std::ranges::equal(seqs[idx], seq_comp[idx])));
EXPECT_TRUE((std::ranges::equal(ids[idx], id_comp[idx])));
EXPECT_TRUE((std::ranges::equal(struc[idx], structure_comp[idx])));
EXPECT_DOUBLE_EQ(energies[idx].value(), energy_comp[idx]);
bpp_test(bpps[idx], interaction_comp[idx]);
}
}

// ----------------------------------------------------------------------------
// decompression
// ----------------------------------------------------------------------------
Expand Down
17 changes: 2 additions & 15 deletions test/unit/io/structure_file/structure_file_output_test.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -396,22 +396,9 @@ TEST_F(structure_file_output_rows, assign_structure_file_pipes)

struct structure_file_output_columns : public structure_file_output_rows{};

TEST_F(structure_file_output_columns, assign_record_of_columns)
TEST_F(structure_file_output_columns, assign_columns)
{
record<type_list<std::vector<rna5_vector>, std::vector<std::string>, std::vector<std::vector<wuss51>>>,
fields<field::SEQ, field::ID, field::STRUCTURE>> columns
{
seqs,
ids,
structures
};

assign_impl(columns);
}

TEST_F(structure_file_output_columns, assign_tuple_of_columns)
{
assign_impl(std::tie(seqs, ids, structures));
assign_impl(views::zip(seqs, ids, structures));
}

// ----------------------------------------------------------------------------
Expand Down