New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

[FEATURE] Make the final chunk only as long as the longest sequence in the collection. #1378

Merged

rrahn merged 2 commits into seqan:master from rrahn:feature/simd_view_stops_at_longest_sequence

Dec 15, 2019

Contributor

rrahn commented Nov 23, 2019

Before this change the last chunk always had a size of simd_traits<simd_t>::length, which caused some issues when dealing with the last chunk, since the last elements in the transformed sequence might not have carried any information anymore if the size of the longest sequence was not a multiple of the simd size. Instead, the new code tracks the end of the last column in the final chunk and returns a span that only covers the sequences.

rrahn requested a review from marehr

November 23, 2019 15:59

rrahn force-pushed the feature/simd_view_stops_at_longest_sequence branch from fddce63 to e93dcfc Compare

November 23, 2019 19:36


          [FEATURE] Make the final chunk only as long as the longest sequence i…

483c96d

…n the collection.

rrahn force-pushed the feature/simd_view_stops_at_longest_sequence branch from e93dcfc to 483c96d Compare

November 23, 2019 20:07

codecov bot commented Nov 23, 2019 •

edited

Loading

Codecov Report

Merging #1378 into master will increase coverage by 0.02%.
The diff coverage is 87.17%.

@@            Coverage Diff             @@
##           master    #1378      +/-   ##
==========================================
+ Coverage   97.54%   97.56%   +0.02%     
==========================================
  Files         231      231              
  Lines        8747     8755       +8     
==========================================
+ Hits         8532     8542      +10     
+ Misses        215      213       -2

Impacted Files	Coverage Δ
include/seqan3/core/simd/view_to_simd.hpp	`92.92% <87.17%> (+2.44%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e4fa221...3c06b17. Read the comment docs.

rrahn mentioned this pull request

Feature/compute vectorised score for global alignment #1379

Merged

Member

marehr commented Nov 29, 2019

@rrahn I looked at the code cov and there are some untested places. Can you verify if it is intended that those places are not tested yet?

marehr reviewed

View reviewed changes

Member

marehr left a comment

First round of review, thank you for the pr. I think we should sit together next week and should look at that together in person.

include/seqan3/core/simd/view_to_simd.hpp Show resolved Hide resolved

include/seqan3/core/simd/view_to_simd.hpp Outdated

-                              uint8_t pos = j * chunk_size + i; // matrix entry to fill
-                              if (cached_sentinel[i] - cached_iter[i] >= max_size) // not in final block
+                              uint8_t pos = chunk_pos * chunk_size + sequence_pos; // matrix entry to fill
+                              size_t current_chunk_size = cached_sentinel[sequence_pos] - cached_iter[sequence_pos];

Member

marehr Nov 29, 2019

Isn't this remaining_elements and it can be bigger than chunk_size?

include/seqan3/core/simd/view_to_simd.hpp Outdated

-                              if (cached_sentinel[i] - cached_iter[i] >= max_size) // not in final block
+                              uint8_t pos = chunk_pos * chunk_size + sequence_pos; // matrix entry to fill
+                              size_t current_chunk_size = cached_sentinel[sequence_pos] - cached_iter[sequence_pos];
+                              max_size_of_last_chunk = std::max(max_size_of_last_chunk, current_chunk_size);

Member

marehr Nov 29, 2019

Can't this computed ahead of time? The remaining_elements will only decrease so only the first time this inner loop is executed, max_size_of_last_chunk will be set. All further execution of the outer loop the result will not change anymore and we waste computation.

include/seqan3/core/simd/view_to_simd.hpp Outdated

-                              if (cached_sentinel[i] - cached_iter[i] >= max_size) // not in final block
+                              uint8_t pos = chunk_pos * chunk_size + sequence_pos; // matrix entry to fill
+                              size_t current_chunk_size = cached_sentinel[sequence_pos] - cached_iter[sequence_pos];
+                              max_size_of_last_chunk = std::max(max_size_of_last_chunk, current_chunk_size);

Member

marehr Nov 29, 2019

Can't this computed a head of time? The remaining_elements will only decrease so only the first time this inner loop is executed, max_size_of_last_chunk will be set. All other times we waste computation.

Contributor Author

rrahn Nov 29, 2019

actually, it feels that the entire loop could be improved.
But that was not my primary goal here. So I didn't spent much time on making it real tight. Just wanted to get it working for now. When benchmarking suggests that there would be a measurable gain in doing this more optimal one can do it at a later point. But if you see immediate improvements we can discuss this and improve it.

include/seqan3/core/simd/view_to_simd.hpp Outdated Show resolved Hide resolved

include/seqan3/core/simd/view_to_simd.hpp Outdated

+                      if (final_chunk)
+                      { // Store the size of the last chunk.
+                          size_t size_of_last_chunk = max_size_of_last_chunk % chunk_size;

Member

marehr Nov 29, 2019

max_size_of_last_chunk is misleading, because it can be bigger than chunk_size. Or are you referring to a different chunk? If think max_remaining_elements would be better.

rrahn requested a review from marehr

December 4, 2019 11:05

marehr reviewed

View reviewed changes

include/seqan3/core/simd/view_to_simd.hpp Outdated Show resolved Hide resolved

include/seqan3/core/simd/view_to_simd.hpp Outdated Show resolved Hide resolved

rrahn requested a review from marehr

December 6, 2019 09:41

marehr reviewed

View reviewed changes

include/seqan3/core/simd/view_to_simd.hpp Show resolved Hide resolved

include/seqan3/core/simd/view_to_simd.hpp Outdated Show resolved Hide resolved

include/seqan3/core/simd/view_to_simd.hpp Show resolved Hide resolved

rrahn requested a review from marehr

December 9, 2019 22:53

Contributor Author

rrahn commented Dec 12, 2019

@marehr polite ping

marehr approved these changes

View reviewed changes

marehr requested a review from smehringer

December 13, 2019 11:53

smehringer approved these changes

View reviewed changes

include/seqan3/core/simd/view_to_simd.hpp Outdated

@@ @@ -357,25 +350,25 @@ class view_to_simd<urng_t, simd_t>::iterator_type @@
                       if constexpr (chunk_size == simd_traits<max_simd_type>::length / 2)  // upcast into 2 vectors.
                       {
                           return std::array{simd::upcast<simd_t>(extract_halve<0>(row)),  // 1. halve
-                                              simd::upcast<simd_t>(extract_halve<1>(row))}; // 2. halve
+                                            simd::upcast<simd_t>(extract_halve<1>(row))}; // 2. halve

Member

smehringer Dec 13, 2019

Suggested change

      
                                          simd::upcast<simd_t>(extract_halve<1>(row))}; // 2. halve
          
                                          simd::upcast<simd_t>(extract_halve<1>(row))}; // 2. half

halve is the verb ;)

include/seqan3/core/simd/view_to_simd.hpp

+                      size_t max_distance = 0;
+                      for (auto && [it, sent] : views::zip(iterators_before_update, cached_sentinel))
+                          max_distance = std::max<size_t>(std::ranges::distance(it, sent), max_distance);

Member

smehringer Dec 13, 2019

remove extra newline 💅


          [MISC] Improve the computation of the size of the final chunk.

3c06b17

rrahn force-pushed the feature/simd_view_stops_at_longest_sequence branch from 10d0f78 to 3c06b17 Compare

December 15, 2019 09:27

rrahn merged commit 6f7f3c2 into seqan:master

rrahn deleted the feature/simd_view_stops_at_longest_sequence branch

January 16, 2020 14:54

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment