Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove boost::iterator_adaptor from VectorWithOffset #1442

Merged
merged 8 commits into from
Jun 10, 2024

Conversation

KrisThielemans
Copy link
Collaborator

  • Remove the somewhat complicated boost::iterator_adaptor code. Looks like we don't need it at all. VectorWithOffset<T>::iterator is now a T*. This allows compilers to speed-up std::copy et al. automatically (by using memmove).
  • add extra timings to stir_timings to be able to check this.

Before the change, ProjDataInMemory::fill was about 1.5 slower than std::copy with std::vector, while now it is as fast. In addition, ProjDataFromStream::fill now takes far less CPU time (although difference in wall-clock time on my test machine is very small, probably as that is dominated by IO). This is likely because the get_empty_segment_by_view()/segment.fill()/set_segment() code is now faster.

Fixes #1259

@KrisThielemans KrisThielemans self-assigned this Jun 3, 2024
@KrisThielemans KrisThielemans added this to the v6.2 milestone Jun 3, 2024
- add more timings for basic building blocks (+=, *= etc)
- allow skipping image/projdata timings
- remove boost::iterator_adaptor, detail::VectorWithOffset_iter is
no longer needed, and begin() et al. now just return T*, allowing compilers
to speed up std::copy etc
- use std::make_reverse_iterator as opposed to the boost version
@KrisThielemans
Copy link
Collaborator Author

Some timings on a beefy machine (30 threads) with a large TOF file (GE Signa PET/MR, view mashing 3, 3 segments, 33 TOF bins, total elems: 676,889,136).

old timings (sorry, not all of them):

        copy_image                                                23.500                          24.319
        create_copy_proj_data_mem_to_mem                         770.000                        2224.292
        create_copy_proj_data_mem_to_file                      44740.000                        9448.194
        create_copy_proj_data_file_to_mem                      41920.000                        6138.730
        create_copy_proj_data_file_to_file                     39345.000                        7681.845

new timings:

        copy_image                                                25.000                          25.709
        copy_add_image                                            38.000                          37.767
        copy_mult_image                                           37.500                          37.745
        copy_std_vector_of_size_projdata                         200.000                         197.675
        create_vector_of_size_projdata                           110.000                         762.857
        create_proj_data_in_mem_no_init                          350.000                        1778.367
        create_proj_data_in_mem_init                             375.000                        1766.101
        copy_proj_data_mem_to_mem                                225.000                         224.398
        create_copy_proj_data_mem_to_mem                         580.000                        1985.701
        create_copy_proj_data_mem_to_file                       2835.000                        8502.535
        create_copy_proj_data_file_to_mem                       3320.000                        5450.137
        create_copy_proj_data_file_to_file                      1020.000                        7258.220
        copy_add_proj_data_mem                                   950.000                        2394.447
        copy_mult_proj_data_mem                                  995.000                        2389.107

Sadly, PMRT projection times are virtually identical (no CUDA on that machine).

removed from VectorWithOffset so have to add it here
@KrisThielemans
Copy link
Collaborator Author

@markus-jehl one more for you to check... Obviously building on master, so presumably will have the same problem as #1438

@KrisThielemans
Copy link
Collaborator Author

MacOS job fails 1 ctest with just

test fill(ProjDataInterfile)
Error. fill(ProjDataInterfile&)

@KrisThielemans
Copy link
Collaborator Author

MacOS job worked ok this time around, even though I didn't change the code (just the formatting of the test), so presumably this was just a borderline rounding case.

@KrisThielemans
Copy link
Collaborator Author

@markus-jehl I have merged your fix on here. I'm more optimistic about this one making some positive impact, even if it's only be removing code complexity 😄 (and one more dependency on boost).

@markus-jehl
Copy link
Contributor

@markus-jehl I have merged your fix on here. I'm more optimistic about this one making some positive impact, even if it's only be removing code complexity 😄 (and one more dependency on boost).

Perfect, I'll give this a try!

@KrisThielemans
Copy link
Collaborator Author

Codacy failure is because tmp below is "unused"

void create_std_vector()
  {
    std::vector<float> tmp(this->v1.size());
  }

@markus-jehl
Copy link
Contributor

again, no noticeable difference :-(

@KrisThielemans
Copy link
Collaborator Author

ok. I guess this will only show up for LAFOV and/or TOF scanners. Anyway, the code is cleaner. Thanks for checking!

@KrisThielemans KrisThielemans merged commit 7c2e5bc into UCL:master Jun 10, 2024
8 of 9 checks passed
@KrisThielemans KrisThielemans deleted the NoBoostIteratorAdaptor branch June 10, 2024 12:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

std::copy of VectorWithOffset<float> does not fall back to memmove
2 participants