Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Search] Use a buffer to cache the intermediate results between search incovations such that memory allocations are reduced on multiple invocations of search. #29

Open
12 tasks
rrahn opened this issue Mar 27, 2020 · 9 comments
Labels
needs refinement A story that was not discussed and/or estimated by the team yet but is planned for upcoming sprints.

Comments

@rrahn
Copy link
Contributor

rrahn commented Mar 27, 2020

Description

A thread local buffer used for the search can have a impact on the performance by avoiding unncessary memory allocations which also have to be synchronised by the OS.

Also see here for possible improvements: seqan/seqan3#1528

Acceptance Criteria

  • the buffer is automatically always available. Nothing changes from the user perspective.
  • no realllocations happen between search invocations
  • the buffer is cleared before the search is invoked but not shrinked.

Tasks

  • how can this be designed and implemented?

Definition of Done

  • Implementation and design approved
  • Unit tests pass
  • Test coverage = 100%
  • Microbenchmarks added and/or affected microbenchmarks < 5% performance drop
  • API documentation added
  • Tutorial/teaching material added
  • Test suite compiles in less than 30 seconds (on travis)
  • Changelog entry added
@rrahn rrahn added the needs refinement A story that was not discussed and/or estimated by the team yet but is planned for upcoming sprints. label Mar 27, 2020
@rrahn rrahn added this to the Sprint 1 milestone Mar 27, 2020
@rrahn rrahn modified the milestones: Sprint 1, Sprint 2 Apr 15, 2020
@rrahn rrahn added this to the Sprint 4 milestone May 12, 2020
@rrahn rrahn modified the milestones: Sprint 4, Sprint 5 May 26, 2020
@rrahn rrahn modified the milestones: Sprint 5, Sprint 6 Jun 8, 2020
@h-2
Copy link
Member

h-2 commented Jul 20, 2020

how can this be designed and implemented?

  • low-level utilities like .locate() on the iterators need an out-parameter version.
  • add a publicly documented struct called search_buffers whose members are public, but not part of the API and not visible to user doc.
  • one member should be a buffer for the occurrences that is passed to .locate() by in-out parameter (.locate() appends to this buffer).
  • actual member it is not one buffer, but a vector of buffers (one for every thread)
  • regular search creates one object of type search_buffer and passes the respective member to down-stream functions.
  • a config object is added that allows the user to create an object outside the function call. In this case the search doesn't create an object but uses the one provided by the config element.

So for regular usage nothing changes, but users can optionally pass in an external buffers object via the config system. The internals of this type are not exposed and can therefore be changed at any time.

@rrahn
Copy link
Contributor Author

rrahn commented Aug 10, 2020

Resolution 10.06.2020
  1. We will investigate a strategy using pmr::vector and polymorphic resource to make it easy to provide a generic buffer from outside without knowing the type.

  2. Agent-Client pattern like for client pattern?

@h-2
Copy link
Member

h-2 commented Aug 10, 2020

Hm, that sounds quite complicated (for the user), but I don't know what you have discussed yet. Why do we need polymorphism? We know the type of the buffers.

My proposal would look like this:

seqan3::search_buffers buffers; // the user doesn't know what's inside, but we do

seqan3::configuration const default_cfg = seqan3::search_cfg::max_error_total{zero_errors} |
                                              seqan3::search_cfg::max_error_substitution{zero_errors} |
                                              seqan3::search_cfg::max_error_insertion{zero_errors} |
                                              seqan3::search_cfg::max_error_deletion{zero_errors} |
                                              seqan3::search_cfg::output_query_id |
                                              seqan3::search_cfg::output_reference_id |
                                              seqan3::search_cfg::output_reference_begin_position |
                                              seqan3::search_cfg::hit_all |
                                              seqan3::search_cfg::with_buffers{buffers}; // <--------

// after that, search normally with the config

@h-2
Copy link
Member

h-2 commented Aug 10, 2020

Note that this whole issue is not about how results are returned from the top-level search but how intermediate results are stored internally (cursors and results of the cursors' locate operation).

@marehr
Copy link
Member

marehr commented Aug 10, 2020

Note that this whole issue is not about how results are returned from the top-level search but how intermediate results are stored internally (cursors and results of the cursors' locate operation).

We know.

The main point is:

  • Allocating internal memory is expensive; reusing (internal) memory would be better.
    • => This could only be achieved if you pass that (internal) memory externally in.

We (briefly) discussed how we could model this with a sound interface. (Just passing a std::vector is a bad interface, because it is to specific to the internals)

I noticed that we already solved this problem for a different data structure (e.g. IBF), where we used a client-agent pattern to model this.

And René noted that std::pmr::memory_resource might be a different way to implement this (passing internal storage externally), but that we need to actually see if it really fits our use case.

If you want and have time, we could talk about this in a web-call. 👍

@h-2
Copy link
Member

h-2 commented Aug 10, 2020

Allocating internal memory is expensive; reusing (internal) memory would be better.
=> This could only be achieved if you pass that (internal) memory externally in.

Yep, that's why I propose do just create a named struct that can be used to create objects that are passed to the algorithm via the config. This struct ca contain anything that we need (and can be changed later on since none of its members are part of the API).

If you want and have time, we could talk about this in a web-call. +1

Maybe we can do a Lambda call sometime soon and then have 30min nerd session on this topic afterwards ;)

@rrahn
Copy link
Contributor Author

rrahn commented Aug 10, 2020

Well, that's the point. I am not completely into the details of the index cursors, but often we end up having different types depending on the configuration (and I am not sure what will come in the future, different indices, different cursors etc.). So while these types usually depend on the configuration, you need to allocate buffer for something you don't know the type yet and so you get into difficulties and hard to-use-interfaces with strange dependencies (get the correct buffer type for the configuration element, but then add this to the configuration again). So I have seen (but not pursued any further, i.e. it has to be investigated) that with the pmr memory resource you as a user can simply provide memory (chunks of std::byte). So that is great, because now we can use these buffers to manage and reuse the memory inside of the algorithms and do not carry to much about the types. Of course we can wrap a default implementation of this inside of a dedicated struct. But also the user can do more advanced stuff by being able to provide memory specifically allocated on high-bandwidth memory or the stack etc..

@h-2
Copy link
Member

h-2 commented Aug 10, 2020

It's true that the buffers might depend on the configuration at some point. My initial response would be to just put all potentially usable buffers into seqan3::search_buffers. It doesn't matter if some members aren't used by some configurations.

While the approach you suggest is powerful, it sounds like it will add complexity to the simple use-case. And while some algorithms might profit from specialised storage, the things we are talking about for search are small, simple objects that need to be in main memory (AFAICT).

An alternative with a non-generic buffer type could be exposing the buffers type as a member typedef of the config, but this is already suboptimal usability-wise from my POV:

seqan3::configuration const cfg = seqan3::search_cfg::max_error_total{zero_errors} |
                                              seqan3::search_cfg::max_error_substitution{zero_errors} |
                                              seqan3::search_cfg::max_error_insertion{zero_errors} |
                                              seqan3::search_cfg::max_error_deletion{zero_errors} |
                                              seqan3::search_cfg::output_query_id |
                                              seqan3::search_cfg::output_reference_id |
                                              seqan3::search_cfg::output_reference_begin_position |
                                              seqan3::search_cfg::hit_all;

decltype(cfg)::buffers_type buffers;

cfg.set_buffers(buffers);

// after that, search normally with the config

edit: let's talk about this on a call sometime.

@marehr marehr added the spike label Sep 28, 2020
@marehr
Copy link
Member

marehr commented Sep 28, 2020

2020-09-28

We briefly discussed this, and we think this is a spike and needs a prototype to be able to discuss this further.

@eseiler eseiler removed the spike label Oct 20, 2022
@smehringer smehringer changed the title Use a buffer to cache the intermediate results between search incovations such that memory allocations are reduced on multiple invocations of search. [Search] Use a buffer to cache the intermediate results between search incovations such that memory allocations are reduced on multiple invocations of search. Oct 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs refinement A story that was not discussed and/or estimated by the team yet but is planned for upcoming sprints.
Projects
None yet
Development

No branches or pull requests

4 participants