Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thoughts on memory pools #1

Open
emmericp opened this issue Nov 2, 2017 · 1 comment
Open

Thoughts on memory pools #1

emmericp opened this issue Nov 2, 2017 · 1 comment

Comments

@emmericp
Copy link
Owner

emmericp commented Nov 2, 2017

Memory pools should be

  • fast
  • simple
  • multi-threaded
  • allow bulk alloc/free

currently they are only fast and simple. Let's see if we can get the other two properties as well without losing speed and simplicity.

A memory pool is just a memory allocator for a fixed number of fixed-size buffers. So its core is some data structure that keeps track of a fixed number of pointers. Reasonable data structures for that are ring buffers and stacks. Currently it's a stack.

Implementing bulk allow/free for that stack is a trivial change. But multi-threaded fast stacks (especially with bulk operations) are extremely difficult (they are a standard example for lock-free data structures suffering from the ABA bug...). That would go against our design goal of keeping it simple...

So let's use a queue? Multi-threaded queues are relatively simple. However, queues suffer from a different problem: poor temporal data locality as they will cycle through all the references -- the stack will re-use buffers that you just recently used.
The fix for that are per-thread caches in the memory pool. There are two problems with that: it's probably no longer simple and it doesn't work with all use cases. One scenario where this fails is some pipeline application where packets aren't sent out on the same thread that they where received. And this is the primary motiviation why we wanted a multi-threaded memory pool in the first place.

An interesting reference is DPDK which defaults to a queue-based memory pool with thread-local caches. And it has the same problem. Their solution for pipeline-based applications is a stack with a spin lock: http://dpdk.org/ml/archives/dev/2016-July/043106.html

Why not use some library? We want to keep ixy free of external dependencies, you should be able to go through the whole code and understand all parts, including the core data structures.

What are we going to do? Probably a stack with a sping lock. And benchmarking! Probably queue vs stack. And single-threaded stack vs. multi-threaded stack in an uncontended scenario. And some realistic contention setup. NUMA?

@pudelkoM
Copy link
Contributor

pudelkoM commented Nov 3, 2017

mhhh ok.

Should we adopt an composable allocator backend like 251fd957 (based on CppCon 2015: Andrei Alexandrescu “std::allocator...”) for this?
The drivers&consumers do not care how memory is managed behind the scenes, but having an unified API make swapping the allocator a one-line change.
Also makes benchmarking a lot easier if the allocator can be specified as a runtime/compile-time flag.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants