Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC, WIP] Global memorypool implementation #3644

Closed
wants to merge 14 commits into from
Closed

Conversation

f18m
Copy link
Contributor

@f18m f18m commented Aug 28, 2019

See #3631 for the full discussion.
This PR is just to show my current results with:

  • using moodycamel::concurrentqueue (MPMC zerolock queue)
  • using an API that will not impact on the ZMQ context API (no allocator is set inside the context) but rather directly on the zmq_msg_t API (there's a new zmq_msg_init_allocator API)

Any comment is welcome!

// Increment counter
auto prevVal = elementsCompletelyDequeued.fetch_add(count, std::memory_order_release);
assert(prevVal + count <= BLOCK_SIZE);
return prevVal +
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MEMORY_LEAK: memory dynamically allocated by call to malloc at line 3529, column 12 is not reachable after line 3530, column 10.

// Increment counter
auto prevVal = elementsCompletelyDequeued.fetch_add(count, std::memory_order_release);
assert(prevVal + count <= BLOCK_SIZE);
return prevVal +
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MEMORY_LEAK: memory dynamically allocated by call to malloc at line 3522, column 12 is not reachable after line 3523, column 10.

// Increment counter
auto prevVal = elementsCompletelyDequeued.fetch_add(count, std::memory_order_release);
assert(prevVal + count <= BLOCK_SIZE);
return prevVal +
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MEMORY_LEAK: memory dynamically allocated by call to malloc at line 3529, column 12 is not reachable after line 3530, column 10.

// Increment counter
auto prevVal = elementsCompletelyDequeued.fetch_add(count, std::memory_order_release);
assert(prevVal + count <= BLOCK_SIZE);
return prevVal +
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MEMORY_LEAK: memory dynamically allocated by call to malloc at line 3522, column 12 is not reachable after line 3523, column 10.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am aware of the this double-posting of comments. It is technically due to a double-send of the pull request event from GitHub. Solutions are being considered but unless the issue occurs painfully often I don't expect to give it priority. Please feel free to shout about any pain points.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's worse is that the line numbers are wrong, the comment is made at line 1548 but it mentions line 3522. Around line 1548 is no malloc call.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

anyway I think I found the reason of a possible memory leak (global_memory_pool_t dtor was not yet implemented!) and fixed it in my last checkin

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for calling this out. I am now in contact with GitHub regarding this issue.

Details: I've investigated and determined it is a GitHub bug. The quickest way to tell something is inconsistent is by clicking on the file src/concurrentqueue.h and seeing github place the comment on the correct line (35xx instead of 15xx).

@somdoron
Copy link
Member

As explained in the other pull request, single-consumer/single-producer memory pool is very dangerous and will break zeromq for existing users. I think we should either not include a built-in memory pool at that moment and only allow users to provide their own, or look for a safe alternative.

src/msg.cpp Outdated
_u.lmsg.metadata = NULL;
_u.lmsg.type = type_lmsg;
_u.lmsg.flags = 0;
_u.lmsg.allocator_was_used = 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need another field, you can use the ffn and maybe another msg type?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes ok I can do this change by adding a message type "type_lmsg_from_allocator" or something like that.

I was a bit scared by the idea of adding a new msg type since I was not sure if that would trigger some side effect: there are a few methods like zmq::msg_t::is_lmsg() . I will probably need to find out the places where they get called and understand if I need to add also a call to the new
zmq::msg_t::is_lmsg_from_allocator()
but that's doable of course

src/msg.cpp Outdated
zmq_assert (alloc_ != NULL && size_ != 0);

_u.lmsg.metadata = NULL;
_u.lmsg.type = type_lmsg;
Copy link
Member

@somdoron somdoron Aug 29, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should have a type for an allocator msg type, just like the one for zero-copy messages:
https://github.com/zeromq/libzmq/blob/master/src/msg.hpp#L197

@f18m
Copy link
Contributor Author

f18m commented Aug 30, 2019

Hi @somdoron

As explained in the other pull request, single-consumer/single-producer memory pool is very dangerous and will break zeromq for existing users. I think we should either not include a built-in memory pool at that moment and only allow users to provide their own, or look for a safe alternative.

I tend to agree; indeed in this WIP memory pool I used the moodycamel::ConcurrentQueue which is a "A fast multi-producer, multi-consumer lock-free concurrent queue for C++11".
This kind of queue allows for safe access in any kind of multithread context. Of course in my proposed implementation there are zero locks even on some variables like:

global_memory_pool_t::m_storage
allocator_t::_type
allocator_t::_tag

however these variables should be set only during initial setup and not later when multiple threads will be using these data structures.

// Increment counter
auto prevVal = elementsCompletelyDequeued.fetch_add(count, std::memory_order_release);
assert(prevVal + count <= BLOCK_SIZE);
return prevVal +
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MEMORY_LEAK: memory dynamically allocated by call to malloc at line 3529, column 12 is not reachable after line 3530, column 10.

// Increment counter
auto prevVal = elementsCompletelyDequeued.fetch_add(count, std::memory_order_release);
assert(prevVal + count <= BLOCK_SIZE);
return prevVal +
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MEMORY_LEAK: memory dynamically allocated by call to malloc at line 3522, column 12 is not reachable after line 3523, column 10.

// Increment counter
auto prevVal = elementsCompletelyDequeued.fetch_add(count, std::memory_order_release);
assert(prevVal + count <= BLOCK_SIZE);
return prevVal +
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MEMORY_LEAK: memory dynamically allocated by call to malloc at line 3529, column 12 is not reachable after line 3530, column 10.

// Increment counter
auto prevVal = elementsCompletelyDequeued.fetch_add(count, std::memory_order_release);
assert(prevVal + count <= BLOCK_SIZE);
return prevVal +
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MEMORY_LEAK: memory dynamically allocated by call to malloc at line 3522, column 12 is not reachable after line 3523, column 10.

@f18m
Copy link
Contributor Author

f18m commented Aug 31, 2019

Hi @somdoron , @sigiesec , @bluca ,
a few points that I think are worth considering in a possible implementation like this one I proposed are:

  1. the proposed pool differentiates among 6 different "classes" of message sizes, from 0 to 8kB (I will spend some time to understand what is the byte range that actually provides a performance advantage compared to malloc/free so the 8kB limit is not well defined yet). How to handle messages >8kB? option a) the allocator should just fallback to malloc/free. Option b) however is to have the
    zmq_msg_init_allocator() method fail instead, so that the user will know it should just use zmq_msg_init_size() instead.

  2. how to avoid hardcoding the max number of active messages maintained by the memory pool? I think we have 3 options: a) dynamically grow the memory pool by malloc'ing a new big block of messages when a memory pool for a certain class is exhausted; b) just refuse to provide memory from exhausted memory pools; c) provide an option to the user to choose a) or b).

  3. how to retrieve statistics data from the memory pool (always very useful IMO). I would suggest a
    int zmq_msg_allocator_get_stats(void* allocator_, int size_class_, int stat_);
    method which might take size_class_ = [256|512|...] and stat_= NUM_AVAILABLE_MSGS|MEMORY_POOL_SIZE.
    What do you think?

Just for comparison the DPDK framework has chosen option 1b, option 2b.

PS: I will perform some performance benchmark to understand if the advantage is still like the one I found in #3631

// Increment counter
auto prevVal = elementsCompletelyDequeued.fetch_add(count, std::memory_order_release);
assert(prevVal + count <= BLOCK_SIZE);
return prevVal +
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MEMORY_LEAK: memory dynamically allocated by call to malloc at line 3529, column 12 is not reachable after line 3530, column 10.

// Increment counter
auto prevVal = elementsCompletelyDequeued.fetch_add(count, std::memory_order_release);
assert(prevVal + count <= BLOCK_SIZE);
return prevVal +
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MEMORY_LEAK: memory dynamically allocated by call to malloc at line 3522, column 12 is not reachable after line 3523, column 10.

@somdoron
Copy link
Member

somdoron commented Sep 1, 2019

@f18m

Regarding (1) I think fallback to malloc/free is better, as one of the users of this method is going to be the decoder classes.

Regarding (3), I would the msg from the method name, as this is a method of the allocator class.

// default allocator using malloc/free
#define ZMQ_MSG_ALLOCATOR_DEFAULT 0
// using internally a SPSC queue (cannot be used with inproc maybe?) or perhaps an MPMC queue anyway
#define ZMQ_MSG_ALLOCATOR_PER_THREAD_POOL 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should drop this for now and stay with the global pool

// using internally a MPMC queue
#define ZMQ_MSG_ALLOCATOR_GLOBAL_POOL 2

ZMQ_EXPORT void *zmq_msg_allocator_new (int type_);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we can drop the msg from the class name?

@somdoron
Copy link
Member

somdoron commented Sep 1, 2019

I wonder if we want to call it pool instead of allocator.

Also, how do you plan for the decoders to get a hand of the allocator?

@sigiesec
Copy link
Member

sigiesec commented Sep 2, 2019

I think allocator is the appropriate abstract term here. A pool is a specific kind of allocator, but there might be other allocators that are custom but not based on a pool.

@mjvankampen
Copy link
Contributor

Hi guys I continued a bit here as this PR seemed a bit dead. I was wondering how you (@f18m and the zmq guys) would like to proceed.

If you would like me to continue on this I think it is better if I make a separate PR and we can discuss what needs to change there. @f18m if you want to continue feel free to merge my branch if you think it is useful.

@f18m
Copy link
Contributor Author

f18m commented May 5, 2020

Hi @mjvk,
it would be fantastic if you could complete this "concept" (memory pools)... unfortunately I sort of ran out of time for this PR so I'm not planning to resume this work anytime soon... :(

@mjvankampen
Copy link
Contributor

Ok cool, I'll try to pick this up. Your comments will be very welcome of course. One would be a C++03 compatible MPMC queue for example, or I'll just add some guards.

@bluca
Copy link
Member

bluca commented May 24, 2020

#3911 seems to have picked up these changes, so closing this one.

@bluca bluca closed this May 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants