Strange exception from unordered_set with memory_stack, on MSVC #144

jwdevel · 2022-10-25T02:32:00Z

With the Microsoft compiler (I'm using VC19, but I imagine it's the same for others), and with fairly simple usage of std::unordered_set with a memory_stack allocator, I get an exception coming from inside the STL.

I believe it is related to the extra implementation_offset reserved by foonathan.

I'm not sure if this represents a bug in the MSVC STL, a bug in foonathan::memory, or if this is some expected limitation.

Below is a simple failing test.
The insert() line will throw std::length_error with message invalid hash bucket count.

TEST_CASE("std unordered_set problem on MSVC") {
    using memory_stack_t = foonathan::memory::memory_stack<>;
    memory_stack_t myAlloc(1024);
    using MySet = foonathan::memory::unordered_set<int, memory_stack_t>;
    MySet mySet(myAlloc);
    for (int i = 0; i < 1000; ++i) {
        mySet.insert(i);
    }
}

This is the line in the STL that generates the error: https://github.com/microsoft/STL/blob/2f8342a3a57fb157d881c6a2d42a917d20e413f8/stl/inc/xhash#L1717

Feel free to step through it yourself, but my brief summary:

inserting enough items causes a re-hash
the STL code seems to assume that a next-larger-power-of-2 size number of buckets will be available
the allocator's max_size comes back as slightly smaller than a power of 2, because of implementation_offset
this causes the STL code to round it down to the next lower power of 2 (see _Floor_of_log_2(...))
this causes it to raise the exception, since the number of buckets it wants is not in agreement with the buckets available.

In particular, there is this comment:

// Don't violate power of 2, fits in half the bucket vector invariant:
// (we assume because vector must use single allocations; as a result, its max_size fits in a size_t)

I am not sure if this invariant is somehow guaranteed by the C++ standard, or is just an assumption made in the MSVC STL.
It all works fine on glibc, for what that's worth.

I wonder if the implementation_offset amount should be something in addition to the basic grow-by-doubling calculation, rather than eating into it, slightly?

The text was updated successfully, but these errors were encountered:

foonathan · 2022-10-29T18:32:17Z

I wonder if the implementation_offset amount should be something in addition to the basic grow-by-doubling calculation, rather than eating into it, slightly?

The issue with that approach is that it can waste memory. For example, before it allocates 4096 bytes then 8192, then 16384 - all of which are multiple of the page size and can thus be served by allocating whole pages using a page allocator. If we then add the offset on top we need to start an entire new page to store the extra pointer.

I think the correct fix here is to allocate the extra capacity in your stack up front.

jwdevel · 2022-10-29T20:01:07Z

I think the correct fix here is to allocate the extra capacity in your stack up front.

What does that look like, exactly? You mean in client code, have memory_stack_t myAlloc(1024 + X); where X accounts for implementation_offset ?

foonathan · 2022-10-30T09:23:54Z

That should work, yes. You can use memory_stack::min_block_size(1024) to do the calculation for you.

jwdevel · 2022-10-30T18:27:21Z

Thanks; that does seem to fix the issue for the case I posted.

However, when trying to apply the fix locally, I noticed another related case, using a node pool allocator:

using MyPool = memory_pool<node_pool, growing_block_allocator<heap_allocator, 2, 1>>;
const auto kNodeSize = unordered_map_node_size<std::pair<int, char>>::value;
MyPool myAlloc(
	kNodeSize,
	1024); // number of nodes

using MyMap = unordered_map<int, char, MyPool>;

MyMap map(myPool);

for (int i = 0; i < 1000; ++i) {
	map[i] = (char)i;
}

Here, it gets the same invalid hash bucket count error. But it's not clear to me how to fix this case.
Here, the caller is not specifying a number of bytes for the memory blocks.

Does this represent a bug inside memory_pool, then?

foonathan · 2022-11-04T07:56:22Z

Here, the caller is not specifying a number of bytes for the memory blocks.

They do, the second parameter is the number of bytes, not number of nodes. You can likewise use min_block_size() to get a byte size that includes the implementation offset.

jwdevel · 2022-11-04T17:53:34Z

Ah, sorry for the confusion. I am still seeing a crash though (see end).

I was thrown off by the fact that MyPool::min_block_size(nodeSize, numberOfNodes) takes a number of nodes.
And presumably that is the function that would be called to generate the value (which is number of bytes, as you say).

So I guess you mean a user should do something like this:

MyPool myAlloc(
	kNodeSize,
	MyPool::min_block_size(kNodeSize, 1000));  // 1000 is the number of nodes, who knows how many bytes you'll get

That does indeed work, for me.

However, if a user wanted to grow by (approximately) some number of bytes rather than nodes, it seems they would need to do their own math, perhaps like:

size_t approxNumNodesInBlock = 1024 / kNodeSize;
MyPool myAlloc(
	kNodeSize,
	MyPool::min_block_size(kNodeSize, approxNumNodesInBlock));

That case still crashes, for me. The math inside min_block_size happens to work out to return exactly 1024, which is the same as the (wrongly-commented) crash mentioned previously.

So, working backwards from that, here is another crash, which doesn't try to do any weird math in client code:

MyPool myAlloc(
	kNodeSize,
	MyPool::min_block_size(kNodeSize, 42));

The number 42 is chosen there to make the math work out to an even 1024 again, so it is the same crash, just achieved in a different way. Previously, with the number 1000, it was working by accident, it seems.

Or is there some other min_block_size you were referring to, which I should use for a node pool?

foonathan · 2022-11-05T08:38:22Z

I don't think there is anything I can do on the library side about it. MSVC wants to allocate a big power of two, which the allocator isn't designed to handle -- it also doesn't need to, allocating big memory pages is perfectly fine for the default allocator.

The fundamental issue is that unordered_map is a hybrid container: it allocates both arrays and individual nodes, and the pool should only be used for the latter.

jwdevel · 2022-11-06T02:19:41Z

Ok, fair enough; thank you for sticking with me through the explanation (:

So it seems like a node pool allocator is just a bad fit for the use-case.

I was just thrown off by the fact that MSVC fails where others did not. But I guess in theory gcc and others would have the same problem, given the right circumstances...

MiguelCompany · 2022-11-07T06:20:02Z

@jwdevel I've just seen this. On Fast DDS we are using a foonathan::memory::binary_segregator for some of our unordered_map instances. See here for an example code.

jwdevel closed this as completed Nov 6, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Strange exception from unordered_set with memory_stack, on MSVC #144

Strange exception from unordered_set with memory_stack, on MSVC #144

jwdevel commented Oct 25, 2022

foonathan commented Oct 29, 2022

jwdevel commented Oct 29, 2022

foonathan commented Oct 30, 2022

jwdevel commented Oct 30, 2022 •

edited

Loading

foonathan commented Nov 4, 2022

jwdevel commented Nov 4, 2022 •

edited

Loading

foonathan commented Nov 5, 2022

jwdevel commented Nov 6, 2022

MiguelCompany commented Nov 7, 2022

Strange exception from unordered_set with memory_stack, on MSVC #144

Strange exception from unordered_set with memory_stack, on MSVC #144

Comments

jwdevel commented Oct 25, 2022

foonathan commented Oct 29, 2022

jwdevel commented Oct 29, 2022

foonathan commented Oct 30, 2022

jwdevel commented Oct 30, 2022 • edited Loading

foonathan commented Nov 4, 2022

jwdevel commented Nov 4, 2022 • edited Loading

foonathan commented Nov 5, 2022

jwdevel commented Nov 6, 2022

MiguelCompany commented Nov 7, 2022

jwdevel commented Oct 30, 2022 •

edited

Loading

jwdevel commented Nov 4, 2022 •

edited

Loading