Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strange exception from unordered_set with memory_stack, on MSVC #144

Closed
jwdevel opened this issue Oct 25, 2022 · 9 comments
Closed

Strange exception from unordered_set with memory_stack, on MSVC #144

jwdevel opened this issue Oct 25, 2022 · 9 comments

Comments

@jwdevel
Copy link
Contributor

jwdevel commented Oct 25, 2022

With the Microsoft compiler (I'm using VC19, but I imagine it's the same for others), and with fairly simple usage of std::unordered_set with a memory_stack allocator, I get an exception coming from inside the STL.

I believe it is related to the extra implementation_offset reserved by foonathan.

I'm not sure if this represents a bug in the MSVC STL, a bug in foonathan::memory, or if this is some expected limitation.

Below is a simple failing test.
The insert() line will throw std::length_error with message invalid hash bucket count.

TEST_CASE("std unordered_set problem on MSVC") {
    using memory_stack_t = foonathan::memory::memory_stack<>;
    memory_stack_t myAlloc(1024);
    using MySet = foonathan::memory::unordered_set<int, memory_stack_t>;
    MySet mySet(myAlloc);
    for (int i = 0; i < 1000; ++i) {
        mySet.insert(i);
    }
}

This is the line in the STL that generates the error: https://github.com/microsoft/STL/blob/2f8342a3a57fb157d881c6a2d42a917d20e413f8/stl/inc/xhash#L1717

Feel free to step through it yourself, but my brief summary:

  • inserting enough items causes a re-hash
  • the STL code seems to assume that a next-larger-power-of-2 size number of buckets will be available
  • the allocator's max_size comes back as slightly smaller than a power of 2, because of implementation_offset
  • this causes the STL code to round it down to the next lower power of 2 (see _Floor_of_log_2(...))
  • this causes it to raise the exception, since the number of buckets it wants is not in agreement with the buckets available.

In particular, there is this comment:

// Don't violate power of 2, fits in half the bucket vector invariant:
// (we assume because vector must use single allocations; as a result, its max_size fits in a size_t)

I am not sure if this invariant is somehow guaranteed by the C++ standard, or is just an assumption made in the MSVC STL.
It all works fine on glibc, for what that's worth.

I wonder if the implementation_offset amount should be something in addition to the basic grow-by-doubling calculation, rather than eating into it, slightly?

@foonathan
Copy link
Owner

I wonder if the implementation_offset amount should be something in addition to the basic grow-by-doubling calculation, rather than eating into it, slightly?

The issue with that approach is that it can waste memory. For example, before it allocates 4096 bytes then 8192, then 16384 - all of which are multiple of the page size and can thus be served by allocating whole pages using a page allocator. If we then add the offset on top we need to start an entire new page to store the extra pointer.

I think the correct fix here is to allocate the extra capacity in your stack up front.

@jwdevel
Copy link
Contributor Author

jwdevel commented Oct 29, 2022

I think the correct fix here is to allocate the extra capacity in your stack up front.

What does that look like, exactly? You mean in client code, have memory_stack_t myAlloc(1024 + X); where X accounts for implementation_offset ?

@foonathan
Copy link
Owner

That should work, yes. You can use memory_stack::min_block_size(1024) to do the calculation for you.

@jwdevel
Copy link
Contributor Author

jwdevel commented Oct 30, 2022

Thanks; that does seem to fix the issue for the case I posted.

However, when trying to apply the fix locally, I noticed another related case, using a node pool allocator:

using MyPool = memory_pool<node_pool, growing_block_allocator<heap_allocator, 2, 1>>;
const auto kNodeSize = unordered_map_node_size<std::pair<int, char>>::value;
MyPool myAlloc(
	kNodeSize,
	1024); // number of nodes

using MyMap = unordered_map<int, char, MyPool>;

MyMap map(myPool);

for (int i = 0; i < 1000; ++i) {
	map[i] = (char)i;
}

Here, it gets the same invalid hash bucket count error. But it's not clear to me how to fix this case.
Here, the caller is not specifying a number of bytes for the memory blocks.

Does this represent a bug inside memory_pool, then?

@foonathan
Copy link
Owner

Here, the caller is not specifying a number of bytes for the memory blocks.

They do, the second parameter is the number of bytes, not number of nodes. You can likewise use min_block_size() to get a byte size that includes the implementation offset.

@jwdevel
Copy link
Contributor Author

jwdevel commented Nov 4, 2022

Ah, sorry for the confusion. I am still seeing a crash though (see end).

I was thrown off by the fact that MyPool::min_block_size(nodeSize, numberOfNodes) takes a number of nodes.
And presumably that is the function that would be called to generate the value (which is number of bytes, as you say).

So I guess you mean a user should do something like this:

MyPool myAlloc(
	kNodeSize,
	MyPool::min_block_size(kNodeSize, 1000));  // 1000 is the number of nodes, who knows how many bytes you'll get

That does indeed work, for me.

However, if a user wanted to grow by (approximately) some number of bytes rather than nodes, it seems they would need to do their own math, perhaps like:

size_t approxNumNodesInBlock = 1024 / kNodeSize;
MyPool myAlloc(
	kNodeSize,
	MyPool::min_block_size(kNodeSize, approxNumNodesInBlock));

That case still crashes, for me. The math inside min_block_size happens to work out to return exactly 1024, which is the same as the (wrongly-commented) crash mentioned previously.

So, working backwards from that, here is another crash, which doesn't try to do any weird math in client code:

MyPool myAlloc(
	kNodeSize,
	MyPool::min_block_size(kNodeSize, 42));

The number 42 is chosen there to make the math work out to an even 1024 again, so it is the same crash, just achieved in a different way. Previously, with the number 1000, it was working by accident, it seems.

Or is there some other min_block_size you were referring to, which I should use for a node pool?

@foonathan
Copy link
Owner

I don't think there is anything I can do on the library side about it. MSVC wants to allocate a big power of two, which the allocator isn't designed to handle -- it also doesn't need to, allocating big memory pages is perfectly fine for the default allocator.

The fundamental issue is that unordered_map is a hybrid container: it allocates both arrays and individual nodes, and the pool should only be used for the latter.

@jwdevel
Copy link
Contributor Author

jwdevel commented Nov 6, 2022

Ok, fair enough; thank you for sticking with me through the explanation (:

So it seems like a node pool allocator is just a bad fit for the use-case.

I was just thrown off by the fact that MSVC fails where others did not. But I guess in theory gcc and others would have the same problem, given the right circumstances...

@jwdevel jwdevel closed this as completed Nov 6, 2022
@MiguelCompany
Copy link
Contributor

@jwdevel I've just seen this. On Fast DDS we are using a foonathan::memory::binary_segregator for some of our unordered_map instances. See here for an example code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants