Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Increase ObjectManager.MaxArraySize to improve BinaryFormatter deserialization time #17949
To support fix-ups during BinaryFormatter deserialization, ObjectManager maintains a table of ObjectHolders. This table is an array, where each element/bucket of the array is a linked list of ObjectHolders, and where the right bucket is found via
There are a variety of ways to address this. But a very simple stop-gap is to simply increase the MaxArraySize. Doing so doesn't impact small deserializations, as the array wouldn't have grown beyond MaxArraySize anyway, and for large deserializations, we allocate more memory for the array but only proportionally to the number of objects added to the table. The primary downside is such a large array is more likely to end up in the LOH, which will make collecting the array more expensive, but nowhere near as expensive as the O(N^2) algorithm that would result with a smaller array of long lists.
Thus, this mitigation simply increases the MaxArraySize, from 4K to 1M.
For the repro in #16991, before this change I get:
and after I get:
If I increase the size further, from 500K to 1M Book in the list, before I get:
and after I get:
For super huge graphs, we could still end up with long bucket lists and thus still exhibit some polynomic behavior, but the chances/impact of that are significantly decreased.
Apr 5, 2017
19 checks passed
Just to Point out how bad the performance are:
stephentoub 's O(N^2) I just can't understand. O(N) should be possible I think.
You mean you don't understand my analysis that says the current implementation is O(N^2)? Worst case, you're doing a lookup for N items, and since those N items are stored in lists that are being walked, each lookup can be O(N), hence O(N^2).
Sorry I'm unclear,
I agrue that just increasing the buffer size is not a satisfactory solution.
As I stated in the PR, it's a workaround that helps to mitigate the problem, with few downsides. If you'd like to submit a PR that does better, we'd welcome it. But we're not planning to spend a lot of time working on BinaryFormatter, which was brought to .NET Core primarily for compatibility and is not a technology we're heavily investing in moving forward.
I might consider take a deep dive into BinaryFormater, but having browsed the code I realize it's not any easy task to understand/get a grip on it...
I have 2 reasons for pushing this issue
Now my trust in .NET decreeses. (I recently also experienced bad performance in string.IndexOf, I found a workaround)