[API Proposal]: ArrayRecord.TotalElementsCount #106644
Labels
api-approved
API was approved in API review, it can be implemented
area-System.Formats.Nrbf
binaryformatter-migration
Issues related to the removal of BinaryFormatter and migrations away from it
blocking-release
in-pr
There is an active PR which will close this issue when it is merged
Milestone
Background and motivation
NRBF format exposes a possibility to represent multiple nulls with a single record (we need one byte for its type and another four bytes for the null count). It allows the attackers to send a small payload that represents a very large array. For example, serialized representation of following jagged array is just 90 bytes.
while it takes more than 50 GB to instantiate it.
During the Threat Model session, it turned out that we currently don’t expose any API that could allow the users to avoid getting into this particular trap.
API Proposal
Providing
TotalElementsCount
is going to allow the users to perform such checks (and also avoid the need to compute it on their own for multi-dimensional arrays). By elements I mean theT
values stored by the inner most arrays.namespace System.Formats.Nrbf; public abstract partial class ArrayRecord : SerializationRecord { public abstract System.ReadOnlySpan<int> Lengths { get; } + public virtual long TotalElementsCount { get; } }
API Usage
Alternative Designs
It's not obvious what total element count means for jagged arrays of jagged arrays. To avoid that, we could provide an API that would return the total allocated bytes.
The API would need to take references into account (for every array record,
GetArray
allocates the array only once). In following example:The allocated bytes would be 2 GB (rather than 6 GB). Would that be a problem for users who would like to use the new API as initial guard and then process it later without any checks?
We don't provide any similar API in the BCL and I suspect that it would be hard to make it always return 100% exact value: we would need to take many runtime implementation details into account: object headers, method table pointers, alignment. But we could also just document that the API does not take these extra fields into account and returns estimated value. But would that make us vulnerable to attack where the payload contains MANY contained arrays with just 1 element?
Risks
No response
Other
#106629 contains a working impl
The text was updated successfully, but these errors were encountered: