You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Variable-length integers in TON have surprising definition and implementing them by hand is prone to error.
TL-B schema for VarInteger/VarUInteger defines the integer via the maximum size of the integer in bytes. That is, VarUInteger 32 means "integer up to 31 byte long", or "248-bit integer". The syntax #<n means "number of bits necessary to represent numbers 0..n-1".
Compare VarUInteger 7 and VarUInteger 8. Both use 3-bit length prefix to represent sizes in the ranges 0..6 bytes (48 bits) and 0..7 bytes (56 bits) respectively. Every encoding of VarUInteger {5,6,7} are strict subsets of VarUInteger 8 and a correct TL-B encoder and decoder will have to enforce the actual specified bounds in runtime. This is indeed what C++ TL-B compiler does.
Current APIs in ton-core, tonweb and tongo depart from the original definition in the TL-B schema and let user specify the bit width of the length prefix. Instead of specifying the non-inclusive upper boundary in bytes (n), the user is supposed to specify ceil(log2(n)) number of bits in the internal length prefix. Such API erases the possibility to enforce actual bounds for types such as VarUInteger 7 (see for example, the definition of StorageUsed). This means, the user may accidentally encode a larger number than permitted or accept a malformed input data.
Suggestion
Redefine the varint APIs in terms of upper byte boundary and add checks on the actual number size when writing and reading varints.
Preface
Variable-length integers in TON have surprising definition and implementing them by hand is prone to error.
TL-B schema for VarInteger/VarUInteger defines the integer via the maximum size of the integer in bytes. That is,
VarUInteger 32
means "integer up to 31 byte long", or "248-bit integer". The syntax#<n
means "number of bits necessary to represent numbers 0..n-1".The problem
Compare
VarUInteger 7
andVarUInteger 8
. Both use 3-bit length prefix to represent sizes in the ranges 0..6 bytes (48 bits) and 0..7 bytes (56 bits) respectively. Every encoding ofVarUInteger {5,6,7}
are strict subsets ofVarUInteger 8
and a correct TL-B encoder and decoder will have to enforce the actual specified bounds in runtime. This is indeed what C++ TL-B compiler does.Current APIs in ton-core, tonweb and tongo depart from the original definition in the TL-B schema and let user specify the bit width of the length prefix. Instead of specifying the non-inclusive upper boundary in bytes (
n
), the user is supposed to specifyceil(log2(n))
number of bits in the internal length prefix. Such API erases the possibility to enforce actual bounds for types such asVarUInteger 7
(see for example, the definition ofStorageUsed
). This means, the user may accidentally encode a larger number than permitted or accept a malformed input data.Suggestion
Redefine the varint APIs in terms of upper byte boundary and add checks on the actual number size when writing and reading varints.
References
The text was updated successfully, but these errors were encountered: