Extended packing and extracting library for value types #5056

Amxx · 2024-05-27T12:08:52Z

Fixes #5053
Alternative to #5051

Missing tests for the replace functions.

PR Checklist

Tests
Documentation
Changeset entry (run npx changeset add)

changeset-bot · 2024-05-27T12:08:56Z

⚠️ No Changeset found

Latest commit: 038e2dc

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes changesets to release 1 package

Name	Type
openzeppelin-solidity	Minor

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

…l/packing

ernestognw

I like the approach and how it looks. However, the Packing.sol file itself is already 437 kb (around 25% the current size of the library). Let alone the file, I guess the compilation artifacts we distribute will grow as well so probably more.

In a past design, I was suggesting the following size combinations I thought were enough:

const BYTES_PACK_SIZES = {
  32: [
    ['uint128', 'uint128'],
    ['uint192', 'uint64'],
    ['uint64', 'uint64', 'uint64', 'uint64'],
  ],
  24: [
    ['uint96', 'uint96'],
    ['uint144', 'uint48'],
    ['uint48', 'uint48', 'uint48', 'uint48'],
  ],
  16: [
    ['uint64', 'uint64'],
    ['uint96', 'uint32'],
    ['uint32', 'uint32', 'uint32', 'uint32'],
  ],
  12: [
    ['uint48', 'uint48'],
    ['uint72', 'uint24'],
    ['uint24', 'uint24', 'uint24', 'uint24'],
  ],
  8: [
    ['uint32', 'uint32'],
    ['uint48', 'uint16'],
    ['uint16', 'uint16', 'uint16', 'uint16'],
  ],
  4: [
    ['uint16', 'uint16'],
    ['uint24', 'uint8'],
    ['uint8', 'uint8', 'uint8', 'uint8'],
  ],
  2: [['uint8', 'uint8']],
};

The idea is to split the most common byte types (2,4,8,12,16,24,32) in 4 (equal) and 2 (unequal) chunks. I realized pairs like [uint48,uint16] and [uint16,uint48] were redundant.

The current design feels the right approach but we can reasonably shave off a lot of redundancy from duplicated pack functions (eg. bytes3 | bytes4 vs bytes4 | bytes3), types such as PackedBytes1 that aren't needed in my opinion, or even extremely specific uses like bytes 2 | bytes27 that aren't worth putting in the library.

What do you think of starting with selecting the combination of types we think are useful and keep the product of them? My list is based on the following?:

Bytes (32, 16, 8, 4, 2, 1) as starting point. Users may be able to compose them
Add extra common use cases like ERC4337 nonce packing (bytes24 | bytes8), in that case 24 is missing
12 and 6 to split 24 (we can get rid of these)

Additionally, we can add 20 and 12 for addresses.

With this setup, it's possible that we can reduce this file size substantially while keeping enough usage freedom.

Amxx · 2024-05-29T08:22:40Z

I guess the compilation artifacts we distribute will grow as well so probably more.
Compilation artifacts for libraries should be small. All the functions are internals.

I checked, and the Packing.json artifact is only 720bytes. (Math.json is similar, Strings.json is 1.1k)

About removing some variants to reduce size:

IMO symetric construction such as bytes8 | bytes24 vs bytes24 | bytes4 are not a duplication. If you want to build a "meta nonce" (as in 4337) from a byte8 key, and a bytes24 value ... the bytes24 | bytes8 helper is not going to help.
If we remove some variants, we should be ready when people come and ask us "why is this variant that I want not available". Not having some is an opinionated choice. Weither we want to make it or not, I'm not 100% sure.
About the PackedBytes1, I'm not sure what the impact of removing it would be. Having it makes things more uniform:
Just like you would do "I want to extract 20 bytes from this thing, at this location, and interpret it as an address", you would also do "I want to extract 1 byte from this thing, at this location, and interpret it as a uint8". IMO there is value to keeping it.

Something that we never considered, but that may be an alternative is to:

not ship the procedurally generated solidity files in the npm package
ship the generation code in the npm package
automatically run the generation code when the packages is pulled.
That would reduce the size of the package, but also add some moving pieces. We would have to be super carefull about dependencies.

IMO next steps are

determine which version we want to keep (if not all)
decide on the delivery (if we feel what want to distribute it too big)

Amxx · 2024-05-29T08:25:56Z

Proposal:

include types:
- bytes1
- bytes2
- bytesXX with X a multiple of 4, from bytes4 to bytes32
packing:
- 1 + 1 <> 2
- 2 + 2 <> 4
- X + Y <> Z, for X and Y multiples of 4
  - this includes 16+16<>32, but also 12+20<>32, and 4+8<>12

(that is equivalent to saying pack X+Y<>Z if and only if all X, Y and Z are included types)

extract / replace
- extract / replace any of the included type from any of the included types that are bigger

This result in a file of 1165 lines (41k). SafeCast.sol is 35k, Math.sol is 28k, ...

frangio · 2024-05-29T15:09:33Z

I'm not a big fan of the reliance on user defined types in this library. I understand that overloaded function names doesn't work, but it seems very uncomfortable for users to have to wrap and unwrap values to use the library.

Have you considered disambiguating by naming convention? For example, extract_8_2, replace_8_2, pack_8_12.

Amxx · 2024-05-29T16:04:14Z

Have you considered disambiguating by naming convention? For example, extract_8_2, replace_8_2, pack_8_12

I think we liked the idea of having the same name when possible, and deducing the right version from the input parameters. We also think that this will possibly be used with both bytesXX and uintXXX (and address) as input and ouput, so the UDVT would be a good intermediary representation.

I guess that is not set in stone, we could replace the UDVT with bytesXX, add bytesXX<>uintXXX conversion/casting function, and specialize all the names ...

I prefer the current way, but its not a very strong opinion.

frangio · 2024-05-29T18:16:39Z

I think the benefit of having the same name may be lost if it requires a high level of verbosity like this:

Packing.pack(left.asPackedBytes2(), right.asPackedBytes2()).extract2(0).asUint16()

IMO simpler usage should be prioritized:

uint16(Packing.pack_2_2(left, right).extract_4_2(0))

The cast is not ideal. Perhaps extractBytes2_4(uint8) -> bytes2 and extractUint16_4(uint8) -> uint16 should be separate functions.

Amxx · 2024-05-29T18:24:53Z

The cast is not ideal. Perhaps extractBytes2_4(uint8) -> bytes2 and extractUint16_4(uint8) -> uint16 should be separate functions.

But then if you want to extract a uint from a bytes, or vice versa, that is 4 functions you need ...

Usecase: accountGasLimits and gasFee in PackedUserOperation.sol are actually two uint128 packed in a bytes32.

(cf this utils)

Packing.pack(left.asPackedBytes2(), right.asPackedBytes2()).extract2(0).asUint16()

This is testing code, I don't expect that to ever be done in production. IMO productions cases are:

// somewhere in contract X
somevariable = Packing.pack(x.asPackedBytes16(), y.asPackedBytes16()).asBytes32();

// somewhere in another contract (or at least another function)
parsed = somevariable.asPackedBytes32().extract16(0).asUint128();

would we prefer ?

// somewhere in contract X
somevariable = Packing.pack_16_16(bytes16(x), bytes16(y));

// somewhere in another contract (or at least another function)
parsed = uint128(somevariable.extract_32_16(0));

I'm honestly not sure.

frangio · 2024-05-29T19:49:25Z

somevariable = Packing.pack_16_16(bytes16(x), bytes16(y));

Why do you need the cast there?

Amxx · 2024-05-29T21:21:12Z

X and y may be uint128

frangio · 2024-05-30T18:58:11Z

@Amxx @ernestognw How do you feel about the new API?

ernestognw · 2024-05-30T19:46:14Z

Something that we never considered, but that may be an alternative is to:

not ship the procedurally generated solidity files in the npm package

ship the generation code in the npm package

automatically run the generation code when the packages is pulled.
That would reduce the size of the package, but also add some moving pieces. We would have to be super carefull about dependencies.

Yeah this is a great idea but I share the concerns with dependencies. I've seen people modifying our procedurally generation scripts for custom types. I don't remember the contest but I think it was Checkpoints with a different key size (eg. uint64). So I guess we don't need to distribute them formally.

IMO next steps are

determine which version we want to keep (if not all)

decide on the delivery (if we feel what want to distribute it too big)

I do prefer to distribute a single and "small enough" single library in a Solidity file. I like the current API since I felt initially that we were creating UDVTs for types already defined in the language. I couldn't formulate a concern from that initially becuase I tend to prefer verbosity.

Reading the current API is less clear in my opinion, especially considering @Amxx example:

parsed = uint128(somevariable.extract_32_16(0));
// vs
parsed = somevariable.asPackedBytes32().extract16(0).asUint128();

I think the second is clearer but honestly it's a matter of taste. I'd be ok shipping any of both

The cast is not ideal. Perhaps extractBytes2_4(uint8) -> bytes2 and extractUint16_4(uint8) -> uint16 should be separate functions.

But then if you want to extract a uint from a bytes, or vice versa, that is 4 functions you need ...

I also strongly agree with this. Providing the casting variants would be too much in my opinion

Amxx · 2024-05-30T20:57:13Z

We could replicate the second syntax using the current library if we add (in another PR, let's not do that now) a Cast library that provide all the asXXX primitive you would need

library Cast {
    function asUint256(bytes32 self) internal pure returns (uint256) { return uint256(self); }
    function asBytes32(uint256 self) internal pure returns (bytes32) { return bytes32(self); }
    function asAddress(bytes20 self) internal pure returns (address) { return address(self); }
    // ...
}

scripts/generate/templates/Packing.js

ernestognw · 2024-06-04T00:18:59Z

scripts/generate/templates/Packing.js

+  ) internal pure returns (bytes${outer} result) {
+    bytes${inner} oldValue = extract_${outer}_${inner}(self, offset);
+    assembly ("memory-safe") {
+      result := xor(self, shr(mul(8, offset), xor(oldValue, value)))


The double xor trick might require an explanation somewhere, but I get that we're avoiding comments on purpose.

For the record:

the "inner" xor (xor(oldValue, value)) computes the "difference" between the old and the new values.

then we shift this difference to place it a the right location

then the "outer" xor applies this difference to self

scripts/generate/templates/Packing.js

ernestognw

LGTM. I added a small guide in the utilities docs and an usage example.

frangio · 2024-06-11T14:54:12Z

Looks nice 👍

cairoeth

looks great! are there any packings use cases in 5.1 that we should update to use this lib?

ernestognw · 2024-06-11T18:06:48Z

Alright, thanks everyone for your input. I'm happy with the result and glad we took the time to iterate it before making a release 🙌🏻

Amxx added 8 commits May 22, 2024 20:37

starting work on a procedurally generated packing library

a660637

update

96a3540

update

54f4755

update

8905c27

re-design

575df4d

reduce fuzzing runs

56e1858

first hardhat testing

1bcc626

unit tests for coverage

9f1c891

Amxx mentioned this pull request May 27, 2024

Procedurally generate packing libraries #5053

Closed

4 tasks

Amxx changed the title ~~Extended packing and extracting library for bytesXx~~ Extended packing and extracting library for bytesXX May 27, 2024

Amxx added 3 commits May 27, 2024 14:11

Merge branch 'master' into procedural/packing

5e8fea4

coverage

3f0ffae

Merge remote-tracking branch 'amxx/procedural/packing' into procedura…

df69368

…l/packing

Amxx force-pushed the procedural/packing branch from 6d053ae to df69368 Compare May 27, 2024 12:30

Amxx changed the title ~~Extended packing and extracting library for bytesXX~~ Extended packing and extracting library for value types May 27, 2024

Amxx added 2 commits May 27, 2024 15:12

add support for address

257b579

refactor tests

dedb389

Amxx requested review from ernestognw May 27, 2024 18:41

add Packing.replace

75d096c

ernestognw reviewed May 29, 2024

View reviewed changes

Amxx requested a review from frangio May 29, 2024 08:29

Amxx added 2 commits May 29, 2024 10:36

reduce supported types in Packing.sol (proposal)

c30cea3

fuzzing replace

796b79d

refactor: use bytesXX and explicit casting instead of UDVT

7e96042

Amxx force-pushed the procedural/packing branch from 8e009a6 to 7e96042 Compare May 29, 2024 18:51

Amxx added 2 commits May 30, 2024 15:20

unit testing / coverage

bb68006

update changeset from previous PR

ce79f8e

Amxx added this to the 5.1 milestone Jun 3, 2024

ernestognw mentioned this pull request Jun 3, 2024

Procedurally generate packing libraries #5051

Closed

3 tasks

ernestognw reviewed Jun 4, 2024

View reviewed changes

ernestognw requested a review from cairoeth June 4, 2024 00:22

ernestognw added 6 commits June 10, 2024 13:19

Add usage guide in utils docs

bdd2a5f

Adjust example without UDVTs

4002c8b

Fix example and generation

6c61ccf

Codespell

db4a4bb

Merge branch 'master' into procedural/packing

905cc40

Fix packing?

038e2dc

ernestognw approved these changes Jun 10, 2024

View reviewed changes

cairoeth approved these changes Jun 11, 2024

View reviewed changes

cairoeth mentioned this pull request Jun 11, 2024

Update packings to use utility library #5075

Open

ernestognw merged commit dc62599 into OpenZeppelin:master Jun 11, 2024
21 checks passed

Amxx deleted the procedural/packing branch June 11, 2024 20:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extended packing and extracting library for value types #5056

Extended packing and extracting library for value types #5056

Amxx commented May 27, 2024 •

edited

Loading

changeset-bot bot commented May 27, 2024 •

edited

Loading

ernestognw left a comment

Amxx commented May 29, 2024 •

edited

Loading

Amxx commented May 29, 2024 •

edited

Loading

frangio commented May 29, 2024 •

edited

Loading

Amxx commented May 29, 2024

frangio commented May 29, 2024 •

edited

Loading

Amxx commented May 29, 2024 •

edited

Loading

frangio commented May 29, 2024

Amxx commented May 29, 2024

frangio commented May 30, 2024

ernestognw commented May 30, 2024

Amxx commented May 30, 2024 •

edited

Loading

ernestognw Jun 4, 2024

Amxx Jun 4, 2024

ernestognw left a comment

frangio commented Jun 11, 2024

cairoeth left a comment •

edited

Loading

ernestognw commented Jun 11, 2024

Extended packing and extracting library for value types #5056

Extended packing and extracting library for value types #5056

Conversation

Amxx commented May 27, 2024 • edited Loading

PR Checklist

changeset-bot bot commented May 27, 2024 • edited Loading

⚠️ No Changeset found

ernestognw left a comment

Choose a reason for hiding this comment

Amxx commented May 29, 2024 • edited Loading

Amxx commented May 29, 2024 • edited Loading

frangio commented May 29, 2024 • edited Loading

Amxx commented May 29, 2024

frangio commented May 29, 2024 • edited Loading

Amxx commented May 29, 2024 • edited Loading

frangio commented May 29, 2024

Amxx commented May 29, 2024

frangio commented May 30, 2024

ernestognw commented May 30, 2024

Amxx commented May 30, 2024 • edited Loading

ernestognw Jun 4, 2024

Choose a reason for hiding this comment

Amxx Jun 4, 2024

Choose a reason for hiding this comment

ernestognw left a comment

Choose a reason for hiding this comment

frangio commented Jun 11, 2024

cairoeth left a comment • edited Loading

Choose a reason for hiding this comment

ernestognw commented Jun 11, 2024

Amxx commented May 27, 2024 •

edited

Loading

changeset-bot bot commented May 27, 2024 •

edited

Loading

Amxx commented May 29, 2024 •

edited

Loading

Amxx commented May 29, 2024 •

edited

Loading

frangio commented May 29, 2024 •

edited

Loading

frangio commented May 29, 2024 •

edited

Loading

Amxx commented May 29, 2024 •

edited

Loading

Amxx commented May 30, 2024 •

edited

Loading

cairoeth left a comment •

edited

Loading