Skip to content
This repository has been archived by the owner on Feb 26, 2024. It is now read-only.

Various improvements to truffle opcode and @truffle/code-utils #5408

Merged
merged 13 commits into from
Aug 10, 2022

Conversation

haltman-at
Copy link
Contributor

@haltman-at haltman-at commented Aug 6, 2022

OK, I maybe stuffed a bit too many different things into this one. XD (Note: Contains breaking changes to @truffle/code-utils.)

This PR:

  1. Fixes various bugs in truffle opcode and @truffle/code-utils
  2. Reworks the interface to @truffle/code-utils
  3. Adds handling to @truffle/code-utils for Vyper 0.3.4
  4. Adds handling to truffle opcode for really old Vyper versions :)
  5. A few smaller things

In order to understand how this PR came about, it's useful to understand a bit of the history here. Originally, the parseCode function exported by @truffle/code-utils took only one argument, the bytecode to be disassembled. It also attempted to strip the metadata off the end. However, metadata stripping is unreliable, so at some point I added a second argument. The second argument is the number of instructions expected -- as might be gotten from a source map -- and if not passed it behaves the old way, where it attempts to strip the metadata. I also altered the various invocations of parseCode to pass this second argument, so we wouldn't actually be using the metadata stripping.

This is a terrible interface! I had planned to make a breaking change and redo it when it came time to add EOF support, but, um, I ended up doing it now. See, the reason this PR came about is I noticed a bug in Truffle Opcode: When I changed it to use the source map, I also made it so that it assumed the source map was present; if it was absent, it would crash. Oops.

So, first things first, I've changed Truffle Opcode so it no longer makes that assumption. Now, if the source map is present, it will pass the number of instructions, but if it's not present, it will attempt to perform metadata stripping instead.

Of course, as mentioned, I've also reworked the interface to parseCode; now, instead of the old way where metadata stripping is turned on if you don't pass an instruction count, instead the second argument is now an options argument, and both of these are now options. Obviously, this is a breaking change. (OK, I guess I could have preserved the old interface, but I didn't really see any reason to.) Since I was already making breaking changes to @truffle/code-utils, I also went ahead and removed some exports that were only there for compatibility. Of course I had to update the various invocations of parseCode as well, but that was straightforward.

(Note that my intent is that, when we implement EOF support, EOF bytecode will simply ignore these options, as the purpose of both options is for separating code from data, something that EOF does on its own.)

While I was at it, I also made it so that Truffle Opcode can use old-style Vyper sourcemaps as well as Solidity sourcemaps... OK, semi-old-style, ones recent enough that our shim can handle it. Sorry, the really old ones are still getting ignored! I also added special handling to parseCode so that, when metadata stripping is turned on, it will also strip out Vyper 0.3.4's idiosyncratic metadata format (that is now gone in Vyper 0.3.5, hooray). Note I did not make any changes to @truffle/codec's metadata handling for this; Codec's concern is bytecode matching, not disassembly, and since this special metadata is fixed, Codec doesn't need to give it any special treatment.

I also added two tests to @truffle/code-utils; one of the special metadata stripping for Vyper 0.3.4, and one of the more normal Solidity metadata stripping (which wasn't there before!). And it's a good thing I added that latter test, because it turns out that, uh, our Solidity metadata stripping was actually broken. (Good thing we were no longer using it anywhere! But now we will be again...) The problem is that we were using an old version of cbor which didn't accept Uint8Arrays. (This problem didn't affect Codec, as it was passing in hex strings instead.) So I updated cbor to 7.0.6; not to 8.x because 8.1.0 failed to build for some reason (I guess I could have tried an earlier 8.x version, but I didn't bother). That solved the problem. It also let me ditch the @types/cbor dependency, as cbor now supplies its own types. (Scratch that, see comments below -- I ended up not updating cbor and using a Buffer.from() instead.)

Also, it turns out that the @truffle/code-utils dependency was missing in core, so I added it. I wonder why depcheck didn't catch that...? @cds-amal you may want to know about this.

And that (together with some minor style and comment improvements) basically covers it! Note that I only added tests of parseCode, not of truffle opcode itself; I did the testing of the new truffle opcode cases (no sourcemap, old Vyper-style sourcemap) manually.

@haltman-at
Copy link
Contributor Author

Ugh, dammit, looks like cbor 7.0.6 is causing problems elsewhere. OK, I'll have to figure out what to do about that...

@haltman-at
Copy link
Contributor Author

haltman-at commented Aug 6, 2022

Ugh -- if I downgrade to cbor 6.0.1, then the bootstrap succeeds... except it takes forever and the resulting bundle size of Truffle Contract is huge. O_o

Meanwhile, if I downgrade to cbor 5.2.0, we hit the original problem.

Possible options here:

  1. Switch to a different CBOR-parsing package, because there are a number of them out there (cbor-x, cborg, and cbor-sync, to name a few)
  2. Stay on cbor 5.x, but work around the bug with a conversion.

I'm going to go with (2) for now, but wondering if other people have opinions!

(Note: We used to use borc for handling CBOR but it caused some serious problems -- it could crash Node on large input -- so we're definitely not switching back to that one.)

Copy link
Contributor

@benjamincburns benjamincburns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not super familiar with this code, so I have a big pile of questions that probably more represents my lack of a deep knowledge of this part of our codebase moreso than shortcomings in this PR. Much gratitude in advance for bearing with me there! 😄

packages/code-utils/package.json Show resolved Hide resolved
packages/code-utils/src/index.ts Show resolved Hide resolved
packages/code-utils/src/index.ts Show resolved Hide resolved
packages/code-utils/src/index.ts Show resolved Hide resolved
packages/code-utils/test/index.test.ts Show resolved Hide resolved
packages/codec/lib/contexts/utils.ts Outdated Show resolved Hide resolved
packages/codec/package.json Show resolved Hide resolved
packages/core/lib/commands/opcode/run.js Show resolved Hide resolved
packages/code-utils/src/index.ts Show resolved Hide resolved
Copy link
Contributor

@benjamincburns benjamincburns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving this with a couple of caveats.

It's my strong belief that if a code change doesn't introduce new bugs and reflects a net positive in value, we shouldn't block it from merging. We can always make further improvements later. That said, I think if we're seeing maintenance issues with the cbor package, we should investigate moving to one of the other CBOR parser packages. I don't think it's wise or necessary to block this change on completion of that effort, however.

As for my other unresolved comments, I think the same applies. We don't have a stated max cyclomatic complexity metric in our dev-handbook, so I consider that requested change optional.

Similarly, I don't think it's within the scope of the problem that's being solved here to fix the opaque/unreadable/write-only-code hex strings in the test suite, so I regard that requested improvement as optional as well.

@cds-amal
Copy link
Member

cds-amal commented Aug 9, 2022

Also, it turns out that the @truffle/code-utils dependency was missing in core, so I added it. I wonder why depcheck didn't catch that...? @cds-amal you may want to know about this.

Whoa! Thanks, @haltman-at!

Copy link
Member

@cds-amal cds-amal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. I'm approving pending resolution of @benjamincburns 's unresolved conversations.

packages/code-utils/src/index.ts Show resolved Hide resolved
packages/code-utils/src/index.ts Show resolved Hide resolved
packages/code-utils/test/index.test.ts Show resolved Hide resolved
@@ -109,8 +109,10 @@ var SourceMapUtils = {
}

//because we might be dealing with a constructor with arguments, we do
//*not* remove metadata manually
let instructions = CodeUtils.parseCode(binary, numInstructions);
//*not* pass attemptStripMetadata under any circumstances
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be helpful to explain why metadata is preserved for potential cstor with artuments.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...so, I was going to explain why, but on writing out the explanation, I realized this doesn't actually make sense anymore! So I'll explain why I did it this way, and then I'll go change it. :P

Suppose you have a constructor, and you attempt to remove the metadata. But there are arguments on the end of the constructor -- meaning it's the arguments on the end, not the metadata! So the attempt to remove the metadata could go very badly. More specifically, if the last two bytes get interpreted as a large length, then a whole lot could be incorrectly removed, way more than just metadata!

The reason why this no longer applies is because these days we only remove valid metadata. The probability of us removing code by accident is quite low. That said, if the last two bytes are small, we could notionally remove parts of the arguments by accident... but who cares, arguments are data and not code!

Hm, I dunno. Maybe it's best to leave this as-is after all. This case honestly doesn't really matter that much. But I can at least explain it more?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants