Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minimize impact on performance #20

Open
sam-goodwin opened this issue Aug 14, 2022 · 3 comments
Open

Minimize impact on performance #20

sam-goodwin opened this issue Aug 14, 2022 · 3 comments

Comments

@sam-goodwin
Copy link

SWC currently emits array literals for s-expressions which seems to be impacting performance, especially when enabling the transformer on node_modules.

To fix, there are various avenues to explore:

  1. Emit a string instead of array literal - lexing and parsing javascript is much more expensive than parsing a JSON string. We can embed the data as a string and then parse it out at runtime using JSON.parse. This is a common technique deployed by bundlers to optimize embedded literal data.
  2. Same as (1) except using something more optimal than JSON, e.g. flat buffers
  3. Only emit references to free variables and then parse the function.toString() - we could even use SWC's JS bindings to do the parsing. This seems to be the most compact form.
@sam-goodwin
Copy link
Author

I did some research into serialization formats and I came to the conclusion that our best option is likely MessagePack. I also considered BSON, ProtoBuf, FlatBuffers and pure JSON.

Reasons for MessagePack:

  1. by far the fastest serialization format I could find
  2. doesn't require a schema - unlike ProtoBuf or FlatBuffers, MessagePack doesn't require a schema which is perfect for our s-expression encoding.
  3. faster and more compact than BSON

Most obvious question is why not JSON? IMO, JSON is unacceptable because we store a lot of numbers for the Span. Encoding numbers as text is highly inefficient as it take 1 byte per character.

The challenge with moving to byte encoding is that I don't know how to represent undefined.

@thantos
Copy link
Contributor

thantos commented Aug 25, 2022

The challenge with moving to byte encoding is that I don't know how to represent undefined.

0 or 255 (max byte)

@sam-goodwin
Copy link
Author

It will have to be compatible with message pack encoding

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants