Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(parser): remove TokenValue::BigInt from Token #1924

Merged
merged 1 commit into from
Jan 8, 2024

Conversation

Boshen
Copy link
Member

@Boshen Boshen commented Jan 7, 2024

This PR partially fixes #1803 and is part of #1880.

BigInt is removed from the Token value, so that the token size can be
reduced once we removed all the variants.

Token is now also Copy, which removes all the clone and drop calls.

This yields 5% performance improvement for the parser.

Copy link
Member Author

Boshen commented Jan 7, 2024

@github-actions github-actions bot added A-parser Area - Parser A-ast Area - AST A-codegen Area - Code Generation A-prettier Area - Prettier labels Jan 7, 2024
@Boshen Boshen requested a review from Dunqing January 7, 2024 12:50
@Dunqing

This comment was marked as resolved.

Copy link

codspeed-hq bot commented Jan 7, 2024

CodSpeed Performance Report

Merging #1924 will not alter performance

Comparing 01-07-refactor_parser_parse_BigInt_lazily (dad94f7) with main (c6eb519)

Summary

✅ 14 untouched benchmarks

@Boshen Boshen force-pushed the 01-07-refactor_parser_parse_BigInt_lazily branch from ef1143c to ae5f337 Compare January 7, 2024 13:13
@Boshen Boshen force-pushed the 01-07-refactor_parser_parse_BigInt_lazily branch from ae5f337 to 23d8767 Compare January 7, 2024 13:17
@Boshen
Copy link
Member Author

Boshen commented Jan 7, 2024

@Boshen
Copy link
Member Author

Boshen commented Jan 7, 2024

image
image

@overlookmotel
Copy link
Collaborator

overlookmotel commented Jan 7, 2024

Very nice performance gain! I assume that's due to reduced size of Token (and possibly also Token now being Copy and non-drop).

However...

I don't think BigintLiteral now containing a Box<BigInt> rather than just BigInt solves the memory leak.

To explain my thinking...

oxc_allocator::Box only contains a reference &'alloc mut T, not an owned T. So (if I understand right) it has no implicit Drop impl. And oxc_allocator::Box does not have an explicit Drop impl either.

This is different to Bumpalo's Box which ensures the content of the Box is dropped when the Box is with an explicit Drop impl:
https://docs.rs/bumpalo/latest/src/bumpalo/boxed.rs.html#337-344

So you could fix the memory leak by adding a Drop impl to oxc_allocator::Box. But I don't think you want to do that! If Box is drop, then Expression and Statement become drop, which means Program is drop too. So it would have the sizeable negative consequence that dropping an AST then involves traversing the entire AST to hunt for any Boxes which contain drop types.

The fact that it doesn't do this is I think a significant factor in why OXC is so fast, and I had assumed oxc_allocator::Box was designed this way intentionally to achieve this.

The only viable solution I can see to the memory leak, which preserves OXC's speed, is to replace BigInt with an ArenaBigInt<'a> which stores its internal content in the arena - so then there's no data stored external to the arena that needs to be dropped.

Therefore, I don't think the change from BigInt to Box<BigInt> gains anything, and is in fact marginally less performant than before (not that it matters hugely because bigints are so rare in JS code).

All the (very significant) gain in this PR is due to the other change - removing BigInt from TokenValue.

Or am I misunderstanding?

@overlookmotel
Copy link
Collaborator

By the way, if just removing Bigint from TokenValue gets a 5% speedup, that's a very positive sign that removing TokenValue entirely is going to be MASSIVE.

@Boshen
Copy link
Member Author

Boshen commented Jan 7, 2024

I copied the arena implementation from jsparagus https://github.com/mozilla-spidermonkey/jsparagus/blob/master/crates/ast/src/arena.rs

We'll have to experiment with the drop problem. oxc is being integrated into Rolldown (an incremental compiler), we are going to get into real trouble if memory leaks 💥


You're right about Box<BigInt>, let me drop the box from this PR for now, as it adds no functionality.

@Boshen Boshen force-pushed the 01-07-refactor_parser_parse_BigInt_lazily branch 2 times, most recently from d9a662f to 6f8a72d Compare January 7, 2024 15:00
This PR partially fixes #1803 and is part of #1800.

The BigInt value is now boxed, so it can be dropped along with the AST.

It is also removed from the `Token` value, so that the token size can be
reduced once we removed all the variants.
@Boshen Boshen force-pushed the 01-07-refactor_parser_parse_BigInt_lazily branch from 6f8a72d to dad94f7 Compare January 7, 2024 15:01
@Boshen Boshen merged commit 7eb2573 into main Jan 8, 2024
18 checks passed
@Boshen Boshen deleted the 01-07-refactor_parser_parse_BigInt_lazily branch January 8, 2024 04:37
@Boshen Boshen changed the title refactor(parser): parse BigInt lazily refactor(parser): remove TokenValue::BigInt from Token Jan 8, 2024
@Boshen Boshen mentioned this pull request Jan 8, 2024
IWANABETHATGUY pushed a commit to IWANABETHATGUY/oxc that referenced this pull request May 29, 2024
This PR partially fixes oxc-project#1803 and is part of oxc-project#1880.

BigInt is removed from the `Token` value, so that the token size can be
reduced once we removed all the variants.

`Token` is now also `Copy`, which removes all the `clone` and `drop`
calls.

This yields 5% performance improvement for the parser.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ast Area - AST A-codegen Area - Code Generation A-parser Area - Parser A-prettier Area - Prettier
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Memory leak in arena with Atoms
3 participants