Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(semantic): Add index mapping from span to reference id #270

Merged
merged 2 commits into from
Apr 10, 2023

Conversation

YangchenYe323
Copy link
Contributor

As a temporary workaround, save mappings from Span to ResolvedReferenceId to enable searching for semantic information given an IdentifierReference

@YangchenYe323
Copy link
Contributor Author

Benchmark is taking forever 😓 Maybe we'll fall back to a linear search instead

@Boshen
Copy link
Member

Boshen commented Apr 8, 2023

@YangchenYe323 I found where your algorithm complexity blew up:

https://github.com/Boshen/oxc/blob/024f1a1552516243ab97e60e5a404e14874ca91e/crates/oxc_semantic/src/scope/builder.rs#L69-L85

You don't need to walk up the scope tree here. Pushing the unresolved references to the parent scope is enough. because the parent scope exit will pick up this routine again and carry the unresolved references upwards. Any IdentifierReference is an unresolved reference when the Top scope's unresolved_references contains this IdentifierReference.

I think we should move scope.unresolved_references to ScopeBuilder, and then move it to ScopeTree?

Note: For you need to actively check for Top scope in order to find the correct scope to add the unresolved_references.

@YangchenYe323
Copy link
Contributor Author

You don't need to walk up the scope tree here. Pushing the unresolved references to the parent scope is enough. because the parent scope exit will pick up this routine again and carry the unresolved references upwards. Any IdentifierReference is an unresolved reference when the Top scope's unresolved_references contains this IdentifierReference.

I think we should move scope.unresolved_references to ScopeBuilder, and then move it to ScopeTree?

Note: For you need to actively check for Top scope in order to find the correct scope to add the unresolved_references.

I've made a new commit. Is it what you're suggesting? I experimented with it and found that it didn't give much boost in performace:

Screen Shot 2023-04-08 at 1 10 28 PM

The overall complexity would remain the same either way because the total number of references we're resolving is the same. The depth of the scope tree is not a dominating factor.

@YangchenYe323
Copy link
Contributor Author

YangchenYe323 commented Apr 8, 2023

The overall complexity would remain the same either way because the total number of references we're resolving is the same. The depth of the scope tree is not a dominating factor.

For babylon.max.js, the index's capacity grows to as many as 114688 entries, if each entry is a (Span, ResolvedReferenceId) pair, the index is 2MB.

@Boshen
Copy link
Member

Boshen commented Apr 9, 2023

I wonder if it's due to hashing two u32 resulting too many hash collisions, FxHash is horrible in this regard ...

Maybe we'll fall back to a linear search instead

Shall we try with a Vec, sort it in the final build phase, and do binary search for lookup instead?

@YangchenYe323
Copy link
Contributor Author

I wonder if it's due to hashing two u32 resulting too many hash collisions, FxHash is horrible in this regard ...

Maybe we'll fall back to a linear search instead

Shall we try with a Vec, sort it in the final build phase, and do binary search for lookup instead?

Unfortunately we can't. Currently ResolvedReferenceId is the index into the resolved_reference vector. If we sort it the ids all point to wrong index, and the references in the Symbol are invalidated as well.

But I found a workaround: use BTreeMap. The overhead seems negligible.

@YangchenYe323 YangchenYe323 force-pushed the feat/reference_to_global branch 2 times, most recently from c26ae88 to 4204002 Compare April 9, 2023 20:00
@YangchenYe323
Copy link
Contributor Author

I wonder if it's due to hashing two u32 resulting too many hash collisions, FxHash is horrible in this regard ...

I couldn't reproduce this issue in isolated micro benchmarks (FxHashMap is a lot faster than BTreeMaps). But the memory footprint of FxHashMap is a lot larger than BtreeMap in our case:
Screen Shot 2023-04-09 at 3 52 26 PM

@Boshen Boshen merged commit 35b19e7 into oxc-project:main Apr 10, 2023
@Boshen
Copy link
Member

Boshen commented Apr 10, 2023

The benchmark is finally back to normal. We can finally focus our energy on the HIR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants