Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Peg parser optimizations #2878

Closed
dolik-rce opened this issue Feb 19, 2021 · 9 comments
Closed

Peg parser optimizations #2878

dolik-rce opened this issue Feb 19, 2021 · 9 comments

Comments

@dolik-rce
Copy link
Contributor

Following up from discussion in #2866 about peg parser speed and possible optimization, I'd like to use this issue to share some of my findings and to discuss possibilities for optimization.

As a first expirement, I wrote a simple script that optimizes the grammar. It reduces the number of rules in the grammar, which in turn means less allocations and other overhead. The real-world results are quite good, for Kotlin parser, the grammar is reduced to less than half (from 517 to 222 rules) and the runtime is ~40% faster.

This approach is very basic and could be much improved, especially if it was implemented directly in packcc. However, that is probably not going to happen, since one of the goals of packcc is to generate readable parser which would not be true if the grammar was highly optimized.

Another option is to add a script like this in the build process of ctags, preprocessing the peg files before the are compiled. Full original grammar could be used in debug mode, to make the development easier.

The 40% speed-up is nice, however it still is orders of magnitude worse then custom C parsers. I'll definitely continue to look inside the packcc internals. I believe there must be some way to make the generated parsers faster (and to reduce the memory overhead.

@dolik-rce
Copy link
Contributor Author

I've discovered couple places, where realloc is called unnecessarily. Fixing that shaves off about ~13% of the runtime. I will send a PR to packcc soon.

@masatake: By the way, do you have any plans to migrate to the upstream packcc?

@dolik-rce
Copy link
Contributor Author

I have also a very strong suspicion that the allocations can be optimized much more. By logging all de/allocations, I have discovered, that there is significantly more calls to free then to malloc. For a very simple kotlin script, containing just a single line val x=1, there is:

  • 4615 calls to malloc
  • 43 calls to realloc
  • 5273 calls to free

The log file also shows that there is many small (64 bytes or less) objects (lr answers, captures, thunks, chunks, ...) that are alocated only to be immediately freed again as the algortihm backtracks. I have a strong suspicion, that a simple memory pool (or object pool) would help a lot. I'll try to create some proof of concept implementation, at least for some of the structures, just to be able to measure the performance impact.

@masatake
Copy link
Member

@masatake: By the way, do you have any plans to migrate to the upstream packcc?

Yes. We should use the upstream version.
However, there are some ctags specific changes in our version.
See #2866 (comment) .

@masatake
Copy link
Member

I found the object pool must be implemented in struct foo_context_tag.
static pool_t pool64 = ... is not allowed because Thread-safe and reentrant is one of the features of packcc.

@dolik-rce
Copy link
Contributor Author

I implemented a very simple fixed-size memory pool and modified packcc to use it for all allocation of pcc_lr_answer_t, which is the most often allocated type. The resulting kotlin parser runs about 10% faster. Converting all of the allocations could yield (by my wild guess) 30-35% speedup.

However, more testing is needed, since the pool is not suitable for general use yet. Some of the Kotlin files I use for testing require quite a lot memory, while others require only few thousands simultaneously allocated objects. There is no reasonable limit to set as a default size, so it must be dynamic, which complicates the implementation quite a bit. Only proper testing will reveal if the pool overhead will be lower then using malloc/free in the first place...

@dolik-rce
Copy link
Contributor Author

Yes. We should use the upstream version.

Ok, I will suggest the optimizations directly to upstream. You can either merge them to ctags later or use them after ctags switches to upstream packcc.

Thread-safe and reentrant is one of the features of packcc.

I wasn't aware of that, but since I followed the general codestyle used in packcc, the pool actually turned out to be both thread safe and reentrant :-)

@masatake
Copy link
Member

All the changes made by @dolik-rce are merged to the upstream project.
Through the merging process, I watched how the upstream project is healthy or not.
I think it is mostly healty.

Now we must use the upstream version.
What we have to do first is solving universal-ctags/packcc#5.

  1. exporting all changes made in https://github.com/universal-ctags/packcc to the upstream project.
  2. rebase https://github.com/universal-ctags/packcc to the upstream.
  3. update the build-system of ctags.

@masatake
Copy link
Member

I listed the changes developed at u-ctags project.
universal-ctags/packcc#5

@masatake
Copy link
Member

masatake commented Jun 1, 2021

I think now we can close this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants