Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

address_lists_parser.rb requires a lot of memory (~30 mb) #1342

Closed
schneems opened this issue May 29, 2019 · 5 comments
Closed

address_lists_parser.rb requires a lot of memory (~30 mb) #1342

schneems opened this issue May 29, 2019 · 5 comments

Comments

@schneems
Copy link

Loading the address_lists_parser.rb is the cause for about 1/3 of the startup memory for CodeTriage.

TOP: 105.7266 MiB
  mail/parsers: 31.9609 MiB
    mail/parsers/address_lists_parser: 29.7266 MiB

Using the derailed_benchmarks gem on codetriage/codetriage.

It looks like it is generated by a parser generator Is there some way we can reduce that memory overhead?

@schneems schneems changed the title address_lists_parser.rb requires a lot of memory (30+ mb) address_lists_parser.rb requires a lot of memory (~30 mb) May 29, 2019
@ahorek
Copy link
Contributor

ahorek commented May 30, 2019

There's an option to recompile ragel sources for lower memory consumption, but it does hurt performance, see #1215

@jeremy any plans to support ragel 7?

@schneems
Copy link
Author

schneems commented May 30, 2019

The Ragel Bitmap PR looks pretty promising. Doesn’t look like it affects performance.

I tested with derailed and instead of 30 mb I’m seeing about 5mb. Which is a pretty significant improvement.

Edit, it totally affects performance

@jeremy
Copy link
Collaborator

jeremy commented May 30, 2019

Remember #812#815#912? 😅

Lazy-loading the parsers is desirable, considering that most usage is building and delivering messages, not parsing them. But the lazy-loading has to work consistently.

@jeremy jeremy closed this as completed May 30, 2019
@ahorek
Copy link
Contributor

ahorek commented May 30, 2019

Well, it depends. For our helpdesk we use the mail gem for parsing incoming messages a lot.
Ragel never used to generate a performant code in ruby, but the question is if 30MB per instance really matters these days...

@schneems
Copy link
Author

schneems commented Jun 4, 2019

😅 Sorry for the duplicate issue spam. I was working with someone on derailed benchmarks and mis-remembered that somehow I had gotten away without the memory increase for https://www.codetriage.com.

While it doesn't look like there is an obvious replacement (that satisfies all memory and speed requirements) it did generate some good discussion and eyeballs in #1343 whether any changes end up getting merged or not, there's some new techniques in there that i've not seen before.

the question is if 30MB per instance really matters these days...

It matters more for memory constrained implementations, FaaS and PaaS. Usually, there's a line somewhere and if you're over even a few mb, you'll still start to swap and incur enormous perf penalties.

I certainly understand that for some absolute performance is important. I also understand that one of the goals of the project is to not have a c-extension dependency.

Thanks for your time and maintenance!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants