Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Obvious Memory Usage Reductions #386

Closed
TheBlueMatt opened this issue May 17, 2020 · 8 comments · Fixed by #388
Closed

Obvious Memory Usage Reductions #386

TheBlueMatt opened this issue May 17, 2020 · 8 comments · Fixed by #388
Assignees
Labels
configuration enhancement New feature or request p/medium Medium priority

Comments

@TheBlueMatt
Copy link

It seems like one of the biggest memory users is just the gunicorn workers, so it would be nice to be able to reduce the gunicorn worker count with an env variable (I set it down to 1 and didn't notice any performance change). That plus setting RABBITMQ_IO_THREAD_POOL_SIZE: 16 in docker-compose.yaml seems to have reduced memory usage by about a full GB.

@slowr slowr self-assigned this May 17, 2020
@slowr
Copy link
Member

slowr commented May 17, 2020

hey @TheBlueMatt, thanks for reaching out and the feedback

We have a very big variety of users that can actually benefit from the thread pool size of rabbitmq so it won't be ideal to just reduce them (maybe not so beneficial on gunicorn workers though).

Saying that, what do you think of adding these variables for gunicorn workers and the rabbitmq pool size in the .env file? Default can be the ones we have now so we don't break running developments and you can have your own customized on with the lower values. This way it will be easier to configure it when downloading new versions.

@slowr slowr added configuration enhancement New feature or request p/medium Medium priority labels May 17, 2020
@slowr slowr added this to To do in build-system via automation May 17, 2020
@TheBlueMatt
Copy link
Author

TheBlueMatt commented May 17, 2020

Right, thats what I was figuring - just make them options. Though, really, the real issue is the bgpstreamlive stuff - it uses about as much memory as everything else combined. Easy to just turn it off and use RIPE RIS, though not sure why it could possibly use so much memory when it appears to do the same thing as the RIS feed listener.

@slowr
Copy link
Member

slowr commented May 18, 2020

Right, thats what I was figuring - just make them options. Though, really, the real issue is the bgpstreamlive stuff - it uses about as much memory as everything else combined. Easy to just turn it off and use RIPE RIS, though not sure why it could possibly use so much memory when it appears to do the same thing as the RIS feed listener.

BGPStream offers more reliability as a different source of truth and improves the coverage as it's not only serving RIPE RIS updates but also Route Views, RIPE dumps and real-time BMP data.

Hope that makes more sense now. I will issue a PR for a quick fix to introduce these two variables that you mentioned.

@TheBlueMatt
Copy link
Author

TheBlueMatt commented May 18, 2020

BGPStream offers more reliability as a different source of truth and improves the coverage as it's not only serving RIPE RIS updates but also Route Views, RIPE dumps and real-time BMP data.

Right, I'd love to use BGPStream instead, but I don't really get how it uses 1.5-2GB of mem to monitor 10 prefixes. For now its not really worth dedicating that to it (at least when the RIPE RIS direct and bgpstreamkafka-BMP listeners use 100x less memory and appear to have about as much incoming data, not to mention redundancy of reading via two separate listener processes instead of one common codebase).

@slowr
Copy link
Member

slowr commented May 19, 2020

The developers of BGPStream suggested to try LIBTRACEIO=buffers=1 option when running the monitor container. Can you try adding it on the docker-compose.yaml environment variables and see the memory consumption?

From the devs:

Can you try setting LIBTRACEIO=buffers=1 when you launce the monitor process?

With our limited testing we found that setting it would reduce the memory consumption significantly. 
The downside is it slows down the processing of dump files (by 1/3 from the limited test cases).

build-system automation moved this from To do to Done May 19, 2020
@TheBlueMatt
Copy link
Author

Can you try setting LIBTRACEIO=buffers=1 when you launce the monitor process?

This appears to still use 100x or so more memory than bgpstreamkafka and riperis. I assume its trying to do something much more than just simply follow a stream of BGP updates (I assume it, like, holds the full table in memory or so?), maybe there's a way to turn off whatever feature its trying to provide which others do not?

@alistairking
Copy link
Collaborator

This is because you're comparing the memory usage of a true realtime stream (BMP from bgpstreamkafka and RIS-Live) against a stream generated from many dump files.

In this case BGPStream is creating a stream of updates from the MRT dump files (it has to completely read all files from all collectors since they are indexed by time and not by prefix). In order to do this it needs to open many files at the same time (in order for the stream to be in chronological order). Even with only a (relatively) small buffer for each file, this adds up when you consider there are ~54 collectors from RV and RIS, and for each collector BGPStream often needs to open multiple files in order to provide the best sorting (hence 100x is expected).

This gives data that is redundant (when things are operating correctly) with the RIS Live stream, but it adds all the RV data that is not (yet) available via a live stream.

If the memory consumption is too much for your system, I'd suggest you:

  • configure bgpstreamlive to only give data from RV (this will only ~halve memory usage)
  • don't use bgpstreamlive (at the cost of missing out on data from almost all RV collectors)

@TheBlueMatt
Copy link
Author

Ah! Ok, thanks for explaining it. That makes way more sense. In any case, I'll just stick to missing RV data for now. Thanks again for adding the parameters, btw.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
configuration enhancement New feature or request p/medium Medium priority
Projects
build-system
  
Done
Development

Successfully merging a pull request may close this issue.

3 participants