Obvious Memory Usage Reductions #386

TheBlueMatt · 2020-05-17T21:55:41Z

It seems like one of the biggest memory users is just the gunicorn workers, so it would be nice to be able to reduce the gunicorn worker count with an env variable (I set it down to 1 and didn't notice any performance change). That plus setting RABBITMQ_IO_THREAD_POOL_SIZE: 16 in docker-compose.yaml seems to have reduced memory usage by about a full GB.

The text was updated successfully, but these errors were encountered:

slowr · 2020-05-17T22:54:48Z

hey @TheBlueMatt, thanks for reaching out and the feedback

We have a very big variety of users that can actually benefit from the thread pool size of rabbitmq so it won't be ideal to just reduce them (maybe not so beneficial on gunicorn workers though).

Saying that, what do you think of adding these variables for gunicorn workers and the rabbitmq pool size in the .env file? Default can be the ones we have now so we don't break running developments and you can have your own customized on with the lower values. This way it will be easier to configure it when downloading new versions.

TheBlueMatt · 2020-05-17T23:40:24Z

Right, thats what I was figuring - just make them options. Though, really, the real issue is the bgpstreamlive stuff - it uses about as much memory as everything else combined. Easy to just turn it off and use RIPE RIS, though not sure why it could possibly use so much memory when it appears to do the same thing as the RIS feed listener.

slowr · 2020-05-18T00:08:56Z

Right, thats what I was figuring - just make them options. Though, really, the real issue is the bgpstreamlive stuff - it uses about as much memory as everything else combined. Easy to just turn it off and use RIPE RIS, though not sure why it could possibly use so much memory when it appears to do the same thing as the RIS feed listener.

BGPStream offers more reliability as a different source of truth and improves the coverage as it's not only serving RIPE RIS updates but also Route Views, RIPE dumps and real-time BMP data.

Hope that makes more sense now. I will issue a PR for a quick fix to introduce these two variables that you mentioned.

TheBlueMatt · 2020-05-18T00:09:55Z

BGPStream offers more reliability as a different source of truth and improves the coverage as it's not only serving RIPE RIS updates but also Route Views, RIPE dumps and real-time BMP data.

Right, I'd love to use BGPStream instead, but I don't really get how it uses 1.5-2GB of mem to monitor 10 prefixes. For now its not really worth dedicating that to it (at least when the RIPE RIS direct and bgpstreamkafka-BMP listeners use 100x less memory and appear to have about as much incoming data, not to mention redundancy of reading via two separate listener processes instead of one common codebase).

slowr · 2020-05-19T13:47:04Z

The developers of BGPStream suggested to try LIBTRACEIO=buffers=1 option when running the monitor container. Can you try adding it on the docker-compose.yaml environment variables and see the memory consumption?

From the devs:

Can you try setting LIBTRACEIO=buffers=1 when you launce the monitor process?

With our limited testing we found that setting it would reduce the memory consumption significantly. 
The downside is it slows down the processing of dump files (by 1/3 from the limited test cases).

TheBlueMatt · 2020-05-20T00:28:58Z

Can you try setting LIBTRACEIO=buffers=1 when you launce the monitor process?

This appears to still use 100x or so more memory than bgpstreamkafka and riperis. I assume its trying to do something much more than just simply follow a stream of BGP updates (I assume it, like, holds the full table in memory or so?), maybe there's a way to turn off whatever feature its trying to provide which others do not?

alistairking · 2020-05-20T15:30:57Z

This is because you're comparing the memory usage of a true realtime stream (BMP from bgpstreamkafka and RIS-Live) against a stream generated from many dump files.

In this case BGPStream is creating a stream of updates from the MRT dump files (it has to completely read all files from all collectors since they are indexed by time and not by prefix). In order to do this it needs to open many files at the same time (in order for the stream to be in chronological order). Even with only a (relatively) small buffer for each file, this adds up when you consider there are ~54 collectors from RV and RIS, and for each collector BGPStream often needs to open multiple files in order to provide the best sorting (hence 100x is expected).

This gives data that is redundant (when things are operating correctly) with the RIS Live stream, but it adds all the RV data that is not (yet) available via a live stream.

If the memory consumption is too much for your system, I'd suggest you:

configure bgpstreamlive to only give data from RV (this will only ~halve memory usage)
don't use bgpstreamlive (at the cost of missing out on data from almost all RV collectors)

TheBlueMatt · 2020-05-20T17:32:53Z

Ah! Ok, thanks for explaining it. That makes way more sense. In any case, I'll just stick to missing RV data for now. Thanks again for adding the parameters, btw.

slowr self-assigned this May 17, 2020

slowr added configuration enhancement New feature or request p/medium Medium priority labels May 17, 2020

slowr added this to To do in build-system via automation May 17, 2020

slowr mentioned this issue May 18, 2020

Added GUNICORN_WORKERS and RABBITMQ_IO_THREAD_POOL_SIZE #388

Merged

18 tasks

slowr closed this as completed in #388 May 19, 2020

build-system automation moved this from To do to Done May 19, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Obvious Memory Usage Reductions #386

Obvious Memory Usage Reductions #386

TheBlueMatt commented May 17, 2020

slowr commented May 17, 2020

TheBlueMatt commented May 17, 2020 •

edited

slowr commented May 18, 2020 •

edited

TheBlueMatt commented May 18, 2020 •

edited

slowr commented May 19, 2020

TheBlueMatt commented May 20, 2020

alistairking commented May 20, 2020

TheBlueMatt commented May 20, 2020

Obvious Memory Usage Reductions #386

Obvious Memory Usage Reductions #386

Comments

TheBlueMatt commented May 17, 2020

slowr commented May 17, 2020

TheBlueMatt commented May 17, 2020 • edited

slowr commented May 18, 2020 • edited

TheBlueMatt commented May 18, 2020 • edited

slowr commented May 19, 2020

TheBlueMatt commented May 20, 2020

alistairking commented May 20, 2020

TheBlueMatt commented May 20, 2020

TheBlueMatt commented May 17, 2020 •

edited

slowr commented May 18, 2020 •

edited

TheBlueMatt commented May 18, 2020 •

edited