Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Growing RAM usage in recvContent #272

Closed
Dieterbe opened this issue May 30, 2017 · 3 comments

Comments

Projects
None yet
3 participants
@Dieterbe
Copy link

commented May 30, 2017

Hello amqp maintainers,
I run an open source project (https://github.com/graphite-ng/carbon-relay-ng) that uses this library, and one of my users is running into a memory leak problem:

go tool pprof carbon-relay-ng carbon-relay-ng.heap
Entering interactive mode (type "help" for commands)
(pprof) top30
77518.24MB of 78140.35MB total (99.20%)
Dropped 123 nodes (cum <= 390.70MB)
      flat  flat%   sum%        cum   cum%
76608.96MB 98.04% 98.04% 77518.24MB 99.20%  github.com/graphite-ng/carbon-relay-ng/vendor/github.com/streadway/amqp.(*Channel).recvContent
  909.28MB  1.16% 99.20%   909.28MB  1.16%  github.com/graphite-ng/carbon-relay-ng/vendor/github.com/streadway/amqp.newDelivery
         0     0% 99.20%   909.28MB  1.16%  github.com/graphite-ng/carbon-relay-ng/vendor/github.com/streadway/amqp.(*Channel).dispatch
         0     0% 99.20% 77518.24MB 99.20%  github.com/graphite-ng/carbon-relay-ng/vendor/github.com/streadway/amqp.(*Connection).demux
         0     0% 99.20% 77518.24MB 99.20%  github.com/graphite-ng/carbon-relay-ng/vendor/github.com/streadway/amqp.(*Connection).dispatchN
         0     0% 99.20% 77687.28MB 99.42%  github.com/graphite-ng/carbon-relay-ng/vendor/github.com/streadway/amqp.(*Connection).reader
         0     0% 99.20% 78139.85MB   100%  runtime.goexit
(pprof) top30 -cum
77518.24MB of 78140.35MB total (99.20%)
Dropped 123 nodes (cum <= 390.70MB)
      flat  flat%   sum%        cum   cum%
         0     0%     0% 78139.85MB   100%  runtime.goexit
         0     0%     0% 77687.28MB 99.42%  github.com/graphite-ng/carbon-relay-ng/vendor/github.com/streadway/amqp.(*Connection).reader
76608.96MB 98.04% 98.04% 77518.24MB 99.20%  github.com/graphite-ng/carbon-relay-ng/vendor/github.com/streadway/amqp.(*Channel).recvContent
         0     0% 98.04% 77518.24MB 99.20%  github.com/graphite-ng/carbon-relay-ng/vendor/github.com/streadway/amqp.(*Connection).demux
         0     0% 98.04% 77518.24MB 99.20%  github.com/graphite-ng/carbon-relay-ng/vendor/github.com/streadway/amqp.(*Connection).dispatchN
         0     0% 98.04%   909.28MB  1.16%  github.com/graphite-ng/carbon-relay-ng/vendor/github.com/streadway/amqp.(*Channel).dispatch
  909.28MB  1.16% 99.20%   909.28MB  1.16%  github.com/graphite-ng/carbon-relay-ng/vendor/github.com/streadway/amqp.newDelivery
(pprof) list amqp.*Channel.*recvContent
Total: 76.31GB
ROUTINE ======================== github.com/graphite-ng/carbon-relay-ng/vendor/github.com/streadway/amqp.(*Channel).recvContent in /home/ubuntu/go/src/github.com/graphite-ng/carbon-relay-ng/vendor/github.com/streadway/amqp/channel.go
   74.81GB    75.70GB (flat, cum) 99.20% of Total
         .          .    365:	case *headerFrame:
         .          .    366:		// drop and reset
         .          .    367:		return ch.transition((*Channel).recvMethod)
         .          .    368:
         .          .    369:	case *bodyFrame:
   74.81GB    74.81GB    370:		ch.body = append(ch.body, frame.Body...)
         .          .    371:
         .          .    372:		if uint64(len(ch.body)) >= ch.header.Size {
         .          .    373:			ch.message.setContent(ch.header.Properties, ch.body)
         .          .    374:			ch.dispatch(ch.message) // termination state
         .   909.28MB    375:			return ch.transition((*Channel).recvMethod)
         .          .    376:		}
         .          .    377:
         .          .    378:		return ch.transition((*Channel).recvContent)
         .          .    379:	}
         .          .    380:
(pprof) 

the user also reported:

I can correlate the memory growth with inbound bandwith, at ~60Mbps its ok - when it goes path ~65Mbps it starts to grow, at ~70Mbps it grows like crazy... top says 136% cpu used (box has 20 cores)

our amqp consumption routine is single threaded.
(see https://github.com/graphite-ng/carbon-relay-ng/blob/fa4a07caa9a9f85463441cfb66d615bc475a320b/cmd/carbon-relay-ng/carbon-relay-ng.go#L193 and https://github.com/graphite-ng/carbon-relay-ng/blob/fa4a07caa9a9f85463441cfb66d615bc475a320b/input/amqp.go#L33 ) so my hypothesis is this: as long as we can handle the ingest volume, no problem. but once the volume is too high, the amqp consumption goes up to 100% cpu and maxes out a single cpu, and the amqp library consumes faster than we can, and that allocates all that memory. and the 36% cpu is just other stuff the relay does.

note this was produced with
amqp library d75c3a3 dated 2017-02-3T13:40:38Z,
I just updated the carbon-relay-ng project to use the latest amqp master (graphite-ng/carbon-relay-ng@3e87664)
, would be nicer to reproduce with that, though i didn't see any changes obviously related to this code.

now, the route in the relay was configured like so:

[amqp]
amqp_enabled = true
amqp_host = "host"
amqp_port = 5672
amqp_user = "foo"
amqp_password = "bar"
amqp_vhost = "/graphite"
amqp_exchange = "aggr_exchange"
amqp_queue = "metrictank002-carbon-relay-ng"
amqp_key = "#"
amqp_durable = false
amqp_exclusive = true

as for how that translates into the options passed in to the amqp library, it's probably easiest/clearest if i just link you to our code at https://github.com/graphite-ng/carbon-relay-ng/blob/fa4a07caa9a9f85463441cfb66d615bc475a320b/input/amqp.go (StartAMQP and consumeAMQP functions)

PS this issue looks somewhat similar to #149 and #264 but i'm not sure.

personally i know little of rabbitmq/amqp (in fact this functionality was contributed by others but I'm trying to support it) so i'm not even sure if we're just misusing the library or not, but to you it might be obvious..

thanks!

@michaelklishin

This comment has been minimized.

Copy link
Collaborator

commented May 30, 2017

Your code seems to rely on the automatic acknowledgement mode and no prefetch (see tutorial 2. Therefore RabbitMQ will deliver to it as fast as possible and there is no way to for consumer to throttle the rate.

Have you considered using manual acknowledgement and a bounded prefetch, say, 200 or 300?

@michaelklishin michaelklishin changed the title memory leak in recvContent potentially due to misconfiguration? Growing RAM usage in recvContent May 30, 2017

@nvx

This comment has been minimized.

Copy link

commented Jun 4, 2017

Unless you are cancelling consumers or closing channels, #264 is unrelated. That issue only leaks memory when there are in flight messages (the higher the prefetch the bigger the effect) when a consumer is cancelled and you don't explicitly drain the consumer delivery chan.

@Dieterbe

This comment has been minimized.

Copy link
Author

commented Jun 6, 2017

thanks for your help, if it looks like the causes of the problem are all on our side (seems to be the case), feel free to close this ticket

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.