AsyncIterator causes memory leaks in production #124

TimSusa · 2018-08-22T17:01:28Z

Hi,

this is not an issue directly related to your repository, but as contributor of the project: https://github.com/axelspringer/graphql-google-pubsub I can say we use a very similar version of the AsyncIterator like you have. I wonder if you ran into similar issues?

We observe huge problems with memory leaks on production, where about 100 users are on the system in parallel. It takes about one hour until the memory reaches its limit and a restart of the container is triggered. We only saw this, because of our monitoring, otherwise nobody would have taken notice about it.

Heapdumps gave us the assumption, async iterator is the bad guy here:

Here my Questions:

Is somebody using this project in production?
Did anybody observe similar problems?
What could be the right way in rewriting that AsyncIterator?

By the way, I wonder, why you are not interested in this pull-request, which seems to adress the problem, described: #114
--> We will try that out tomorrow and come back with results.

davidyaha · 2018-08-23T08:36:16Z

Hey @TimSusa! Thanks for opening this issue. I have been following #114 and it seems that I would rather have this package inline with graphql-subscriptions package. The PR over there was never merged as well apollographql/graphql-subscriptions#147.
Also it seems as though the leak has a lot to do with how you implement you asyncIterator/asyncIterable and I hadn't faced issues with it thus far.

If that change will end up fixing the issues you had, let me know and I'll gladly merge it.
I think that the concept of asyncIterator is not well known to the majority of js developers yet and that's a big part of the issue.. I've only used it in the context of graphql-subscriptions so I am sure no expert with it.

Would love to help trying to reproduce your issue with this package if and if it is reproduces, I will give it my full attention.

Thanks again and best of luck!

dobesv · 2018-09-18T23:08:43Z

I am having the same issue, by the looks of it. I think it does have to do with how you use the asyncIterator.

dobesv · 2018-09-18T23:56:38Z

From reading through:

tc39/proposal-async-iteration#126

I believe this issue is exposed if you are using async generators syntax sugar, e.g.:

const pubsub = new RedisPubSub(...);

const exampleSubscriptionResolveFunction = async * () => {
  const asyncIter = pubsub.asyncIterator('topic');
  for await (const elt of asyncIter) {
    if(elt.whatever !== 'bla') continue;
    yield elt;
  }
}

In this case the return call from graphql into the asyncIterable returned by this function isn't properly propagated to the pubsub-returned asyncIterator, so it never unsubscribes.

TimSusa · 2018-09-21T19:00:03Z

Hi,

at first I am sorry for that delay. Following points bubbled up in between at our side. We found out for ourselves to improve the situation by sending keep alive messages to the client. This helped to reduced the amazing amount of dangled connections.

Furthermore, an AWS Autobalancer made a lot of riot by canceling each connection at specific time. We get pay attention about that.

at second we gave our node process a flag called: "--optimize_for_size", according to this article: https://medium.com/@snird/do-not-use-node-js-optimization-flags-blindly-3cc8dfdf76fd

this helped to run the garbage collector more frequently, because we have frequent new data.

at third we improved our open source project to use unit-tests at all and converted this to type-script: https://github.com/axelspringer/graphql-google-pubsub/commits/master

At the moment we will have a look about how it will evolve and look forward.

Best,

Tim Susa

groundmuffin · 2018-12-19T21:20:03Z

I have the same issue in production: every time the message is received, memory increase.
This is the code of my subscribe function

subscribe: (_, args, context, info) => {
	const { pubsub, REDIS_NOTIFICATIONS_CHANNEL, user_id } = context;
	const userIdAsStr = user_id.toString();
	const filterByUser = ({ channel } = {}) => channel === userIdAsStr;
	const iterator = pubsub.asyncIterator(REDIS_NOTIFICATIONS_CHANNEL);
	const filtered = withFilter(
		() => iterator,
		filterByUser
	);
	return filtered(_, args, context, info);
},

dobesv · 2018-12-19T22:34:50Z

You might have to move your call to pubsub.asyncIterator into the callback passed to withFilter to ensure that the the iterator is not created if it is never used.

groundmuffin · 2018-12-21T10:55:22Z

You might have to move your call to pubsub.asyncIterator into the callback passed to withFilter to ensure that the the iterator is not created if it is never used.

Unfortunately, the same result.
I tried to combine node's "--optimize_for_size" cli-option with "NODE_OPTIONS=--max-old-space-size=XXX" environment variable (refers to V8 options) as @TimSusa mentioned, to enforce better garbage collecting. This seems to improve the result, but leaking persist.

I use the similar code in other project, but with graphql-rabbitmq-subscriptions pubsub instead (also based on AsyncIterator), and it's not leaking.

davidyaha · 2019-12-25T13:32:15Z

@groundmuffin @dobesv @TimSusa Please have a look at v2.1.2 and let us know if that fixed your issue. If it does, praise @jedwards1211 for the fix!

jedwards1211 · 2019-12-25T16:33:18Z

It probably doesn't...I discovered the main cause of memory leaks in our application was using async generators for stuff like

async function * subscribe() {
  await checkUserPermissions()
  for await (const event of redisPubSub.asyncIterator('foo')) {
    yield {Foo: event}
  }
}

I'm considering making a babel plugin to fix this use case, but right now the only solution is to not use async generators.

davidyaha · 2019-12-25T16:37:57Z

@jedwards1211 Not sure I am following, should we roll back the fix? Or are you saying that those are not directly related issues?

dobesv · 2019-12-25T20:27:49Z

The fix might be helpful, but it won't fix the issue fully. The async generator and for await loop still won't pass through the fact that the caller is not iterating any more.

jedwards1211 · 2019-12-26T01:42:39Z

No rollback needed, the issue I'm talking about is a separate thing. My PR still lowers the risk of memory leaks

ursualexandr · 2020-02-10T14:53:34Z

Having same issues in production - when there are many subscribers service is being restarted.

jedwards1211 · 2020-02-10T19:06:28Z

@ursualexandr are you using any for await loops?

ursualexandr · 2020-02-10T20:30:45Z

@ursualexandr are you using any for await loops?

no, I don't think so

Subscription: {
    NewSubscription: {
      subscribe: withFilter(
        () => pubsub.asyncIterator('subscription_topic'),
        (payload, variables, context: Context) => {
          const hasAccess = validation(context.profile!, variables.channelId);
          if (!hasAccess) {
            log.error('User does not have access');
            throw new Error('user does not have access');
          }
          return payload.channelId === variables.channelId;
        }
      )
    }
}

Doe it matter if my pubsub.publish is async?

await pubsub.publish('subscription_topic', {
          data,
          channelId
})

jedwards1211 · 2020-02-10T22:18:24Z

Definitely doesn't matter about publish. It looks like withFilter swallows rejected promises from your filter function, treating them the same as returning false: https://github.com/apollographql/graphql-subscriptions/blob/master/src/with-filter.ts#L19

So this might be causing subs to pile up if you were expecting your error to terminate the subscription.
I'm not aware of any other risk of a memory leak in withFilter though...

ursualexandr · 2020-02-20T19:33:06Z

https://github.com/apollographql/graphql-subscriptions/blob/master/src/with-filter.ts#L19

I've tried to return false instead of throwing an error and it still keeps restarting ....

julianguinard · 2020-04-07T10:00:45Z

Any update on this? We also have the issue with the following code in production on our chat server

NodeJS V13.2.0
graphql-redis-subscriptions: V2.2.1
graphql-subscriptions: V1.1.0

{
  joinedRoomWatcher: {
    subscribe: withFilter(
      (root, args, context) =>
        pubsub.asyncIterator(
          `${config.get('pubsub.prefix')}PARTICIPANT_JOINED_ROOM`
        ),
      async (payload, args, { identity }, info) => {
        // Assert identity matches w/ company rooms
        const { participant_id: participantId } = payload;
        const companiesIds = (await identity.getUser).companiesIds;
        return participantId && companiesIds.indexOf(participantId) !== -1;
      }
    ),
    resolve: (payload) => new Room(payload.room),
  }
}

Scenario:

Open a webpage that triggers that subscription once
Force GC in chrome devtools & make snapshot 1
Refresh the same webpage 10 times, then close it
Force GC in chrome devtools & make snapshot 2

=> Leaky result : after calling the same graphQL subscription 10 times by refreshing the same window before closing it, PubSubAsyncIterator has retainers in memory that are never evicted even after GC was forced before creating snapshots 1 & 2

Chrome devtools analysis below

Memory consumption in production below

UPDATE : apparently this had something to do with websocket payload originating from the client not specifying a fixed ID when opening subscription. Adding this ID as in the image below results in PubSubAsyncIterator not retaining memory after GC anymore

Note that I was not using a particular GraphQL Client library on the frontend for this leaky case, but the raw Websocket API. Clients such as Apolo auto-add these IDs to subscriptions

TimSusa closed this as completed Sep 21, 2018

jedwards1211 mentioned this issue Dec 24, 2019

asyncIterator should not subscribe until first call to next() to prevent memory leaks #197

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AsyncIterator causes memory leaks in production #124

AsyncIterator causes memory leaks in production #124

TimSusa commented Aug 22, 2018

davidyaha commented Aug 23, 2018

dobesv commented Sep 18, 2018

dobesv commented Sep 18, 2018

TimSusa commented Sep 21, 2018 •

edited

Loading

groundmuffin commented Dec 19, 2018 •

edited

Loading

dobesv commented Dec 19, 2018

groundmuffin commented Dec 21, 2018

davidyaha commented Dec 25, 2019

jedwards1211 commented Dec 25, 2019

davidyaha commented Dec 25, 2019

dobesv commented Dec 25, 2019

jedwards1211 commented Dec 26, 2019

ursualexandr commented Feb 10, 2020

jedwards1211 commented Feb 10, 2020

ursualexandr commented Feb 10, 2020 •

edited

Loading

jedwards1211 commented Feb 10, 2020

ursualexandr commented Feb 20, 2020

julianguinard commented Apr 7, 2020 •

edited

Loading

AsyncIterator causes memory leaks in production #124

AsyncIterator causes memory leaks in production #124

Comments

TimSusa commented Aug 22, 2018

davidyaha commented Aug 23, 2018

dobesv commented Sep 18, 2018

dobesv commented Sep 18, 2018

TimSusa commented Sep 21, 2018 • edited Loading

groundmuffin commented Dec 19, 2018 • edited Loading

dobesv commented Dec 19, 2018

groundmuffin commented Dec 21, 2018

davidyaha commented Dec 25, 2019

jedwards1211 commented Dec 25, 2019

davidyaha commented Dec 25, 2019

dobesv commented Dec 25, 2019

jedwards1211 commented Dec 26, 2019

ursualexandr commented Feb 10, 2020

jedwards1211 commented Feb 10, 2020

ursualexandr commented Feb 10, 2020 • edited Loading

jedwards1211 commented Feb 10, 2020

ursualexandr commented Feb 20, 2020

julianguinard commented Apr 7, 2020 • edited Loading

TimSusa commented Sep 21, 2018 •

edited

Loading

groundmuffin commented Dec 19, 2018 •

edited

Loading

ursualexandr commented Feb 10, 2020 •

edited

Loading

julianguinard commented Apr 7, 2020 •

edited

Loading