Memory leak #1531

usmanm · 2016-05-25T15:04:28Z

May 25 07:22:33 vapipeline01 kernel: [733833.174735] [ 8820]  1001  8820  3149706  1516475    2992        0             0 pipeline-server
May 25 07:22:33 vapipeline01 kernel: [733833.174868] Out of memory: Kill process 8820 (pipeline-server) score 369 or sacrifice child
May 25 07:22:33 vapipeline01 kernel: [733833.179622] Killed process 8820 (pipeline-server) total-vm:12598824kB, anon-rss:6065816kB, file-rss:84kB
May 25 07:22:33 vapipeline01 pipeline[8814]: [83-1] LOG:  autovacuum launcher process (PID 8820) was terminated by signal 9: Killed
May 25 12:45:51 vapipeline01 pipeline[20481]: [8820-1] LOG:  out of file descriptors: Too many open files; release and retry
May 25 12:45:51 vapipeline01 pipeline[20481]: [8820-2] STATEMENT:  planedemo_altitude
May 25 13:36:14 vapipeline01 pipeline[20482]: [8820-1] LOG:  out of file descriptors: Too many open files; release and retry
May 25 13:36:14 vapipeline01 pipeline[20482]: [8820-2] STATEMENT:  planedemo_altitude

The text was updated successfully, but these errors were encountered:

derekjn · 2016-05-25T16:08:34Z

This might actually have been the culprit of the OOM crashes that we were seeing the other day.

derekjn · 2016-07-31T22:28:53Z

Do we have any idea of a workload that can reproduce this? So far I haven't been able to.

focusaurus · 2016-10-03T21:09:26Z

FYI we are seeing what we believe is to be this issue (or something similar) on pipelinedb v0.9.5 on ubuntu linux:

LOG: could not fork autovacuum worker process: Cannot allocate memory LOG: autovacuum launcher process (PID 4280) was terminated by signal 9: Killed

We see this regularly and it continually leaks memory until the whole process gets killed every few days.

derekjn · 2016-11-05T23:45:09Z

This was most likely fixed by one of these fixes:

4c300e0
ed109e0

sat · 2016-11-10T04:23:57Z

@derekjn @usmanm we were seeing the same issue with Ubuntu 14.04 and PipelineDB 0.9.5 and the autovacuum launcher process. We have updated to 0.9.6 and we are still experiencing a memory leak, albeit slower.

0.1 percent increase every 5 minutes.

derekjn · 2016-11-10T04:37:47Z

@sat can you share your continuous view definitions here or in a private Gittr channel?

sat · 2016-11-10T04:43:46Z

@derekjn let me know if you want me to provide more context

CREATE CONTINUOUS VIEW ml_score_signal_stw_view AS
SELECT coalesce(g.group_member_id, s.group_id) as group_id, signal_type, avg(score) as score_avg, min(score) as min_score, max(score) as max_score, percentile_cont(array[0.25, 0.5, 0.75]) WITHIN GROUP (ORDER BY score) as percentiles, max(sampled_at) as last_sampled_at, count(*) as signals_count FROM
ml_score_signal_stream s
-- TODO: Add sender_id to groups and use name as human reference
LEFT JOIN groups_flattened g ON g.group_id = s.group_id
WHERE (sampled_at >= clock_timestamp() - interval '1 minute')
GROUP BY coalesce(g.group_member_id, s.group_id), signal_type;

CREATE CONTINUOUS VIEW ml_score_signal_latest_view AS
SELECT group_id, keyed_max(last_sampled_at, score_avg) as score_avg,
keyed_max(last_sampled_at, min_score) as min_score,
keyed_max(last_sampled_at, max_score) as max_score,
keyed_max(last_sampled_at, percentile_25) as percentile_25,
keyed_max(last_sampled_at, percentile_50) as percentile_50,
keyed_max(last_sampled_at, percentile_75) as percentile_75,
max(last_sampled_at) as last_sampled_at,
keyed_max(last_sampled_at, signals_count) as signals_count
FROM ml_score_signal_latest_stream
GROUP BY group_id;

CREATE CONTINUOUS VIEW ml_score_signal_ttw_view AS
  SELECT coalesce(g.group_member_id, s.group_id) as group_id, minute(sampled_at) as minute, signal_type, avg(score) as score_avg, min(score) as min_score, max(score) as max_score, percentile_cont(array[0.25, 0.5, 0.75]) WITHIN GROUP (ORDER BY score) as percentiles, max(sampled_at) as last_sampled_at, count(*) as signals_count
  FROM ml_score_signal_stream s
  LEFT JOIN groups_flattened g ON g.group_id = s.group_id
  WHERE sampled_at >= clock_timestamp() - interval '1 hour'
  GROUP BY minute, coalesce(g.group_member_id, s.group_id), signal_type;

CREATE CONTINUOUS VIEW health_status_signal_group_latest_view AS
SELECT group_id, keyed_max(sampled_at, score) as score, max(sampled_at) as last_sampled_at FROM
health_status_signal_stream
GROUP BY group_id;

CREATE CONTINUOUS VIEW system_status_signal_group_latest_view AS
SELECT group_id, keyed_max(sampled_at, score) as score, avg(score) as avg_score, max(sampled_at) as last_sampled_at FROM
system_status_signal_stream
GROUP BY group_id;

derekjn · 2016-11-10T05:00:55Z

Thanks @sat! One more thing that would be helpful is a count(*) for each of those CVs' matrels so we can get an idea of how big they are. So,

SELECT count(*) FROM <cv name>_mrel

sat · 2016-11-10T05:09:32Z

@derekjn here you go, if it makes it easier i can give you an sql dump.
85 - ml_score_signal_stw_view_mrel
10 - ml_score_signal_latest_view_mrel
699 - ml_score_signal_ttw_view_mrel
6 - health_status_signal_group_latest_view_mrel
6 - system_status_signal_group_latest_view_mrel

derekjn · 2016-11-10T17:28:34Z

@sat after doing some initial investigation, a couple of more things would be helpful:

CREATE STREAM statements (so we can know all specific types of CV columns)
CREATE TABLE statement for groups_flattened.
Number of rows in groups_flattened

sat · 2016-11-10T22:02:02Z

@derekjn would a full schema / data dump be helpful?

derekjn · 2016-11-10T22:43:04Z

Sure, wouldnt hurt. Thanks!

sat · 2016-11-10T22:56:06Z

@derekjn my colleague @dominicpacquing will send you it with a PM in Gitter

derekjn · 2016-11-14T18:36:46Z

Hi @sat,

I've run quite a few tests against the dump you guys sent us, and haven't been able to reproduce any unexpected behavior. And we're running these tests on CVs that are several orders of magnitude larger than the sizes you previously indicated. A couple of notes:

When a sliding-window CV's output stream is being read by something (e.g. your transforms), the entire CV will be cached in memory because new values have to be "ticked" forward with time (this is not necessary for non-sliding-window CVs). This may account for the rising memory usage you're seeing, but with such small CVs it seems like memory consumption would still be relatively low.
You may want to try reworking your CVs to not use sliding windows, and instead group on something like minute(sampled_at) and just use TTLs to clean up old rows

sat · 2016-11-14T21:08:37Z

@derekjn ok, at the moment we are using monit to terminate pipelinedb at 80% system memory usage. It then re-exhibits the same slow leak. We can try the TTLs for the tumbling windows used but unfortunately the sliding windows are needed.

The other factor that may not have been reproduced on your end is the use of the pipeline_kinesis extension.

derekjn · 2016-11-14T21:18:28Z

@sat gotcha. If the sliding-window CVs cached for writing to output streams can't account for the memory consumption, then this is still a bug and we'll get to the bottom of it. A couple more questions as we hone in on the issue:

What is the maximum theoretical number of rows you expect for sliding-window CVs that have something reading their output stream? I'd like to rule this out as a cause of increasing memory consumption.
Can you isolate the increasing memory consumption to a single process?

sat · 2016-11-14T21:36:51Z

@derekjn The number of rows - potentially hundreds to thousands. It depends on the join table (groups_flattened). At the moment we are only running it against something quite small (10 rows). That join which is used by the GROUP BY may be duplicating a lot of records as essentially it is maintaining aggregates for a recursive grouping structure. Eg. like a graph, a leaf group also belongs to the group that contains it and then the group that contain that group, all the way up to the super group that contains all groups. In order to maintain the aggregates for all groups then that data point would need to be duplicated in all of it's grouped aggregates (rows)? I can elaborate on this if it's not clear.

The increasing memory consumption can be isolated to the "autovacuum launcher process". Is this what you mean?

We will try the TTL on the views and report back also.

derekjn · 2016-11-14T22:43:13Z

@sat are you able to run PipelineDB under valgrind in your environment? Since I haven't been able to repro the autovac leak locally, it would be helpful if we could get a memory dump from you. You can use the massif tool for that:

valgrind --tool=massif pipelinedb -D <data dir> <args>

This will create a file for each pid launched by the pipelinedb root process after pipelinedb is shut down. Note that running under valgrind will slow things down substantially. If you could attach the autovacuum launcher process' massif file here, that should tell us a lot.

sat · 2016-11-14T22:52:54Z

@derekjn sure, will do that now

derekjn · 2016-11-16T18:08:23Z

@sat were you guys able to take a memory dump with massif? I noticed there was a comment here about files not being created. Note that you need to cleanly stop valgrind (and therefore the pipelinedb process running under it) for the files to be created.

sat · 2016-11-16T21:21:26Z

@derekjn i saw that comment and it must be someone with a related issue. Will take the memory dump today. Sorry for the delay.

sat · 2016-11-17T05:12:03Z

@derekjn does this help?
massif.out.4980.zip

derekjn · 2016-11-17T05:22:54Z

Definitely, here's the leak:

It was a pretty slow one, which is why we weren't able to repro it. Super simple fix though, thank you so much for your help and patience!

sat · 2016-11-17T05:34:33Z

@derekjn, ok, so am i safe to run HEAD of master? It is a slightly quicker leak when the system is under load (nothing was being ingested when i ran this).

derekjn · 2016-11-17T05:35:44Z

Yes, we'll also backport this to the latest 0.9.6 release. I'll let you know here when the releases are published.

sat · 2016-11-17T05:40:02Z

@derekjn ok thanks

derekjn · 2016-11-17T05:57:08Z

@sat the 0.9.6 releases containing the backported fix have been published.

sat · 2016-11-17T22:27:17Z

@derekjn, i used the updated 0.9.6 release and still looks like i'm getting the same issue.

derekjn · 2016-11-17T22:55:21Z

So right when I mentioned that the 0.9.6 releases were updated, the nightly build probably wasn't updated yet. Can you run this?

SELECT pipeline_version();

sat · 2016-11-17T23:36:55Z

@derekjn "PipelineDB 0.9.6 at revision 0fbff92 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4, 64-bit"

derekjn · 2016-11-17T23:54:31Z

Hmm, that revision should definitely contain the fix. The cause of the leak was very obvious once we got the dump, but I'll double check that there isn't anything else.

Also note that if you downloaded the nightly binary last night after the fix went out, it would not have contained the fix. Only the 0.9.6 release was patched, is it possible that you were using the nightly version?

sat · 2016-11-18T00:14:56Z

@derekjn i built it based on the following: https://www.pipelinedb.com/download/0.9.6/ubuntu14

derekjn · 2016-11-18T00:27:09Z

Ok, I'm double checking the fix from last night, will report back shortly.

derekjn · 2016-11-18T01:40:16Z

@sat I haven't been able to repro the issue on the latest release binaries. What is the y axis on the chart you attached, GB? If so, it looks like about an increase of < 1GB over an 8-hour period, which could easily be legitimate and not a leak. Or have you been able to isolate to the autovac process?

sat · 2016-11-18T02:48:22Z

@derekjn you can use the same dump. The Y axis is %. I can leave it running for the weekend, here is today only:

I can run valgrind against it again if it runs out of memory over the weekend. There is nothing being ingested into the views at present, so i'm not sure what could account for the increased memory usage?

derekjn · 2016-11-18T04:06:25Z

Hmm, what exactly do you mean by use the same dump? If there's less than a 1% increase in memory usage over a long period of time, then there's nothing that would indicate a problem anywhere. Various system caches, shared buffers, etc. can easily accumulate small amounts of memory over time. I'll keep looking to see if there's a small leak anywhere, but it doesn't seem like it. If nothing is being ingested, then no part of PipelineDB is really even running, so this would be happening at the PostgreSQL level.

sat · 2016-11-18T04:43:00Z

@derekjn same dump as my colleague gave you on gitter. Yes, I think it's fine. Had a look at the memory consumption of pipeline and it's below 9% and the autovacuum process is now well down the queue of top memory consumers. Interestingly, the awslogs python agent that is providing the statistics for those graphs (in cloudwatch) is using quite a bit of memory :(. Will monitor over the weekend and see.

derekjn · 2016-11-18T04:57:27Z

@derekjn same dump as my colleague gave you on gitter.

I'm not sure that I follow, since that dump was taken before the fix was pushed out and isn't relevant anymore. In any case, please let us know if you see a leak isolated to a specific PipelineDB process. Thanks @sat !

sat · 2016-11-22T04:29:27Z

@derekjn all good

usmanm added bug vacuum labels May 25, 2016

usmanm added this to the 0.x.0 milestone May 25, 2016

derekjn removed this from the 0.x.0 milestone May 28, 2016

derekjn modified the milestone: 0.9.4 Jul 31, 2016

derekjn mentioned this issue Jul 31, 2016

Autovacuum launcher process causes memory leak #1465

Closed

derekjn closed this as completed Aug 2, 2016

usmanm reopened this Oct 3, 2016

derekjn closed this as completed Nov 5, 2016

derekjn reopened this Nov 10, 2016

derekjn changed the title ~~AV launcher killed due to OOM~~ Memory leak Nov 14, 2016

derekjn removed the vacuum label Nov 14, 2016

derekjn added this to the 0.9.x milestone Nov 14, 2016

derekjn removed this from the 0.9.5 milestone Nov 14, 2016

derekjn self-assigned this Nov 14, 2016

derekjn closed this as completed in 4bcecf5 Nov 17, 2016

Memory leak #1531

Memory leak #1531

Comments

usmanm commented May 25, 2016

derekjn commented May 25, 2016

derekjn commented Jul 31, 2016

focusaurus commented Oct 3, 2016

derekjn commented Nov 5, 2016

sat commented Nov 10, 2016

derekjn commented Nov 10, 2016

sat commented Nov 10, 2016

derekjn commented Nov 10, 2016

sat commented Nov 10, 2016

derekjn commented Nov 10, 2016

sat commented Nov 10, 2016

derekjn commented Nov 10, 2016

sat commented Nov 10, 2016

derekjn commented Nov 14, 2016

sat commented Nov 14, 2016

derekjn commented Nov 14, 2016

sat commented Nov 14, 2016 • edited

derekjn commented Nov 14, 2016

sat commented Nov 14, 2016

derekjn commented Nov 16, 2016

sat commented Nov 16, 2016

sat commented Nov 17, 2016

derekjn commented Nov 17, 2016

sat commented Nov 17, 2016

derekjn commented Nov 17, 2016

sat commented Nov 17, 2016

derekjn commented Nov 17, 2016

sat commented Nov 17, 2016

derekjn commented Nov 17, 2016

sat commented Nov 17, 2016

derekjn commented Nov 17, 2016

sat commented Nov 18, 2016

derekjn commented Nov 18, 2016

derekjn commented Nov 18, 2016

sat commented Nov 18, 2016 • edited

derekjn commented Nov 18, 2016

sat commented Nov 18, 2016

derekjn commented Nov 18, 2016

sat commented Nov 22, 2016

sat commented Nov 14, 2016 •

edited

sat commented Nov 18, 2016 •

edited