Memory usage and Garbage Collector #7

menardorama · 2017-06-01T15:42:47Z

Hi
after fixing the long historical data issue (even if I would'nt have done like that) I am facing another issue.

The process consume all memory on the server while waiting for the result of the db.

It's like if the the GC is not working at all.

The result for me is that I can't transfert the data as it consume the whole 32 GB of RAM and OOM Killer kill the process.

zensqlmonitor · 2017-06-01T17:23:59Z

I made different loads during the day and the memory footprint of the process was less than 2GB for the 4 tables with 100k/per batch and 200k/per batch .
It looks like the GC properly works.
Again, please create the indexes to avoid to scan the complete table and think about the separation of concerns: the ETL process has to be in a separate server for avoiding resources contention.

zensqlmonitor · 2017-06-02T08:43:21Z

Here you are some stats for a run of 12 hours:

influxdb-zabbix process

RSS: Avg 194 MB - Max 337 MB
VSZ: Avg 594 MB - Max 785 MB
I/O Reads/Writes: 0
Threads: 9

postgresql backend

Fetched per sec: Avg 7 K - Max 585 K
Blocks I/O Reads: Avg 114 K - Max 43 mio
Blocks I/O Hits : Avg 1,9 mio- Max 56 mio
Temporary files bytes: Avg 837 MB - Max 18 GB

menardorama · 2017-06-02T09:50:38Z

Hi

Here is after 3 minutes :

zensqlmonitor · 2017-06-02T11:56:47Z

what's your configuration ?

menardorama · 2017-06-02T13:50:54Z

Basically the server have the same specs

2 Xeon 2.§Ghz
32Go of RAM
Postgresql 9.6 with partitionning

Latest version of influxdb
Centos 7

zensqlmonitor · 2017-06-02T15:46:58Z

which GO version ? try to update with latest version.
About your config file: input rows / batch ?
Have you created the indexes ?

menardorama · 2017-06-07T08:46:45Z

I am using Go 1.7.4 and the index has not been created.

Regarding the config :
inputrowsperbatch=50000
outputrowsperbatch=50000
interval=60

But now regarding the indexes, it's just a good to have and should . not have any impact on the memory consumption on the client side.

The thing is I have a 500 GB zabbix DB (most of the data is for the history table and I don't want to add more indexing weight.

Having a limit on the row to return is a workaround but not the real solution (on a DBA part...) for me.
A moving window based on the clock would be more light on the db side (as the ORDER BY force to get all the results in memory or worse in a temp file).

For a one year of historical the overload on the DB is just to much

zensqlmonitor · 2017-06-08T22:22:15Z

@menardorama a moving window based on number of days is now implemented

menardorama · 2017-06-09T12:57:17Z

Hi

Thanks for your feedback, it's much better now on the DB side.

But there is still something wrong, I think I pointed out but I am not enough good in Go to propose a patch.

I'll try to explain my observation.

From what I understand, you app works in two steps

Extract from the Zabbix DB
Process all rows retreived from the sql query and store it in memory
Load it in InfluxDB.

My concern is that I have 57 millions of rows per week, and it does not fit in memory.

Another approach could be to process a batch of rows (at the fetch level) and insert them in influxdb.

This would be more scalable instead of waiting for the full fetch.

Another idea would be to spool the result to a tempfile at the fetch level and pass the filename to the influxdb processor.

Once again I'm sorry I am not good enough in Go to do it.

What do you think ?

zensqlmonitor · 2017-06-09T14:14:17Z

My concern is that I have 57 millions of rows per week, and it does not fit in memory.

You can now split the dataset to multiple dataset with the conf paramaters daysperbatch.
For example, in the configuration file, you set:

startdate="2017-01-01T00:00:00"
daysperbatch=15

=> process will start for data with timestamp between ]2017-01-01 and 2017-01-16[ and it will continue with the increment of 15 days, [2017-01-16 to 2017-01-31[, etc
Like this and related to the number of rows you got per days, you can adjust the batch number.

Have you tested the last version ?

menardorama · 2017-06-09T14:17:58Z

Yes my comment was regarding the latest version.

And I can't put more RAM on my server (32GB already)

Setting daysperbatch=1 help a bit but it's already 57 Millions of rows and it consume all memory

zensqlmonitor · 2017-06-09T14:36:36Z

57 Millions for 1 day ? you just said it was for 1 week. Anyway, that's huge...
I can't do better and sorry but won't spool the result in disk.

menardorama · 2017-06-09T14:57:33Z

Sorry you're right 51 millions is per week... But having à batch per day don't fit either in memory.... Thanks anyway. Le 9 juin 2017 4:36 PM, "zensqlmonitor" <notifications@github.com> a écrit : 57 Millions for 1 day ? you just said it was for 1 week. Anyway, that's huge... Sorry but won't spool the result in disk. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#7 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AF0vYLQF0yTpF3sy4DVB4w3rzRT-14Zcks5sCVh0gaJpZM4NtKzr> .

zensqlmonitor · 2017-06-09T20:10:28Z

Let's do it more granular.
I've just commited a moving window based in hours -> new parameter: hours per batch.
@menardorama could you please have a look ?

zensqlmonitor closed this as completed Jun 15, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory usage and Garbage Collector #7

Memory usage and Garbage Collector #7

menardorama commented Jun 1, 2017

zensqlmonitor commented Jun 1, 2017 •

edited

Loading

zensqlmonitor commented Jun 2, 2017 •

edited

Loading

menardorama commented Jun 2, 2017

zensqlmonitor commented Jun 2, 2017

menardorama commented Jun 2, 2017

zensqlmonitor commented Jun 2, 2017 •

edited

Loading

menardorama commented Jun 7, 2017

zensqlmonitor commented Jun 8, 2017

menardorama commented Jun 9, 2017

zensqlmonitor commented Jun 9, 2017

menardorama commented Jun 9, 2017

zensqlmonitor commented Jun 9, 2017

menardorama commented Jun 9, 2017 via email •

edited by zensqlmonitor

Loading

zensqlmonitor commented Jun 9, 2017 •

edited

Loading

Memory usage and Garbage Collector #7

Memory usage and Garbage Collector #7

Comments

menardorama commented Jun 1, 2017

zensqlmonitor commented Jun 1, 2017 • edited Loading

zensqlmonitor commented Jun 2, 2017 • edited Loading

menardorama commented Jun 2, 2017

zensqlmonitor commented Jun 2, 2017

menardorama commented Jun 2, 2017

zensqlmonitor commented Jun 2, 2017 • edited Loading

menardorama commented Jun 7, 2017

zensqlmonitor commented Jun 8, 2017

menardorama commented Jun 9, 2017

zensqlmonitor commented Jun 9, 2017

menardorama commented Jun 9, 2017

zensqlmonitor commented Jun 9, 2017

menardorama commented Jun 9, 2017 via email • edited by zensqlmonitor Loading

zensqlmonitor commented Jun 9, 2017 • edited Loading

zensqlmonitor commented Jun 1, 2017 •

edited

Loading

zensqlmonitor commented Jun 2, 2017 •

edited

Loading

zensqlmonitor commented Jun 2, 2017 •

edited

Loading

menardorama commented Jun 9, 2017 via email •

edited by zensqlmonitor

Loading

zensqlmonitor commented Jun 9, 2017 •

edited

Loading