Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OOM killed #40

Closed
cmabastar opened this issue Nov 14, 2013 · 7 comments
Closed

OOM killed #40

cmabastar opened this issue Nov 14, 2013 · 7 comments

Comments

@cmabastar
Copy link

Hi @monnand

Nov 14 11:00:46 push-notification kernel: [1100089.772898] Node 0 DMA: 2_4kB 2_8kB 2_16kB 1_32kB 3_64kB 3_128kB 2_256kB 0_512kB 1_1024kB 1_2048kB 1_4096kB = 8344kB
Nov 14 11:00:46 push-notification kernel: [1100089.772920] Node 0 DMA32: 446_4kB 0_8kB 1_16kB 2_32kB 1_64kB 0_128kB 1_256kB 1_512kB 1_1024kB 0_2048kB 1_4096kB = 7816kB
Nov 14 11:00:46 push-notification kernel: [1100089.772945] 160 total pagecache pages
Nov 14 11:00:46 push-notification kernel: [1100089.772949] 0 pages in swap cache
Nov 14 11:00:46 push-notification kernel: [1100089.772956] Swap cache stats: add 0, delete 0, find 0/0
Nov 14 11:00:46 push-notification kernel: [1100089.772961] Free swap = 0kB
Nov 14 11:00:46 push-notification kernel: [1100089.772964] Total swap = 0kB
Nov 14 11:00:46 push-notification kernel: [1100089.784492] 985072 pages RAM
Nov 14 11:00:46 push-notification kernel: [1100089.784502] 23923 pages reserved
Nov 14 11:00:46 push-notification kernel: [1100089.784505] 132 pages shared
Nov 14 11:00:46 push-notification kernel: [1100089.784509] 956829 pages non-shared
Nov 14 11:00:46 push-notification kernel: [1100089.784513] [ pid ] uid tgid total_vm rss cpu oom_adj oom_score_adj name
Nov 14 11:00:46 push-notification kernel: [1100089.784532] [ 672] 0 672 3729 106 0 -17 -1000 udevd
Nov 14 11:00:46 push-notification kernel: [1100089.784544] [ 852] 0 852 3728 106 0 -17 -1000 udevd
Nov 14 11:00:46 push-notification kernel: [1100089.784554] [ 853] 0 853 3728 106 0 -17 -1000 udevd
Nov 14 11:00:46 push-notification kernel: [1100089.784564] [ 1366] 0 1366 2280 126 0 0 0 dhclient
Nov 14 11:00:46 push-notification kernel: [1100089.784574] [ 1429] 0 1429 22265 52 0 -17 -1000 auditd
Nov 14 11:00:46 push-notification kernel: [1100089.784583] [ 1444] 0 1444 60832 641 0 0 0 rsyslogd
Nov 14 11:00:46 push-notification kernel: [1100089.784593] [ 1470] 81 1470 5353 59 0 0 0 dbus-daemon
Nov 14 11:00:46 push-notification kernel: [1100089.784605] [ 1559] 0 1559 18860 191 0 -17 -1000 sshd
Nov 14 11:00:46 push-notification kernel: [1100089.784613] [ 1577] 38 1577 7651 148 0 0 0 ntpd
Nov 14 11:00:46 push-notification kernel: [1100089.784623] [ 1584] 220 1584 37654 137 0 0 0 gmond
Nov 14 11:00:46 push-notification kernel: [1100089.784631] [ 1591] 215 1591 10296 150 0 0 0 nrpe
Nov 14 11:00:46 push-notification kernel: [1100089.784639] [ 1617] 219 1617 2708 29 0 0 0 epmd
Nov 14 11:00:46 push-notification kernel: [1100089.784647] [ 1764] 0 1764 29292 152 0 0 0 crond
Nov 14 11:00:46 push-notification kernel: [1100089.784655] [ 1790] 0 1790 5367 44 0 0 0 atd
Nov 14 11:00:46 push-notification kernel: [1100089.784663] [ 1808] 0 1808 1021 24 0 0 0 agetty
Nov 14 11:00:46 push-notification kernel: [1100089.784671] [ 1811] 0 1811 1018 23 0 0 0 mingetty
Nov 14 11:00:46 push-notification kernel: [1100089.784679] [ 1814] 0 1814 1018 22 0 0 0 mingetty
Nov 14 11:00:46 push-notification kernel: [1100089.784686] [ 1817] 0 1817 1018 23 0 0 0 mingetty
Nov 14 11:00:46 push-notification kernel: [1100089.784694] [ 1819] 0 1819 1018 22 0 0 0 mingetty
Nov 14 11:00:46 push-notification kernel: [1100089.784702] [ 1821] 0 1821 1018 23 0 0 0 mingetty
Nov 14 11:00:46 push-notification kernel: [1100089.784710] [ 1823] 0 1823 1018 22 0 0 0 mingetty
Nov 14 11:00:46 push-notification kernel: [1100089.784718] [ 1839] 0 1839 35248 1638 0 0 0 munin-node
Nov 14 11:00:46 push-notification kernel: [1100089.784726] [ 2317] 0 2317 27639 126 0 0 0 uniqush-push
Nov 14 11:00:46 push-notification kernel: [1100089.784734] [ 2319] 0 2319 27572 51 0 0 0 bash
Nov 14 11:00:46 push-notification kernel: [1100089.784742] [ 2320] 0 2320 1163030 940269 0 0 0 uniqush-push
Nov 14 11:00:46 push-notification kernel: [1100089.784750] Out of memory: Kill process 2320 (uniqush-push) score 950 or sacrifice child
Nov 14 11:00:46 push-notification kernel: [1100089.784758] Killed process 2320 (uniqush-push) total-vm:4652120kB, anon-rss:3761076kB, file-rss:0kB

I really can't pinned it down why, but it happens when we try to send a push to 500k devices. It just runs out of memory, and we have a cron to revive it. It happens time to time.

@ghost ghost assigned monnand Nov 14, 2013
@monnand
Copy link
Member

monnand commented Nov 15, 2013

Sorry for my late reply. I was working on another project.

I think this is because of the GC which will not release memory immediately. Let me see what I can do to make it more memory efficient. I really appreciate for your report!

@monnand
Copy link
Member

monnand commented Nov 15, 2013

@cmabastar While I'm working on the code, you may do some optimization on your side. For example, you may want to use a swap partition on your instance to store the memory page on disk.

In the meantime, I will try to optimize the memory management part to use less memory.

A quick question: Are you using wild card to send those 500K messages? Or 500K requests individually?

@cmabastar
Copy link
Author

Hi @monnand ,

Ok, Thanks for the tip. Nope, we are not using wild card, the last time we used wild card was around 1.4.3 version which caused it to crash heavily. Yes we are still sending them individually.

@monnand
Copy link
Member

monnand commented Nov 19, 2013

@cmabastar No worries. Sending them individually is the correct way (I believe) considering the volume of messages you need to send. I will try my best to improve the memory management part and will keep you updated. I hope I can release a new improved version by the end of this year. Thank you so much for your report!

@monnand
Copy link
Member

monnand commented Sep 28, 2014

@cmabastar I believe this is related to the connpool and this bug is reported an fixed in this thread. I did not uploaded a new version with this fix yet. But it is already available when you run go get -u github.com/uniqush/uniqush-push.

@mishan
Copy link
Member

mishan commented Jun 4, 2015

What volume are you sending at? At about 200 GCM pushes / sec distributed by 6 uniqush-push instances, the RSS is over 700MB per instance and remains pretty constant -- it only varies with volume of pushes. Subscribe / unsubscribe is very light weight since it's Redis.

nobody 10449 30.4 18.6 1425972 730892 ? Sl May29 2337:41 /usr/bin/uniqush-push

@TysonAndre
Copy link
Contributor

We stopped using connpool and moved to a worker pool a while ago.

We've also fixed various GCM inefficiencies.

Please open a new issue if you are still having problems with uniqush-push 2.5.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants