Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

after upgrading 1.7.2 my synapse does OOM each minute #6587

Closed
pedro-nonfree opened this issue Dec 21, 2019 · 13 comments
Closed

after upgrading 1.7.2 my synapse does OOM each minute #6587

pedro-nonfree opened this issue Dec 21, 2019 · 13 comments

Comments

@pedro-nonfree
Copy link

pedro-nonfree commented Dec 21, 2019

here is a long run (1.5 year) matrix instance running official matrix repos

memory_leak_matrix

(other times I saw postgres: 11/main: synapse_user synapse ::1(38138) SELECT)

and here is in the dmesg (starts with the end of a OOM and starts with one of them, then you have an idea of the time between two) [0]

This started when I upgraded to 1.7.2 in 2019-12-21 02:40:35 [1], but the first memory leak started at 2019-12-21 12:29:04; then I decided to upgrade from stretch (debian oldstable) to buster (debian stable).

Then I downgraded to 1.7.1 and I still (?) have the memory leak

wget https://packages.matrix.org/debian/pool/main/m/matrix-synapse-py3/matrix-synapse-py3_1.7.1+buster1_amd64.deb
dpkg -i matrix-synapse-py3_1.7.1+buster1_amd64.deb

Then I downgraded to 1.7.0 and the memory leak continues, but in a slower time same timing, first oom requires a little more time, but then again as usual

wget https://packages.matrix.org/debian/pool/main/m/matrix-synapse-py3/matrix-synapse-py3_1.7.0+buster1_amd64.deb
dpkg -i matrix-synapse-py3_1.7.0+buster1_amd64.deb

My configuration overrides the default one (uses postgresql and ldap provider) [3]

Thanks Aaron and realitygaps in the synapse matrix channel for their initial support which was crucial to start this bug report

[0]

[Sat Dec 21 21:43:55 2019] Out of memory: Kill process 3547 (python) score 897 or sacrifice child
[Sat Dec 21 21:43:55 2019] Killed process 3547 (python) total-vm:4568952kB, anon-rss:3618040kB, file-rss:0kB, shmem-rss:0kB
[Sat Dec 21 21:43:55 2019] oom_reaper: reaped process 3547 (python), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
[Sat Dec 21 21:46:51 2019] python invoked oom-killer: gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null), order=0, oom_score_adj=0
[Sat Dec 21 21:46:51 2019] python cpuset=/ mems_allowed=0
[Sat Dec 21 21:46:51 2019] CPU: 0 PID: 3757 Comm: python Not tainted 4.19.0-6-amd64 #1 Debian 4.19.67-2+deb10u2
[Sat Dec 21 21:46:51 2019] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.1-0-g0551a4be2c-prebuilt.qemu-project.org 04/01/2014
[Sat Dec 21 21:46:51 2019] Call Trace:
[Sat Dec 21 21:46:51 2019]  dump_stack+0x5c/0x80
[Sat Dec 21 21:46:51 2019]  dump_header+0x6b/0x283
[Sat Dec 21 21:46:51 2019]  ? do_try_to_free_pages+0x2ec/0x370
[Sat Dec 21 21:46:51 2019]  oom_kill_process.cold.30+0xb/0x1cf
[Sat Dec 21 21:46:51 2019]  ? oom_badness+0xe9/0x140
[Sat Dec 21 21:46:51 2019]  out_of_memory+0x1a5/0x430
[Sat Dec 21 21:46:51 2019]  __alloc_pages_slowpath+0xbd8/0xcb0
[Sat Dec 21 21:46:51 2019]  __alloc_pages_nodemask+0x28b/0x2b0
[Sat Dec 21 21:46:51 2019]  filemap_fault+0x3bd/0x780
[Sat Dec 21 21:46:51 2019]  ? page_add_file_rmap+0x109/0x210
[Sat Dec 21 21:46:51 2019]  ? alloc_set_pte+0xf2/0x560
[Sat Dec 21 21:46:51 2019]  ? _cond_resched+0x15/0x30
[Sat Dec 21 21:46:51 2019]  __xfs_filemap_fault+0x6d/0x200 [xfs]
[Sat Dec 21 21:46:51 2019]  __do_fault+0x36/0x130
[Sat Dec 21 21:46:51 2019]  __handle_mm_fault+0xe6c/0x1270
[Sat Dec 21 21:46:51 2019]  handle_mm_fault+0xd6/0x200
[Sat Dec 21 21:46:51 2019]  __do_page_fault+0x249/0x4f0
[Sat Dec 21 21:46:51 2019]  ? async_page_fault+0x8/0x30
[Sat Dec 21 21:46:51 2019]  async_page_fault+0x1e/0x30
[Sat Dec 21 21:46:51 2019] RIP: 0033:0x53cf39
[Sat Dec 21 21:46:51 2019] Code: Bad RIP value.
[Sat Dec 21 21:46:51 2019] RSP: 002b:00007fff6f061790 EFLAGS: 00010202
[Sat Dec 21 21:46:51 2019] RAX: 000000000079c910 RBX: 0000000000000002 RCX: 0000000000000001
[Sat Dec 21 21:46:51 2019] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 00007fa53de76db8
[Sat Dec 21 21:46:51 2019] RBP: 00000000008a5780 R08: 00000000008a5780 R09: 0000000000000002
[Sat Dec 21 21:46:51 2019] R10: 00007fa56b2d0be0 R11: 0000000000000022 R12: 0000000000000001
[Sat Dec 21 21:46:51 2019] R13: 0000000000000002 R14: 0000000000598e40 R15: 00007fff6f061f28
[Sat Dec 21 21:46:51 2019] Mem-Info:
[Sat Dec 21 21:46:51 2019] active_anon:959454 inactive_anon:1978 isolated_anon:0
                            active_file:14 inactive_file:42 isolated_file:0
                            unevictable:0 dirty:0 writeback:0 unstable:0
                            slab_reclaimable:7449 slab_unreclaimable:9459
                            mapped:35816 shmem:36953 pagetables:3418 bounce:0
                            free:21726 free_pcp:62 free_cma:0
[Sat Dec 21 21:46:51 2019] Node 0 active_anon:3837816kB inactive_anon:7912kB active_file:56kB inactive_file:268kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:143364kB dirty:0kB writeback:0kB shmem:147812kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 53248kB writeback_tmp:0kB unstable:0kB all_unreclaimable? yes
[Sat Dec 21 21:46:51 2019] Node 0 DMA free:15840kB min:268kB low:332kB high:396kB active_anon:60kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15992kB managed:15908kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[Sat Dec 21 21:46:51 2019] lowmem_reserve[]: 0 2937 3896 3896 3896
[Sat Dec 21 21:46:51 2019] Node 0 DMA32 free:54128kB min:50748kB low:63432kB high:76116kB active_anon:2964716kB inactive_anon:16kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:3129200kB managed:3043412kB mlocked:0kB kernel_stack:464kB pagetables:8480kB bounce:0kB free_pcp:104kB local_pcp:0kB free_cma:0kB
[Sat Dec 21 21:46:51 2019] lowmem_reserve[]: 0 0 959 959 959
[Sat Dec 21 21:46:51 2019] Node 0 Normal free:16432kB min:16564kB low:20704kB high:24844kB active_anon:873040kB inactive_anon:7896kB active_file:288kB inactive_file:0kB unevictable:0kB writepending:0kB present:1048576kB managed:982132kB mlocked:0kB kernel_stack:1824kB pagetables:5192kB bounce:0kB free_pcp:136kB local_pcp:0kB free_cma:0kB
[Sat Dec 21 21:46:51 2019] lowmem_reserve[]: 0 0 0 0 0
[Sat Dec 21 21:46:51 2019] Node 0 DMA: 2*4kB (UM) 1*8kB (U) 1*16kB (U) 0*32kB 3*64kB (UM) 2*128kB (UM) 2*256kB (UM) 1*512kB (M) 2*1024kB (UM) 0*2048kB 3*4096kB (M) = 15840kB
[Sat Dec 21 21:46:51 2019] Node 0 DMA32: 564*4kB (UME) 444*8kB (UME) 103*16kB (UME) 74*32kB (UE) 124*64kB (UME) 54*128kB (UE) 36*256kB (UME) 19*512kB (UME) 7*1024kB (UME) 0*2048kB 1*4096kB (M) = 54880kB
[Sat Dec 21 21:46:51 2019] Node 0 Normal: 1307*4kB (UME) 781*8kB (UME) 238*16kB (UME) 47*32kB (UME) 6*64kB (ME) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 17172kB
[Sat Dec 21 21:46:51 2019] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[Sat Dec 21 21:46:51 2019] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[Sat Dec 21 21:46:51 2019] 37022 total pagecache pages
[Sat Dec 21 21:46:51 2019] 0 pages in swap cache
[Sat Dec 21 21:46:51 2019] Swap cache stats: add 0, delete 0, find 0/0
[Sat Dec 21 21:46:51 2019] Free swap  = 0kB
[Sat Dec 21 21:46:51 2019] Total swap = 0kB
[Sat Dec 21 21:46:51 2019] 1048442 pages RAM
[Sat Dec 21 21:46:51 2019] 0 pages HighMem/MovableOnly
[Sat Dec 21 21:46:51 2019] 38079 pages reserved
[Sat Dec 21 21:46:51 2019] 0 pages hwpoisoned
[Sat Dec 21 21:46:51 2019] Tasks state (memory values in pages):
[Sat Dec 21 21:46:51 2019] [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
[Sat Dec 21 21:46:51 2019] [    257]     0   257     6622      615   110592        0             0 systemd-journal
[Sat Dec 21 21:46:51 2019] [    278]     0   278     5514      260    65536        0         -1000 systemd-udevd
[Sat Dec 21 21:46:51 2019] [    372]   103   372     1706      107    53248        0             0 rpcbind
[Sat Dec 21 21:46:51 2019] [    374]   100   374    23270      232    86016        0             0 systemd-timesyn
[Sat Dec 21 21:46:51 2019] [    384]     0   384     4879      246    73728        0             0 systemd-logind
[Sat Dec 21 21:46:51 2019] [    385]     0   385    56456      291    81920        0             0 rsyslogd
[Sat Dec 21 21:46:51 2019] [    388]   105   388     2243      139    57344        0          -900 dbus-daemon
[Sat Dec 21 21:46:51 2019] [    390]     0   390      581       17    40960        0             0 acpid
[Sat Dec 21 21:46:51 2019] [    391]     0   391     2126       66    57344        0             0 cron
[Sat Dec 21 21:46:51 2019] [    396]     0   396    20540       92    57344        0             0 irqbalance
[Sat Dec 21 21:46:51 2019] [    403]     0   403     1372       44    53248        0             0 atd
[Sat Dec 21 21:46:51 2019] [    627]     0   627     1403       29    45056        0             0 agetty
[Sat Dec 21 21:46:51 2019] [    633]     0   633     3963      218    73728        0         -1000 sshd
[Sat Dec 21 21:46:51 2019] [    646]   111   646    53362     3130   151552        0          -900 postgres
[Sat Dec 21 21:46:51 2019] [    648]     0   648    30525      533   118784        0             0 nginx
[Sat Dec 21 21:46:51 2019] [    649]    33   649    30664      742   126976        0             0 nginx
[Sat Dec 21 21:46:51 2019] [    650]    33   650    30596      584   126976        0             0 nginx
[Sat Dec 21 21:46:51 2019] [    652]   111   652    53416     7486   339968        0             0 postgres
[Sat Dec 21 21:46:51 2019] [    653]   111   653    53395    33117   405504        0             0 postgres
[Sat Dec 21 21:46:51 2019] [    654]   111   654    53362     1578   131072        0             0 postgres
[Sat Dec 21 21:46:51 2019] [    655]   111   655    53496     1215   188416        0             0 postgres
[Sat Dec 21 21:46:51 2019] [    656]   111   656    17198      542   114688        0             0 postgres
[Sat Dec 21 21:46:51 2019] [    657]   111   657    53462      707   135168        0             0 postgres
[Sat Dec 21 21:46:51 2019] [    764]   108   764     6988     2348    98304        0             0 lua5.2
[Sat Dec 21 21:46:51 2019] [   1013]   104  1013     5013      205    77824        0             0 exim4
[Sat Dec 21 21:46:51 2019] [   1431]     0  1431     4228      289    73728        0             0 sshd
[Sat Dec 21 21:46:51 2019] [   1434]     0  1434     5254      302    73728        0             0 systemd
[Sat Dec 21 21:46:51 2019] [   1435]     0  1435     5740      588    81920        0             0 (sd-pam)
[Sat Dec 21 21:46:51 2019] [   1448]     0  1448     2283      671    57344        0             0 bash
[Sat Dec 21 21:46:51 2019] [   1453]     0  1453     2103      273    57344        0             0 htop
[Sat Dec 21 21:46:51 2019] [   1604]     0  1604     4228      296    73728        0             0 sshd
[Sat Dec 21 21:46:51 2019] [   1610]     0  1610     2430      732    57344        0             0 bash
[Sat Dec 21 21:46:51 2019] [   3290]     0  3290     4228      285    69632        0             0 sshd
[Sat Dec 21 21:46:51 2019] [   3296]     0  3296     2316      713    49152        0             0 bash
[Sat Dec 21 21:46:51 2019] [   3476]     0  3476     4368      798    69632        0             0 vi
[Sat Dec 21 21:46:51 2019] [   3757]   110  3757  1120978   905539  7499776        0             0 python
[Sat Dec 21 21:46:51 2019] [   3850]   111  3850    53924    19314   405504        0             0 postgres
[Sat Dec 21 21:46:51 2019] [   3860]   111  3860    54039    15842   376832        0             0 postgres
[Sat Dec 21 21:46:51 2019] [   3868]   111  3868    53871    15040   380928        0             0 postgres
[Sat Dec 21 21:46:51 2019] [   3878]   111  3878    54022    17647   389120        0             0 postgres
[Sat Dec 21 21:46:51 2019] [   3885]   111  3885    53850    14661   372736        0             0 postgres
[Sat Dec 21 21:46:51 2019] [   3889]   111  3889    54058    19481   401408        0             0 postgres
[Sat Dec 21 21:46:51 2019] [   3890]   111  3890    54064    16440   385024        0             0 postgres
[Sat Dec 21 21:46:51 2019] [   3892]   111  3892    54157    17175   385024        0             0 postgres
[Sat Dec 21 21:46:51 2019] [   3895]   111  3895    53927    24754   409600        0             0 postgres
[Sat Dec 21 21:46:51 2019] [   3897]   111  3897    54120    35123   409600        0             0 postgres
[Sat Dec 21 21:46:51 2019] Out of memory: Kill process 3757 (python) score 898 or sacrifice child
[Sat Dec 21 21:46:51 2019] Killed process 3757 (python) total-vm:4483912kB, anon-rss:3622156kB, file-rss:0kB, shmem-rss:0kB
[Sat Dec 21 21:46:51 2019] oom_reaper: reaped process 3757 (python), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB

[1]

Start-Date: 2019-12-19  17:41:49
Commandline: apt upgrade
Upgrade: git-man:amd64 (1:2.11.0-3+deb9u4, 1:2.11.0-3+deb9u5), git:amd64 (1:2.11.0-3+deb9u4, 1:2.11.0-3+deb9u5), matrix-synapse-py3:amd64 (1.6.1+stretch1, 1.7.1+stretch1)
End-Date: 2019-12-19  17:42:50

Start-Date: 2019-12-21  02:40:35
Commandline: apt upgrade
Upgrade: libsasl2-modules-db:amd64 (2.1.27~101-g0780600+dfsg-3, 2.1.27~101-g0780600+dfsg-3+deb9u1), libsasl2-2:amd64 (2.1.27~101-g0780600+dfsg-3, 2.1.27~101-g0780600+dfsg-3+deb9u1), matrix-synapse-py3:amd64 (1.7.1+stretch1, 1.7.2+stretch1), libsasl2-modules:amd64 (2.1.27~101-g0780600+dfsg-3, 2.1.27~101-g0780600+dfsg-3+deb9u1)
End-Date: 2019-12-21  02:41:17

(from /var/log/apt/var/log/history.log)

[2]
https://github.com/matrix-org/synapse/pull/6576/files

which in debian packaging is here: /opt/venvs/matrix-synapse/lib/python3.5/site-packages/synapse/handlers/federation.py

[3]

#certificates
tls_certificate_path: "/etc/letsencrypt/live/example.com/fullchain.pem"
tls_private_key_path: "/etc/letsencrypt/live/example.com/privkey.pem"

# overridden: default is sqlite
database:
  name: psycopg2
  args:
    user: synapse_user
    password: synapse_user
    database: synapse
    host: localhost
    cp_min: 5
    cp_max: 10

password_providers:
 - module: "ldap_auth_provider.LdapAuthProvider"
   config:
     enabled: true
     uri: "ldaps://ldap.example.com"
     start_tls: true
     base: "o=webusers,dc=example,dc=com"
     attributes:
        uid: "uid"
        mail: "mail"
        name: "uid"

# overridden: default is false
allow_guest_access: True

# reverse proxy -> https://github.com/matrix-org/synapse#using-a-reverse-proxy-with-synapse
listeners:
  # Main HTTPS listener
  # For when matrix traffic is sent directly to synapse.
  -
    # The port to listen for HTTPS requests on.
    port: 8448

    # Local interface to listen on.
    # The empty string will cause synapse to listen on all interfaces.
    #bind_address: ''
    # include IPv6 -> src https://github.com/matrix-org/synapse/issues/1886
    bind_address: '::'

    # This is a 'http' listener, allows us to specify 'resources'.
    type: http

    tls: true

    # Use the X-Forwarded-For (XFF) header as the client IP and not the
    # actual client IP.
    x_forwarded: false

    # List of HTTP resources to serve on this listener.
    resources:
      -
        # List of resources to host on this listener.
        names:
          - client     # The client-server APIs, both v1 and v2

        # Should synapse compress HTTP responses to clients that support it?
        # This should be disabled if running synapse behind a load balancer
        # that can do automatic compression.
        compress: true

      - names: [federation]  # Federation APIs
        compress: false

  # Unsecure HTTP listener,
  # For when matrix traffic passes through loadbalancer that unwraps TLS.
  - port: 8008
    tls: false
    bind_address: '127.0.0.1'
    type: http

    #x_forwarded: true
    x_forwarded: false

    resources:
      - names: [client]
        compress: true
      - names: [federation]
        compress: false

# Enable communities
enable_group_creation: True
@ara4n
Copy link
Member

ara4n commented Dec 21, 2019

can you please dm me the logs from when it was leaking memory to @matthew:matrix.org

@ara4n
Copy link
Member

ara4n commented Dec 22, 2019

The server is receiving a big federation transaction from matrix.org (with id 1576842288415) which contains 50 PDUs and 6 EDUs. The PDUs have lots of prev events; it tries to go off and locate the missing events, but something gets wedged and it simply never completes. Therefore the request retries, stacks up, and OOMs.

a) We shouldn't be wedging.
b) We should surely ignore duplicate federation txns coming in if we're already processing one.

I've suggested to @pedro-nonfree that he tries leaving some of the rooms mentioned in the bad transaction in case that unsticks things until we can figure out what's going wrong.

@pedro-nonfree
Copy link
Author

pedro-nonfree commented Dec 22, 2019

Thanks @ara4n

a part of finding the bug or the wrong rooms that are hitting this memory leak. Synapse itself should avoid having memory leaks that makes it explode every minute. (Hope that a) and b) deals with it)

@pedro-nonfree
Copy link
Author

pedro-nonfree commented Dec 22, 2019

As the affected room is public I would like to show it here: it is the room freenode/#openwrt-devel through the matrix.org bridge to IRC (matrix room: !sAEmVCXWUHBxWiYCmu:matrix.org)

@anoadragon453
Copy link
Member

b) We should surely ignore duplicate federation txns coming in if we're already processing one.

For reference, if we receive a transaction that we've already processed, we will return the cached response.

But there is no check for whether we're currently processing a given transaction.

@richvdh
Copy link
Member

richvdh commented Dec 24, 2019

But there is no check for whether we're currently processing a given transaction.

I don't think that's correct: the relevant code is here.

@richvdh
Copy link
Member

richvdh commented Dec 24, 2019

Having had a look at @pedro-nonfree's logs: I don't believe it is correct to conclude that the reason for the OOM is that we are trying to process the same transaction several times in parallel: the code I linked above appears to be doing its job correctly.

Rather, it looks like the code processing the incoming transaction is just wedging, without doing any logging. The last thing it logs in every case (apart from the calling server dropping the connection) is a successful response to a call to matrix://matrix.org/_matrix/federation/v1/state_ids/%21jWRVQAlVGjigKCRGwS%3Amatrix.org. That's a fairly large room, and I guess somehow it is getting stuck when processing that response.

I'll have to think a bit more about this. In the meantime @pedro-nonfree : if you'd like to try running synapse with debug logging enabled (essentially, change the root log level to DEBUG in your log config), it may produce some more helpful information.

@ara4n
Copy link
Member

ara4n commented Dec 24, 2019

thanks - please share logs asap (privately) once you've repro'd an OOM. be aware that DEBUG logs can contain personal data though.

@richvdh
Copy link
Member

richvdh commented Dec 24, 2019

I'd recommend setting synapse.storage.SQL back to INFO - it logs a lot of data including access tokens and passwords.

@richvdh
Copy link
Member

richvdh commented Dec 24, 2019

right, so the final entries in @pedro-nonfree's logs are:

{GET-O-7} [matrix.org] Sending request: GET matrix://matrix.org/_matrix/federation/v1/state_ids/%21jWRVQAlVGjigKCRGwS%3Amatrix.org?event_id=%241577162865947686RYsuB%3Amatrix.org; timeout 60.000000s
{GET-O-7} [matrix.org] Completed: 200 OK
Loaded 27637 events (27199 rows)

so I'd guess that attempting to load 27637 events into memory at once is causing an OOM.

This is happening because the auth_chain being returned by matrix.org has 22327 events (plus the 5314 events in the state at that point).

I can't really believe that there are 22327 events in the auth chain for a single event, so something seems off there.

I now think that the fact this started happening after the upgrade to 1.7.2 is a coincidence: rather it was triggered by us deploying #6556 (or something similar) on matrix.org.

@richvdh
Copy link
Member

richvdh commented Dec 25, 2019

I've raised #6597 to track the problem with matrix.org's response.

@pedro-nonfree
Copy link
Author

pedro-nonfree commented Dec 25, 2019

@richvdh identified what room was causing the problems, I tried to leave, I could not; he kicked me from that room and the OOM-cycle stopped.

So the problem was identified in the matrix.org server. I suggest that the other servers part of the federation should be ready to receive insane state_ids responses:

We can think about this as a vector attack: a malicious user/server invites @user:example.com to a room; and from that room it sends insane state_ids responses causing OOMs to the whole example.com matrix server, which as shown, it is difficult to see, detect and fix. Hence, synapse requires pre-processing that petition partioning that data in chunks that can process; or telling the other server, to receive them little by little or rejecting them because "looks invalid".

lessons learned: was a good idea to have matrix service mostly isolated. That's a big warning for people that has the server with other unrelated services, together.

pedro-nonfree referenced this issue Dec 25, 2019
TODO: automate generation of this
@richvdh
Copy link
Member

richvdh commented Dec 26, 2019

I'm going to close this in favour of #6597

@richvdh richvdh closed this as completed Dec 26, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants