Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Suspected missing m.room.membership events after a gappy sync causing UISI #11102

Closed
BillCarsonFr opened this issue Oct 18, 2021 · 6 comments
Closed
Labels
A-Sync defects related to /sync S-Major Major functionality / product severely impaired, no satisfactory workaround. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues.

Comments

@BillCarsonFr
Copy link
Member

I have been investigating the following UISI report :https://github.com/matrix-org/element-web-rageshakes/issues/6602

It looks like a gappy sync scenario, and that the membership event was not sent
image

The join event of the receiver was only received when he sent is first message in the room (he actually joined 5h before)

@richvdh
Copy link
Member

richvdh commented Oct 18, 2021

are onurays and adam on the same server? if not, I suspect it's #2165.

@erikjohnston
Copy link
Member

They are both on matrix.org

@DMRobertson DMRobertson added T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues. A-Sync defects related to /sync S-Major Major functionality / product severely impaired, no satisfactory workaround. labels Oct 18, 2021
@erikjohnston
Copy link
Member

Some notes from investigation

Onuray join stream ordering: 2390145608, 2021-10-15 07:46:54Z
Adam message stream ordering: 2390646342, 2021-10-15 12:44:31Z

First sync of the day 2021-10-15 09:40:02,476 (doxy, sync31): s2368016989_757284961_8078984_1051647443_982723366_2666339_337741678_2062913065_209393

First sync at midday 2021-10-15 11:22:45,487 (grindy, sync18): s2390509478_757284961_2884_1063246301_999332906_2685745_342581158_2099604528_209609

From the rageshake logs it does appear that after the first sync of the day onuray is in the room (from rageshake logs logs-0001):

2021-10-15T09:40:02.872Z I LL: got 43 members from server for room !txUGtgnGUNdAWqUqmg:matrix.org
...
2021-10-15T09:40:05.808Z I got device keys for @onurays:matrix.org: {"GQXASNVAUE":{"algorithms":["m.olm.v1.curve25519-aes-sha2","m.megolm.v1.aes-sha2"],"device_id":"GQXASNVAUE","keys":{"curve25519:GQXASNVAUE":"yNfwWjE9Io4bHiG8oOgbfc+Jaa3GbUziCEskqnNKHA8","ed25519:GQXASNVAUE":"ouzPv973Vxo0v1snJCuhlQDsszf47y6unNhbrt2YDi4"},"signatures":{"@onurays:matrix.org":{"ed25519:GQXASNVAUE":"ObMO/GrbE+Bvp/cbSVEBbtCY2RJOrbQ5N2XQdabokj9vMtSJPomKALa9KGC+Gw09P1UX5Fo8s5SZKW4r/B1fCg","ed25519:Bf8ldqtlREY6Ad/36nW57oSEMbqNvP/uU/9ghyUMpEI":"HW/Ns208JwrbOtlCFUzbKDWRHBpjyw8+irI4SZaf+G03GAVTITksEo4aNzCXRcWyb4q6iiYrEDnCcGigT/aGAw"}},"user_id":"@onurays:matrix.org","unsigned":{"device_display_name":"Riot Desktop on macOS"}},"COUVTNGBQG":{"algorithms":["m.megolm.v1.aes-sha2","m.olm.v1.curve25519-aes-sha2"],"device_id":"COUVTNGBQG","keys":{"curve25519:COUVTNGBQG":"d8vvRaHuNV/bb+x7OVzZMzQDnWUTXQ5z4XZEodWfaH4","ed25519:COUVTNGBQG":"C84hoSukC+BleLKEmGrU1WgvLjWMXhr8ePuam+EB4l0"},"signatures":{"@onurays:matrix.org":{"ed25519:COUVTNGBQG":"iFdryml9CsDBDoVyO1oWU79z6IP2znO3kYF+2glvpGXocTzjzr9xkNhRpCA68j0COMDQrPDySx4laxcK4L4hDQ","ed25519:Bf8ldqtlREY6Ad/36nW57oSEMbqNvP/uU/9ghyUMpEI":"r32suRQDgj4/BffWIylNLC6EKvMRA7a4pgT0LzhbqOviS7VuAp9BtamVeuQ2kmsu1nfh68v2XZwTKI82FP1DCA"}},"user_id":"@onurays:matrix.org","unsigned":{"device_display_name":"Element Android"}},"AKJXKVMQSF":{"algorithms":["m.olm.v1.curve25519-aes-sha2","m.megolm.v1.aes-sha2"],"device_id":"AKJXKVMQSF","keys":{"curve25519:AKJXKVMQSF":"X1kQA8OppkShqC4aYZL/kLCO7CUQkNNTnX+Ad1bgFQ8","ed25519:AKJXKVMQSF":"s9T6Ri0vO2Gn1pSvaxR42LMmw7SvR5FXY9rtKaJ42JM"},"signatures":{"@onurays:matrix.org":{"ed25519:AKJXKVMQSF":"MnWoWv4KtW6WJ/PSQlIgRROG4BRM/BzVuQ+hArtxoSFriQnGwi0smQgV17j6HHRLQH1yxfQqZgkIg72oiXWpAg","ed25519:Bf8ldqtlREY6Ad/36nW57oSEMbqNvP/uU/9ghyUMpEI":"ua5xTjK8dH3jAssejhm2ocUSxj1a+h7nQRCbUqZajZOKiBeSSY47o8CWD2vFkdT7rj1pWs2GfSF+5kwks1XMDg"}},"user_id":"@onurays:matrix.org","unsigned":{"device_display_name":"Element Nightly Desktop (macOS)"}}}
...
2021-10-15T09:40:15.583Z I Sharing megolm keys for S8yqJerNPaMfoVmx8IMKvKH2kXZkFUQK82YwP4F5JOE in !txUGtgnGUNdAWqUqmg:matrix.org (slice 5/9) ["@onurays:matrix.org/GQXASNVAUE","@onurays:matrix.org/COUVTNGBQG","@onurays:matrix.org/AKJXKVMQSF","@ouchadam:matrix.org/TKGFSAGOQL","@ouchadam:matrix.org/FRMLZFBCSW","@ouchadam:matrix.org/XPWHKZVNKY","@ouchadam:matrix.org/ZSEXSAUXFA","@palid:matrix.org/AGWKOPUPBI","@palid:matrix.org/OCTKIQLNLS","@palid:matrix.org/QDWFOXVBYA","@palid:matrix.org/FNNTEDROQL","@palid:matrix.org/KTGUQWPUXX","@palid:matrix.org/AWXPNUQDMD","@palid:matrix.org/MVFYTPPOUU","@palid:matrix.org/PBLXNIIPRN","@poljar:matrix.org/BNZRRMNDGA","@poljar:matrix.org/IRYUCRQPMV","@stefan.ceriu:matrix.org/LXGQHGZYHY","@stefan.ceriu:matrix.org/KJLOQRIACI","@stefan.ceriu:matrix.org/VQNQQCRVWB","@stefan.ceriu:matrix.org/CPQJKJTPOC","@stefan.ceriu:matrix.org/OSRMNRVVNI","@stefan.ceriu:matrix.org/BQWRDQBLMX","@stefan.ceriu:matrix.org/LBQLTFSDEY","@stefan.ceriu:matrix.org/QGAUSYRDXU","@stefan.ceriu:matrix.org/NXANSKQULF","@stefan.ceriu:matrix.org/XCVRJJYAXM","@stefan.ceriu:matrix.org/MCMMDEJMPA","@stefan.ceriu:matrix.org/PJAIPQYWPB","@stefan.ceriu:matrix.org/NDHWAJXPHR","@stefan.ceriu:matrix.org/EYBJWNLFJH","@stefan.ceriu:matrix.org/TLOVYBJZPN","@stefan.ceriu:matrix.org/VKJYBLOPLX","@stefan.ceriu:matrix.org/MWKVTMLEXE","@stefan.ceriu:matrix.org/EHATUBKIIT","@stefan.ceriu:matrix.org/LUGUTBOHRD","@stefan.ceriu:matrix.org/BXRQIEZHDS","@stefan.ceriu:matrix.org/KDPXTSGDKW","@stefan.ceriu:matrix.org/AKBIQMTLUP","@stefan.ceriu:matrix.org/KITKRACGTA","@stefan.ceriu:matrix.org/XZOJNHNWUL","@stefan.ceriu:matrix.org/WFLSKZHPOW","@stefan.ceriu:matrix.org/DROINSUVUW"]
...

So my hunch is that the onuray join did come down in the first sync, but then also turned up at the second gappy sync, and that the client got confused and displayed that in the timeline?

@BillCarsonFr
Copy link
Member Author

Thx @erikjohnston, the timeline is not confused, I used this representation to reflect when the join event appears in logs.

So you are saying that an additional join event came down the sync with the first message, but it was also already sent previously?
I'll check for that.

@erikjohnston
Copy link
Member

So you are saying that an additional join event came down the sync with the first message, but it was also already sent previously? I'll check for that.

Yup, I think for lazy loaded memberships we do sometimes send down the membership events more than once, when on the server we don't know if we have already sent it. Note that they should show up in the state section, and not the timeline section.

@BillCarsonFr
Copy link
Member Author

Ok, I am closing.
The previous megolm was correctly shared to onuray, so it looks it's on the client that did shared correctly the new session

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Sync defects related to /sync S-Major Major functionality / product severely impaired, no satisfactory workaround. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues.
Projects
None yet
Development

No branches or pull requests

4 participants