Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Joining rooms over federation can be very slow (SYN-293) #1211

Closed
matrixbot opened this issue Feb 28, 2015 · 36 comments
Closed

Joining rooms over federation can be very slow (SYN-293) #1211

matrixbot opened this issue Feb 28, 2015 · 36 comments
Assignees
Labels
A-Federated-Join joins over federation generally suck A-Performance Performance, both client-facing and admin-facing O-Frequent Affects or can be seen by most users regularly or impacts most users' first experience S-Major Major functionality / product severely impaired, no satisfactory workaround. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues.

Comments

@matrixbot
Copy link
Member

matrixbot commented Feb 28, 2015

Edit: Currently tracking related work in the milestones:

For an overview of the technical changes, see MSC3902.


Original Post:

I have Synapse/0.7.1-r2 installed from pip/github.
I am running nginx on a public server in front of it:

I can initiate 1:1 chats or join rooms on matrix.org from matrix.ytnoc.net. If I join an active room like #matrix-dev:matrix.org, it takes well over a minute to synchronize. I only had one instance where I saw all the user avatars and the channel name show up. The rest of the time, the channel name on the left is replace my my local id (@​kb3dfz:matrix.ytnoc.net). See screenshot.

#test:matrix.org seems to be working just fine with federation.

I have attached homeserver.log

(Imported from https://matrix.org/jira/browse/SYN-293)

(Reported by John Hogemiller)

Attachments:

https://matrix.org/jira/secure/attachment/10115/icognito-1.png
https://matrix.org/jira/secure/attachment/10118/noalias.png
https://matrix.org/jira/secure/attachment/10116/redacted_homeserver.log
https://matrix.org/jira/secure/attachment/10113/Screen+Shot+2015-02-28+at+11.24.53+AM.png
https://matrix.org/jira/secure/attachment/10117/usernames+only.png

@matrixbot
Copy link
Member Author

Jira watchers: @erikjohnston

@matrixbot
Copy link
Member Author

Once I open an incognito browser (for different cache) I did get more, mainly avatars and images. The channel ID showed up, not the alias.

So I feel there's still some disconnect, but not a big one.

-- John Hogemiller

@matrixbot
Copy link
Member Author

After 24 hours, the channel alias still hasn't shown up, just the ID (as seen in the screenshot). I can click on the cog and see the alias. I just submitting it as is, changing it to be an alias on my server, and listing both: ["#matrix-dev:matrix.org", "#matrix-dev:matrix.ytnoc.net"]. In all cases, it reverts back to the list propogated from matrix.org.

-- John Hogemiller

@matrixbot
Copy link
Member Author

Can a project admin edit the description? I think some of my initial issues (no avatars) were browser cache related, and the key issue remaining is that the channel alias won't show up.

-- John Hogemiller

@matrixbot
Copy link
Member Author

Did you have multiple clients open? Were the ones that did not get the complete state the ones that you didn't use to join the room?

-- @erikjohnston

@matrixbot
Copy link
Member Author

I initially had one client open. It did not get the room member's list/complete state (first screenshot). It showed the room as my username. Later, upon opening an incognito client, I got the member list, and it showed the room id and member list. Refreshing my main client got the room id and member list to show. m.room.alias can be viewed by clicking on the gear, but does not appear on the left channel list.

This finicky behavior seems to be based on room size.
Joining #test:matrix.org worked just fine.
Joining #matrix:matrix.org (today) shows me usernames on the left side (screenshot "usernames only"). Hitting the gear, no m.room.alias appears. I see the same in chrome, chrome incognito, firefox.

-- John Hogemiller

@matrixbot matrixbot changed the title Federated channel loses information, takes a long time to sync (SYN-293) Federated channel loses information, takes a long time to sync (https://github.com/matrix-org/synapse/issues/1211) Nov 7, 2016
@matrixbot matrixbot changed the title Federated channel loses information, takes a long time to sync (https://github.com/matrix-org/synapse/issues/1211) Federated channel loses information, takes a long time to sync (SYN-293) Nov 7, 2016
@richvdh richvdh changed the title Federated channel loses information, takes a long time to sync (SYN-293) Joining rooms over federation can be very slow (SYN-293) Oct 11, 2017
@richvdh
Copy link
Member

richvdh commented Apr 18, 2018

Having just done a join from a test server to matrix HQ, the join eventually completed with:

2018-04-18 13:55:23,582 - synapse.access.https.8447 - 93 - INFO - POST-726- None - 8447 - {@richvdh:matrixtest.sw1v.org} Processed request: 467006ms (53781ms, 4376ms) (12344977ms/16508ms/4567) 65B 200 "POST /_matrix/client/r0/join/%23matrix%3Amatrix.org? HTTP/1.1" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.181 Safari/537.36"

- that's 12345 seconds (3.5 hours) waiting for db connections. given the whole request completed in "only" 467 seconds, presumably it is doing a bunch of that waiting in parallel.

I suspect we're handling each server separately and in parallel from the point of view of fetching server keys and persisting said keys to the db. We should apply more intelligence here.

@richvdh
Copy link
Member

richvdh commented Apr 18, 2018

Also related: #3120

@enter-the-voiddddd
Copy link

I am experiencing the same lag when federating to any room that has more than 100 users. Running synapse in docker with open file limits set to 20000 and running postgres database. Using nginx as a reverse proxy following the instructions here.

@reivilibre reivilibre added the T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues. label Aug 31, 2021
@Janhouse
Copy link

Having similar experience without being able to join any larger rooms (~1000 users) over federated connection.
Should it actually be possible?

@FSG-Cat
Copy link
Contributor

FSG-Cat commented Sep 23, 2021

Yes its completely possible. A lot of the matrix devs and enthusiaists run their own homeservers and quite a few are in rooms like HQ and HQ is currently at 25k members so ye joining 1000+ user rooms over Fed is fully possible and works.

@nickian
Copy link

nickian commented Oct 27, 2021

I'm experiencing this as well. Can't seem to join most matrix.org rooms. The Raspberry Pi one finally loaded randomly after errors in Element popping up saying "Can't join room." Is this a function of my server's resources, or something else? I'm only running on a virtual machine with 4GB memory and 2 CPUs, using SQLite.

@aaronraimist
Copy link
Contributor

@nickian you definitely should be using Postgres for any chance of joining a room to take a reasonable amount of time https://matrix-org.github.io/synapse/latest/setup/installation.html#using-postgresql

@nickian
Copy link

nickian commented Oct 31, 2021

It randomly joined finally when I logged back in a half hour later, despite the error.

@BBaoVanC
Copy link

BBaoVanC commented Nov 1, 2021

It randomly joined finally when I logged back in a half hour later, despite the error.

Yeah it's a bit of a UI problem in Element. It shouldn't say there was an error, because the server is probably still trying to join anyways.

@enthus1ast

This comment has been minimized.

@squahtx

This comment has been minimized.

@enthus1ast

This comment has been minimized.

@Axeltherabbit

This comment has been minimized.

@zippyy

This comment has been minimized.

@amstan

This comment has been minimized.

@deepbluev7
Copy link
Contributor

If you people missed it, there is work being done on this: https://matrix.org/blog/2022/01/28/this-week-in-matrix-2022-01-28#synapse-website

@erikjohnston erikjohnston removed their assignment Feb 28, 2022
@tucnak
Copy link

tucnak commented Mar 24, 2022

@richvdh First of all, thank you so much for putting in the work to battle this awful issue that plagued Matrix since forever! Could you please be kind to provide some simple up-to-date note/overview on the ongoing efforts so those of us eagerly awaiting, and possibly looking to contribute— could understand this better? I, for one, am very interested in resolving this, and I'm sure many home server maintainers, too would be happy to chip in, but the codebase and the intrinsics seem rather intimidating given there's limited granularity in the issuing for SYN-293.

@richvdh
Copy link
Member

richvdh commented Mar 25, 2022

I've put some notes on progress up at https://hackmd.io/@richvdh/SkhcKnjKY https://hackmd.io/R20t3MTaSO6NvfPQh978QA.

@neoromantique
Copy link

+1 This is very annoying and breaks new user UX immensely.

Even if it is not possible to speed it up, at least providing some feedback that it is happening in the background would be great

@tucnak
Copy link

tucnak commented May 28, 2022

@richvdh 404 Not Found!

@mind-overflow
Copy link

Hey, is there any progress on this? Joining big rooms still takes an annoyingly long time on homeservers.

@deepbluev7
Copy link
Contributor

Please stop spamming this issue with "+1"s or "this affects me too". It's not helpful. If it affects you, add a thumbs-up to the issue.

As about progress, there is no writeup for it, but you can follow along by reading the changelogs and looking for lines containing "faster" and "join". I.e. https://github.com/matrix-org/synapse/releases/tag/v1.60.0 mentions 2 things done as preparatory work.

You can also take a look at https://github.com/matrix-org/synapse/milestone/6 which lists some of the work done on this. But some pull requests like #12877 are also done as part of the faster joins work, but not explicitly listed.

So in general, you can find the current progress by using the search on this repository, since most PRs and issues mention "faster joins". As you can see it is actively being worked on, but also quite technical, so for most people the updates or explanations won't make sense.

Also, don't ask when it will be done. It will take longer every time you ask :3

@richvdh
Copy link
Member

richvdh commented Jun 9, 2022

@richvdh 404 Not Found!

Sorry about that; I've updated the link to my notepad. However, the milestones (https://github.com/matrix-org/synapse/milestone/6, https://github.com/matrix-org/synapse/milestone/8) are probably a better place to follow along.

@gorbehnare
Copy link

Hello, I'm just trying to figure out if this is the problem I am seeing, or if something else is broken. I set up my first synapse server last week, and within two days I had everything working (including VoIP and video calls). By this point, I had already joined the public synapse-admins room and was able to chat, ask and answer questions there. As of this week, I keep getting "no known server" and "failed to join" errors after very long wait times when trying to join (seemingly) any public room.
In frustration, I have scrapped the VM, and installed a new VM with more RAM, a new domain, postgresSQL, etc... all from scratch, and I am still facing this. What are the test cases for this? how can I find out if the issue is with my homeserver or just a general matrix issue here?

@richvdh
Copy link
Member

richvdh commented Aug 1, 2022

how can I find out if the issue is with my homeserver or just a general matrix issue here?

Find the /join request from your client to your server in your homeserver logs, then look at the other lines logged for that request (see https://matrix-org.github.io/synapse/latest/usage/administration/admin_faq.html#how-can-i-find-the-lines-corresponding-to-a-given-http-request-in-my-homeserver-log). They will give you a clue about why your join is failing.

@DMRobertson DMRobertson added A-Performance Performance, both client-facing and admin-facing S-Major Major functionality / product severely impaired, no satisfactory workaround. O-Frequent Affects or can be seen by most users regularly or impacts most users' first experience labels Sep 6, 2022
@clokep
Copy link
Contributor

clokep commented May 17, 2023

It looks like the various bits of this have been implemented. All the milestones are empty. I'm sure there's more that could be done, but overall this seems good to file as separate, specific follow-ups.

@clokep clokep closed this as completed May 17, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Federated-Join joins over federation generally suck A-Performance Performance, both client-facing and admin-facing O-Frequent Affects or can be seen by most users regularly or impacts most users' first experience S-Major Major functionality / product severely impaired, no satisfactory workaround. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues.
Projects
None yet
Development

No branches or pull requests