Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Send SNI indication to support vhosts over federation (SYN-620) #1491
(Imported from https://matrix.org/jira/browse/SYN-620)
changed the title
Support vhosts over federation (https://github.com/matrix-org/synapse/issues/1491)
Nov 7, 2016
referenced this issue
Nov 15, 2016
referenced this issue
Jan 14, 2017
Install docs should probably be made more clear to account for this issue:
Is not possible with reverse-proxying. The readme states:
But in actuality it's not unreliable, it's impossible.
referenced this issue
Oct 10, 2017
For work conducted by the core team it comes down to a question of priority - right now that means dealing with the massive growth on matrix.org, hence the bias towards performance in the common case.
With that in mind, community contributions much appreciated :)
I'm aware this conversation may be long passed, but I just wanted to throw my 2¢ out there:
Because SRV never got mainstream traction I would expect SRV resolution to take place in user-space and then any client to make a standard request to https://SRV_HOST:SRV_PORT and validate that they have an SSL certificate for SRV_HOST. Doing it as is currently implemented has some interesting implications:
The one positive I see with the current implementation is that it allows for SRV records to point to IP addresses, and works around any issues about what certs to validate there.
Unfortunately, until the ecosystem of federated servers have all upgraded, SNI still can't be relied on since older servers won't send it.
Putting a homeserver behind SNI right now will mean you can only federate with a subset of up to date servers.
Unfortunately, there's also not a good summary of the "versions" present on the broader matrix network, so it's difficult for server operators to know when they can rely on SNI.
This relates to issue matrix-org/matrix.org#67 to some degree.
As a start, a server operator should check
However, it still breaks communication with other servers that aren't new enough, and there's really not a great way for a server operator to make an informed decision about when there are few enough active servers in the broader matrix fediverse of older versions that they're okay breaking them.
As far as I know, good tooling for doing the above isn't available, so I'll be writing some once-off postgres queries and scripts to do it for my server, but it's not really reasonable to expect every server operator to do that before using an LB which requires SNI.
@euank That is the case with every new feature. This issue just mentioned sending SNI support - not requiring it on the receiving side. That will stay the case for some time.
@krombel this is different than other features. Most of the time, if my server has a new feature X, old servers will simply not use it, but I can still receive messages from users on those servers.
SNI is special in that if I require SNI, there is no degradation; all old servers will simply get a certificate error and I won't be able to communicate with them at all. To my knowledge, there hasn't been any other such feature that had such a negative impact to use.
They're two sides of the same coin. People want it to be sent so they can host synapse as they do other http endpoints: behind a service proxy, load balancer, ingress controller, whatever. The issue title even mentions "support vhosts" which is another way to say "require SNI on the receiving side".
Perhaps we should create a new issue for the receiving side, which would basically be figuring out when to update the readme (here), that is to say figuring out what criterion we need to meet in the broader matrix network before we can recommend running synapse behind some SNI-aware LB.
I think it's worse. User-agents can be re-written (e.g. if it goes through certain proxies or other things) and are a less specified behaviour in server-server communication. The server-server api at least documents the api I referenced.
I decided to check the adaption of this feature from my view of the network to determine if I could finally put synapse behind a regular load balancer like all the other http services I run. The tl;dr is I don't feel I can rely on this issue being fixed yet for a large enough percentage of federation clients yet, so I can't.
I'm sharing the information I collected to decide this below in case anyone else following this issue for a similar reason finds it useful.
The hello-matrix site conveniently offers a list of servers and their version, so I went ahead and checked what that data shows:
$ curl -s "https://www.hello-matrix.net/public_servers.php?format=json" | jq '. | select(.last_response == 200).server_version' -r | sort | uniq -c 6 null 1 0.26.0 2 0.29.0 2 0.30.0 1 0.31.2 2 0.32.2 1 0.33.0 4 0.33.2.1 1 0.33.3 4 0.33.3.1 8 0.33.4 10 0.33.5.1 32 0.33.6 1 0.33.7rc2
At the time of writing, of the 75 active servers tracked by hello-matrix, 19 of them (~25%) are on versions too old to send SNI headers. The lower bound is 17% if the 'null' versioned servers all support it, but it seems more likely those versions are very old really.
Now, this isn't really representative because most servers don't add themselves to the list, and those that do are probably more closely involved in the matrix ecosystem and upgrading.
$ psql ..... database=# COPY (SELECT server_name FROM server_keys_json) TO '/tmp/servers-from-keys.csv' WITH CSV DELIMITER ','; $ wc -l /tmp/servers-from-keys.csv 3146 /tmp/servers-from-keys.csv $ ./fetch-version-stats.sh < /tmp/servers-from-keys.csv > server_stats.csv
(script here if anyone wants it
Looking at the information from servers I have ever federated with, I get 1919 servers that no longer respond (typically because the operator is no longer running a synapse server for whatever reason), 446 that are < version 0.33.3, and 781 >= 0.33.3.
That means that putting synapse's federation endpoint behind a load balancer partitions me from 35% of the servers mine has interacted with (that are still active).
Of course, there's still one more statistic which is more useful: what about servers I've recently interacted with?
Let's also look at the servers I've specifically talked to in the last 1 month period:
SELECT split_part(sender, ':', 2) as server FROM events WHERE received_ts > (extract(epoch from TIMESTAMP 'now'::timestamp - '1 month'::interval) * 1000) GROUP BY server;
Throwing the output of that query through my stats process tells me that 17% of the servers I've specifically received events from over the last 1 month period are on synapse versions too old to send SNI headers.
In summary, I don't think we can yet rely on SNI headers for our synapse setups unless we're okay with partitioning ourselves off from a subset of the fediverse.
It does seem like the majority of servers are up to date, but there's still enough lagging behind that I'm personally going to continue to dedicate a wonky special ingress setup just to matrix.