Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Route key requests to federation senders #15121

Closed
wants to merge 12 commits into from

Conversation

DMRobertson
Copy link
Contributor

@DMRobertson DMRobertson commented Feb 21, 2023

Fixes #14805, which says:

Ideally only a federation_sender (and potentially the main process) would need that ability. Therefore, it would be nice if a given set of worker names could be configured as those that should go out and fetch keys from remote locations, for which other workers not in the set would then make a HTTP replication request to perform the operation. This has implications for high-security network environments.

(emphasis mine).

I am loathe to add a config option for this; I have chosen to route all key requests via federation senders.

@DMRobertson DMRobertson added the Z-Time-Tracked Element employees should track their time spent on this issue/PR. label Feb 21, 2023
async def _fetch_keys(
self, keys_to_fetch: List[_FetchKeyRequest]
) -> Dict[str, Dict[str, FetchKeyResult]]:
# For simplicity's sake, pick a random federation sender
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we would want to choose the same federation sender that's talking to that instance? 🤷

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what I thought, but federation_shard_config is a ShardedWorkerHandlingConfig rather than a class RoutableShardedWorkerHandlingConfig; only the latter defines get_instance publicly.

Personally I find this all incomprehensible.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My reading of it is that if all workers have the same federation_sender_instances config, then federation_shard_config can be upgraded to a RoutableShardedWorkerHandlingConfig, since all workers will route requests the same way.

@@ -0,0 +1 @@
Route remote key requests via federation senders.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(As I've left this comment on #15133 and #15134 I should just copy it here...)

I think we also need to update:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The upgrade notes will need amending too; I think federation senders will need to be declared in the instance_map.

David Robertson added 2 commits February 23, 2023 23:55
This will help to break an import cycle
So I can call it from federation senders
@DMRobertson DMRobertson force-pushed the dmr/key-requests-from-fed-senders branch from 37c633b to 615c55f Compare February 24, 2023 00:32
@DMRobertson DMRobertson force-pushed the dmr/key-requests-from-fed-senders branch from 615c55f to 8f5f5f8 Compare February 24, 2023 00:33
@DMRobertson DMRobertson force-pushed the dmr/key-requests-from-fed-senders branch from 8f5f5f8 to 4e3d2e8 Compare February 24, 2023 00:41
@DMRobertson
Copy link
Contributor Author

This should now be working well enough to start review.

I have one outstanding concern. If this change lands as-is, every worker that can possibly need to fetch keys will need an entry for the federation_sender in its instance map. That means adding upgrade notes, changing sytest and Synapse etc and is something to trip server admins up.

We could add a config option e.g. workers/make_key_requests_via_federation_sender, but

a) I am loathe to any new config option. Indeed, that's why I chose to exfiltrate these requests via federation senders rather than add some new complex worker class/role.

b) There is likely to be future traffic that we might want to route to the external internet via federation senders only; so we would probably end up adding similar config options to workers/make_profile_requests_via_federation_sender, resulting in further config option soup.
c)

Could use input from the team to make that call: @matrix-org/synapse-core

  • New config option, admins don't have to change their instance_maps
  • No new config option, admins likely have to change their instance_maps

@MatMaul
Copy link
Contributor

MatMaul commented Feb 24, 2023

I would favor an option, but it would be nice to make it generic. Something like workers/make_outgoing_requests_via_federation_sender.
But this only make sense when all the work is done.
Perhaps introduce an experimental option for now, and once we are confident that we cover all the requests, remove them for a generic one ?

@H-Shay
Copy link
Contributor

H-Shay commented Feb 24, 2023

My two cents is that it makes more sense to just do the second option: it means confusion and annoyance in the short term while everyone gets up to speed, rather than the endless confusion and annoyance that results from a continuously expanding and complex-ifying config.

@DMRobertson DMRobertson changed the title [WIP] Route key requests to federation senders Route key requests to federation senders Feb 24, 2023
@DMRobertson
Copy link
Contributor Author

There are some unit tests that need fixing up. I would like us to decide how to proceed on this (config option or not) before fixing that up. I'll chase up on Monday.

@erikjohnston
Copy link
Member

I'm in several minds about this. I'm not mad keen about the faff of ensuring that everyone has correctly set up their instance_map config. OTOH if we think that routing all federation traffic via a federation sender, then so be it.

OTOH, I'm not really that keen with this approach. It feels like we're going to play whack-a-mole with lots of the federation requests, rather than take a more systematic approach.

I wonder if we should take this opportunity to go down a "federation proxy" route, either as an external util or as a dedicated config option federation_proxy: <worker_name>. Though it still involves adding a new config option, it somehow seems less weird to me?

@DMRobertson
Copy link
Contributor Author

I wonder if we should take this opportunity to go down a "federation proxy" route, either as an external util or as a dedicated config option federation_proxy: <worker_name>.

So, what, something like this?

  • config option to a worker/group of workers as "federation outbound proxies"
  • define a generic endpoint on federation outbound proxies which makes an HTTP request to another homeserver and returns the response.
  • change the MatrixFederationAgent in the code to use this new endpoint if the current worker is not a federation outbound proxy.
  • admins can configure the federation outbound proxies to be e.g. their federation senders

Copy link
Contributor

@squahtx squahtx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks alright functionally.

Server admins having to update their config sounds like a lot of faff. The alternative of making it an option isn't great either. It sounded like we're exploring whether we can make a federation proxy as an alternative to this PR, is that right?

Comment on lines +618 to +624
instance_map:
- federation_sender1:
- host: localhost
- port: 1001
- federation_sender2:
- host: localhost
- port: 1002
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
instance_map:
- federation_sender1:
- host: localhost
- port: 1001
- federation_sender2:
- host: localhost
- port: 1002
instance_map:
- federation_sender1:
- host: localhost
- port: 1001
- federation_sender2:
- host: localhost
- port: 1002

async def _fetch_keys(
self, keys_to_fetch: List[_FetchKeyRequest]
) -> Dict[str, Dict[str, FetchKeyResult]]:
# For simplicity's sake, pick a random federation sender
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My reading of it is that if all workers have the same federation_sender_instances config, then federation_shard_config can be upgraded to a RoutableShardedWorkerHandlingConfig, since all workers will route requests the same way.

Comment on lines +52 to +63
We would normally return a group of FetchKeyResponse structs like the
normal code path does, but FetchKeyResponse holds a nacl.signing.VerifyKey
which is not JSON-serialisable. Instead, for each requested key we respond
with a boolean: `true` meaning we fetched this key, and `false` meaning we
didn't.

The response takes the form:

200 OK
{
"fetched_count": 1
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these comments still accurate? I can't figure out what the endpoint returns just from the code below, but it doesn't seem to match the comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I think I had iterated since then and not updated the comment.

@DMRobertson
Copy link
Contributor Author

It sounded like we're exploring whether we can make a federation proxy as an alternative to this PR, is that right?

That's right. Let me close this.

@DMRobertson DMRobertson closed this Mar 2, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Z-Time-Tracked Element employees should track their time spent on this issue/PR.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add the ability to configure which workers will issue POST /_matrix/key/v2/query requests
6 participants