Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support MSC3916 by adding a federation /download endpoint #17172

Open
wants to merge 13 commits into
base: develop
Choose a base branch
from

Conversation

H-Shay
Copy link
Contributor

@H-Shay H-Shay commented May 8, 2024

MSC3916 outlines the addition of a new federation endpoint to serve media downloads to other servers. This PR implements that endpoint and the new multipart/form-data response format that it returns. Future PRs will implement the rest of MSC3916, including altering the new client-server /download endpoint to use the new federation endpoint for remote download requests.

Should be reviewable commit-by-commit

@H-Shay H-Shay requested a review from a team as a code owner May 8, 2024 22:38
@H-Shay
Copy link
Contributor Author

H-Shay commented May 8, 2024

Does this require more tests? If so, what should be tested?

@H-Shay
Copy link
Contributor Author

H-Shay commented May 9, 2024

complement run failing with CreateDirtyDeployment failed: CreateDirtyServer: Failed to deploy image complement-synapse : Error response from daemon: Conflict. The container name "/complement_fed_dirty_hs1" is already in use by container "08dc1e60340dfb37e078f29731cb4d102e1149a6afb2f5f62fc41564fd9caeac". You have to remove (or rename) that container to be able to reuse that name. - pretty sure that's not this PR? Normally I would just hit the re-run button but I no longer have that power...

@anoadragon453
Copy link
Member

complement run failing with CreateDirtyDeployment failed: CreateDirtyServer: Failed to deploy image complement-synapse : Error response from daemon: Conflict. The container name "/complement_fed_dirty_hs1" is already in use by container "08dc1e60340dfb37e078f29731cb4d102e1149a6afb2f5f62fc41564fd9caeac". You have to remove (or rename) that container to be able to reuse that name. - pretty sure that's not this PR? Normally I would just hit the re-run button but I no longer have that power...

I think that error is a red herring. In the Synapse logs, I'm seeing the following exception:

  Error during startup:
  Traceback (most recent call last):
  pusher1 | 2024-05-09 12:22:43,513 - synapse.replication.tcp.redis - 292 - INFO - sentinel - Connecting to redis server localhost:6379
    File "/usr/local/lib/python3.11/site-packages/synapse/app/_base.py", line 258, in wrapper
      await cb(*args, **kwargs)
    File "/usr/local/lib/python3.11/site-packages/synapse/app/_base.py", line 594, in start
      hs.start_listening()
    File "/usr/local/lib/python3.11/site-packages/synapse/app/generic_worker.py", line 247, in start_listening
      self._listen_http(listener)
    File "/usr/local/lib/python3.11/site-packages/synapse/app/generic_worker.py", line 187, in _listen_http
      resources[FEDERATION_PREFIX] = TransportLayerServer(self)
                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/usr/local/lib/python3.11/site-packages/synapse/federation/transport/server/__init__.py", line 75, in __init__
      self.register_servlets()
    File "/usr/local/lib/python3.11/site-packages/synapse/federation/transport/server/__init__.py", line 78, in register_servlets
      register_servlets(
    File "/usr/local/lib/python3.11/site-packages/synapse/federation/transport/server/__init__.py", line 318, in register_servlets
      servletclass(
    File "/usr/local/lib/python3.11/site-packages/synapse/federation/transport/server/federation.py", line 812, in __init__
      self.media_repo = self.hs.get_media_repository()
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/usr/local/lib/python3.11/site-packages/synapse/server.py", line 220, in _get
      dep = builder(self)
            ^^^^^^^^^^^^^
    File "/usr/local/lib/python3.11/site-packages/synapse/server.py", line 686, in get_media_repository
      return MediaRepository(self)
             ^^^^^^^^^^^^^^^^^^^^^
    File "/usr/local/lib/python3.11/site-packages/synapse/media/media_repository.py", line 92, in __init__
      self.max_upload_size = hs.config.media.max_upload_size
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  AttributeError: 'ContentRepositoryConfig' object has no attribute 'max_upload_size'

The error about conflicting containers is certainly unhelpful though...

@H-Shay
Copy link
Contributor Author

H-Shay commented May 9, 2024

Weird! If it's an attribute error why did only one of the three complement tests fail? Don't they use the same image? I'm gonna dig deeper but I'm surprised...

Edit: looks like this was due to workers trying to instantiate the federation download servlet (which relies on the media repo) when the media repo has been disabled via config. I wondered why this wasn't picked up by the Sytest worker runs but I suspect that it is due to the fact that sytest doesn't appear to use the configure_workers_and_start script which explicitly disables the media repo on most workers. Anyhow I think the error is resolved now :)

@anoadragon453 anoadragon453 self-requested a review May 10, 2024 10:44
Copy link
Member

@anoadragon453 anoadragon453 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A fantastic start - a few questions below. In addition, this functionality should be gated before an experimental config option.

synapse/media/media_storage.py Show resolved Hide resolved
synapse/media/media_storage.py Show resolved Hide resolved
synapse/media/media_storage.py Outdated Show resolved Hide resolved
synapse/media/media_storage.py Show resolved Hide resolved
@@ -55,13 +57,21 @@ async def store_file(self, path: str, file_info: FileInfo) -> None:
"""

@abc.abstractmethod
async def fetch(self, path: str, file_info: FileInfo) -> Optional[Responder]:
async def fetch(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is somewhat unclear to me - I couldn't tell from the config documentation what defines a media_storage_provider - looking at the code that parses the config it uses the load_module function which loads a Synapse module, and there is a Synapse module FileStorageProvderBackend that exists in Synapse. Nothing in the docs/module docs mentioned anything about third-party storage providers so I figured it was not an external thing, but I could be wrong? Any insight is appreciated

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrm, indeed the docs don't provide much insight here.

https://github.com/matrix-org/synapse-s3-storage-provider/ is an external storage provider that is widely used (including on matrix.org), and does implement this method: https://github.com/matrix-org/synapse-s3-storage-provider/blob/2c46a764f700e6439afa11c00db827ddf21a9e89/s3_storage_provider.py#L136-L148. This change would cause a TypeError in that library.

What we've done in the past for the Module API in this situation is to either:

  1. Define a new function with the new arguments, i.e. fetch_with_media_info.
  2. Check the parameter count of the module's implementation of the function, and add arguments accordingly. Module API example, another example. Note that you don't need to define an async_wrapper - that was backwards-compatibility from the time before we had async functions everywhere.

2 is probably cleaner from an external perspective.

We'll should also add a note to the changelog signalling to media storage provider module authors that they can take advantage of the enhanced metadata.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've created an issue to document this at #17193

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah that complicates things - the federation parameter that I have added does change the form of the response from fetch, which is responsible for streaming the file to the consumer. In the code I've added, if the federation parameter is True, fetch will stream the multipart response, rather than just the file as it did previously. So I don't think?? that we can just drop the parameter if the provider doesn't support it, otherwise the response could be incorrect. I guess in that case adding a new function would be the best way forward, although AFAICT that means that third-party storage modules will be incompatible with MSC3916 until they implement the new function supporting a multipart response.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How have things like this been handled in the past, as it seems like we need to change the behavior of an external entity, since there are external storage providers and the storage provider is responsible for streaming the response?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the more I think about this the more I am concerned that this is a significant problem

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this is too hacky, but we could add a wrapper around the reader returned by fetch.

In the case of a federation request, we return the multipart response headers and JSON part, then stream the file from the wrapped reader for the second part. Essentially you'd have one IO reader reading from the one returned by fetch.

Copy link
Member

@anoadragon453 anoadragon453 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nearly there! Just a few small things now.

synapse/media/media_storage.py Show resolved Hide resolved
synapse/media/media_storage.py Show resolved Hide resolved
tests/federation/test_federation_media.py Show resolved Hide resolved
synapse/media/media_storage.py Show resolved Hide resolved
synapse/media/_base.py Show resolved Hide resolved
synapse/federation/transport/server/_base.py Show resolved Hide resolved
@@ -55,13 +57,21 @@ async def store_file(self, path: str, file_info: FileInfo) -> None:
"""

@abc.abstractmethod
async def fetch(self, path: str, file_info: FileInfo) -> Optional[Responder]:
async def fetch(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrm, indeed the docs don't provide much insight here.

https://github.com/matrix-org/synapse-s3-storage-provider/ is an external storage provider that is widely used (including on matrix.org), and does implement this method: https://github.com/matrix-org/synapse-s3-storage-provider/blob/2c46a764f700e6439afa11c00db827ddf21a9e89/s3_storage_provider.py#L136-L148. This change would cause a TypeError in that library.

What we've done in the past for the Module API in this situation is to either:

  1. Define a new function with the new arguments, i.e. fetch_with_media_info.
  2. Check the parameter count of the module's implementation of the function, and add arguments accordingly. Module API example, another example. Note that you don't need to define an async_wrapper - that was backwards-compatibility from the time before we had async functions everywhere.

2 is probably cleaner from an external perspective.

We'll should also add a note to the changelog signalling to media storage provider module authors that they can take advantage of the enhanced metadata.

@@ -55,13 +57,21 @@ async def store_file(self, path: str, file_info: FileInfo) -> None:
"""

@abc.abstractmethod
async def fetch(self, path: str, file_info: FileInfo) -> Optional[Responder]:
async def fetch(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've created an issue to document this at #17193

tests/federation/test_federation_media.py Show resolved Hide resolved
@H-Shay
Copy link
Contributor Author

H-Shay commented May 14, 2024

It seems very unlikely to me that the sytest failure is a result of the last round of changes as I only added a test, newlines, and a comment...

@H-Shay H-Shay requested a review from anoadragon453 May 14, 2024 22:57
Copy link
Member

@anoadragon453 anoadragon453 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems very unlikely to me that the sytest failure is a result of the last round of changes as I only added a test, newlines, and a comment...

Looks like a flake. I've re-ran the tests.

@@ -55,13 +57,21 @@ async def store_file(self, path: str, file_info: FileInfo) -> None:
"""

@abc.abstractmethod
async def fetch(self, path: str, file_info: FileInfo) -> Optional[Responder]:
async def fetch(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this is too hacky, but we could add a wrapper around the reader returned by fetch.

In the case of a federation request, we return the multipart response headers and JSON part, then stream the file from the wrapped reader for the second part. Essentially you'd have one IO reader reading from the one returned by fetch.

synapse/media/_base.py Show resolved Hide resolved
synapse/federation/transport/server/_base.py Show resolved Hide resolved
if media_info.media_length is not None:
request.setHeader(b"Content-Length", b"%d" % (media_info.media_length,))
request.setHeader(
b"Content-Type", b"multipart/form-data; boundary=%s" % responder.boundary
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should multipart/mixed I think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had the same question :) #17172 (comment)

It's in the process of being changed. @H-Shay it would reduce reviewer confusion to get that bit updated in the MSC.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, looks like the MSC thread isn't clear - it should be multipart/mixed for now.

erikjohnston pushed a commit that referenced this pull request May 24, 2024
#17213)

[MSC3916](https://github.com/matrix-org/matrix-spec-proposals/blob/rav/authentication-for-media/proposals/3916-authentication-for-media.md)
adds new media endpoints under `_matrix/client`. This PR adds the
`/preview_url`, `/config`, and `/thumbnail` endpoints. `/download` will
be added in a follow-up PR once the work for the federation `/download`
endpoint is complete (see
#17172).

Should be reviewable commit-by-commit.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants