From b43927793ce8f425115bbddec9f914c297aba350 Mon Sep 17 00:00:00 2001 From: Tulir Asokan Date: Sat, 24 Aug 2019 16:37:39 +0300 Subject: [PATCH 01/27] Proposal for asynchronous media uploads Signed-off-by: Tulir Asokan --- proposals/2246-asynchronous-uploads.md | 46 ++++++++++++++++++++++++++ 1 file changed, 46 insertions(+) create mode 100644 proposals/2246-asynchronous-uploads.md diff --git a/proposals/2246-asynchronous-uploads.md b/proposals/2246-asynchronous-uploads.md new file mode 100644 index 00000000000..1312be7c53b --- /dev/null +++ b/proposals/2246-asynchronous-uploads.md @@ -0,0 +1,46 @@ +# Asynchronous media uploads +Sending media to Matrix currently requires that clients first upload the media +to the content repository and then send the event. This is a problem for some +use cases, such as bridges that want to preserve message order, as reuploading +a large file would block all messages. + +## Proposal +This proposal proposes a way to send the event containing media before actually +uploading the media, which would make the aformentioned bridge message order +preservation possible without blocking all other messages behind a long upload. + +In the future, this new functionality could be used for streaming file +transfers, as requested in [#1885]. + +### Content repository behavior +The proposal adds two new endpoints to the content repository API and modifies +the behavior of the download endpoint. + +#### `POST /_matrix/media/r0/create` +Create a new MXC URI without content. Like `/upload`, this endpoint returns the +`content_uri` that can be used in events. Additionally, the response would +contain a `media_id` field with the media ID part of the MXC URI for +convenience (see [MXC format] in the spec). + +#### `PUT /_matrix/media/r0/upload/{mediaId}` +Upload content to a MXC URI that was created earlier. If the endpoint is called +with a media ID that already has content, the request should be rejected with +the error code `M_CANNOT_OVERRIDE_MEDIA`. + +#### Behavior change in `/download` +When another client tries to download media that has not yet been uploaded, the +content repository should stall the request until it is uploaded. Optionally, +content repository implementations may send the data that has already been +uploaded and stream data as it comes in from the sender. + +## Tradeoffs + +## Potential issues +Other clients may time out the download if the sender takes too long to upload +media. + +## Security considerations + + +[#1885]: https://github.com/matrix-org/matrix-doc/issues/1885 +[MXC format]: https://matrix.org/docs/spec/client_server/latest#matrix-content-mxc-uris From a83c79c8e69af26a0ec85adaeb163cceef113e14 Mon Sep 17 00:00:00 2001 From: Tulir Asokan Date: Sat, 24 Aug 2019 17:45:14 +0300 Subject: [PATCH 02/27] Add security consideration and mention possible /create request body Signed-off-by: Tulir Asokan --- proposals/2246-asynchronous-uploads.md | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/proposals/2246-asynchronous-uploads.md b/proposals/2246-asynchronous-uploads.md index 1312be7c53b..51c38d8ead4 100644 --- a/proposals/2246-asynchronous-uploads.md +++ b/proposals/2246-asynchronous-uploads.md @@ -22,6 +22,12 @@ Create a new MXC URI without content. Like `/upload`, this endpoint returns the contain a `media_id` field with the media ID part of the MXC URI for convenience (see [MXC format] in the spec). +This endpoint could have a JSON body that contains metadata, such as the mime +type of the media that's going to be uploaded. In the future, the body could +also contain access control settings (related: [MSC701]). + +TODO: decide if body is needed already and if yes, spec body schema + #### `PUT /_matrix/media/r0/upload/{mediaId}` Upload content to a MXC URI that was created earlier. If the endpoint is called with a media ID that already has content, the request should be rejected with @@ -40,7 +46,9 @@ Other clients may time out the download if the sender takes too long to upload media. ## Security considerations +There may be DoS risks in creating lots of media IDs. [#1885]: https://github.com/matrix-org/matrix-doc/issues/1885 [MXC format]: https://matrix.org/docs/spec/client_server/latest#matrix-content-mxc-uris +[MSC701]: https://github.com/matrix-org/matrix-doc/issues/701 From 9a395ed0ab19172cc77eaf648b52f8a8ff5a0ac7 Mon Sep 17 00:00:00 2001 From: Tulir Asokan Date: Sun, 25 Aug 2019 20:57:28 +0300 Subject: [PATCH 03/27] Expand on security consideration --- proposals/2246-asynchronous-uploads.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/proposals/2246-asynchronous-uploads.md b/proposals/2246-asynchronous-uploads.md index 51c38d8ead4..de86b83cb9a 100644 --- a/proposals/2246-asynchronous-uploads.md +++ b/proposals/2246-asynchronous-uploads.md @@ -46,8 +46,9 @@ Other clients may time out the download if the sender takes too long to upload media. ## Security considerations -There may be DoS risks in creating lots of media IDs. - +There may be DoS risks in creating lots of media IDs. Especially if those media +IDs are then sent to rooms, but not given any content, the proposed stalling +method could cause a lot of open HTTP connections. [#1885]: https://github.com/matrix-org/matrix-doc/issues/1885 [MXC format]: https://matrix.org/docs/spec/client_server/latest#matrix-content-mxc-uris From 29e3463e2e330b040870241f226326127e05828b Mon Sep 17 00:00:00 2001 From: Tulir Asokan Date: Mon, 26 Aug 2019 22:51:10 +0300 Subject: [PATCH 04/27] Change error code for existing media PUT Co-Authored-By: Travis Ralston --- proposals/2246-asynchronous-uploads.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/2246-asynchronous-uploads.md b/proposals/2246-asynchronous-uploads.md index de86b83cb9a..e675cc4a70a 100644 --- a/proposals/2246-asynchronous-uploads.md +++ b/proposals/2246-asynchronous-uploads.md @@ -31,7 +31,7 @@ TODO: decide if body is needed already and if yes, spec body schema #### `PUT /_matrix/media/r0/upload/{mediaId}` Upload content to a MXC URI that was created earlier. If the endpoint is called with a media ID that already has content, the request should be rejected with -the error code `M_CANNOT_OVERRIDE_MEDIA`. +the error code `M_CANNOT_OVERWRITE_MEDIA`. #### Behavior change in `/download` When another client tries to download media that has not yet been uploaded, the From 7cf22bebd3179199f016a80540b1cde32a8abf95 Mon Sep 17 00:00:00 2001 From: Tulir Asokan Date: Mon, 26 Aug 2019 23:27:00 +0300 Subject: [PATCH 05/27] Add spam prevention error, example response and auth requirements --- proposals/2246-asynchronous-uploads.md | 37 +++++++++++++++++++------- 1 file changed, 27 insertions(+), 10 deletions(-) diff --git a/proposals/2246-asynchronous-uploads.md b/proposals/2246-asynchronous-uploads.md index e675cc4a70a..fd07fd1ddc9 100644 --- a/proposals/2246-asynchronous-uploads.md +++ b/proposals/2246-asynchronous-uploads.md @@ -17,21 +17,39 @@ The proposal adds two new endpoints to the content repository API and modifies the behavior of the download endpoint. #### `POST /_matrix/media/r0/create` -Create a new MXC URI without content. Like `/upload`, this endpoint returns the -`content_uri` that can be used in events. Additionally, the response would -contain a `media_id` field with the media ID part of the MXC URI for -convenience (see [MXC format] in the spec). +Create a new MXC URI without content. Like `/upload`, this endpoint requires +auth and returns the `content_uri` that can be used in events. This endpoint could have a JSON body that contains metadata, such as the mime type of the media that's going to be uploaded. In the future, the body could also contain access control settings (related: [MSC701]). +To prevent spam, servers should limit calls to this endpoint if the user has +created many media IDs without uploading any content to them. The error code +for such a spam limit is `M_TOO_MANY_IDS_ASSIGNED`. Servers should also provide +configuration options to let high-traffic clients like application services +bypass these limits. The `rate_limited` flag in the appservice registration is +one potential way to do this. + +##### Example request TODO: decide if body is needed already and if yes, spec body schema -#### `PUT /_matrix/media/r0/upload/{mediaId}` +##### Example response +```json +{ + "content_uri": "mxc://example.com/AQwafuaFswefuhsfAFAgsw" +} +``` + +#### `PUT /_matrix/media/r0/upload/{serverName}/{mediaId}` Upload content to a MXC URI that was created earlier. If the endpoint is called with a media ID that already has content, the request should be rejected with -the error code `M_CANNOT_OVERWRITE_MEDIA`. +the error code `M_CANNOT_OVERWRITE_MEDIA`. This endpoint too requires auth. + +If the upload is successful, an empty JSON object and status code 200 is +returned. If the serverName/mediaId combination is not known or not local, an +`M_NOT_FOUND` error is returned. For other errors, such as file size, file type +or user quota errors, the normal `/upload` rules apply. #### Behavior change in `/download` When another client tries to download media that has not yet been uploaded, the @@ -39,6 +57,8 @@ content repository should stall the request until it is uploaded. Optionally, content repository implementations may send the data that has already been uploaded and stream data as it comes in from the sender. +TODO: this should have at least an option to not stall if there's no data yet + ## Tradeoffs ## Potential issues @@ -46,10 +66,7 @@ Other clients may time out the download if the sender takes too long to upload media. ## Security considerations -There may be DoS risks in creating lots of media IDs. Especially if those media -IDs are then sent to rooms, but not given any content, the proposed stalling -method could cause a lot of open HTTP connections. + [#1885]: https://github.com/matrix-org/matrix-doc/issues/1885 -[MXC format]: https://matrix.org/docs/spec/client_server/latest#matrix-content-mxc-uris [MSC701]: https://github.com/matrix-org/matrix-doc/issues/701 From 658aac8437cf28276be865e56711ae095f11c5ec Mon Sep 17 00:00:00 2001 From: Tulir Asokan Date: Sun, 2 Aug 2020 01:38:33 +0300 Subject: [PATCH 06/27] Add query parameters to control downloading --- proposals/2246-asynchronous-uploads.md | 40 ++++++++++++++++++++------ 1 file changed, 32 insertions(+), 8 deletions(-) diff --git a/proposals/2246-asynchronous-uploads.md b/proposals/2246-asynchronous-uploads.md index fd07fd1ddc9..b07ea350415 100644 --- a/proposals/2246-asynchronous-uploads.md +++ b/proposals/2246-asynchronous-uploads.md @@ -14,7 +14,7 @@ transfers, as requested in [#1885]. ### Content repository behavior The proposal adds two new endpoints to the content repository API and modifies -the behavior of the download endpoint. +the download endpoint. #### `POST /_matrix/media/r0/create` Create a new MXC URI without content. Like `/upload`, this endpoint requires @@ -44,20 +44,35 @@ TODO: decide if body is needed already and if yes, spec body schema #### `PUT /_matrix/media/r0/upload/{serverName}/{mediaId}` Upload content to a MXC URI that was created earlier. If the endpoint is called with a media ID that already has content, the request should be rejected with -the error code `M_CANNOT_OVERWRITE_MEDIA`. This endpoint too requires auth. +the error code `M_CANNOT_OVERWRITE_MEDIA` and HTTP status code 409. This endpoint +requires auth. If the upload is successful, an empty JSON object and status code 200 is returned. If the serverName/mediaId combination is not known or not local, an `M_NOT_FOUND` error is returned. For other errors, such as file size, file type or user quota errors, the normal `/upload` rules apply. -#### Behavior change in `/download` -When another client tries to download media that has not yet been uploaded, the -content repository should stall the request until it is uploaded. Optionally, -content repository implementations may send the data that has already been -uploaded and stream data as it comes in from the sender. +#### Changes to the `/download` endpoint +Two new query parameters are added: `max_stall` and `allow_stream`. -TODO: this should have at least an option to not stall if there's no data yet +* `max_stall` is an integer that specifies the maximum number of milliseconds + that the client is willing to wait for the upload to start (or finish, when + streaming is not enabled). The default value is 10000 (10 seconds). +* `allow_stream` is a boolean that specifies whether or not the content + repository should stream data as it comes in from the sender. Defaults to + `true`. + +When another client tries to download media that is currently being uploaded, +the content repository should send already uploaded data and stream new data as +it comes in from the sender. + +If the upload is not started after the time limit, the content repository +should return a `M_NOT_YET_UPLOADED` error. The error may include an additional +`retry_after_ms` field to suggest when the client should try again. + +If streaming is not supported by the server, or if it's disabled with the query +parameter, the content repository should act as if the upload has not started +until the upload is finished. ## Tradeoffs @@ -67,6 +82,15 @@ media. ## Security considerations +## Unstable prefix +While this MSC is not in a released version of the spec, implementations should +use `net.maunium.msc2246` as a prefix and as a `unstable_features` flag in the +`/versions` endpoint. + +* `POST /_matrix/media/unstable/net.maunium.msc2246/create` +* `PUT /_matrix/media/unstable/net.maunium.msc2246/upload/{serverName}/{mediaId}` +* `?net.maunium.msc2246.max_stall` +* `?net.maunium.msc2246.allow_stream` [#1885]: https://github.com/matrix-org/matrix-doc/issues/1885 [MSC701]: https://github.com/matrix-org/matrix-doc/issues/701 From bbd7d089459e91085e77e5a95ed1c459c2347ad6 Mon Sep 17 00:00:00 2001 From: Tulir Asokan Date: Sun, 2 Aug 2020 01:40:08 +0300 Subject: [PATCH 07/27] Rename max_stall to max_stall_ms --- proposals/2246-asynchronous-uploads.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/proposals/2246-asynchronous-uploads.md b/proposals/2246-asynchronous-uploads.md index b07ea350415..65dfcc6c3cb 100644 --- a/proposals/2246-asynchronous-uploads.md +++ b/proposals/2246-asynchronous-uploads.md @@ -53,9 +53,9 @@ returned. If the serverName/mediaId combination is not known or not local, an or user quota errors, the normal `/upload` rules apply. #### Changes to the `/download` endpoint -Two new query parameters are added: `max_stall` and `allow_stream`. +Two new query parameters are added: `max_stall_ms` and `allow_stream`. -* `max_stall` is an integer that specifies the maximum number of milliseconds +* `max_stall_ms` is an integer that specifies the maximum number of milliseconds that the client is willing to wait for the upload to start (or finish, when streaming is not enabled). The default value is 10000 (10 seconds). * `allow_stream` is a boolean that specifies whether or not the content @@ -89,7 +89,7 @@ use `net.maunium.msc2246` as a prefix and as a `unstable_features` flag in the * `POST /_matrix/media/unstable/net.maunium.msc2246/create` * `PUT /_matrix/media/unstable/net.maunium.msc2246/upload/{serverName}/{mediaId}` -* `?net.maunium.msc2246.max_stall` +* `?net.maunium.msc2246.max_stall_ms` * `?net.maunium.msc2246.allow_stream` [#1885]: https://github.com/matrix-org/matrix-doc/issues/1885 From 4d009a9cee18ebfccbe80a6617707e607ac08901 Mon Sep 17 00:00:00 2001 From: Tulir Asokan Date: Sun, 2 Aug 2020 01:44:32 +0300 Subject: [PATCH 08/27] Specify /create request body content --- proposals/2246-asynchronous-uploads.md | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/proposals/2246-asynchronous-uploads.md b/proposals/2246-asynchronous-uploads.md index 65dfcc6c3cb..f170b27153e 100644 --- a/proposals/2246-asynchronous-uploads.md +++ b/proposals/2246-asynchronous-uploads.md @@ -20,9 +20,9 @@ the download endpoint. Create a new MXC URI without content. Like `/upload`, this endpoint requires auth and returns the `content_uri` that can be used in events. -This endpoint could have a JSON body that contains metadata, such as the mime -type of the media that's going to be uploaded. In the future, the body could -also contain access control settings (related: [MSC701]). +The request body should be an empty JSON object. In the future, the body could +be used for metadata about the file, such as the mime type or access control +settings (related: [MSC701]). To prevent spam, servers should limit calls to this endpoint if the user has created many media IDs without uploading any content to them. The error code @@ -31,9 +31,6 @@ configuration options to let high-traffic clients like application services bypass these limits. The `rate_limited` flag in the appservice registration is one potential way to do this. -##### Example request -TODO: decide if body is needed already and if yes, spec body schema - ##### Example response ```json { From 1cbc04e143919d828e69788102cdc827cdc283a2 Mon Sep 17 00:00:00 2001 From: Tulir Asokan Date: Mon, 14 Mar 2022 12:29:43 +0200 Subject: [PATCH 09/27] Update links, versions and prefixes --- proposals/2246-asynchronous-uploads.md | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/proposals/2246-asynchronous-uploads.md b/proposals/2246-asynchronous-uploads.md index f170b27153e..8cf4347fa21 100644 --- a/proposals/2246-asynchronous-uploads.md +++ b/proposals/2246-asynchronous-uploads.md @@ -10,13 +10,13 @@ uploading the media, which would make the aformentioned bridge message order preservation possible without blocking all other messages behind a long upload. In the future, this new functionality could be used for streaming file -transfers, as requested in [#1885]. +transfers, as requested in [matrix-spec#432]. ### Content repository behavior The proposal adds two new endpoints to the content repository API and modifies the download endpoint. -#### `POST /_matrix/media/r0/create` +#### `POST /_matrix/media/v3/create` Create a new MXC URI without content. Like `/upload`, this endpoint requires auth and returns the `content_uri` that can be used in events. @@ -38,7 +38,7 @@ one potential way to do this. } ``` -#### `PUT /_matrix/media/r0/upload/{serverName}/{mediaId}` +#### `PUT /_matrix/media/v3/upload/{serverName}/{mediaId}` Upload content to a MXC URI that was created earlier. If the endpoint is called with a media ID that already has content, the request should be rejected with the error code `M_CANNOT_OVERWRITE_MEDIA` and HTTP status code 409. This endpoint @@ -81,13 +81,13 @@ media. ## Unstable prefix While this MSC is not in a released version of the spec, implementations should -use `net.maunium.msc2246` as a prefix and as a `unstable_features` flag in the +use `fi.mau.msc2246` as a prefix and as a `unstable_features` flag in the `/versions` endpoint. -* `POST /_matrix/media/unstable/net.maunium.msc2246/create` -* `PUT /_matrix/media/unstable/net.maunium.msc2246/upload/{serverName}/{mediaId}` -* `?net.maunium.msc2246.max_stall_ms` -* `?net.maunium.msc2246.allow_stream` +* `POST /_matrix/media/unstable/fi.mau.msc2246/create` +* `PUT /_matrix/media/unstable/fi.mau.msc2246/upload/{serverName}/{mediaId}` +* `?fi.mau.msc2246.max_stall_ms` +* `?fi.mau.msc2246.allow_stream` -[#1885]: https://github.com/matrix-org/matrix-doc/issues/1885 +[matrix-spec#432]: https://github.com/matrix-org/matrix-spec/issues/432 [MSC701]: https://github.com/matrix-org/matrix-doc/issues/701 From c65f2bfcdecb995456c5c8b3bf287e02db0a4c3f Mon Sep 17 00:00:00 2001 From: Tulir Asokan Date: Mon, 14 Mar 2022 12:44:35 +0200 Subject: [PATCH 10/27] Add recommended maximum delay for starting upload --- proposals/2246-asynchronous-uploads.md | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/proposals/2246-asynchronous-uploads.md b/proposals/2246-asynchronous-uploads.md index 8cf4347fa21..ce4849f5eca 100644 --- a/proposals/2246-asynchronous-uploads.md +++ b/proposals/2246-asynchronous-uploads.md @@ -31,10 +31,17 @@ configuration options to let high-traffic clients like application services bypass these limits. The `rate_limited` flag in the appservice registration is one potential way to do this. +The server may optionally enforce a maximum age for unused media IDs to delete +media IDs when the client doesn't start the upload in time, or when the upload +was interrupted and not resumed in time. The server should include the maximum +timestamp to start the upload in the `unused_expires_at` field in the response +JSON. The recommended default expiration for starting the upload is 1 minute. + ##### Example response ```json { - "content_uri": "mxc://example.com/AQwafuaFswefuhsfAFAgsw" + "content_uri": "mxc://example.com/AQwafuaFswefuhsfAFAgsw", + "unused_expires_at": 1647257217083 } ``` From 63cef508abe85cab3fc8e9e7e9c707fd16c3ada4 Mon Sep 17 00:00:00 2001 From: Tulir Asokan Date: Mon, 14 Mar 2022 12:44:47 +0200 Subject: [PATCH 11/27] Explicitly deny uploading to other users' media IDs --- proposals/2246-asynchronous-uploads.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/proposals/2246-asynchronous-uploads.md b/proposals/2246-asynchronous-uploads.md index ce4849f5eca..80789598128 100644 --- a/proposals/2246-asynchronous-uploads.md +++ b/proposals/2246-asynchronous-uploads.md @@ -48,8 +48,9 @@ JSON. The recommended default expiration for starting the upload is 1 minute. #### `PUT /_matrix/media/v3/upload/{serverName}/{mediaId}` Upload content to a MXC URI that was created earlier. If the endpoint is called with a media ID that already has content, the request should be rejected with -the error code `M_CANNOT_OVERWRITE_MEDIA` and HTTP status code 409. This endpoint -requires auth. +the error code `M_CANNOT_OVERWRITE_MEDIA` and HTTP status code 409. The endpoint +should also reject upload requests from users other than the user who created +the media ID. This endpoint requires auth. If the upload is successful, an empty JSON object and status code 200 is returned. If the serverName/mediaId combination is not known or not local, an From 8ccf85fac720d28b11ea86746e7afb525fd49c1e Mon Sep 17 00:00:00 2001 From: Tulir Asokan Date: Fri, 18 Mar 2022 13:14:49 +0200 Subject: [PATCH 12/27] Prevent spam by ratelimiting instead of limiting number of IDs --- proposals/2246-asynchronous-uploads.md | 11 ++++------- 1 file changed, 4 insertions(+), 7 deletions(-) diff --git a/proposals/2246-asynchronous-uploads.md b/proposals/2246-asynchronous-uploads.md index 80789598128..5960145b9d2 100644 --- a/proposals/2246-asynchronous-uploads.md +++ b/proposals/2246-asynchronous-uploads.md @@ -24,19 +24,16 @@ The request body should be an empty JSON object. In the future, the body could be used for metadata about the file, such as the mime type or access control settings (related: [MSC701]). -To prevent spam, servers should limit calls to this endpoint if the user has -created many media IDs without uploading any content to them. The error code -for such a spam limit is `M_TOO_MANY_IDS_ASSIGNED`. Servers should also provide -configuration options to let high-traffic clients like application services -bypass these limits. The `rate_limited` flag in the appservice registration is -one potential way to do this. - The server may optionally enforce a maximum age for unused media IDs to delete media IDs when the client doesn't start the upload in time, or when the upload was interrupted and not resumed in time. The server should include the maximum timestamp to start the upload in the `unused_expires_at` field in the response JSON. The recommended default expiration for starting the upload is 1 minute. +The server should also rate limit requests to create media. When combined with +the maximum age for created media IDs, it effectively prevents spam by creating +lots of unused ids. + ##### Example response ```json { From 12e907b9535c5f2ddd13bdf436897064936816bf Mon Sep 17 00:00:00 2001 From: Tulir Asokan Date: Fri, 18 Mar 2022 13:15:17 +0200 Subject: [PATCH 13/27] Remove streaming requirement It's complicated --- proposals/2246-asynchronous-uploads.md | 48 ++++++++++++-------------- 1 file changed, 23 insertions(+), 25 deletions(-) diff --git a/proposals/2246-asynchronous-uploads.md b/proposals/2246-asynchronous-uploads.md index 5960145b9d2..a3ce49cfd05 100644 --- a/proposals/2246-asynchronous-uploads.md +++ b/proposals/2246-asynchronous-uploads.md @@ -54,45 +54,43 @@ returned. If the serverName/mediaId combination is not known or not local, an `M_NOT_FOUND` error is returned. For other errors, such as file size, file type or user quota errors, the normal `/upload` rules apply. -#### Changes to the `/download` endpoint -Two new query parameters are added: `max_stall_ms` and `allow_stream`. - -* `max_stall_ms` is an integer that specifies the maximum number of milliseconds - that the client is willing to wait for the upload to start (or finish, when - streaming is not enabled). The default value is 10000 (10 seconds). -* `allow_stream` is a boolean that specifies whether or not the content - repository should stream data as it comes in from the sender. Defaults to - `true`. - -When another client tries to download media that is currently being uploaded, -the content repository should send already uploaded data and stream new data as -it comes in from the sender. - -If the upload is not started after the time limit, the content repository -should return a `M_NOT_YET_UPLOADED` error. The error may include an additional -`retry_after_ms` field to suggest when the client should try again. - -If streaming is not supported by the server, or if it's disabled with the query -parameter, the content repository should act as if the upload has not started -until the upload is finished. - -## Tradeoffs +The client should include a `Content-Length` header to ensure the server knows +if the file was uploaded entirely. If the server receives a different amount of +data than specified in the header, the upload must fail. + +#### Changes to the `/download` and `/thumbnail` endpoints +A new query parameter, `max_stall_ms` is added to the endpoints that can +download media. It's an integer that specifies the maximum number of +milliseconds that the client is willing to wait to start receiving data. +The default value is 20000 (20 seconds). + +If the data is not available before the specified time is up, the content +repository returns a `M_NOT_YET_UPLOADED` error with a HTTP 404 status code. +The error may include an additional `retry_after_ms` field to suggest when the +client should try again. + +For the `/download` endpoint, the server could also stream data directly as it +is being uploaded. However, streaming creates several implementation and spec +complications (e.g. how to stream if the media repo has multiple workers, what +to do if the upload is interrupted), so specifying exactly how streaming works +is left for another MSC. ## Potential issues Other clients may time out the download if the sender takes too long to upload media. +## Alternatives + ## Security considerations ## Unstable prefix While this MSC is not in a released version of the spec, implementations should -use `fi.mau.msc2246` as a prefix and as a `unstable_features` flag in the +use `fi.mau.msc2246` as a prefix and as an `unstable_features` flag in the `/versions` endpoint. * `POST /_matrix/media/unstable/fi.mau.msc2246/create` * `PUT /_matrix/media/unstable/fi.mau.msc2246/upload/{serverName}/{mediaId}` * `?fi.mau.msc2246.max_stall_ms` -* `?fi.mau.msc2246.allow_stream` [matrix-spec#432]: https://github.com/matrix-org/matrix-spec/issues/432 [MSC701]: https://github.com/matrix-org/matrix-doc/issues/701 From d582bb309a8a85d5f32905ce234aae2711d6556a Mon Sep 17 00:00:00 2001 From: Tulir Asokan Date: Fri, 18 Mar 2022 13:35:17 +0200 Subject: [PATCH 14/27] Add unstable prefixes for new error codes --- proposals/2246-asynchronous-uploads.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/proposals/2246-asynchronous-uploads.md b/proposals/2246-asynchronous-uploads.md index a3ce49cfd05..28fbc3b3836 100644 --- a/proposals/2246-asynchronous-uploads.md +++ b/proposals/2246-asynchronous-uploads.md @@ -91,6 +91,8 @@ use `fi.mau.msc2246` as a prefix and as an `unstable_features` flag in the * `POST /_matrix/media/unstable/fi.mau.msc2246/create` * `PUT /_matrix/media/unstable/fi.mau.msc2246/upload/{serverName}/{mediaId}` * `?fi.mau.msc2246.max_stall_ms` +* `FI.MAU.MSC2246_NOT_YET_UPLOADED` +* `FI.MAU.MSC2246_CANNOT_OVERWRITE_MEDIA` [matrix-spec#432]: https://github.com/matrix-org/matrix-spec/issues/432 [MSC701]: https://github.com/matrix-org/matrix-doc/issues/701 From 173edf38f0e586fa5fe413d5494be72319279fe4 Mon Sep 17 00:00:00 2001 From: Tulir Asokan Date: Fri, 18 Mar 2022 15:00:00 +0200 Subject: [PATCH 15/27] Add missing words --- proposals/2246-asynchronous-uploads.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/2246-asynchronous-uploads.md b/proposals/2246-asynchronous-uploads.md index 28fbc3b3836..fe6249a4399 100644 --- a/proposals/2246-asynchronous-uploads.md +++ b/proposals/2246-asynchronous-uploads.md @@ -14,7 +14,7 @@ transfers, as requested in [matrix-spec#432]. ### Content repository behavior The proposal adds two new endpoints to the content repository API and modifies -the download endpoint. +the download and thumbnail endpoints. #### `POST /_matrix/media/v3/create` Create a new MXC URI without content. Like `/upload`, this endpoint requires From f438754b55ed300f10ea5f1380e0c6c86c4f86ff Mon Sep 17 00:00:00 2001 From: Sumner Evans Date: Fri, 8 Jul 2022 15:03:54 -0600 Subject: [PATCH 16/27] Change /create endpoint to use v1 Signed-off-by: Sumner Evans --- proposals/2246-asynchronous-uploads.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/2246-asynchronous-uploads.md b/proposals/2246-asynchronous-uploads.md index fe6249a4399..011656cd814 100644 --- a/proposals/2246-asynchronous-uploads.md +++ b/proposals/2246-asynchronous-uploads.md @@ -16,7 +16,7 @@ transfers, as requested in [matrix-spec#432]. The proposal adds two new endpoints to the content repository API and modifies the download and thumbnail endpoints. -#### `POST /_matrix/media/v3/create` +#### `POST /_matrix/media/v1/create` Create a new MXC URI without content. Like `/upload`, this endpoint requires auth and returns the `content_uri` that can be used in events. From 725675c7f0fa8ae5f709757f4365d4a6f719806b Mon Sep 17 00:00:00 2001 From: Sumner Evans Date: Fri, 8 Jul 2022 15:17:54 -0600 Subject: [PATCH 17/27] Reorganize /upload spec and integrate feedback * Explicitly specify that M_NOT_FOUND should be used for expired media * Explicitly specify that M_FORBIDDEN should be used when a user other than the one who created the media ID tries to upload to it * Remove content-length failure note Signed-off-by: Sumner Evans --- proposals/2246-asynchronous-uploads.md | 30 ++++++++++++++------------ 1 file changed, 16 insertions(+), 14 deletions(-) diff --git a/proposals/2246-asynchronous-uploads.md b/proposals/2246-asynchronous-uploads.md index 011656cd814..1f38b7c7d93 100644 --- a/proposals/2246-asynchronous-uploads.md +++ b/proposals/2246-asynchronous-uploads.md @@ -43,20 +43,22 @@ lots of unused ids. ``` #### `PUT /_matrix/media/v3/upload/{serverName}/{mediaId}` -Upload content to a MXC URI that was created earlier. If the endpoint is called -with a media ID that already has content, the request should be rejected with -the error code `M_CANNOT_OVERWRITE_MEDIA` and HTTP status code 409. The endpoint -should also reject upload requests from users other than the user who created -the media ID. This endpoint requires auth. - -If the upload is successful, an empty JSON object and status code 200 is -returned. If the serverName/mediaId combination is not known or not local, an -`M_NOT_FOUND` error is returned. For other errors, such as file size, file type -or user quota errors, the normal `/upload` rules apply. - -The client should include a `Content-Length` header to ensure the server knows -if the file was uploaded entirely. If the server receives a different amount of -data than specified in the header, the upload must fail. +Upload content to a MXC URI that was created earlier. This endpoint requires +auth. If the upload is successful, an empty JSON object and status code 200 is +returned. + +If the endpoint is called with a media ID that already has content, the request +should be rejected with the error code `M_CANNOT_OVERWRITE_MEDIA` and HTTP +status code 409. + +If the upload request comes from a user other than the one who created the media +ID, the request should be rejected with an `M_FORBIDDEN` error. + +If the serverName/mediaId combination is not known, not local, or expired, an +`M_NOT_FOUND` error is returned. + +For other errors, such as file size, file type or user quota errors, the normal +`/upload` rules apply. #### Changes to the `/download` and `/thumbnail` endpoints A new query parameter, `max_stall_ms` is added to the endpoints that can From d55f1f924524ab0645088712f6430315f4fd813c Mon Sep 17 00:00:00 2001 From: Sumner Evans Date: Fri, 8 Jul 2022 15:19:33 -0600 Subject: [PATCH 18/27] Rename max_stall_ms -> timeout_ms Signed-off-by: Sumner Evans --- proposals/2246-asynchronous-uploads.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/proposals/2246-asynchronous-uploads.md b/proposals/2246-asynchronous-uploads.md index 1f38b7c7d93..7209cd942ab 100644 --- a/proposals/2246-asynchronous-uploads.md +++ b/proposals/2246-asynchronous-uploads.md @@ -61,7 +61,7 @@ For other errors, such as file size, file type or user quota errors, the normal `/upload` rules apply. #### Changes to the `/download` and `/thumbnail` endpoints -A new query parameter, `max_stall_ms` is added to the endpoints that can +A new query parameter, `timeout_ms` is added to the endpoints that can download media. It's an integer that specifies the maximum number of milliseconds that the client is willing to wait to start receiving data. The default value is 20000 (20 seconds). @@ -92,7 +92,7 @@ use `fi.mau.msc2246` as a prefix and as an `unstable_features` flag in the * `POST /_matrix/media/unstable/fi.mau.msc2246/create` * `PUT /_matrix/media/unstable/fi.mau.msc2246/upload/{serverName}/{mediaId}` -* `?fi.mau.msc2246.max_stall_ms` +* `?fi.mau.msc2246.timeout_ms` * `FI.MAU.MSC2246_NOT_YET_UPLOADED` * `FI.MAU.MSC2246_CANNOT_OVERWRITE_MEDIA` From 955177b45d983c2c465f7e9dbc5cf4fb67368ca9 Mon Sep 17 00:00:00 2001 From: Sumner Evans Date: Fri, 8 Jul 2022 15:20:12 -0600 Subject: [PATCH 19/27] Mention that maximum value for timeout_ms should be imposed by the server Signed-off-by: Sumner Evans --- proposals/2246-asynchronous-uploads.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/proposals/2246-asynchronous-uploads.md b/proposals/2246-asynchronous-uploads.md index 7209cd942ab..a8eb6162b29 100644 --- a/proposals/2246-asynchronous-uploads.md +++ b/proposals/2246-asynchronous-uploads.md @@ -64,7 +64,8 @@ For other errors, such as file size, file type or user quota errors, the normal A new query parameter, `timeout_ms` is added to the endpoints that can download media. It's an integer that specifies the maximum number of milliseconds that the client is willing to wait to start receiving data. -The default value is 20000 (20 seconds). +The default value is 20000 (20 seconds). The server can and should impose a +maximum value for this parameter. If the data is not available before the specified time is up, the content repository returns a `M_NOT_YET_UPLOADED` error with a HTTP 404 status code. From 823fcca849bfc22d1b29d6b8a8f1ef5818706edd Mon Sep 17 00:00:00 2001 From: Sumner Evans Date: Fri, 8 Jul 2022 15:21:21 -0600 Subject: [PATCH 20/27] Mention that the timeout_ms can be ignored if the media exists already Signed-off-by: Sumner Evans --- proposals/2246-asynchronous-uploads.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/proposals/2246-asynchronous-uploads.md b/proposals/2246-asynchronous-uploads.md index a8eb6162b29..efb1e03593d 100644 --- a/proposals/2246-asynchronous-uploads.md +++ b/proposals/2246-asynchronous-uploads.md @@ -67,6 +67,9 @@ milliseconds that the client is willing to wait to start receiving data. The default value is 20000 (20 seconds). The server can and should impose a maximum value for this parameter. +If the media is available immediately (for example in the case of a +non-asynchronous upload), the server should ignore this parameter. + If the data is not available before the specified time is up, the content repository returns a `M_NOT_YET_UPLOADED` error with a HTTP 404 status code. The error may include an additional `retry_after_ms` field to suggest when the From 3b0002680c09ca48ab805b1d136762200c202eef Mon Sep 17 00:00:00 2001 From: Sumner Evans Date: Fri, 8 Jul 2022 15:21:55 -0600 Subject: [PATCH 21/27] Change M_NOT_YET_UPLOADED to use 504 status code Signed-off-by: Sumner Evans --- proposals/2246-asynchronous-uploads.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/2246-asynchronous-uploads.md b/proposals/2246-asynchronous-uploads.md index efb1e03593d..33e7914897c 100644 --- a/proposals/2246-asynchronous-uploads.md +++ b/proposals/2246-asynchronous-uploads.md @@ -71,7 +71,7 @@ If the media is available immediately (for example in the case of a non-asynchronous upload), the server should ignore this parameter. If the data is not available before the specified time is up, the content -repository returns a `M_NOT_YET_UPLOADED` error with a HTTP 404 status code. +repository returns a `M_NOT_YET_UPLOADED` error with a HTTP 504 status code. The error may include an additional `retry_after_ms` field to suggest when the client should try again. From 9627af27396f85454f78eb22ce9e095e9a0bce49 Mon Sep 17 00:00:00 2001 From: Sumner Evans Date: Fri, 8 Jul 2022 15:22:46 -0600 Subject: [PATCH 22/27] Remove retry_after_ms optional field Signed-off-by: Sumner Evans --- proposals/2246-asynchronous-uploads.md | 2 -- 1 file changed, 2 deletions(-) diff --git a/proposals/2246-asynchronous-uploads.md b/proposals/2246-asynchronous-uploads.md index 33e7914897c..dcbed5d024f 100644 --- a/proposals/2246-asynchronous-uploads.md +++ b/proposals/2246-asynchronous-uploads.md @@ -72,8 +72,6 @@ non-asynchronous upload), the server should ignore this parameter. If the data is not available before the specified time is up, the content repository returns a `M_NOT_YET_UPLOADED` error with a HTTP 504 status code. -The error may include an additional `retry_after_ms` field to suggest when the -client should try again. For the `/download` endpoint, the server could also stream data directly as it is being uploaded. However, streaming creates several implementation and spec From 045c21e533311e7d32d24e348aeb996d7e26b17c Mon Sep 17 00:00:00 2001 From: Sumner Evans Date: Thu, 30 Mar 2023 08:36:46 +0200 Subject: [PATCH 23/27] Make unused_expires_at the deadline for the upload to complete rather than start Signed-off-by: Sumner Evans --- proposals/2246-asynchronous-uploads.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/proposals/2246-asynchronous-uploads.md b/proposals/2246-asynchronous-uploads.md index dcbed5d024f..d0d8a02c04c 100644 --- a/proposals/2246-asynchronous-uploads.md +++ b/proposals/2246-asynchronous-uploads.md @@ -27,8 +27,9 @@ settings (related: [MSC701]). The server may optionally enforce a maximum age for unused media IDs to delete media IDs when the client doesn't start the upload in time, or when the upload was interrupted and not resumed in time. The server should include the maximum -timestamp to start the upload in the `unused_expires_at` field in the response -JSON. The recommended default expiration for starting the upload is 1 minute. +timestamp to complete the upload in the `unused_expires_at` field in the +response JSON. The recommended default expiration for completing the upload is 1 +minute. The server should also rate limit requests to create media. When combined with the maximum age for created media IDs, it effectively prevents spam by creating From 011031b422a4e0c5e86b3a14ddb6cfbff5257072 Mon Sep 17 00:00:00 2001 From: Sumner Evans Date: Thu, 30 Mar 2023 08:37:41 +0200 Subject: [PATCH 24/27] Add notes about suggested rate-limiting techniques Namely, allowing a limited number of concurrent uploads Signed-off-by: Sumner Evans --- proposals/2246-asynchronous-uploads.md | 61 +++++++++++++++++++++++--- 1 file changed, 54 insertions(+), 7 deletions(-) diff --git a/proposals/2246-asynchronous-uploads.md b/proposals/2246-asynchronous-uploads.md index d0d8a02c04c..f445eeae533 100644 --- a/proposals/2246-asynchronous-uploads.md +++ b/proposals/2246-asynchronous-uploads.md @@ -31,9 +31,18 @@ timestamp to complete the upload in the `unused_expires_at` field in the response JSON. The recommended default expiration for completing the upload is 1 minute. -The server should also rate limit requests to create media. When combined with -the maximum age for created media IDs, it effectively prevents spam by creating -lots of unused ids. +##### Rate Limiting + +The server should rate limit requests to create media. + +The server should limit the number of concurrent *pending media uploads* a given +user can have. A pending media upload is a created MXC URI that (a) is not +expired (the `unused_expires_at` timestamp has not passed) and (b) the media has +not yet been uploaded for. + +In both cases, the server should respond with `M_LIMIT_EXCEEDED` optionally +providing details in the `error` field, but servers may wish to obscure the +exact limits that are used and not provide such details. ##### Example response ```json @@ -58,6 +67,10 @@ ID, the request should be rejected with an `M_FORBIDDEN` error. If the serverName/mediaId combination is not known, not local, or expired, an `M_NOT_FOUND` error is returned. +If the MXC's `unused_expires_at` is reached before the upload completes, the +server may either respond immediately with `M_NOT_FOUND` or allow the upload to +continue. + For other errors, such as file size, file type or user quota errors, the normal `/upload` rules apply. @@ -65,13 +78,17 @@ For other errors, such as file size, file type or user quota errors, the normal A new query parameter, `timeout_ms` is added to the endpoints that can download media. It's an integer that specifies the maximum number of milliseconds that the client is willing to wait to start receiving data. -The default value is 20000 (20 seconds). The server can and should impose a -maximum value for this parameter. +The default value is 20000 (20 seconds). The content repository can and should +impose a maximum value for this parameter. The content repository can also +choose to respond before the timeout if it desires. If the media is available immediately (for example in the case of a -non-asynchronous upload), the server should ignore this parameter. +non-asynchronous upload), the content repository should ignore this parameter. + +If the MXC has expired, the content repository should respond with `M_NOT_FOUND` +and a HTTP 404 status code. -If the data is not available before the specified time is up, the content +If the data is not available when the server chooses to respond, the content repository returns a `M_NOT_YET_UPLOADED` error with a HTTP 504 status code. For the `/download` endpoint, the server could also stream data directly as it @@ -88,6 +105,36 @@ media. ## Security considerations +The primary attack vector that must be prevented is a malicious user creating a +large number of MXC URIs and sending them to a room without uploading the +corresponding media. Clients in that room would then attempt to download the +media, holding open connections to the server and potentially exhausting the +number of available connections. + +This attack vector is stopped in multiple ways: + +1. Limits on `/create` prevent users from creating MXC URIs too quickly and also + require them to finish uploading files (or let some of their MXCs expire) + before creating new MXC URIs. + +2. Servers are free to respond to `/download` and `/thumbnail` requests before + the `timeout_ms` has been reached and respond with `M_NOT_YET_UPLOADED`. For + example, if the server is under connection count pressure, it can choose to + respond to waiting download connections with `M_NOT_YET_UPLOADED` to free + connections in the pool. + +3. Once the media is expired, servers can respond immediately to `/download` and + `/thumbnail` requests with `M_NOT_FOUND`. + +## Future work + +Future MSCs might wish to address large file uploads. One approach would be to +add metadata to the `/create` call via a query parameter (for example +`?large_file_upload=true`. Servers would have the ability to impose restrictions +on how many such "large file" uploads a user can have concurrently. For such a +situation, the server would likely send a more generous `unused_expires_at` +timestamp to allow for a long-running upload. + ## Unstable prefix While this MSC is not in a released version of the spec, implementations should use `fi.mau.msc2246` as a prefix and as an `unstable_features` flag in the From 6cb7e31073a1db9cd3d0c14f407a64ec76e01666 Mon Sep 17 00:00:00 2001 From: Sumner Evans Date: Thu, 30 Mar 2023 14:45:29 +0200 Subject: [PATCH 25/27] Recommend 24 hours instead of 1 minute Signed-off-by: Sumner Evans --- proposals/2246-asynchronous-uploads.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/proposals/2246-asynchronous-uploads.md b/proposals/2246-asynchronous-uploads.md index f445eeae533..4f58604a92d 100644 --- a/proposals/2246-asynchronous-uploads.md +++ b/proposals/2246-asynchronous-uploads.md @@ -28,8 +28,9 @@ The server may optionally enforce a maximum age for unused media IDs to delete media IDs when the client doesn't start the upload in time, or when the upload was interrupted and not resumed in time. The server should include the maximum timestamp to complete the upload in the `unused_expires_at` field in the -response JSON. The recommended default expiration for completing the upload is 1 -minute. +response JSON. The recommended default expiration is 24 hours which should be +enough time to accommodate users on poor connection who find a better connection +to complete the upload. ##### Rate Limiting From 7652f59cefe692f1b8c1a93327e10cce5e033eb2 Mon Sep 17 00:00:00 2001 From: Sumner Evans Date: Thu, 20 Apr 2023 00:23:36 -0600 Subject: [PATCH 26/27] Clarify that rate limiting can apply on /create and /upload Signed-off-by: Sumner Evans --- proposals/2246-asynchronous-uploads.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/proposals/2246-asynchronous-uploads.md b/proposals/2246-asynchronous-uploads.md index 4f58604a92d..d54c03ebecb 100644 --- a/proposals/2246-asynchronous-uploads.md +++ b/proposals/2246-asynchronous-uploads.md @@ -18,7 +18,8 @@ the download and thumbnail endpoints. #### `POST /_matrix/media/v1/create` Create a new MXC URI without content. Like `/upload`, this endpoint requires -auth and returns the `content_uri` that can be used in events. +auth, can be rate limited, and returns the `content_uri` that can be used in +events. The request body should be an empty JSON object. In the future, the body could be used for metadata about the file, such as the mime type or access control @@ -56,7 +57,7 @@ exact limits that are used and not provide such details. #### `PUT /_matrix/media/v3/upload/{serverName}/{mediaId}` Upload content to a MXC URI that was created earlier. This endpoint requires auth. If the upload is successful, an empty JSON object and status code 200 is -returned. +returned. Rate limiting additionally can apply here. If the endpoint is called with a media ID that already has content, the request should be rejected with the error code `M_CANNOT_OVERWRITE_MEDIA` and HTTP From 098dd9000a1903e3ceee408bdca742b5a85b654c Mon Sep 17 00:00:00 2001 From: Sumner Evans Date: Thu, 20 Apr 2023 00:24:02 -0600 Subject: [PATCH 27/27] Clarify that unused_expires_at is a POSIX millisecond timestamp Signed-off-by: Sumner Evans --- proposals/2246-asynchronous-uploads.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/proposals/2246-asynchronous-uploads.md b/proposals/2246-asynchronous-uploads.md index d54c03ebecb..2e844eeb967 100644 --- a/proposals/2246-asynchronous-uploads.md +++ b/proposals/2246-asynchronous-uploads.md @@ -1,4 +1,4 @@ -# Asynchronous media uploads +# MSC2246: Asynchronous media uploads Sending media to Matrix currently requires that clients first upload the media to the content repository and then send the event. This is a problem for some use cases, such as bridges that want to preserve message order, as reuploading @@ -28,10 +28,10 @@ settings (related: [MSC701]). The server may optionally enforce a maximum age for unused media IDs to delete media IDs when the client doesn't start the upload in time, or when the upload was interrupted and not resumed in time. The server should include the maximum -timestamp to complete the upload in the `unused_expires_at` field in the -response JSON. The recommended default expiration is 24 hours which should be -enough time to accommodate users on poor connection who find a better connection -to complete the upload. +POSIX millisecond timestamp to complete the upload in the `unused_expires_at` +field in the response JSON. The recommended default expiration is 24 hours which +should be enough time to accommodate users on poor connection who find a better +connection to complete the upload. ##### Rate Limiting