Skip to content

Conversation

@andywirv
Copy link
Contributor

@andywirv andywirv commented Sep 7, 2023

There appears to be a silent loss of the metadata.metadata on write. Adding JSON.stringify as used in create https://github.com/tus/tus-node-server/blob/main/packages/gcs-store/index.ts#L44 fixed the issue I was seeing.

Sorry not added a test but not had a chance to get it all set up. The metadata.metadata is there on create but will be removed as the file is written. Other data such as offset is retained correctly

const TEST_METADATA = 'filetype dmlkZW8vbXA0,filename dGVzdC5tcDQ='
gcloud storage objects describe gs://{bucket}/8614d69d16b7344cd24e530dfd20672c
bucket: {bucket}
crc32c: cuffvA==
etag: CPysvYzMmIEDEAE=
generation: '1694092967761532'
id: {bucket}/8614d69d16b7344cd24e530dfd20672c/1694092967761532
kind: storage#object
md5Hash: ILjxSjUwj4okFluJpikGmQ==
mediaLink: https://storage.googleapis.com/download/storage/v1/b/{bucket}/o/8614d69d16b7344cd24e530dfd20672c?generation=1694092967761532&alt=media
metadata:
  metadata: '{"filetype":"video/mp4","filename":"test.mp4"}'
  offset: '0'
  size: '960244'
  sizeIsDeferred: 'false'
  tus_version: 1.0.0
metageneration: '1'

@Murderlon
Copy link
Member

Hi, sorry for the late response. I think this makes sense, however GCS might be in need of a similar fix as in S3: #470. Quickly looking it seems GCS too doesn't properly use a .info file but puts everything on metadata, which might be a problem too if the SDK sends it as a query string.

@andywirv
Copy link
Contributor Author

andywirv commented Sep 21, 2023

Hey @Murderlon . There is no built-in concept of a .info file with GCS. You could, in theory, add a .info file alongside the primary file. The GCS standard is to use the metadata fields as key value pairs.
https://cloud.google.com/storage/docs/metadata#custom-metadata

Limit on max size : https://cloud.google.com/storage/quotas#objects

Maximum combined size of all custom metadata keys and values per object | 8 KiB

The current tus-node-server implementation stores all of the metadata under a single metadata key which is not ideal but is ok. I think it would be an improvement to have each of metadata field extracted and added, maybe with a tus- extension, as independent key.

This MR only addresses the metadata (metadata.metadata to be precise) being overwritten on the first upload/update of data to the GCS object. The initial create of the object currently adds stringified metadata so any issue with length etc of the metadata fields and the GCS SDK would show up there first.

We are now using this patch in production and it works correctly/as expected

@Murderlon Murderlon changed the title fix: ensure that gcs metadata is (re)encoded so it is not lost @tus/gcs-store: ensure metadata is (re)encoded Sep 21, 2023
@Murderlon Murderlon linked an issue Sep 21, 2023 that may be closed by this pull request
@Murderlon
Copy link
Member

Thanks for clarifying. I'll make a release for it now as well.

@Murderlon Murderlon merged commit 8a080c0 into tus:main Sep 21, 2023
@Murderlon
Copy link
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

GCS Store Upload-Metadata lost on update

2 participants