Skip to content

Compress report content at rest #2291

@davidjgoss

Description

@davidjgoss

Since doing the initial post-SmartBear launch of Cucumber Reports, we've done some architectural shuffling to get the functionality we need as economically as possible. That has gotten us to a point where our AWS costs are 99% in S3, within which the breakdown is like this:

Image

Currently we're uploading, storing and downloading the report content - Cucumber messages as JSON lines - in clear text. Messages are naturally pretty good candidates for compression, and some rough testing I did with gzip reduced the size by about 90% on average. I wouldn't expect the real world effect to be this good, because users often have large binary attachments, and with these already being base64-encoded there would be close to no gains on them. But still, it's likely this would noticeably reduce our S3 spend.

The reports service supports serving gzipped content today without any changes, although I've made cucumber/cucumber-reports#42 to document and verify that. In short, the Cucumber implementation is free to specify the content type and encoding of the upload request, and when loading in the browser gzip is handled automatically.

I've proved this out in cucumber-js via cucumber/cucumber-js#2687 and tested that against the production service. It's a pretty small diff.

Do we foresee any issues with landing this change and doing a similar one for JVM and Ruby too?

Another way to go about this would be to do it in the service, either by replacing the presigned URL with a Lambda function to gzip the content at the point of upload, or doing it afterwards in response to an event. This would get us the gains sooner, since we wouldn't have to wait for user adoption of new versions. But I suspect (without hard evidence, to be fair) that the savings in storage cost would be at least partially offset by increases in data transfer and Lamba costs, plus it would be more work to build and maintain in the first place. S3, for its part, does not have a way to automatically compress content at rest.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions