Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

S3 Bucket Deployments with content encoding based on file extension. #7090

Open
2 tasks
hleb-albau opened this issue Mar 31, 2020 · 7 comments
Open
2 tasks
Labels
@aws-cdk/aws-s3-deployment effort/medium Medium work item – several days of effort feature-request A feature should be added or improved. p2

Comments

@hleb-albau
Copy link

Currently, there is contentEncoding?:string option in BucketDeployment construct (System-defined content-encoding metadata to be set on all objects in the deployment). It would be nice to have possibility to specify contentEncoding according to mapping for file extension. Example: for files with extension .br specify "Content-Encoding: br", for .gzip files - "Content-Encoding: gzip" and so on.

Use Case

We use s3+ cloudfront pair to serve static website. To provide better performance br files are server based on Accept-Encoding header, so our files have to copies (ex: index.html and index.html.br*). Currently, we have to use aws cli to deploy differently encoded files with right headers. If BucketDeployment construct will support contentEncoding encoding by file extension option, it would be more easy-to-go static hosting option.

  • 👋 I may be able to implement this feature request
  • ⚠️ This feature might incur a breaking change

This is a 🚀 Feature Request

@hleb-albau hleb-albau added feature-request A feature should be added or improved. needs-triage This issue or PR still needs to be triaged. labels Mar 31, 2020
@SomayaB SomayaB added the @aws-cdk/aws-s3 Related to Amazon S3 label Mar 31, 2020
@iliapolo
Copy link
Contributor

Hi @hleb-albau

This is definitely an interesting use-case. The problem is that bucket deployments run:

aws s3 sync --delete --content-type=<content-type> {sourceDir} {targetBucket}

The command does not allow specifying different conte-types for different files. Splitting to different source directories won't work either because of the (necessary) --delete flag.

Are you having issues with anything other than br files? The aws cli actually determines the content-type of each individual file automatically, by delegating to python's standard library. However, support for br file extensions was only just added to CPython, and is not yet released outside of an alpha version.

One possible solution would be to have the bucket deployment lambda add a specific entry for br files to one of the known mime type linux files, which should make the cli detect it properly, voiding the need to pass Content-Type all-together.

This seems like the most pragmatic solution for now.

WDYT?

@iliapolo iliapolo added effort/medium Medium work item – several days of effort response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. and removed effort/medium Medium work item – several days of effort labels Apr 12, 2020
@hleb-albau
Copy link
Author

Thanks for response!

Beside content-encoding header, which determines content compression(br, gz and etc) we also have content-type header.

Example for file util.js.br: content-type - application/javascript, content-encoding - br

Right now our deployments process runs as follows:

  1. use cdk bucket deployment to deploy all files.
  2. redeploy compressed files via cli with given flags
aws s3 cp ./dist s3://{BUCKET_NAME} \
  --exclude="*" --include="*.js.br" \
  --content-encoding br \
  --content-type="application/javascript" \
  --cache-control "max-age=31536000" \
  --metadata-directive REPLACE --recursive

So, I wonder if cli can detect both headers properly..

@SomayaB SomayaB added @aws-cdk/aws-s3-deployment and removed @aws-cdk/aws-s3 Related to Amazon S3 needs-triage This issue or PR still needs to be triaged. response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. labels Apr 23, 2020
@iliapolo
Copy link
Contributor

iliapolo commented May 13, 2020

Hi @hleb-albau - Yeah, seems like there's no way around this.

We can probably support this use-case by doing what you did with exclude/include.

Stay tuned 👍

Thanks!

@iliapolo iliapolo added the effort/medium Medium work item – several days of effort label May 13, 2020
@iliapolo
Copy link
Contributor

relates also to #4687

@iliapolo iliapolo added the p2 label Aug 29, 2020
@peabnuts123
Copy link

Just to add to this conversation, here is the script I am using to achieve this right now:

# Clear out / upload everything first
echo "[Phase 1] Sync everything"
aws s3 sync . "s3://${s3_bucket_name}" --acl 'public-read' --delete

# Brotli-compressed files
# - general (upload everything brotli-compressed as "binary/octet-stream" by default)
echo "[Phase 2] Brotli-compressed files"
aws s3 cp . "s3://${s3_bucket_name}" \
  --exclude="*" --include="*.br" \
  --acl 'public-read' \
  --content-encoding br \
  --content-type="binary/octet-stream" \
  --metadata-directive REPLACE --recursive;

# - javascript (ensure javascript has correct content-type)
echo "[Phase 3] Brotli-compressed JavaScript"
aws s3 cp . "s3://${s3_bucket_name}" \
  --exclude="*" --include="*.js.br" \
  --acl 'public-read' \
  --content-encoding br \
  --content-type="application/javascript" \
  --metadata-directive REPLACE --recursive;

@Simonl9l
Copy link

Simonl9l commented Jan 15, 2023

would also be good if the upload detected the file encoding and added a charset content type that defaulted to utf-8

@andresionek91
Copy link

andresionek91 commented Aug 3, 2023

I solved it with a Custom Resource. This one changes the CacheControl but the logic is the same for other metadata.

s3_deployment = BucketDeployment(...

copy_object_changing_cache = cr.AwsSdkCall(
    service="S3",
    action="copyObject",
    parameters={
        "Bucket":bucket.bucket_name,
        "CopySource": f"{bucket.bucket_name}/remote.js",
        "Key": "remote.js",
        "MetadataDirective": "REPLACE",
        "CacheControl": "no-cache, no-store",
        "Metadata": {"object-hash": uuid4().hex[:8]}  # Important to trigger update in cloudformation
    },
    physical_resource_id=cr.PhysicalResourceId.of("ChangeObjectCacheControl"),
)

change_cache_role = iam.Role(
    scope=self,
    id="ChangeCacheRole",
    assumed_by=iam.ServicePrincipal(service="lambda.amazonaws.com"),
    inline_policies={
        "AllowCopyRemoteJs": iam.PolicyDocument(
            statements=[
                iam.PolicyStatement(
                    actions=[
                        "s3:PutObject", 
                        "s3:CopyObject", 
                        "s3:GetObject", 
                        "s3:DeleteObject"
                    ],
                    resources=[bucket.arn_for_objects("remote.js")],
                    effect=iam.Effect.ALLOW,
                ),
            ],
        )
    },
)

change_cache = cr.AwsCustomResource(
    scope=self,
    id="ChangeObjectCacheControl",
    role=change_cache_role,
    on_create=copy_object_changing_cache,
    on_update=copy_object_changing_cache,
)
change_cache.node.add_dependency(s3_deployment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@aws-cdk/aws-s3-deployment effort/medium Medium work item – several days of effort feature-request A feature should be added or improved. p2
Projects
None yet
Development

No branches or pull requests

6 participants