CDN Hosting at Google Cloud Storage

Peter Krautzberger edited this page Sep 21, 2015 · 34 revisions
Clone this wiki locally

Note: these are notes about using Google Cloud Storage to host a copy of MathJax. We use GCS in combination with CloudFlare which means these instructions are not always optimal if you are working in a pure GCS setup.

Preliminaries

Copying

Notes:

  • https://developers.google.com/storage/docs/gsutil/commands/cp
  • gsutil option: -m multithreaded (higher load+cost but faster)
  • cp options: recursively (-R), compressed (-z), verbose (-v)
    • compression is necessary to get gzipped delivery. Google Cloud storage does not compress on the fly but should decompress on the fly for old browsers that can't accept decompressed files. We could probably skip it since CloudFlare will zip it if it's not.

The basic recursive operation is

    gsutil -m cp -rz js latest gs://cdn.mathjax.org/mathjax/

Note: not all metadata is preserved by default, especially when updating a bucket from local files. The previous sections take care of public access, caching, CORS; see below for mime-types.

This also works within GCS, i.e., bucket to bucket. If you copy within GCS from bucket to bucket, you can use the -p option to preserve metadata (note: this incurs higher costs).

    gsutil -m cp -rp gs://bucket1/file1 gs://bucket2/

(You'll do need to set the CORS once for each bucket though; see the One-time-task section.)

MIME-type headers

Notes:

WOFF, EOT, and SVG content-type headers are detected correctly during upload to GCS.

  • For otf:

      gsutil setmeta -r -h "Content-Type:font/opentype" gs://our-bucket/mathjax/**.otf
    

Either change it as described or see advice here to do this automatically on upload.

Testing

  • CORS allow-origin [vital]
    • A simple test: switch www.mathjax.org over to the GCS origin (bucket.storage.googlapis.com) and use saucelabs to check rendering across IE, FF, Chrome, Safari.
    • Since CORS is set on bucket level, we should not have to worry about it.
  • font mimetypes [non-vital]
    • TODO write script to curl -I all woff and otf files, compare header.

One-time tasks

Setting a default ACL (access control list)

Notes:

The GCS default for access control is project-private. GCS does not allow for ACLs to persist when files are overwritten.

Since we update files in \latest with every releases and files in beta during beta runs, we need to prevent GCS from marking those new files as private (breaking public access to the CDN).

The solution is to set the default ACL for the bucket once following the instructions in the gsutil documentation.

To set this just use

$ gsutil defacl set public-read gs://our-bucket

set CORS headers

Notes:

Create an XML file with

<?xml version="1.0" encoding="UTF-8"?>
<CorsConfig>
  <Cors>
        <Origins>
            <Origin>*</Origin>
        </Origins>
        <Methods>
            <Method>GET</Method>
            <Method>POST</Method>
            <Method>HEAD</Method>
        </Methods>
        <ResponseHeaders>
            <ResponseHeader>*</ResponseHeader>
        </ResponseHeaders>
        <MaxAgeSec>3600</MaxAgeSec>
    </Cors>
</CorsConfig>

Save as cors.xml and run

gsutil cors set cors.xml gs://our-bucket

Set up caching

Notes:

The default caching time for public files is 1h. That's good enough for us since caching really happens on CloudFlare.

If you need to change it use something like

    $ gsutil -m setmeta -r -h "Cache-Control:public, max-age=3600" gs://our-bucket/mathjax/

CNAME

To use a CNAME for a bucket, the bucket needs to be named like the CNAME, see https://developers.google.com/storage/docs/reference-uris#cname.

You also need to verify domain ownership; see https://support.google.com/webmasters/answer/35179?hl=en.

Logging

We don't log on GCS anymore since we combine it with CloudFlare (and get their stats).

For the record, instructions are at https://developers.google.com/storage/docs/gsutil/commands/logging