Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-5901: Robust Edition purges by having consistent surrogate-keys for each Edition #7

Closed
wants to merge 42 commits into from

Conversation

jonathansick
Copy link
Member

LTD Keeper needs to purge Fastly when an Edition is rebuilt. Currently the surrogate-key for the build is also used to cover editions. This means that the key needed to purge an edition is the same as that for an build. Hence purging an edition means that the system needs to purge the surrogate key of the previous build.

We're seeing situations where the surrogate key that Keeper is purging is not the one that needs to be purged. A more robust configuration would be for each edition to have a stable surrogate-key that can be unambiguously purged.

This PR

  1. Adds a surrogate-key column to the Edition model
  2. Change the S3 copy rebuild code to change the surrogate-key header
  3. Change the rebuild code to purge based on the edition's surrogate-key.
  4. Enables Alembic migrations for Flask (Flask-Migrate) to deal with new schema

The surrogate key is intended to be included in the
x-amz-meta-surrogate-key header of all objects in S3 belonging to a
build. This always us to instantly purge a build from Fastly (for
example, when pointing an Edition to a new build).

Since surrogate keys only need to be unique, I'm just using a uuid4
expressed as a hex string. That way the value is always just 32
characters and still always unique.

A surrogate key is automatically issued by a POST
/products/<product>/builds/.
Includes the API key for the Fastly account that the service ID that
serves LSST the Docs content from S3.

See https://docs.fastly.com/api/config#service for more info about
service ids.
The API call is documented at
https://docs.fastly.com/api/purge#purge_077dfb4aa07f49792b13c87647415537

When LTD Mason inserts an x-amz-meta-surrogate-key value in the header
of build objects. This surrogate key is maintained in the Build resource
by Keeper. This allows us to purge the build for an Edition when an
edition is re-pointed.

- Create a FastlyService client class to handle API calls.
- Create a FastlyError exception for API errors.
- The FastlyService.purge_key() method specially handles surrogate-key
  based purges.
- Test the purge requests via the responses package.
- Add responses as an official dependency.
I haven't settled on logging best practices but this is one approach to
logging Fastly purges:

1. info log the request url
2. log the json response on errors

Note that I use logging directly rather than the Flask app's logging
since the Fastly module may be used/tested outside the app context.

We'll have to specifically activate logger's such as the fastly module's
later in a logging config step.
- LICENSE for the main license
- licenses/ for third-party licenses
- licenses/README.rst explains that licenses/ is for third party
  licenses
This helps keep the configuration environment driven.
This can be used to copy objects from a directory to another in the
same bucket. The main use case for this will be in minting new editions
from builds.
Implements Edition.rebuild() method (called during a PATCH /edition/) to
copy the new build to the edition's directory and purge the previous
build from the Fastly cache
- Add encrypted AWS credentials to Travis (.travis.yml)
- Add a test for delete_directory that uploads files to S3, deletes
  a directory and then tests what paths are available on S3.
- Test directory copying against a real S3 bucket. Looks at bucket
  contents to ensure that the original destination was overwritten with
  the source contents.
- Refactored utility code for uploading test files to S3 based on a list
  of relative file paths
- Tests that assertion errors are raised when the source and destination
  directories are not independent over each other.
The permission_required decorator needs to be bypassed with the
IGNORE_AUTH configuration is set (i.e., in development mode).
Previously a Product resource would be created where the domain a
product was served from was specified independently of the slug. With
fastly regex-based redirects we need to tie the slug (and therefore the
directory in the bucket) to the domain a product is served from.

Thus have the user specify the root domain and the root domain that
Fastly services from and then compute the domain specific to the product
itself.

This suggests that perhaps the Fastly service ID, bucket name, fastly
root domain and doc root domain should all be refactored into a single
row of a separate table (i.e., a 'site settings').
This regex allows lowercase letters, numbers and hyphens to be used as
the slug (and thus the bucket directory and subdomain). However the slug
must start with a letter and can't end with a hyphen.
For consistency with the environment-variable based production config.
This takes advantage of the Travis CI AWS credentials set up for the S3
unit tests. With this we have now deprecated all integration tests;
everything is run by py.test/Travis.
Tests to ensure that the AWS credentials are in place (to skip for
testing, for example).
When a build is uploaded we query the edition table for editions that
track the git refs matching the git refs of the build that was just
uploaded.

Includes validation in a test.
Setting the editions published url (e.g. product.lsst.io/v/edition-slug)
from the slug rather than allowing a url to independently set allows us
to use regex at the Fastly level to point URLs at the S3 bucket
directories.
Since the slug is tied to bucket directories this involves copying the
old S3 directory for the edition to the new directory specified by the
slug and then deleting the old edition directory.
Like for editions, the published_url is driven by the slug of the build
and domain of the product. This field is dynamically generated and
maintains consistency with the Fastly URL layout.
Slug is main; title is Latest, tracks master branch
'main' edition is always published at the product's root URL rather than
in the /v/ subdirectory.
coveralls.io consumes the coverage data from py.test --cov and display
it in a useful format after a Travis run.
is_authorized is intended to be used in contexts where you want a
True/False boolean indicating a user has permission. This function helps
in cases where IGNORE_AUTH=True is set, or a user has not authenticated
at all.
This function can be used to automatically create an edition slug from a
build's git_refs. I've added a customization to deal with LSST DM's
tickets/DM-# ticket branches, creating slugs like DM-1234.

Also adds a slug validator for editions/builds.

I'll allow editions and builds to have more lax slugs (can start with
numbers; can use uppercase in their names) than product slugs since
product slugs are used as a subdomain.
This is only done if the user posting the build as admin_editions powers
and an edition does not already exist that tracks this git ref set.
For non-SSL contexts it was necessary to prepend (e.g.
pipelines.lsst.io) to the Fastly CDN domain. However, with TLS we don't
do this. It would be nice to create a switch to allow for deployments
that don't use the TLS version of Fastly... consider this for later.
Mixed up the username and password secrets in the Kubernetes deployment
secrets.
Originally for routes I placed the permissions_required decorator above
the login_required decorator. I had assumed by conventional Python
wisdom that the decorators would be applied from the bottom up, i.e.,

login -> permissions -> route decorator

However it seems that the opposite was happending:

route decorator -> permissions -> login

This mean that the permissions decorator was failing because the user
wasn't set yet by the login method. Thus I've reversed the order of the
login and permissions decorators. It seems to work correctly now.
- PRODUCT_SLUG_PATTERN: remove unnecessary grouping and ensure that
  product begins with at least one lower case letter and ends with a
  letter or number
- PATH_SLUG_PATTERN: use + to ensure at least a match
- TICKET_BRANCH_PATTERN: again use + to ensure a match in ticket number
The GET methods for the collections of builds and editions wasn't
properly filtering by product slug. This commit adds true joins between
the Build and Product (and Edition and Product) tables and then filters
the joined Product against the slug supplied in the URL.

D'oh! This should have been tested earlier.
Using create_all() conflicts with Alembic migrations. Since migrations
are created in development mode this usage of create_all is
counter-productive. Instead we'll settle on creating databases only
through alembic db upgrades.
- Add Flask-Migrate 1.8 requirement
- Hook up Flask-Migrate in run.py
- Add migrations/ via ./run.py db init (only needed this once)
- Document usage in run.py
- Have flake8 ignore migrations directory
This migration creates the DB from scratch to the configuration
currently specified in code.
Previously the s3.copy_directory function would only ensure that
metadata was replicated from the original object to a new object.
Now we have a use case for Editions where the original surrogate-key
needs to be replaced with one standard for that edition.

This patch ensures that old metadata is propagated, while still updating
surrogate-key.

Note that this requires an extra HEAD request on each object before
copying it.
Previously Editions inherited the surrogate key from the source build.
The problem with this is that it permits some degree of unreliability if
the surrogate key for objects actually being served by Fastly for that
edition gets out of sync with the surrogate key for the last used build.
Having a stable surrogate keys for each edition obviates this entire
vector of bugs and makes it easy to manually purge an edition.

- Adds surrogate_key column to editions.
- Editions populate their own surrogate key on their next rebuild (once
  all production editions have this key the column should be set to
  nullable=false).
- Use new s3.copy_directory option to specify a surrogate_key.

In production all editions may need to be rebuilt and then do a Fastly
purge on everything to get everything in sync.
@coveralls
Copy link

coveralls commented May 2, 2016

Coverage Status

Changes Unknown when pulling 4bd9a43 on tickets/DM-5901 into * on master*.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants