-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-5901: Robust Edition purges by having consistent surrogate-keys for each Edition #7
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The surrogate key is intended to be included in the x-amz-meta-surrogate-key header of all objects in S3 belonging to a build. This always us to instantly purge a build from Fastly (for example, when pointing an Edition to a new build). Since surrogate keys only need to be unique, I'm just using a uuid4 expressed as a hex string. That way the value is always just 32 characters and still always unique. A surrogate key is automatically issued by a POST /products/<product>/builds/.
Includes the API key for the Fastly account that the service ID that serves LSST the Docs content from S3. See https://docs.fastly.com/api/config#service for more info about service ids.
The API call is documented at https://docs.fastly.com/api/purge#purge_077dfb4aa07f49792b13c87647415537 When LTD Mason inserts an x-amz-meta-surrogate-key value in the header of build objects. This surrogate key is maintained in the Build resource by Keeper. This allows us to purge the build for an Edition when an edition is re-pointed. - Create a FastlyService client class to handle API calls. - Create a FastlyError exception for API errors. - The FastlyService.purge_key() method specially handles surrogate-key based purges. - Test the purge requests via the responses package. - Add responses as an official dependency.
I haven't settled on logging best practices but this is one approach to logging Fastly purges: 1. info log the request url 2. log the json response on errors Note that I use logging directly rather than the Flask app's logging since the Fastly module may be used/tested outside the app context. We'll have to specifically activate logger's such as the fastly module's later in a logging config step.
- LICENSE for the main license - licenses/ for third-party licenses - licenses/README.rst explains that licenses/ is for third party licenses
This helps keep the configuration environment driven.
This can be used to copy objects from a directory to another in the same bucket. The main use case for this will be in minting new editions from builds.
Implements Edition.rebuild() method (called during a PATCH /edition/) to copy the new build to the edition's directory and purge the previous build from the Fastly cache
- Add encrypted AWS credentials to Travis (.travis.yml) - Add a test for delete_directory that uploads files to S3, deletes a directory and then tests what paths are available on S3.
- Test directory copying against a real S3 bucket. Looks at bucket contents to ensure that the original destination was overwritten with the source contents. - Refactored utility code for uploading test files to S3 based on a list of relative file paths - Tests that assertion errors are raised when the source and destination directories are not independent over each other.
The permission_required decorator needs to be bypassed with the IGNORE_AUTH configuration is set (i.e., in development mode).
Previously a Product resource would be created where the domain a product was served from was specified independently of the slug. With fastly regex-based redirects we need to tie the slug (and therefore the directory in the bucket) to the domain a product is served from. Thus have the user specify the root domain and the root domain that Fastly services from and then compute the domain specific to the product itself. This suggests that perhaps the Fastly service ID, bucket name, fastly root domain and doc root domain should all be refactored into a single row of a separate table (i.e., a 'site settings').
This regex allows lowercase letters, numbers and hyphens to be used as the slug (and thus the bucket directory and subdomain). However the slug must start with a letter and can't end with a hyphen.
For consistency with the environment-variable based production config.
This takes advantage of the Travis CI AWS credentials set up for the S3 unit tests. With this we have now deprecated all integration tests; everything is run by py.test/Travis.
Tests to ensure that the AWS credentials are in place (to skip for testing, for example).
When a build is uploaded we query the edition table for editions that track the git refs matching the git refs of the build that was just uploaded. Includes validation in a test.
Setting the editions published url (e.g. product.lsst.io/v/edition-slug) from the slug rather than allowing a url to independently set allows us to use regex at the Fastly level to point URLs at the S3 bucket directories.
Since the slug is tied to bucket directories this involves copying the old S3 directory for the edition to the new directory specified by the slug and then deleting the old edition directory.
Like for editions, the published_url is driven by the slug of the build and domain of the product. This field is dynamically generated and maintains consistency with the Fastly URL layout.
Slug is main; title is Latest, tracks master branch
'main' edition is always published at the product's root URL rather than in the /v/ subdirectory.
coveralls.io consumes the coverage data from py.test --cov and display it in a useful format after a Travis run.
is_authorized is intended to be used in contexts where you want a True/False boolean indicating a user has permission. This function helps in cases where IGNORE_AUTH=True is set, or a user has not authenticated at all.
This function can be used to automatically create an edition slug from a build's git_refs. I've added a customization to deal with LSST DM's tickets/DM-# ticket branches, creating slugs like DM-1234. Also adds a slug validator for editions/builds. I'll allow editions and builds to have more lax slugs (can start with numbers; can use uppercase in their names) than product slugs since product slugs are used as a subdomain.
This is only done if the user posting the build as admin_editions powers and an edition does not already exist that tracks this git ref set.
For non-SSL contexts it was necessary to prepend (e.g. pipelines.lsst.io) to the Fastly CDN domain. However, with TLS we don't do this. It would be nice to create a switch to allow for deployments that don't use the TLS version of Fastly... consider this for later.
Mixed up the username and password secrets in the Kubernetes deployment secrets.
Originally for routes I placed the permissions_required decorator above the login_required decorator. I had assumed by conventional Python wisdom that the decorators would be applied from the bottom up, i.e., login -> permissions -> route decorator However it seems that the opposite was happending: route decorator -> permissions -> login This mean that the permissions decorator was failing because the user wasn't set yet by the login method. Thus I've reversed the order of the login and permissions decorators. It seems to work correctly now.
- PRODUCT_SLUG_PATTERN: remove unnecessary grouping and ensure that product begins with at least one lower case letter and ends with a letter or number - PATH_SLUG_PATTERN: use + to ensure at least a match - TICKET_BRANCH_PATTERN: again use + to ensure a match in ticket number
The GET methods for the collections of builds and editions wasn't properly filtering by product slug. This commit adds true joins between the Build and Product (and Edition and Product) tables and then filters the joined Product against the slug supplied in the URL. D'oh! This should have been tested earlier.
Using create_all() conflicts with Alembic migrations. Since migrations are created in development mode this usage of create_all is counter-productive. Instead we'll settle on creating databases only through alembic db upgrades.
- Add Flask-Migrate 1.8 requirement - Hook up Flask-Migrate in run.py - Add migrations/ via ./run.py db init (only needed this once) - Document usage in run.py - Have flake8 ignore migrations directory
This migration creates the DB from scratch to the configuration currently specified in code.
Includes db migration.
Previously the s3.copy_directory function would only ensure that metadata was replicated from the original object to a new object. Now we have a use case for Editions where the original surrogate-key needs to be replaced with one standard for that edition. This patch ensures that old metadata is propagated, while still updating surrogate-key. Note that this requires an extra HEAD request on each object before copying it.
Previously Editions inherited the surrogate key from the source build. The problem with this is that it permits some degree of unreliability if the surrogate key for objects actually being served by Fastly for that edition gets out of sync with the surrogate key for the last used build. Having a stable surrogate keys for each edition obviates this entire vector of bugs and makes it easy to manually purge an edition. - Adds surrogate_key column to editions. - Editions populate their own surrogate key on their next rebuild (once all production editions have this key the column should be set to nullable=false). - Use new s3.copy_directory option to specify a surrogate_key. In production all editions may need to be rebuilt and then do a Fastly purge on everything to get everything in sync.
Changes Unknown when pulling 4bd9a43 on tickets/DM-5901 into * on master*. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
LTD Keeper needs to purge Fastly when an Edition is rebuilt. Currently the surrogate-key for the build is also used to cover editions. This means that the key needed to purge an edition is the same as that for an build. Hence purging an edition means that the system needs to purge the surrogate key of the previous build.
We're seeing situations where the surrogate key that Keeper is purging is not the one that needs to be purged. A more robust configuration would be for each edition to have a stable surrogate-key that can be unambiguously purged.
This PR