Skip to content

Rework error handling#374

Merged
risicle merged 3 commits intomainfrom
ris-error-handling
May 7, 2025
Merged

Rework error handling#374
risicle merged 3 commits intomainfrom
ris-error-handling

Conversation

@risicle
Copy link
Member

@risicle risicle commented Apr 29, 2025

https://trello.com/c/EmuRWKZy/1037-experiment-with-removing-mrap-from-document-downloads-bucket

Best reviewed commit-by-commit.

This gets document-download-api acting more-or-less sensibly when encountering missing documents and or documents that are supposed to be expired.

Will require some frontend changes to adapt to the endpoints returning anything other than 400 for these cases. (See alphagov/document-download-frontend#292)

The app-logic Expiration handling added here is a straight port of the logic that the frontend currently implements, though it could stand to be a lot stronger, potentially directly interpreting the retention-period tags and comparing them against a tag-stored last-modified timestamp (which doesn't exist yet). Doing this would reduce our 100% dependence on the correct operation of the lifecycle rules and the s3 native last-modified timestamp (which isn't guaranteed to always be there if we e.g. use s3 backup). But as I say, that stuff isn't done yet - this PR just ports the existing logic across.

risicle added 3 commits April 29, 2025 12:09
don't squash all store errors to 400 status code - if the intention
was to prevent information leak, it failed to do this because it
included the error string in the response. do this by making
DocumentStoreError a superclass of other store exception types so
exception handlers can continue to listen for the general
DocumentStoreError.

return 410 for keys that have a delete marker, 404 for documents
with no such key, 403 for documents that are blocked.

notably also catch NoSuchKey from check_for_blocked_document's
get_object_tagging calls and turn this into a DocumentNotFound
error, because in practise this will be the first problem a
request will encounter when trying to access a missing document.
…ation gracefully

instead of causing a crash this should simply return the metadata
with the expiration missing, allowing the client to make up its
own mind what to do. in most normal cases this is only used for
informational display, and in case of a maintenance issue or
misconfiguration it's better to have this information missing
rather than being unable to serve any documents to anyone (a
certain P1)
… app logic and Expiration metadata

this check was previously done in document-download-frontend, but
it really belongs in the document-download-api if anywhere, if
only because the api's download endpoints are publicly accessible
and a document denied access by just the frontend could always
just be downloaded via the api.

note this check still relies on the s3 bucket's lifecycle policies
being in place and implicitly the s3 object's last modified
timestamp. this will only "rescue" us in cases where a delete
marker has not yet been added by the lifecycle rile for some reason.
@risicle risicle marked this pull request as draft April 29, 2025 12:14
@risicle risicle marked this pull request as ready for review May 1, 2025 08:52
Copy link

@robinjam robinjam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a massive improvement, nice one

@risicle risicle merged commit af97c0b into main May 7, 2025
3 checks passed
@risicle risicle deleted the ris-error-handling branch May 7, 2025 14:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants