CI documentation state change events (attempt 2) #237
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
so, this has proven to be far more challenging than initially thought (#235). some notes in no particular order:
we really should be sharding build logs by date rather than by package. this is something we need to do regardless of the state change APIs.
our API clients (e.g. https://swiftonserver.com) want a linear storyline that goes something like “Fetch → Build → Link → Render”, however it seems everything in the database is laid out to inhibit this sort of workflow. in retrospect, this is unsurprising because these four components were designed to operate independently, and it is extraordinarily difficult to reliably collate activity across the various subsystems into a coherent “plot line”.
each subsystem is designed to keep operating even if the other subsystems are down or disabled, and to recover the backlog when they have been offline for some time (e.g. for maintenance). for external clients, this means it is easy to “lose the plot” so to speak.
we have two choices of collection we could conceivably subscribe to broadcast the state change events -
Snapshots
andVolumeMetadata
. both schema are designed to be written atomically, and neither of them have the capability to represent build state, including build failure. it would not be feasible to give them the ability to represent build state, because their schema are very hostile to the sort of streaming updates that build state involves.in practical terms, subscribing to them means API clients would only receive notifications of successful builds; they would hang forever if any of the steps in the documentation pipeline fail. in a sense, this is how Unidoc was originally designed to operate. for example, it is supposed to be okay to build symbol graphs even if the linker is offline - the linker may come back online again later and process the backlogged symbol graph.
currently, the way we track build state is by streaming the updates to
BuildMetadata
. this collection is per-package, because the version to build is not chosen until midway through the build process. subscribing toBuildMetadata
would tell us about build failures, but build state isn’t what our API clients are actually interested in - it is the link state that they care about, since they want to render the linked docs. there is also a gap between the time a build completes and the time the docs are ready to render, and there is still the possibility that linking itself could fail.