Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI documentation state change events (attempt 2) #237

Closed
wants to merge 1 commit into from

Conversation

tayloraswift
Copy link
Owner

@tayloraswift tayloraswift commented May 26, 2024

so, this has proven to be far more challenging than initially thought (#235). some notes in no particular order:

  • we really should be sharding build logs by date rather than by package. this is something we need to do regardless of the state change APIs.

    • build logs should be per-edition, not per-package.
  • our API clients (e.g. https://swiftonserver.com) want a linear storyline that goes something like “Fetch → Build → Link → Render”, however it seems everything in the database is laid out to inhibit this sort of workflow. in retrospect, this is unsurprising because these four components were designed to operate independently, and it is extraordinarily difficult to reliably collate activity across the various subsystems into a coherent “plot line”.

  • each subsystem is designed to keep operating even if the other subsystems are down or disabled, and to recover the backlog when they have been offline for some time (e.g. for maintenance). for external clients, this means it is easy to “lose the plot” so to speak.

  • we have two choices of collection we could conceivably subscribe to broadcast the state change events - Snapshots and VolumeMetadata. both schema are designed to be written atomically, and neither of them have the capability to represent build state, including build failure. it would not be feasible to give them the ability to represent build state, because their schema are very hostile to the sort of streaming updates that build state involves.

  • in practical terms, subscribing to them means API clients would only receive notifications of successful builds; they would hang forever if any of the steps in the documentation pipeline fail. in a sense, this is how Unidoc was originally designed to operate. for example, it is supposed to be okay to build symbol graphs even if the linker is offline - the linker may come back online again later and process the backlogged symbol graph.

  • currently, the way we track build state is by streaming the updates to BuildMetadata. this collection is per-package, because the version to build is not chosen until midway through the build process. subscribing to BuildMetadata would tell us about build failures, but build state isn’t what our API clients are actually interested in - it is the link state that they care about, since they want to render the linked docs. there is also a gap between the time a build completes and the time the docs are ready to render, and there is still the possibility that linking itself could fail.

@tayloraswift
Copy link
Owner Author

i think it is okay to direct API clients to use polling for volume link state - linking usually completes within seconds of a successful build. the polling could take place on a short enough interval that we wouldn’t have to solve the link failure problem at all - clients could simply give up after some number (3?) attempts.

the longest phase of the documentation pipeline is likely to be the compilation phase. to prevent clients from pinging the server at a high frequency during the entire workflow, they could instead subscribe to the build state at a lower frequency using a long-poll mechanism. but this has weird edge cases too. for example, if the build finishes very quickly, it could appear to the client that the build has never started at all. things could also get weird is someone else is already building the same package concurrently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant