Skip to content

ReleaseDraft: upload diagnostics panel#81

Open
AudioSkyWalker wants to merge 2 commits intomainfrom
draft-upload-diagnostics
Open

ReleaseDraft: upload diagnostics panel#81
AudioSkyWalker wants to merge 2 commits intomainfrom
draft-upload-diagnostics

Conversation

@AudioSkyWalker
Copy link
Copy Markdown
Contributor

Summary

Surfaces the upload/finalize/preview logs delivery-kid already persists on each content draft, so a failed transcode or pin shows what actually broke instead of a static "Preview transcoding failed" one-liner.

The plumbing exists — Justin's recent maybelle-config commits (f6ef786 persist upload + finalize logs, ada00c1 capture coconut progress events as preview_log) write upload_log, finalize_log, and preview_log to draft.json and expose them via GET /draft-content/{id}. The wiki side wasn't reading any of them. This PR closes that gap.

What it does

  • Adds a <div id="rd-diagnostics"> placeholder right after the status banner in ReleaseDraftContentHandler::fillParserOutput()
  • New initDiagnostics() in releaseDraft.js fetches /draft-content/{draft_id} on page load, polls every 10s while in-flight, stops once the draft reaches a terminal state
  • Renders:
    • Red banner + auto-expanded details when status is upload_failed / finalize_failed, or preview_status is failed — banner shows the last error field, details list every log entry with timestamp/stage/message and a monospace <pre> for structured error text
    • Amber banner + collapsed details while uploading / finalizing / preview transcoding
    • Hidden when the draft is clean and has no logs
  • Album drafts use /draft-album/{id} (which doesn't expose these fields) and are skipped — only content-typed drafts (video, blue-railroad, other) get the panel

Preview

Standalone HTML rendering all five states (clean / in-flight / upload_failed / finalize_failed / preview_status: failed) against synthetic fixture data, using the verbatim renderer + CSS that ships in this PR:

https://skyler2.hunter.cryptograss.live

Test plan

  • Special:DeliverVideo upload of a clean clip → ReleaseDraft page loads, amber "Preview transcoding in progress…" banner appears, log entries stream as Coconut webhooks fire, panel collapses (or hides) once preview_status: ready
  • Upload an unparseable file (e.g. .txt renamed to .mp4) → red "Upload failed" banner with the ffprobe error from upload_log, details auto-expanded
  • Finalize with a deliberately bad strategy (e.g. force HEVC source through a Coconut tier that doesn't support it) → red "Finalize failed" banner pulled from finalize_log
  • Album draft (Special:DeliverRecord) → no panel rendered (intentional — /draft-album doesn't expose logs)
  • Clean finalized draft → no panel, no clutter

Future polish (not in this PR)

  • Friendlier timestamps ("5 min ago" vs raw ISO 8601) and friendlier status labels
  • "Delete this draft" / "Try again" affordance next to the error banner so users can act on a failed upload without leaving detritus
  • Collapse runs of Coconut: job.progress events into the latest one

…anel

Justin's pinning service already persists upload_log, finalize_log, and
preview_log on each content draft and exposes them via /draft-content/{id}.
The wiki side wasn't reading them — a failed transcode or pin showed up
on the draft page as a single static "Preview transcoding failed" line
with no detail.

Adds an Upload Diagnostics panel (placeholder rendered server-side,
populated by JS on load) that pulls all three logs from delivery-kid:
- Red banner with the last error when status is upload_failed,
  finalize_failed, or preview_status is failed
- Info banner while uploading / finalizing / preview transcoding
- Collapsible <details>, auto-expanded on any *_failed state and
  collapsed otherwise, listing each log entry with timestamp, stage,
  message, progress, and structured error text

Polls /draft-content every 10s while in-flight, then stops once the
draft reaches a terminal state. Album drafts use a different endpoint
without logs and are skipped.
@jMyles
Copy link
Copy Markdown
Contributor

jMyles commented May 6, 2026

So, this pulls information (release draft state and logs, etc) from delivery-kid on page load? Does it like, get stored on pickipedia itself? What happens if delivery-kid is nuked and rebuilt? Will we lose history?

@AudioSkyWalker
Copy link
Copy Markdown
Contributor Author

Yep jswizzle, both right — fetched live from /draft-content/{id} on page load and polled every 10s while in-flight. Nothing gets stored on the wiki side from these fetches; the panel is a read-only viewer over delivery-kid's draft.json.

Which means yes, lil bubba — if delivery-kid gets nuked and rebuilt the diagnostics history goes with it. The ReleaseDraft YAML keeps draft_id + user-supplied metadata, and on successful finalize the Release page captures the CID/gateway URL, but the upload/finalize/preview log trails live only in draft.json. That's the current design — pickipedia treats delivery-kid as the source of truth for in-flight upload state, with the recent 30-day TTL bound.

If we want diagnostic history to survive a delivery-kid rebuild, cleanest path I'd suggest is mirroring on terminal state: when status hits finalized / upload_failed / finalize_failed, the wiki snapshots the logs into the ReleaseDraft YAML via a wiki edit. Doable from the existing finalize flow — showFinalizeResult in releaseDraft.js:822 already writes the page on success, just need to extend it to include the log arrays (and add a similar hook on the upload-error path in deliverVideo.js). Happy to spike that as a follow-up PR if losing history feels painful — just want to know whether you'd rather snapshot it or keep the wiki YAML lean.

@jMyles
Copy link
Copy Markdown
Contributor

jMyles commented May 6, 2026

Yeah, that sounds reasonable. You say that it already writes to the wiki - writes as who?

@AudioSkyWalker
Copy link
Copy Markdown
Contributor Author

Writes as the currently logged-in wiki user, jswizzle — straight mw.Api().postWithEditToken() with the user's session cookie + CSRF token. The user who clicked Upload authors the draft creation (createReleaseDraftPage at deliverVideo.js:304), the user who clicked Finalize authors the post-finalize edit (showFinalizeResult at releaseDraft.js:822). No service identity in this path.

Which has an implication worth flagging before I spike anything, lil bubba — a browser-side snapshot inherits a gap: if the uploader closes the tab before the SSE reaches error / complete, no edit fires and the logs still go down with delivery-kid. Most users will hang around through finalize, but a transcoding failure that surfaces 10 minutes in is squarely in "user has wandered off" territory.

If we want it bulletproof against tab-close, the alternative is delivery-kid writing back as a bot account — the Blue Railroad bot pattern already exists (the LocalSettings has an exempt-from-verification group for trusted bots, and enrich.py already documents that the caller is responsible for the wiki write, so half the muscle is there). Tradeoff is provisioning a bot credential for the pinning service and tightening scope on what a bot is allowed to touch on a ReleaseDraft.

So three flavors to pick from:

  1. Browser-only snapshot — easiest, has the tab-close gap. Fine if we expect rebuilds to be rare and most users stick around for finalize.
  2. delivery-kid writes back as a bot — robust, needs a credential + a scope rule (only release-draft-yaml namespace, only for drafts the bot is finalizing).
  3. Skip the snapshot — accept that diagnostics are ephemeral until/unless a rebuild actually costs us something we want back.

Which way do you want to push?

@jMyles
Copy link
Copy Markdown
Contributor

jMyles commented May 6, 2026

I think 2. Is that easy?

@AudioSkyWalker
Copy link
Copy Markdown
Contributor Author

Honest read, jswizzle — moderate, not a one-liner but most of the muscle's in place. The Blue Railroad import tool already does exactly this dance (tools/blue-railroad-import/blue_railroad_import/wiki_client.py uses mwclient with site.login(username, password)page.save()), so we have a working pattern to crib from. Concrete steps:

  • Bot account + bot password on pickipedia (BotPasswords special page → bot group + exempt-from-verification) — 5 min
  • Credentials in delivery-kid — slot into secrets/vault.yml next to DELIVERY_KID_API_KEY, deployed via existing ansible — 15 min
  • pickipedia_client.py in pinning-service — port the MWClientWrapper pattern from blue-railroad-import — couple hours
  • Hook points at the four terminal-state transitions (upload_failed, finalize_failed, finalized, preview_status: failed/ready) — fire next to where draft.json already gets written — couple hours

The one design call I want to flag before I spike, lil bubba — where do the snapshotted logs land on the wiki page?

  1. Same page, extend the YAMLupload_log:, finalize_log:, preview_log: get appended to the existing ReleaseDraft YAML. Pro: one fetch, JS already has it. Con: the bot and the user are both writing the same page. If the user is editing trim points or venue when the bot fires, we get an edit conflict — the wiki API does base-revision detection but we'd need a retry-with-merge loop on the bot side.

  2. Sub-pageReleaseDraft:<id>/diagnostics with raw log JSON. Pro: zero conflict surface, bot owns its own page. Con: extra fetch from JS, slightly more rendering glue.

I lean sub-page on the conflict-avoidance argument — bots stomping in-progress user edits is the kind of thing that becomes a recurring papercut, and ReleaseDraft:<id>/diagnostics is also a clean place to land machine-readable run history without polluting the human-edited YAML. But if you'd rather keep everything in one YAML and accept the conflict-handling complexity, I can wire that up too.

Total estimate: half a day for sub-page, a full day if we go same-page-with-conflict-handling. Want me to spike it as a follow-up PR after this one merges?

@jMyles
Copy link
Copy Markdown
Contributor

jMyles commented May 6, 2026

Yeah sub-page sounds good. Make it so.

@AudioSkyWalker
Copy link
Copy Markdown
Contributor Author

Roger that jswizzle — sub-page it is. Spike list, in order:

  1. Bot account + bot password on pickipedia (DeliveryKidBot or whatever you'd prefer to call it) in the bot + exempt-from-verification groups
  2. Creds into maybelle-config/secrets/vault.yml, surfaced as PICKIPEDIA_BOT_PASSWORD in the pinning-service env
  3. pickipedia_client.py in delivery-kid/pinning-service/app/services/ — port the MWClientWrapper pattern from blue-railroad-import/wiki_client.py
  4. Hooks at the four terminal-state transitions in routes/content.py and routes/coconut.py — write to ReleaseDraft:<id>/diagnostics with the persisted upload_log / finalize_log / preview_log as raw JSON
  5. Wire the diagnostics-panel JS in this PR to merge live /draft-content data with the snapshotted sub-page (live wins while the draft is in-flight, sub-page fills in once delivery-kid forgets)

Quick housekeeping question for you, lil bubba — good to merge this PR as-is (read-only viewer over delivery-kid's live state) and do the snapshot persistence as a clean follow-up PR? Or hold this open and bundle?

@jMyles
Copy link
Copy Markdown
Contributor

jMyles commented May 6, 2026

I think let's see what you can do for the bot pattern/sub page first, then we'll merge.

@AudioSkyWalker
Copy link
Copy Markdown
Contributor Author

On it jswizzle. Couple of gates I'll need your hand on to land it end-to-end:

  1. Bot account on pickipedia.xyzDeliveryKidBot (or whatever name you'd prefer) provisioned with bot + exempt-from-verification group memberships, plus a BotPasswords-issued password. The account creation + group grants need an admin; I can prep instructions but can't do it solo.
  2. Vault password for maybelle-config/secrets/vault.yml — to slot the bot password in for ansible to pick up at deploy time.

Solo work I can spike right now without waiting on those, lil bubba:

  • delivery-kid/pinning-service/app/services/pickipedia_client.pymwclient-based wiki write helper, ported from the MWClientWrapper in blue-railroad-import/wiki_client.py. Reads bot creds from env (PICKIPEDIA_BOT_USER / PICKIPEDIA_BOT_PASSWORD); no-op gracefully when unset so dev environments without the creds don't crash
  • Terminal-state hooks in routes/content.py and routes/coconut.py that call the client on upload_failed / finalize_failed / finalized / preview_status: ready|failed, writing JSON to ReleaseDraft:<id>/diagnostics
  • Tests in pinning-service/tests/test_pickipedia_client.py patterning off the existing test_coconut.py / test_staging.py
  • This PR's JS extended to merge live /draft-content with the snapshotted ReleaseDraft:<id>/diagnostics sub-page — live wins while the draft is in-flight, sub-page fills in once delivery-kid forgets

The maybelle-config side will land as a separate PR over there; the wiki-side JS merge gets added to this PR. I'll ping back here when both are up so you can eyeball before granting the bot account.

Heading in now.

@jMyles
Copy link
Copy Markdown
Contributor

jMyles commented May 6, 2026

No no, just use the existing Blue Railroad Imports.

@AudioSkyWalker
Copy link
Copy Markdown
Contributor Author

Heard jswizzle — calling off the bot recruitment drive.

Blue Railroad bot is hereby promoted from "hauls Blue Railroad videos in" to "also writes the after-action report on every draft she touches." Same engine, second job, no raise. She'll figure it out — she's a workhorse.

Cuts the spike from ~half a day to maybe two hours, lil bubba — skipping the bot provisioning + vault dance entirely, just porting the creds + wiki_client.py pattern straight over. The cryptograss civil service is spared the addition of a redundant bureaucrat, and I'm spared explaining to a freshly-minted DeliveryKidBot why her LinkedIn says she's been employed for negative four hours.

Heading in.

…draft

When initDiagnostics' fetch of /draft-content/{id} returns 404, fall
back to reading ReleaseDraft:{id}/diagnostics via the MediaWiki API.
That sub-page is written by delivery-kid's pickipedia_client at every
terminal-state transition (cryptograss/maybelle-config branch
delivery-kid-snapshot-diagnostics) and survives a delivery-kid storage
rebuild.

The renderer marks snapshot-sourced data with "(snapshot)" in the
disclosure summary and surfaces snapshot_at + a "wiki snapshot —
delivery-kid no longer has live data" note in the meta block, so users
know they're looking at a frozen view rather than live state.
@AudioSkyWalker
Copy link
Copy Markdown
Contributor Author

Spike's up jswizzle. Two PRs to look at:

  • delivery-kid side: cryptograss/maybelle-config#86pickipedia_client.py, hooks at the four terminal states, ansible env-var plumbing, 14 new tests (existing 20 still pass, all 34 green)
  • wiki side (this PR): new commit 82814c4 adds the snapshot fallback to initDiagnostics — when /draft-content returns 404 the JS pulls ReleaseDraft:{id}/diagnostics via the MediaWiki API and renders it with a "(snapshot)" disclosure label + a "wiki snapshot — delivery-kid no longer has live data" note in the meta block

Bot reuse landed exactly as you called it, lil bubba — Blue Railroad's Magent@magent BotPassword does double duty, no new credential, no vault dance. The pinning-service no-ops cleanly when the password env var is unset so dev environments without creds still run.

Hooks fire asyncio.create_task fire-and-forget at each terminal transition — wiki blips never block the route or mask the underlying upload outcome. The finalize finally block snapshots BEFORE the rmtree runs on success (since the in-memory state is the only copy of finalize_log once draft.json is wiped).

Ready for review on both. Once they merge, redeploy delivery-kid with --rebuild so the new mwclient dep gets pulled into the image, and you should see ReleaseDraft:<uuid>/diagnostics pages start materializing on terminal states.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants