Skip to content

Worklog

tavlean edited this page Jul 2, 2026 · 3 revisions

Worklog

Newest entries first. Every session that changes the extension appends: date, what changed, why, gotchas for the next model. (Migrated from the extension repo's docs/WORKLOG.md when docs moved to this wiki, 2026-07-02.)

2026-07-02 (on GitHub) — Live API verified; prod content-type bug caught & fixed; docs policy codified

  • Tav published tavlean/rankedagi-raycast and pushed the site commits. https://rankedagi.com/api/export is LIVE and verified: 208 models / 78 benchmarks / 13 families with levels; the empty-levels fix is in effect (Sonnet 5 ships no levels array). Roadmap + Data-Contract updated; the extension now works against production with zero configuration.
  • Prod-only bug caught during verification: the static host serves the payload as application/octet-stream, and useFetch's default parser falls back to text() for non-JSON content types — every list would have rendered empty in production while local dev (proper JSON header) worked. Fixed with an explicit parseResponseresponse.json() in useDataset() (extension commit d153d8e); the AI-tools path already parsed explicitly. Gotcha now in Data Contract: consumers must never rely on this endpoint's content-type.
  • README rewritten store-facing (localhost dev instructions removed — endpoint is live); Data URL preference description genericized (was carrying a session-specific port).
  • CLAUDE.md added to the extension repo with Tav's docs policy: read the wiki before starting any work; update affected pages in the same session work lands; create new pages when needed; stale docs are bugs — fix them on sight; twin-doc rule for contract changes. (CLAUDE.md is excluded from the store PR at R4, noted there.)
  • Site repo: docs/api-export.md gained the content-type note (twin-doc rule). This wiki pushed to github.com/tavlean/rankedagi-raycast.wiki.git.

2026-07-02 (late night) — Repos renamed for GitHub clarity

  • Tav: a repo named just "rankedagi" in GitHub Desktop would be indistinguishable from the main site. Renamed: extension repo → rankedagi-raycast, this wiki → rankedagi-raycast.wiki (GitHub pairs a wiki to its repo as <repo>.wiki.git, so the names must stay in lockstep). Every path reference updated across all three repos + session memory.
  • The Raycast manifest name stays "rankedagi" — that's the extension's Store slug and its folder name inside raycast/extensions; it is independent of the GitHub repo name. Don't "fix" it to match.
  • Ready for GitHub: create tavlean/rankedagi-raycast, push the extension repo; enable its wiki, then push this repo to github.com/tavlean/rankedagi-raycast.wiki.git (DevServers flow).

2026-07-02 (night) — Empty-levels fix (both layers) + docs moved to this wiki

  • Tav added test reasoning levels to Sonnet 5 with no benchmark values; Raycast showed a blank Reasoning Levels table. Fixed at BOTH layers, matching the site's own addendum-17 rule (zero-real-value levels never render publicly): site/api/export excludes reasoning-level rows with zero real benchmark values, and a family whose only levels are empty gets no levels array (site commit 930f221, +1 unit test, verified live: sonnet-5 lost its levels array, the 13 real families kept theirs); extension — the levels table renders only columns/levels with ≥1 value and is omitted entirely otherwise (commit 8ee126d), so even stale cached data can't draw a blank table.
  • Docs restructure per Tav (DevServers pattern): the extension repo must stay doc-free — it gets PR'd to the Raycast Store verbatim. Everything moved to this wiki (local git repo until the GitHub repo + wiki exist): Home / Roadmap / Data-Contract / Architecture / Worklog / Raycast-Docs-Research + api-export-sample.json. Extension repo keeps only extension files plus an untracked .claude/PROJECT_BRIEF.md pointer.
  • Site repo gained docs/api-export.md — the producer-side contract doc Tav asked for ("how is the data exposed?"). Twin pages rule: docs/api-export.md (site) ↔ Data Contract (here) — update both.
  • R4 unblocked on identity: author: "tavlean" confirmed via DevServers; fork tavlean/raycast-extensions exists; screenshots convention = metadata/ folder.

2026-07-02 (evening) — Repo moved to ~/Development/RaycastExtensions/rankedagi

  • Tav: the extension belongs in his RaycastExtensions folder, not Tavlean/. Moved (git history intact); all cross-repo path references updated on both sides. The site repo stays at ~/Development/Tavlean/RankedAGI.
  • Roadmap "Later" section restructured to track the site's score-record v2 follow-ons: per-score sources/receipts (nearest-term — the site already serves a slimmed prerendered /api/score-provenance), confidence/disagreeing-source display, level-semantics re-check after the site's Phase 6.

2026-07-02 (later, same session) — R3 built: four AI tools + ai.yaml

  • Tools search-models / get-model / rank-models-by-benchmark / compare-models in src/tools/, manifest tools array, ai.yaml with instructions + 3 evals. Executed by Codex from a spec brief; review found no bugs this round. Build + lint clean.
  • Architecture note: tools can't use React hooks, so data flows through src/lib/dataset.ts (plain fetch + Cache namespace "dataset", 1 h TTL, stale fallback on fetch failure). Shared logic extracted to constants.ts / collections.ts / resolve.ts; data.ts (the hook) consumes the same derivations — one source of truth.
  • Per current docs, AI instructions/evals belong in root ai.yaml, NOT package.json.ai (older examples show the latter — don't regress this).

2026-07-02 — Repo founded; R1 + R2 built (extension works headless; Raycast try pending)

  • Session flow: Tav's brief → Raycast docs research (Raycast Docs Research, cited) → alignment answers locked (separate repo / public Store / v1 = two commands + AI tools) → roadmap → build.
  • R1 (site repo): /api/export prerendered endpoint, unit-tested, committed there (3b452b2). Live after the next site deploy; until then use a local site dev server via the Data URL preference.
  • R2 (extension repo): full extension built — manifest (2 view commands + dataUrl preference), src/lib/ data layer (useFetch stale-while-revalidate), Search Models (composite dropdown, rank accessories, detail with composites/benchmarks/levels tables + metadata + links), Search Benchmarks (RAGI Composites section + category sections, top-20 ranking tables). npm run build and npm run lint pass clean.
  • Executed by Codex from a spec-grade brief; review caught one real bug: links in the payload are {title, url} OBJECTS, not strings — types + rendering fixed (also reasoningEffort is a number). Gotcha: always code against the payload sample (api-export-sample.json in this wiki), not assumptions.
  • Icon: assets/icon.png (site's 512×512 logo, own background — works light/dark; ray lint resolves icons from assets/).
  • Data facts that bite: percent scores are FRACTIONS 0–1 (×100 for display); benchmark display name = name alone (never name + subtitle — doubles the qualifier); ascending: true means lower is better.