-
Notifications
You must be signed in to change notification settings - Fork 0
Worklog
tavlean edited this page Jul 2, 2026
·
3 revisions
Newest entries first. Every session that changes the extension appends: date, what changed, why, gotchas for the next model. (Migrated from the extension repo's docs/WORKLOG.md when docs moved to this wiki, 2026-07-02.)
2026-07-02 (on GitHub) — Live API verified; prod content-type bug caught & fixed; docs policy codified
- Tav published
tavlean/rankedagi-raycastand pushed the site commits.https://rankedagi.com/api/exportis LIVE and verified: 208 models / 78 benchmarks / 13 families with levels; the empty-levels fix is in effect (Sonnet 5 ships no levels array). Roadmap + Data-Contract updated; the extension now works against production with zero configuration. -
Prod-only bug caught during verification: the static host serves the payload as
application/octet-stream, anduseFetch's default parser falls back totext()for non-JSON content types — every list would have rendered empty in production while local dev (proper JSON header) worked. Fixed with an explicitparseResponse→response.json()inuseDataset()(extension commitd153d8e); the AI-tools path already parsed explicitly. Gotcha now in Data Contract: consumers must never rely on this endpoint's content-type. - README rewritten store-facing (localhost dev instructions removed — endpoint is live); Data URL preference description genericized (was carrying a session-specific port).
-
CLAUDE.mdadded to the extension repo with Tav's docs policy: read the wiki before starting any work; update affected pages in the same session work lands; create new pages when needed; stale docs are bugs — fix them on sight; twin-doc rule for contract changes. (CLAUDE.md is excluded from the store PR at R4, noted there.) - Site repo:
docs/api-export.mdgained the content-type note (twin-doc rule). This wiki pushed togithub.com/tavlean/rankedagi-raycast.wiki.git.
- Tav: a repo named just "rankedagi" in GitHub Desktop would be indistinguishable from the main site. Renamed: extension repo →
rankedagi-raycast, this wiki →rankedagi-raycast.wiki(GitHub pairs a wiki to its repo as<repo>.wiki.git, so the names must stay in lockstep). Every path reference updated across all three repos + session memory. -
The Raycast manifest
namestays"rankedagi"— that's the extension's Store slug and its folder name insideraycast/extensions; it is independent of the GitHub repo name. Don't "fix" it to match. - Ready for GitHub: create
tavlean/rankedagi-raycast, push the extension repo; enable its wiki, then push this repo togithub.com/tavlean/rankedagi-raycast.wiki.git(DevServers flow).
- Tav added test reasoning levels to Sonnet 5 with no benchmark values; Raycast showed a blank Reasoning Levels table. Fixed at BOTH layers, matching the site's own addendum-17 rule (zero-real-value levels never render publicly): site —
/api/exportexcludes reasoning-level rows with zero real benchmark values, and a family whose only levels are empty gets nolevelsarray (site commit930f221, +1 unit test, verified live: sonnet-5 lost its levels array, the 13 real families kept theirs); extension — the levels table renders only columns/levels with ≥1 value and is omitted entirely otherwise (commit8ee126d), so even stale cached data can't draw a blank table. - Docs restructure per Tav (DevServers pattern): the extension repo must stay doc-free — it gets PR'd to the Raycast Store verbatim. Everything moved to this wiki (local git repo until the GitHub repo + wiki exist): Home / Roadmap / Data-Contract / Architecture / Worklog / Raycast-Docs-Research +
api-export-sample.json. Extension repo keeps only extension files plus an untracked.claude/PROJECT_BRIEF.mdpointer. - Site repo gained
docs/api-export.md— the producer-side contract doc Tav asked for ("how is the data exposed?"). Twin pages rule:docs/api-export.md(site) ↔ Data Contract (here) — update both. - R4 unblocked on identity:
author: "tavlean"confirmed via DevServers; forktavlean/raycast-extensionsexists; screenshots convention =metadata/folder.
- Tav: the extension belongs in his
RaycastExtensionsfolder, notTavlean/. Moved (git history intact); all cross-repo path references updated on both sides. The site repo stays at~/Development/Tavlean/RankedAGI. - Roadmap "Later" section restructured to track the site's score-record v2 follow-ons: per-score sources/receipts (nearest-term — the site already serves a slimmed prerendered
/api/score-provenance), confidence/disagreeing-source display, level-semantics re-check after the site's Phase 6.
- Tools
search-models/get-model/rank-models-by-benchmark/compare-modelsinsrc/tools/, manifesttoolsarray,ai.yamlwith instructions + 3 evals. Executed by Codex from a spec brief; review found no bugs this round. Build + lint clean. - Architecture note: tools can't use React hooks, so data flows through
src/lib/dataset.ts(plain fetch +Cachenamespace "dataset", 1 h TTL, stale fallback on fetch failure). Shared logic extracted toconstants.ts/collections.ts/resolve.ts;data.ts(the hook) consumes the same derivations — one source of truth. - Per current docs, AI instructions/evals belong in root
ai.yaml, NOTpackage.json.ai(older examples show the latter — don't regress this).
- Session flow: Tav's brief → Raycast docs research (Raycast Docs Research, cited) → alignment answers locked (separate repo / public Store / v1 = two commands + AI tools) → roadmap → build.
-
R1 (site repo):
/api/exportprerendered endpoint, unit-tested, committed there (3b452b2). Live after the next site deploy; until then use a local site dev server via the Data URL preference. -
R2 (extension repo): full extension built — manifest (2 view commands +
dataUrlpreference),src/lib/data layer (useFetchstale-while-revalidate), Search Models (composite dropdown, rank accessories, detail with composites/benchmarks/levels tables + metadata + links), Search Benchmarks (RAGI Composites section + category sections, top-20 ranking tables).npm run buildandnpm run lintpass clean. - Executed by Codex from a spec-grade brief; review caught one real bug:
linksin the payload are{title, url}OBJECTS, not strings — types + rendering fixed (alsoreasoningEffortis a number). Gotcha: always code against the payload sample (api-export-sample.jsonin this wiki), not assumptions. - Icon:
assets/icon.png(site's 512×512 logo, own background — works light/dark;ray lintresolves icons fromassets/). - Data facts that bite: percent scores are FRACTIONS 0–1 (×100 for display); benchmark display name =
namealone (never name + subtitle — doubles the qualifier);ascending: truemeans lower is better.