Appwrite Arena announcement + documentation#2805
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
WalkthroughReplaces and remaps the project icon set across multiple outputs: SCSS variables, generated CSS/SCSS selectors, and the icons info.json — many icon identifiers are renamed and reassigned to different glyph codepoints; a new Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
src/routes/docs/tooling/arena/+page.markdoc (1)
15-16: Keep the docs page versionless.The
191/165/26counts will stale as soon as Arena adds or rebalances questions. Since this lives under/docs, I'd keep the dated numbers in the announcement post and make this page evergreen.♻️ Suggested wording
-Arena evaluates each model using a pool of **191 questions** spanning **9 Appwrite service categories**: +Arena evaluates each model using a pool of questions spanning Appwrite service categories such as: -165 multiple-choice questions with a single correct answer. These scores are fully reproducible with no judge bias, providing a reliable baseline for comparison across models. +Multiple-choice questions with a single correct answer. These scores are fully reproducible with no judge bias, providing a reliable baseline for comparison across models. -26 open-ended questions scored from 0 to 1 by an AI judge using rubrics and reference answers. These questions test reasoning and real-world usage patterns that cannot be captured by multiple-choice alone. Scores may have slight variance due to the nature of AI-based evaluation. +Open-ended questions scored from 0 to 1 by an AI judge using rubrics and reference answers. These questions test reasoning and real-world usage patterns that cannot be captured by multiple-choice alone. Scores may have slight variance due to the nature of AI-based evaluation.Also applies to: 40-44
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/routes/docs/tooling/arena/`+page.markdoc around lines 15 - 16, Remove hard-coded numeric counts from the docs page line containing "Arena evaluates each model using a pool of **191 questions** spanning **9 Appwrite service categories**" (and the other occurrences of 191 / 165 / 26) and replace them with versionless wording such as "a pool of questions across multiple Appwrite service categories" or "a regularly updated pool of questions spanning Appwrite service categories" so the page remains evergreen; locate these exact phrases in +page.markdoc and update the copy accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/icons/output/web-icon.css`:
- Around line 23-223: The stylesheet is missing a centralized .web-icon-button
utility causing inconsistent styles where components (e.g., usages alongside
.web-icon-search, .web-icon-chevron-down, etc.) currently rely on local
component CSS; add a single .web-icon-button rule to web-icon.css that defines
the shared button container styles (display, size, alignment, cursor,
background/hover states, and accessible focus outline) so all components using
the .web-icon-button class get consistent appearance and behavior; ensure the
selector targets .web-icon-button and nested .web-icon-* pseudo elements
(before) so the icon glyphs render correctly and test in components like
MainFooter and changelog to confirm no local overrides are still required.
---
Nitpick comments:
In `@src/routes/docs/tooling/arena/`+page.markdoc:
- Around line 15-16: Remove hard-coded numeric counts from the docs page line
containing "Arena evaluates each model using a pool of **191 questions**
spanning **9 Appwrite service categories**" (and the other occurrences of 191 /
165 / 26) and replace them with versionless wording such as "a pool of questions
across multiple Appwrite service categories" or "a regularly updated pool of
questions spanning Appwrite service categories" so the page remains evergreen;
locate these exact phrases in +page.markdoc and update the copy accordingly.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: c8f9109c-fe52-4375-8842-68c2c5f1779b
⛔ Files ignored due to path filters (8)
src/icons/optimized/arena.svgis excluded by!**/*.svgsrc/icons/output/web-icon.eotis excluded by!**/*.eotsrc/icons/output/web-icon.svgis excluded by!**/*.svgsrc/icons/output/web-icon.symbol.svgis excluded by!**/*.svgsrc/icons/output/web-icon.ttfis excluded by!**/*.ttfsrc/icons/output/web-icon.woffis excluded by!**/*.woffsrc/icons/output/web-icon.woff2is excluded by!**/*.woff2src/icons/svg/arena.svgis excluded by!**/*.svg
📒 Files selected for processing (10)
src/icons/output/_variables.scsssrc/icons/output/info.jsonsrc/icons/output/web-icon.csssrc/icons/output/web-icon.scsssrc/lib/components/ui/icon/sprite/sprite.sveltesrc/lib/components/ui/icon/types.tssrc/routes/blog/post/announcing-appwrite-arena/+page.markdocsrc/routes/docs/Sidebar.sveltesrc/routes/docs/tooling/arena/+layout.sveltesrc/routes/docs/tooling/arena/+page.markdoc
There was a problem hiding this comment.
Hmm, if those are swords, they are missing handle. Was this AI made, or imported from some icon doc? Might be worth checking with design team
There was a problem hiding this comment.
This was AI generated, in the past (esp. for skills) design team had advised me to use AI to generate these, as these are temporarily placed on the main docs page and will be shifted to AI section of docs soon!
|
|
||
| 165 multiple-choice questions with a single correct answer. These scores are fully reproducible with no judge bias, giving you a reliable baseline for comparing models. | ||
|
|
||
| ## AI-judged (open-ended) |
There was a problem hiding this comment.
Might be worth mentioning if they prefer not to see AI-judged questions as part of benchmark, it can easily be toggled off under "Scoring" filter
| Each model is tested in two contexts: | ||
|
|
||
| - **Without Skills**: The model answers using only its built-in training data. | ||
| - **With Skills**: The model answers with access to Appwrite's [Skills files](/docs/tooling/skills), which provide up-to-date SDK and API context. |
There was a problem hiding this comment.
Not sure if relevant or too technical; We provide skills as two tools - TypeScript, and CLI. System prompt incldues mention if their availability, and we do not require for AI to utilize them, if it doesnt want to.
Current benchmark also only tests JavaScript, with other languages coming on the future.
There was a problem hiding this comment.
I think it would make it a bit hard to understand tbh
This PR adds Appwrite Arena to documentation and adds an announcement blog.
Summary by CodeRabbit
New Features
Chores