Skip to content

test: evals audit datasets to dev-facing cases#16433

Merged
denolfe merged 4 commits intomainfrom
ai/evals-audit-datasets
Apr 30, 2026
Merged

test: evals audit datasets to dev-facing cases#16433
denolfe merged 4 commits intomainfrom
ai/evals-audit-datasets

Conversation

@denolfe
Copy link
Copy Markdown
Member

@denolfe denolfe commented Apr 30, 2026

Overview

Trims test/evals/datasets/ so the eval suites measure knowledge a developer applies while building an application with Payload, not knowledge a Payload-monorepo contributor needs. Also adds shorthand npm scripts for running individual eval suites.

Key Changes

  • Trimmed conventions/qa.ts from 10 cases to 1

    • Dropped 9 cases lifted from CLAUDE.md (types vs interfaces, boolean naming, function vs class, translation paths, afterEach cleanup, conventional-commits, dev-server flags, auto-login creds, single-object-parameter convention).
    • Kept the payload.logger.error shape case, the only one that describes a call shape a Payload consumer writes in their own code.
  • Removed plugins/official/qa.ts

    • 11 reference-doc QA cases ("what does plugin X do") testing recall, not application. The borderline MCP-config case is already covered by plugins/official/codegen.ts via real code generation.
    • eval.official-plugins.spec.ts updated to drop the QA registration; codegen registration unchanged.
  • Corrected the audience map in EvalDashboard/audience.ts

    • negative retagged from maintainers to users. Six of seven retained negative cases are dev-facing (debugging your own broken config); the map can't split sub-arrays, so users is the better representative tag.
    • Removed three category keys (commits, structure, testing) that no longer appear in any dataset after the conventions trim.
  • Added test:eval:<suite> shorthand scripts

    • One per suite (building-plugins, collections, config, conventions, fields, graphql, local-api, negative, official-plugins, rest-api). Each delegates to the :skill variant, matching the project-wide default.

Design Decisions

The dividing line is "would a developer consuming payload from npm encounter this?" If no, the case is contributor-only and removed.

Three pre-existing categories were intentionally kept in scope but untouched:

  • negative/codegen.ts negativeInvalidInstructionDataset is an eval-pipeline self-test (it verifies tsc rejects bad types) and is preserved as-is.
  • plugins/qa.ts and plugins/codegen.ts stay because developers may colocate plugins inside their own project structure.
  • Other dead audience-map keys ('access-control', admin, 'building-plugins', conventions, hooks, 'official-plugins', translations) were dead before this audit and were left to keep the diff focused.

conventions/qa.ts and eval.conventions.spec.ts are kept rather than deleted so the surviving coding-category case still runs as a registered suite.


@denolfe denolfe changed the title test(evals): audit datasets to dev-facing cases test: audit datasets to dev-facing cases Apr 30, 2026
@denolfe denolfe changed the title test: audit datasets to dev-facing cases test: evals audit datasets to dev-facing cases Apr 30, 2026
@github-actions
Copy link
Copy Markdown
Contributor

📦 esbuild Bundle Analysis for payload

This analysis was generated by esbuild-bundle-analyzer. 🤖

Meta File Out File Size (raw) Note
packages/next/meta_index.json esbuild/index.js 985.42 KB 🆕 Added
packages/payload/meta_index.json esbuild/index.js 1.39 MB 🆕 Added
packages/payload/meta_shared.json esbuild/exports/shared.js 191.30 KB 🆕 Added
packages/richtext-lexical/meta_client.json esbuild/exports/client_optimized/index.js 287.18 KB 🆕 Added
packages/ui/meta_client.json esbuild/exports/client_optimized/index.js 1.19 MB 🆕 Added
packages/ui/meta_shared.json esbuild/exports/shared_optimized/index.js 16.32 KB 🆕 Added
Largest paths These visualization shows top 20 largest paths in the bundle.

Meta file: packages/next/meta_index.json, Out file: esbuild/index.js

Path Size
../../node_modules ${{\color{Goldenrod}{ ████████████████████▌ }}}$ 82.3%, 807.63 KB
dist/views/Version ${{\color{Goldenrod}{ █▎ }}}$ 5.3%, 51.49 KB
dist/views/Dashboard ${{\color{Goldenrod}{ ▌ }}}$ 2.2%, 21.38 KB
dist/views/Document ${{\color{Goldenrod}{ ▍ }}}$ 1.7%, 16.59 KB
dist/views/List ${{\color{Goldenrod}{ ▎ }}}$ 1.2%, 11.38 KB
dist/views/Root ${{\color{Goldenrod}{ ▎ }}}$ 1.0%, 9.90 KB
dist/views/Versions ${{\color{Goldenrod}{ ▏ }}}$ 0.6%, 6.17 KB
dist/views/API ${{\color{Goldenrod}{ ▏ }}}$ 0.6%, 6.13 KB
dist/elements/Nav ${{\color{Goldenrod}{ ▏ }}}$ 0.6%, 5.96 KB
dist/views/Account ${{\color{Goldenrod}{ ▏ }}}$ 0.6%, 5.55 KB
dist/elements/DocumentHeader ${{\color{Goldenrod}{ ▏ }}}$ 0.5%, 4.81 KB
dist/views/Login ${{\color{Goldenrod}{ }}}$ 0.4%, 4.40 KB
dist/layouts/Root ${{\color{Goldenrod}{ }}}$ 0.3%, 3.41 KB
dist/views/ForgotPassword ${{\color{Goldenrod}{ }}}$ 0.3%, 3.13 KB
dist/views/CreateFirstUser ${{\color{Goldenrod}{ }}}$ 0.3%, 2.81 KB
dist/templates/Default ${{\color{Goldenrod}{ }}}$ 0.3%, 2.64 KB
dist/views/BrowseByFolder ${{\color{Goldenrod}{ }}}$ 0.3%, 2.61 KB
dist/views/CollectionFolders ${{\color{Goldenrod}{ }}}$ 0.2%, 2.44 KB
dist/views/ResetPassword ${{\color{Goldenrod}{ }}}$ 0.2%, 2.40 KB
dist/views/Logout ${{\color{Goldenrod}{ }}}$ 0.2%, 1.94 KB
(other) ${{\color{Goldenrod}{ ████▍ }}}$ 17.7%, 173.12 KB

Meta file: packages/payload/meta_index.json, Out file: esbuild/index.js

Path Size
../../node_modules ${{\color{Goldenrod}{ █████████████████▏ }}}$ 68.8%, 951.98 KB
dist/fields/hooks ${{\color{Goldenrod}{ ▊ }}}$ 3.2%, 44.07 KB
dist/collections/operations ${{\color{Goldenrod}{ ▋ }}}$ 2.9%, 39.96 KB
dist/versions/migrations ${{\color{Goldenrod}{ ▎ }}}$ 1.3%, 18.50 KB
dist/auth/operations ${{\color{Goldenrod}{ ▎ }}}$ 1.1%, 15.63 KB
dist/fields/config ${{\color{Goldenrod}{ ▎ }}}$ 1.0%, 14.16 KB
dist/globals/operations ${{\color{Goldenrod}{ ▎ }}}$ 1.0%, 13.32 KB
dist/utilities/configToJSONSchema.js ${{\color{Goldenrod}{ ▏ }}}$ 0.9%, 13.13 KB
dist/queues/operations ${{\color{Goldenrod}{ ▏ }}}$ 0.9%, 12.43 KB
dist/fields/validations.js ${{\color{Goldenrod}{ ▏ }}}$ 0.8%, 10.57 KB
dist/bin/generateImportMap ${{\color{Goldenrod}{ ▏ }}}$ 0.7%, 9.08 KB
dist/collections/config ${{\color{Goldenrod}{ ▏ }}}$ 0.6%, 8.91 KB
dist/config/orderable ${{\color{Goldenrod}{ ▏ }}}$ 0.6%, 8.00 KB
dist/uploads/fetchAPI-multipart ${{\color{Goldenrod}{ ▏ }}}$ 0.6%, 7.80 KB
dist/index.js ${{\color{Goldenrod}{ ▏ }}}$ 0.6%, 7.79 KB
dist/database/migrations ${{\color{Goldenrod}{ ▏ }}}$ 0.5%, 7.54 KB
dist/collections/endpoints ${{\color{Goldenrod}{ ▏ }}}$ 0.5%, 6.23 KB
dist/config/sanitize.js ${{\color{Goldenrod}{ }}}$ 0.4%, 5.86 KB
dist/auth/strategies ${{\color{Goldenrod}{ }}}$ 0.4%, 5.50 KB
dist/queues/config ${{\color{Goldenrod}{ }}}$ 0.4%, 5.31 KB
(other) ${{\color{Goldenrod}{ ███████▊ }}}$ 31.2%, 431.87 KB

Meta file: packages/payload/meta_shared.json, Out file: esbuild/exports/shared.js

Path Size
../../node_modules ${{\color{Goldenrod}{ ███████████████████▊ }}}$ 79.4%, 148.89 KB
dist/fields/validations.js ${{\color{Goldenrod}{ █▍ }}}$ 5.6%, 10.57 KB
dist/config/orderable ${{\color{Goldenrod}{ ▍ }}}$ 1.7%, 3.13 KB
dist/fields/baseFields ${{\color{Goldenrod}{ ▍ }}}$ 1.5%, 2.79 KB
dist/utilities/deepCopyObject.js ${{\color{Goldenrod}{ ▎ }}}$ 1.4%, 2.54 KB
dist/auth/cookies.js ${{\color{Goldenrod}{ ▏ }}}$ 0.8%, 1.55 KB
dist/utilities/flattenTopLevelFields.js ${{\color{Goldenrod}{ ▏ }}}$ 0.8%, 1.42 KB
dist/fields/config ${{\color{Goldenrod}{ ▏ }}}$ 0.7%, 1.28 KB
dist/utilities/getVersionsConfig.js ${{\color{Goldenrod}{ ▏ }}}$ 0.6%, 1.04 KB
dist/utilities/flattenAllFields.js ${{\color{Goldenrod}{ ▏ }}}$ 0.5%, 943 B
dist/folders/utils ${{\color{Goldenrod}{ ▏ }}}$ 0.5%, 916 B
dist/utilities/unflatten.js ${{\color{Goldenrod}{ }}}$ 0.4%, 779 B
dist/utilities/sanitizeUserDataForEmail.js ${{\color{Goldenrod}{ }}}$ 0.4%, 713 B
dist/utilities/getFieldPermissions.js ${{\color{Goldenrod}{ }}}$ 0.3%, 651 B
dist/collections/config ${{\color{Goldenrod}{ }}}$ 0.3%, 570 B
dist/bin/generateImportMap ${{\color{Goldenrod}{ }}}$ 0.3%, 561 B
dist/auth/sessions.js ${{\color{Goldenrod}{ }}}$ 0.3%, 525 B
dist/fields/getFieldPaths.js ${{\color{Goldenrod}{ }}}$ 0.3%, 485 B
dist/utilities/getSafeRedirect.js ${{\color{Goldenrod}{ }}}$ 0.2%, 423 B
dist/utilities/deepMerge.js ${{\color{Goldenrod}{ }}}$ 0.2%, 413 B
(other) ${{\color{Goldenrod}{ █████▏ }}}$ 20.6%, 38.74 KB

Meta file: packages/richtext-lexical/meta_client.json, Out file: esbuild/exports/client_optimized/index.js

Path Size
dist/features/blocks ${{\color{Goldenrod}{ ███▏ }}}$ 12.8%, 36.34 KB
dist/lexical/plugins ${{\color{Goldenrod}{ ██▉ }}}$ 11.5%, 32.65 KB
dist/lexical/ui ${{\color{Goldenrod}{ ██▏ }}}$ 8.6%, 24.36 KB
dist/features/experimental_table ${{\color{Goldenrod}{ ██ }}}$ 8.3%, 23.70 KB
dist/packages/@lexical ${{\color{Goldenrod}{ █▋ }}}$ 6.7%, 18.99 KB
dist/features/link ${{\color{Goldenrod}{ █▋ }}}$ 6.5%, 18.53 KB
dist/features/toolbars ${{\color{Goldenrod}{ █▍ }}}$ 5.7%, 16.08 KB
dist/features/upload ${{\color{Goldenrod}{ █▏ }}}$ 4.9%, 13.77 KB
dist/features/textState ${{\color{Goldenrod}{ ▉ }}}$ 3.9%, 11.08 KB
dist/features/relationship ${{\color{Goldenrod}{ ▊ }}}$ 3.2%, 9.03 KB
dist/lexical/utils ${{\color{Goldenrod}{ ▊ }}}$ 3.1%, 8.79 KB
dist/features/converters ${{\color{Goldenrod}{ ▋ }}}$ 2.9%, 8.36 KB
dist/features/debug ${{\color{Goldenrod}{ ▋ }}}$ 2.6%, 7.40 KB
dist/utilities/fieldsDrawer ${{\color{Goldenrod}{ ▋ }}}$ 2.5%, 7.15 KB
dist/lexical/config ${{\color{Goldenrod}{ ▍ }}}$ 1.8%, 5.08 KB
dist/features/lists ${{\color{Goldenrod}{ ▍ }}}$ 1.8%, 5.00 KB
dist/features/format ${{\color{Goldenrod}{ ▎ }}}$ 1.2%, 3.46 KB
dist/lexical/LexicalEditor.js ${{\color{Goldenrod}{ ▎ }}}$ 1.1%, 3.23 KB
dist/field/Field.js ${{\color{Goldenrod}{ ▎ }}}$ 1.0%, 2.81 KB
dist/lexical/nodes ${{\color{Goldenrod}{ ▏ }}}$ 0.9%, 2.66 KB
(other) ${{\color{Goldenrod}{ █████████████████████▊ }}}$ 87.2%, 247.61 KB

Meta file: packages/ui/meta_client.json, Out file: esbuild/exports/client_optimized/index.js

Path Size
../../node_modules ${{\color{Goldenrod}{ ████████████▎ }}}$ 49.2%, 579.12 KB
dist/elements/FolderView ${{\color{Goldenrod}{ ▋ }}}$ 2.5%, 29.38 KB
dist/elements/BulkUpload ${{\color{Goldenrod}{ ▌ }}}$ 2.4%, 28.24 KB
dist/elements/WhereBuilder ${{\color{Goldenrod}{ ▍ }}}$ 1.5%, 17.36 KB
dist/views/Edit ${{\color{Goldenrod}{ ▍ }}}$ 1.5%, 17.30 KB
dist/forms/Form ${{\color{Goldenrod}{ ▎ }}}$ 1.4%, 15.91 KB
dist/fields/Relationship ${{\color{Goldenrod}{ ▎ }}}$ 1.3%, 15.79 KB
dist/elements/Table ${{\color{Goldenrod}{ ▎ }}}$ 1.3%, 15.77 KB
dist/fields/Upload ${{\color{Goldenrod}{ ▎ }}}$ 1.2%, 14.22 KB
dist/fields/Blocks ${{\color{Goldenrod}{ ▎ }}}$ 1.2%, 13.90 KB
dist/elements/QueryPresets ${{\color{Goldenrod}{ ▏ }}}$ 0.9%, 10.36 KB
dist/elements/PublishButton ${{\color{Goldenrod}{ ▏ }}}$ 0.8%, 9.11 KB
dist/providers/Folders ${{\color{Goldenrod}{ ▏ }}}$ 0.7%, 8.46 KB
dist/elements/HTMLDiff ${{\color{Goldenrod}{ ▏ }}}$ 0.7%, 8.38 KB
dist/elements/ListHeader ${{\color{Goldenrod}{ ▏ }}}$ 0.7%, 8.07 KB
dist/fields/Array ${{\color{Goldenrod}{ ▏ }}}$ 0.7%, 7.71 KB
dist/views/CollectionFolder ${{\color{Goldenrod}{ ▏ }}}$ 0.6%, 7.50 KB
dist/views/List ${{\color{Goldenrod}{ ▏ }}}$ 0.6%, 7.36 KB
dist/elements/ReactSelect ${{\color{Goldenrod}{ ▏ }}}$ 0.6%, 7.33 KB
dist/elements/LivePreview ${{\color{Goldenrod}{ ▏ }}}$ 0.6%, 7.03 KB
(other) ${{\color{Goldenrod}{ ████████████▋ }}}$ 50.8%, 597.28 KB

Meta file: packages/ui/meta_shared.json, Out file: esbuild/exports/shared_optimized/index.js

Path Size
dist/graphics/Logo ${{\color{Goldenrod}{ █████ }}}$ 20.0%, 3.12 KB
../../node_modules ${{\color{Goldenrod}{ ████▎ }}}$ 17.0%, 2.65 KB
dist/graphics/Icon ${{\color{Goldenrod}{ ██▍ }}}$ 9.8%, 1.52 KB
dist/utilities/formatDocTitle ${{\color{Goldenrod}{ ██▏ }}}$ 8.5%, 1.32 KB
dist/providers/TableColumns ${{\color{Goldenrod}{ █▍ }}}$ 5.5%, 862 B
dist/utilities/groupNavItems.js ${{\color{Goldenrod}{ █▎ }}}$ 5.2%, 814 B
dist/utilities/getGlobalData.js ${{\color{Goldenrod}{ █▏ }}}$ 4.9%, 762 B
dist/utilities/api.js ${{\color{Goldenrod}{ █▏ }}}$ 4.8%, 756 B
dist/elements/Translation ${{\color{Goldenrod}{ ▊ }}}$ 3.2%, 493 B
dist/utilities/handleTakeOver.js ${{\color{Goldenrod}{ ▋ }}}$ 2.8%, 440 B
dist/utilities/traverseForLocalizedFields.js ${{\color{Goldenrod}{ ▋ }}}$ 2.6%, 399 B
dist/elements/withMergedProps ${{\color{Goldenrod}{ ▌ }}}$ 2.2%, 339 B
dist/utilities/getVisibleEntities.js ${{\color{Goldenrod}{ ▌ }}}$ 2.1%, 329 B
dist/utilities/getNavGroups.js ${{\color{Goldenrod}{ ▍ }}}$ 1.9%, 301 B
dist/elements/WithServerSideProps ${{\color{Goldenrod}{ ▍ }}}$ 1.5%, 232 B
dist/utilities/handleGoBack.js ${{\color{Goldenrod}{ ▎ }}}$ 1.2%, 180 B
dist/fields/mergeFieldStyles.js ${{\color{Goldenrod}{ ▎ }}}$ 1.0%, 159 B
dist/utilities/handleBackToDashboard.js ${{\color{Goldenrod}{ ▎ }}}$ 1.0%, 152 B
dist/forms/Form ${{\color{Goldenrod}{ ▏ }}}$ 0.9%, 147 B
dist/utilities/abortAndIgnore.js ${{\color{Goldenrod}{ ▏ }}}$ 0.9%, 146 B
(other) ${{\color{Goldenrod}{ ████████████████████ }}}$ 80.0%, 12.51 KB
Details

Next to the size is how much the size has increased or decreased compared with the base branch of this PR.

  • ‼️: Size increased by 20% or more. Special attention should be given to this.
  • ⚠️: Size increased in acceptable range (lower than 20%).
  • ✅: No change or even downsized.
  • 🗑️: The out file is deleted: not found in base branch.
  • 🆕: The out file is newly found: will be added to base branch.

@denolfe denolfe marked this pull request as ready for review April 30, 2026 14:59
@denolfe denolfe merged commit f221f6c into main Apr 30, 2026
171 of 172 checks passed
@denolfe denolfe deleted the ai/evals-audit-datasets branch April 30, 2026 15:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant