test: evals audit datasets to dev-facing cases by denolfe · Pull Request #16433 · payloadcms/payload

denolfe · 2026-04-30T14:44:27Z

Overview

Trims test/evals/datasets/ so the eval suites measure knowledge a developer applies while building an application with Payload, not knowledge a Payload-monorepo contributor needs. Also adds shorthand npm scripts for running individual eval suites.

Key Changes

Trimmed conventions/qa.ts from 10 cases to 1
- Dropped 9 cases lifted from CLAUDE.md (types vs interfaces, boolean naming, function vs class, translation paths, afterEach cleanup, conventional-commits, dev-server flags, auto-login creds, single-object-parameter convention).
- Kept the payload.logger.error shape case, the only one that describes a call shape a Payload consumer writes in their own code.
Removed plugins/official/qa.ts
- 11 reference-doc QA cases ("what does plugin X do") testing recall, not application. The borderline MCP-config case is already covered by plugins/official/codegen.ts via real code generation.
- eval.official-plugins.spec.ts updated to drop the QA registration; codegen registration unchanged.
Corrected the audience map in EvalDashboard/audience.ts
- negative retagged from maintainers to users. Six of seven retained negative cases are dev-facing (debugging your own broken config); the map can't split sub-arrays, so users is the better representative tag.
- Removed three category keys (commits, structure, testing) that no longer appear in any dataset after the conventions trim.
Added test:eval:<suite> shorthand scripts
- One per suite (building-plugins, collections, config, conventions, fields, graphql, local-api, negative, official-plugins, rest-api). Each delegates to the :skill variant, matching the project-wide default.

Design Decisions

The dividing line is "would a developer consuming payload from npm encounter this?" If no, the case is contributor-only and removed.

Three pre-existing categories were intentionally kept in scope but untouched:

negative/codegen.ts negativeInvalidInstructionDataset is an eval-pipeline self-test (it verifies tsc rejects bad types) and is preserved as-is.
plugins/qa.ts and plugins/codegen.ts stay because developers may colocate plugins inside their own project structure.
Other dead audience-map keys ('access-control', admin, 'building-plugins', conventions, hooks, 'official-plugins', translations) were dead before this audit and were left to keep the diff focused.

conventions/qa.ts and eval.conventions.spec.ts are kept rather than deleted so the surviving coding-category case still runs as a registered suite.

To see the specific tasks where the Asana app for GitHub is being used, see below:
- https://app.asana.com/0/0/1214427219166342

…ence map

…iant)

github-actions · 2026-04-30T14:53:37Z

📦 esbuild Bundle Analysis for payload

This analysis was generated by esbuild-bundle-analyzer. 🤖

Meta File	Out File	Size (raw)	Note
packages/next/meta_index.json	esbuild/index.js	985.42 KB	🆕 Added
packages/payload/meta_index.json	esbuild/index.js	1.39 MB	🆕 Added
packages/payload/meta_shared.json	esbuild/exports/shared.js	191.30 KB	🆕 Added
packages/richtext-lexical/meta_client.json	esbuild/exports/client_optimized/index.js	287.18 KB	🆕 Added
packages/ui/meta_client.json	esbuild/exports/client_optimized/index.js	1.19 MB	🆕 Added
packages/ui/meta_shared.json	esbuild/exports/shared_optimized/index.js	16.32 KB	🆕 Added

Largest paths

These visualization shows top 20 largest paths in the bundle.

Meta file: packages/next/meta_index.json, Out file: esbuild/index.js

Path	Size
../../node_modules	${{\color{Goldenrod}{ ████████████████████▌ }}}$ 82.3%, 807.63 KB
dist/views/Version	${{\color{Goldenrod}{ █▎ }}}$ 5.3%, 51.49 KB
dist/views/Dashboard	${{\color{Goldenrod}{ ▌ }}}$ 2.2%, 21.38 KB
dist/views/Document	${{\color{Goldenrod}{ ▍ }}}$ 1.7%, 16.59 KB
dist/views/List	${{\color{Goldenrod}{ ▎ }}}$ 1.2%, 11.38 KB
dist/views/Root	${{\color{Goldenrod}{ ▎ }}}$ 1.0%, 9.90 KB
dist/views/Versions	${{\color{Goldenrod}{ ▏ }}}$ 0.6%, 6.17 KB
dist/views/API	${{\color{Goldenrod}{ ▏ }}}$ 0.6%, 6.13 KB
dist/elements/Nav	${{\color{Goldenrod}{ ▏ }}}$ 0.6%, 5.96 KB
dist/views/Account	${{\color{Goldenrod}{ ▏ }}}$ 0.6%, 5.55 KB
dist/elements/DocumentHeader	${{\color{Goldenrod}{ ▏ }}}$ 0.5%, 4.81 KB
dist/views/Login	${{\color{Goldenrod}{ }}}$ 0.4%, 4.40 KB
dist/layouts/Root	${{\color{Goldenrod}{ }}}$ 0.3%, 3.41 KB
dist/views/ForgotPassword	${{\color{Goldenrod}{ }}}$ 0.3%, 3.13 KB
dist/views/CreateFirstUser	${{\color{Goldenrod}{ }}}$ 0.3%, 2.81 KB
dist/templates/Default	${{\color{Goldenrod}{ }}}$ 0.3%, 2.64 KB
dist/views/BrowseByFolder	${{\color{Goldenrod}{ }}}$ 0.3%, 2.61 KB
dist/views/CollectionFolders	${{\color{Goldenrod}{ }}}$ 0.2%, 2.44 KB
dist/views/ResetPassword	${{\color{Goldenrod}{ }}}$ 0.2%, 2.40 KB
dist/views/Logout	${{\color{Goldenrod}{ }}}$ 0.2%, 1.94 KB
(other)	${{\color{Goldenrod}{ ████▍ }}}$ 17.7%, 173.12 KB

Meta file: packages/payload/meta_index.json, Out file: esbuild/index.js

Path	Size
../../node_modules	${{\color{Goldenrod}{ █████████████████▏ }}}$ 68.8%, 951.98 KB
dist/fields/hooks	${{\color{Goldenrod}{ ▊ }}}$ 3.2%, 44.07 KB
dist/collections/operations	${{\color{Goldenrod}{ ▋ }}}$ 2.9%, 39.96 KB
dist/versions/migrations	${{\color{Goldenrod}{ ▎ }}}$ 1.3%, 18.50 KB
dist/auth/operations	${{\color{Goldenrod}{ ▎ }}}$ 1.1%, 15.63 KB
dist/fields/config	${{\color{Goldenrod}{ ▎ }}}$ 1.0%, 14.16 KB
dist/globals/operations	${{\color{Goldenrod}{ ▎ }}}$ 1.0%, 13.32 KB
dist/utilities/configToJSONSchema.js	${{\color{Goldenrod}{ ▏ }}}$ 0.9%, 13.13 KB
dist/queues/operations	${{\color{Goldenrod}{ ▏ }}}$ 0.9%, 12.43 KB
dist/fields/validations.js	${{\color{Goldenrod}{ ▏ }}}$ 0.8%, 10.57 KB
dist/bin/generateImportMap	${{\color{Goldenrod}{ ▏ }}}$ 0.7%, 9.08 KB
dist/collections/config	${{\color{Goldenrod}{ ▏ }}}$ 0.6%, 8.91 KB
dist/config/orderable	${{\color{Goldenrod}{ ▏ }}}$ 0.6%, 8.00 KB
dist/uploads/fetchAPI-multipart	${{\color{Goldenrod}{ ▏ }}}$ 0.6%, 7.80 KB
dist/index.js	${{\color{Goldenrod}{ ▏ }}}$ 0.6%, 7.79 KB
dist/database/migrations	${{\color{Goldenrod}{ ▏ }}}$ 0.5%, 7.54 KB
dist/collections/endpoints	${{\color{Goldenrod}{ ▏ }}}$ 0.5%, 6.23 KB
dist/config/sanitize.js	${{\color{Goldenrod}{ }}}$ 0.4%, 5.86 KB
dist/auth/strategies	${{\color{Goldenrod}{ }}}$ 0.4%, 5.50 KB
dist/queues/config	${{\color{Goldenrod}{ }}}$ 0.4%, 5.31 KB
(other)	${{\color{Goldenrod}{ ███████▊ }}}$ 31.2%, 431.87 KB

Meta file: packages/payload/meta_shared.json, Out file: esbuild/exports/shared.js

Path	Size
../../node_modules	${{\color{Goldenrod}{ ███████████████████▊ }}}$ 79.4%, 148.89 KB
dist/fields/validations.js	${{\color{Goldenrod}{ █▍ }}}$ 5.6%, 10.57 KB
dist/config/orderable	${{\color{Goldenrod}{ ▍ }}}$ 1.7%, 3.13 KB
dist/fields/baseFields	${{\color{Goldenrod}{ ▍ }}}$ 1.5%, 2.79 KB
dist/utilities/deepCopyObject.js	${{\color{Goldenrod}{ ▎ }}}$ 1.4%, 2.54 KB
dist/auth/cookies.js	${{\color{Goldenrod}{ ▏ }}}$ 0.8%, 1.55 KB
dist/utilities/flattenTopLevelFields.js	${{\color{Goldenrod}{ ▏ }}}$ 0.8%, 1.42 KB
dist/fields/config	${{\color{Goldenrod}{ ▏ }}}$ 0.7%, 1.28 KB
dist/utilities/getVersionsConfig.js	${{\color{Goldenrod}{ ▏ }}}$ 0.6%, 1.04 KB
dist/utilities/flattenAllFields.js	${{\color{Goldenrod}{ ▏ }}}$ 0.5%, 943 B
dist/folders/utils	${{\color{Goldenrod}{ ▏ }}}$ 0.5%, 916 B
dist/utilities/unflatten.js	${{\color{Goldenrod}{ }}}$ 0.4%, 779 B
dist/utilities/sanitizeUserDataForEmail.js	${{\color{Goldenrod}{ }}}$ 0.4%, 713 B
dist/utilities/getFieldPermissions.js	${{\color{Goldenrod}{ }}}$ 0.3%, 651 B
dist/collections/config	${{\color{Goldenrod}{ }}}$ 0.3%, 570 B
dist/bin/generateImportMap	${{\color{Goldenrod}{ }}}$ 0.3%, 561 B
dist/auth/sessions.js	${{\color{Goldenrod}{ }}}$ 0.3%, 525 B
dist/fields/getFieldPaths.js	${{\color{Goldenrod}{ }}}$ 0.3%, 485 B
dist/utilities/getSafeRedirect.js	${{\color{Goldenrod}{ }}}$ 0.2%, 423 B
dist/utilities/deepMerge.js	${{\color{Goldenrod}{ }}}$ 0.2%, 413 B
(other)	${{\color{Goldenrod}{ █████▏ }}}$ 20.6%, 38.74 KB

Meta file: packages/richtext-lexical/meta_client.json, Out file: esbuild/exports/client_optimized/index.js

Path	Size
dist/features/blocks	${{\color{Goldenrod}{ ███▏ }}}$ 12.8%, 36.34 KB
dist/lexical/plugins	${{\color{Goldenrod}{ ██▉ }}}$ 11.5%, 32.65 KB
dist/lexical/ui	${{\color{Goldenrod}{ ██▏ }}}$ 8.6%, 24.36 KB
dist/features/experimental_table	${{\color{Goldenrod}{ ██ }}}$ 8.3%, 23.70 KB
dist/packages/@lexical	${{\color{Goldenrod}{ █▋ }}}$ 6.7%, 18.99 KB
dist/features/link	${{\color{Goldenrod}{ █▋ }}}$ 6.5%, 18.53 KB
dist/features/toolbars	${{\color{Goldenrod}{ █▍ }}}$ 5.7%, 16.08 KB
dist/features/upload	${{\color{Goldenrod}{ █▏ }}}$ 4.9%, 13.77 KB
dist/features/textState	${{\color{Goldenrod}{ ▉ }}}$ 3.9%, 11.08 KB
dist/features/relationship	${{\color{Goldenrod}{ ▊ }}}$ 3.2%, 9.03 KB
dist/lexical/utils	${{\color{Goldenrod}{ ▊ }}}$ 3.1%, 8.79 KB
dist/features/converters	${{\color{Goldenrod}{ ▋ }}}$ 2.9%, 8.36 KB
dist/features/debug	${{\color{Goldenrod}{ ▋ }}}$ 2.6%, 7.40 KB
dist/utilities/fieldsDrawer	${{\color{Goldenrod}{ ▋ }}}$ 2.5%, 7.15 KB
dist/lexical/config	${{\color{Goldenrod}{ ▍ }}}$ 1.8%, 5.08 KB
dist/features/lists	${{\color{Goldenrod}{ ▍ }}}$ 1.8%, 5.00 KB
dist/features/format	${{\color{Goldenrod}{ ▎ }}}$ 1.2%, 3.46 KB
dist/lexical/LexicalEditor.js	${{\color{Goldenrod}{ ▎ }}}$ 1.1%, 3.23 KB
dist/field/Field.js	${{\color{Goldenrod}{ ▎ }}}$ 1.0%, 2.81 KB
dist/lexical/nodes	${{\color{Goldenrod}{ ▏ }}}$ 0.9%, 2.66 KB
(other)	${{\color{Goldenrod}{ █████████████████████▊ }}}$ 87.2%, 247.61 KB

Meta file: packages/ui/meta_client.json, Out file: esbuild/exports/client_optimized/index.js

Path	Size
../../node_modules	${{\color{Goldenrod}{ ████████████▎ }}}$ 49.2%, 579.12 KB
dist/elements/FolderView	${{\color{Goldenrod}{ ▋ }}}$ 2.5%, 29.38 KB
dist/elements/BulkUpload	${{\color{Goldenrod}{ ▌ }}}$ 2.4%, 28.24 KB
dist/elements/WhereBuilder	${{\color{Goldenrod}{ ▍ }}}$ 1.5%, 17.36 KB
dist/views/Edit	${{\color{Goldenrod}{ ▍ }}}$ 1.5%, 17.30 KB
dist/forms/Form	${{\color{Goldenrod}{ ▎ }}}$ 1.4%, 15.91 KB
dist/fields/Relationship	${{\color{Goldenrod}{ ▎ }}}$ 1.3%, 15.79 KB
dist/elements/Table	${{\color{Goldenrod}{ ▎ }}}$ 1.3%, 15.77 KB
dist/fields/Upload	${{\color{Goldenrod}{ ▎ }}}$ 1.2%, 14.22 KB
dist/fields/Blocks	${{\color{Goldenrod}{ ▎ }}}$ 1.2%, 13.90 KB
dist/elements/QueryPresets	${{\color{Goldenrod}{ ▏ }}}$ 0.9%, 10.36 KB
dist/elements/PublishButton	${{\color{Goldenrod}{ ▏ }}}$ 0.8%, 9.11 KB
dist/providers/Folders	${{\color{Goldenrod}{ ▏ }}}$ 0.7%, 8.46 KB
dist/elements/HTMLDiff	${{\color{Goldenrod}{ ▏ }}}$ 0.7%, 8.38 KB
dist/elements/ListHeader	${{\color{Goldenrod}{ ▏ }}}$ 0.7%, 8.07 KB
dist/fields/Array	${{\color{Goldenrod}{ ▏ }}}$ 0.7%, 7.71 KB
dist/views/CollectionFolder	${{\color{Goldenrod}{ ▏ }}}$ 0.6%, 7.50 KB
dist/views/List	${{\color{Goldenrod}{ ▏ }}}$ 0.6%, 7.36 KB
dist/elements/ReactSelect	${{\color{Goldenrod}{ ▏ }}}$ 0.6%, 7.33 KB
dist/elements/LivePreview	${{\color{Goldenrod}{ ▏ }}}$ 0.6%, 7.03 KB
(other)	${{\color{Goldenrod}{ ████████████▋ }}}$ 50.8%, 597.28 KB

Meta file: packages/ui/meta_shared.json, Out file: esbuild/exports/shared_optimized/index.js

Path	Size
dist/graphics/Logo	${{\color{Goldenrod}{ █████ }}}$ 20.0%, 3.12 KB
../../node_modules	${{\color{Goldenrod}{ ████▎ }}}$ 17.0%, 2.65 KB
dist/graphics/Icon	${{\color{Goldenrod}{ ██▍ }}}$ 9.8%, 1.52 KB
dist/utilities/formatDocTitle	${{\color{Goldenrod}{ ██▏ }}}$ 8.5%, 1.32 KB
dist/providers/TableColumns	${{\color{Goldenrod}{ █▍ }}}$ 5.5%, 862 B
dist/utilities/groupNavItems.js	${{\color{Goldenrod}{ █▎ }}}$ 5.2%, 814 B
dist/utilities/getGlobalData.js	${{\color{Goldenrod}{ █▏ }}}$ 4.9%, 762 B
dist/utilities/api.js	${{\color{Goldenrod}{ █▏ }}}$ 4.8%, 756 B
dist/elements/Translation	${{\color{Goldenrod}{ ▊ }}}$ 3.2%, 493 B
dist/utilities/handleTakeOver.js	${{\color{Goldenrod}{ ▋ }}}$ 2.8%, 440 B
dist/utilities/traverseForLocalizedFields.js	${{\color{Goldenrod}{ ▋ }}}$ 2.6%, 399 B
dist/elements/withMergedProps	${{\color{Goldenrod}{ ▌ }}}$ 2.2%, 339 B
dist/utilities/getVisibleEntities.js	${{\color{Goldenrod}{ ▌ }}}$ 2.1%, 329 B
dist/utilities/getNavGroups.js	${{\color{Goldenrod}{ ▍ }}}$ 1.9%, 301 B
dist/elements/WithServerSideProps	${{\color{Goldenrod}{ ▍ }}}$ 1.5%, 232 B
dist/utilities/handleGoBack.js	${{\color{Goldenrod}{ ▎ }}}$ 1.2%, 180 B
dist/fields/mergeFieldStyles.js	${{\color{Goldenrod}{ ▎ }}}$ 1.0%, 159 B
dist/utilities/handleBackToDashboard.js	${{\color{Goldenrod}{ ▎ }}}$ 1.0%, 152 B
dist/forms/Form	${{\color{Goldenrod}{ ▏ }}}$ 0.9%, 147 B
dist/utilities/abortAndIgnore.js	${{\color{Goldenrod}{ ▏ }}}$ 0.9%, 146 B
(other)	${{\color{Goldenrod}{ ████████████████████ }}}$ 80.0%, 12.51 KB

Details

Next to the size is how much the size has increased or decreased compared with the base branch of this PR.

‼️: Size increased by 20% or more. Special attention should be given to this.
⚠️: Size increased in acceptable range (lower than 20%).
✅: No change or even downsized.
🗑️: The out file is deleted: not found in base branch.
🆕: The out file is newly found: will be added to base branch.

denolfe added 4 commits April 29, 2026 16:35

test(evals): trim conventions QA dataset to dev-facing logger case

b0712ec

test(evals): remove official-plugins QA dataset (reference-doc lookups)

771456e

chore: correct negative audience and sweep dead category keys in audi…

6538392

…ence map

chore: add test:eval:<suite> shorthand scripts (default to :skill var…

02c1ccb

…iant)

github-actions Bot added the created-by: Payload team label Apr 30, 2026

denolfe changed the title ~~test(evals): audit datasets to dev-facing cases~~ test: audit datasets to dev-facing cases Apr 30, 2026

denolfe changed the title ~~test: audit datasets to dev-facing cases~~ test: evals audit datasets to dev-facing cases Apr 30, 2026

denolfe marked this pull request as ready for review April 30, 2026 14:59

denolfe merged commit f221f6c into main Apr 30, 2026
171 of 172 checks passed

denolfe deleted the ai/evals-audit-datasets branch April 30, 2026 15:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: evals audit datasets to dev-facing cases#16433

test: evals audit datasets to dev-facing cases#16433
denolfe merged 4 commits intomainfrom
ai/evals-audit-datasets

denolfe commented Apr 30, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Apr 30, 2026

Meta file: packages/next/meta_index.json, Out file: esbuild/index.js

Meta file: packages/payload/meta_index.json, Out file: esbuild/index.js

Meta file: packages/payload/meta_shared.json, Out file: esbuild/exports/shared.js

Meta file: packages/richtext-lexical/meta_client.json, Out file: esbuild/exports/client_optimized/index.js

Meta file: packages/ui/meta_client.json, Out file: esbuild/exports/client_optimized/index.js

Meta file: packages/ui/meta_shared.json, Out file: esbuild/exports/shared_optimized/index.js

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

denolfe commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Key Changes

Design Decisions

Uh oh!

github-actions Bot commented Apr 30, 2026

📦 esbuild Bundle Analysis for payload

Meta file: packages/next/meta_index.json, Out file: esbuild/index.js

Meta file: packages/payload/meta_index.json, Out file: esbuild/index.js

Meta file: packages/payload/meta_shared.json, Out file: esbuild/exports/shared.js

Meta file: packages/richtext-lexical/meta_client.json, Out file: esbuild/exports/client_optimized/index.js

Meta file: packages/ui/meta_client.json, Out file: esbuild/exports/client_optimized/index.js

Meta file: packages/ui/meta_shared.json, Out file: esbuild/exports/shared_optimized/index.js

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

denolfe commented Apr 30, 2026 •

edited

Loading