Skip to content

fix: skip validation for defaultTest to allow partial test case properties #4732

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

mldangelo
Copy link
Member

@mldangelo mldangelo commented Jul 2, 2025

Relates to #4478

Previously, defaultTest was validated as a complete test case, causing errors when it only contained partial properties like options.provider.embedding.

This change adds an isDefaultTest parameter to readTest() that skips validation when loading defaultTest configurations, allowing it to contain only the properties needed for merging with actual test cases.

mldangelo added 8 commits July 2, 2025 01:28
…rties

Fixes #4478

Previously, defaultTest was validated as a complete test case, causing errors
when it only contained partial properties like options.provider.embedding.
This change adds an isDefaultTest parameter to readTest() that skips validation
when loading defaultTest configurations.

- Added isDefaultTest parameter to readTest function
- Updated config loader to pass true when loading defaultTest
- Added comprehensive test coverage for various provider configurations
…ons\n\n- Add tests to verify defaultTest with embedding provider gets loaded correctly\n- Add tests to verify defaultTest options are merged with test case options\n- Fix TypeScript errors in tests (use 'similar' instead of 'similarity')
…and loadApiProviders\n- Mock readPrompts and loadApiProviders to return expected formats for test scenarios\n- Mock readTests to return specific test data for defaultTest test case\n- These changes ensure tests match the actual behavior of resolveConfigs
Copy link
Contributor

use-tusk bot commented Jul 2, 2025

⏩ No test execution environment matched (1866a1b) View output ↗

View output in GitHub ↗

View check history

Commit Status Output Created (UTC)
b117e06 ⏩ No test execution environment matched Output Jul 2, 2025 5:54AM
b5c0722 ⏩ No test execution environment matched Output Jul 2, 2025 6:04AM
9c3347b ⏩ No test execution environment matched Output Jul 2, 2025 6:26AM
1866a1b ⏩ No test execution environment matched Output Jul 2, 2025 2:52PM

Copy link
Contributor

gru-agent bot commented Jul 2, 2025

TestGru Assignment

Summary

Link CommitId Status Reason
Detail b117e06 🚫 Skipped

History Assignment

Files

File Pull Request
src/util/config/load.ts 🚫 Skipped (There's no need to update the test code)
src/util/testCaseReader.ts 🚫 Skipped (There's no need to update the test code)

Tip

You can @gru-agent and leave your feedback. TestGru will make adjustments based on your input

Copy link
Contributor

coderabbitai bot commented Jul 2, 2025

📝 Walkthrough

Walkthrough

The changes update the readTest function to accept an optional isDefaultTest parameter. When set to true, validation for required test case properties is bypassed, allowing default test configurations with only provider options (including embedding providers) to be loaded without error. The resolveConfigs function is modified to pass this flag when loading defaultTest. Additional tests are added to verify this behavior for various provider configurations and to ensure correct merging and overriding of defaultTest options in test cases. Extensive mocking and new test cases are included to cover these scenarios.

Assessment against linked issues

Objective Addressed Explanation
Allow defaultTest to override embedding provider without triggering missing property validation (#4478)
Ensure defaultTest with only provider options (including embedding) does not throw validation error (#4478)
Correct merging and overriding of defaultTest.options into individual test case options (#4478)

Assessment against linked issues: Out-of-scope changes

No out-of-scope changes were found. All modifications and additions are directly related to the objectives in the linked issue, focusing on the handling and validation of defaultTest configurations and their impact on test execution.

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

npm error Exit handler never called!
npm error This is an error with npm itself. Please report this error at:
npm error https://github.com/npm/cli/issues
npm error A complete log of this run can be found in: /.npm/_logs/2025-07-02T05_56_05_040Z-debug-0.log

✨ Finishing Touches
  • 📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
test/util/testCaseReader.test.ts (1)

631-644: Potential test duplication.

This test appears to duplicate the functionality already tested in "readTest with string input (path to test config)" starting at line 434. Consider whether this additional test case is necessary or if it should be consolidated.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3b4020b and b117e06.

📒 Files selected for processing (5)
  • src/util/config/load.ts (1 hunks)
  • src/util/testCaseReader.ts (2 hunks)
  • test/evaluator.test.ts (1 hunks)
  • test/util/config/load.test.ts (5 hunks)
  • test/util/testCaseReader.test.ts (1 hunks)
🧰 Additional context used
📓 Path-based instructions (6)
`**/*.{ts,tsx}`: Use TypeScript with strict type checking.

**/*.{ts,tsx}: Use TypeScript with strict type checking.

📄 Source: CodeRabbit Inference Engine (CLAUDE.md)

List of files the instruction was applied to:

  • src/util/testCaseReader.ts
  • test/util/testCaseReader.test.ts
  • src/util/config/load.ts
  • test/evaluator.test.ts
  • test/util/config/load.test.ts
`**/*.{js,jsx,ts,tsx}`: Follow established import order with @trivago/prettier-p...

**/*.{js,jsx,ts,tsx}: Follow established import order with @trivago/prettier-plugin-sort-imports.
Use consistent curly braces for all control statements.
Prefer const over let; avoid var.
Use object shorthand syntax whenever possible.
Use async/await for asynchronous code.
Use consistent error handling with proper type checks.

📄 Source: CodeRabbit Inference Engine (CLAUDE.md)

List of files the instruction was applied to:

  • src/util/testCaseReader.ts
  • test/util/testCaseReader.test.ts
  • src/util/config/load.ts
  • test/evaluator.test.ts
  • test/util/config/load.test.ts
`**/*.{ts,tsx}`: Prefer not to introduce new TypeScript types; use existing interfaces whenever possible

**/*.{ts,tsx}: Prefer not to introduce new TypeScript types; use existing interfaces whenever possible

📄 Source: CodeRabbit Inference Engine (.cursor/rules/gh-cli-workflow.mdc)

List of files the instruction was applied to:

  • src/util/testCaseReader.ts
  • test/util/testCaseReader.test.ts
  • src/util/config/load.ts
  • test/evaluator.test.ts
  • test/util/config/load.test.ts
`test/**/*.{test,spec}.{js,jsx,ts,tsx}`: Follow Jest best practices with describe/it blocks.

test/**/*.{test,spec}.{js,jsx,ts,tsx}: Follow Jest best practices with describe/it blocks.

📄 Source: CodeRabbit Inference Engine (CLAUDE.md)

List of files the instruction was applied to:

  • test/util/testCaseReader.test.ts
  • test/evaluator.test.ts
  • test/util/config/load.test.ts
`**/*.{test,spec}.{js,ts,tsx}`: Avoid disabling or skipping tests unless absolutely necessary and documented

**/*.{test,spec}.{js,ts,tsx}: Avoid disabling or skipping tests unless absolutely necessary and documented

📄 Source: CodeRabbit Inference Engine (.cursor/rules/gh-cli-workflow.mdc)

List of files the instruction was applied to:

  • test/util/testCaseReader.test.ts
  • test/evaluator.test.ts
  • test/util/config/load.test.ts
`test/**/*.{test,spec}.ts`: Mock as few functions as possible to keep tests real...

test/**/*.{test,spec}.ts: Mock as few functions as possible to keep tests realistic
Never increase the function timeout - fix the test instead
Organize tests in descriptive describe and it blocks
When writing expectations, prefer assertions on entire objects rather than individual keys
Clean up after tests to prevent side effects (e.g., use afterEach(() => { jest.resetAllMocks(); }))
Run tests with --randomize flag to ensure your mocks setup and teardown don't affect other tests
Use Jest's mocking utilities rather than complex custom mocks
Prefer shallow mocking over deep mocking
Mock external dependencies but not the code being tested
Reset mocks between tests to prevent test pollution
For database tests, use in-memory instances or proper test fixtures
Test both success and error cases for each provider
Mock API responses to avoid external dependencies in tests
Validate that provider options are properly passed to the underlying service
Test error handling and edge cases (rate limits, timeouts, etc.)
Ensure provider caching behaves as expected
Always include both --coverage and --randomize flags when running tests
Run tests in a single pass (no watch mode for CI)
Ensure all tests are independent and can run in any order
Clean up any test data or mocks after each test

📄 Source: CodeRabbit Inference Engine (.cursor/rules/jest.mdc)

List of files the instruction was applied to:

  • test/util/testCaseReader.test.ts
  • test/evaluator.test.ts
  • test/util/config/load.test.ts
🧠 Learnings (6)
📓 Common learnings
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/gh-cli-workflow.mdc:0-0
Timestamp: 2025-06-30T13:44:03.652Z
Learning: Applies to **/*.{test,spec}.{js,ts,tsx} : Avoid disabling or skipping tests unless absolutely necessary and documented
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Validate that provider options are properly passed to the underlying service
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Test both success and error cases for each provider
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Ensure provider caching behaves as expected
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Mock as few functions as possible to keep tests realistic
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Run tests with `--randomize` flag to ensure your mocks setup and teardown don't affect other tests
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Use Jest's mocking utilities rather than complex custom mocks
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Mock API responses to avoid external dependencies in tests
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Reset mocks between tests to prevent test pollution
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Test error handling and edge cases (rate limits, timeouts, etc.)
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Prefer shallow mocking over deep mocking
src/util/testCaseReader.ts (13)
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/gh-cli-workflow.mdc:0-0
Timestamp: 2025-06-30T13:44:03.652Z
Learning: Applies to **/*.{test,spec}.{js,ts,tsx} : Avoid disabling or skipping tests unless absolutely necessary and documented
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Validate that provider options are properly passed to the underlying service
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Test error handling and edge cases (rate limits, timeouts, etc.)
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Run tests in a single pass (no watch mode for CI)
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Always include both `--coverage` and `--randomize` flags when running tests
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Never increase the function timeout - fix the test instead
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Run tests with `--randomize` flag to ensure your mocks setup and teardown don't affect other tests
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Test both success and error cases for each provider
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : For database tests, use in-memory instances or proper test fixtures
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Organize tests in descriptive `describe` and `it` blocks
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Mock as few functions as possible to keep tests realistic
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Mock API responses to avoid external dependencies in tests
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : When writing expectations, prefer assertions on entire objects rather than individual keys
test/util/testCaseReader.test.ts (14)
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Validate that provider options are properly passed to the underlying service
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Test both success and error cases for each provider
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Ensure provider caching behaves as expected
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Test error handling and edge cases (rate limits, timeouts, etc.)
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/gh-cli-workflow.mdc:0-0
Timestamp: 2025-06-30T13:44:03.652Z
Learning: Applies to **/*.{test,spec}.{js,ts,tsx} : Avoid disabling or skipping tests unless absolutely necessary and documented
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Run tests with `--randomize` flag to ensure your mocks setup and teardown don't affect other tests
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:19.000Z
Learning: Applies to test/**/*.{test,spec}.ts : Ensure all tests are independent and can run in any order
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Always include both `--coverage` and `--randomize` flags when running tests
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Mock as few functions as possible to keep tests realistic
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Organize tests in descriptive `describe` and `it` blocks
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Mock API responses to avoid external dependencies in tests
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Reset mocks between tests to prevent test pollution
Learnt from: CR
PR: promptfoo/promptfoo#0
File: CLAUDE.md:0-0
Timestamp: 2025-06-30T13:43:03.637Z
Learning: Applies to test/**/*.{test,spec}.{js,jsx,ts,tsx} : Follow Jest best practices with describe/it blocks.
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Use Jest's mocking utilities rather than complex custom mocks
src/util/config/load.ts (10)
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Run tests in a single pass (no watch mode for CI)
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Never increase the function timeout - fix the test instead
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Run tests with `--randomize` flag to ensure your mocks setup and teardown don't affect other tests
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Always include both `--coverage` and `--randomize` flags when running tests
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Mock external dependencies but not the code being tested
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Mock as few functions as possible to keep tests realistic
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Validate that provider options are properly passed to the underlying service
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Ensure provider caching behaves as expected
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:19.000Z
Learning: Applies to test/**/*.{test,spec}.ts : Ensure all tests are independent and can run in any order
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/gh-cli-workflow.mdc:0-0
Timestamp: 2025-06-30T13:44:03.652Z
Learning: Applies to **/*.{test,spec}.{js,ts,tsx} : Avoid disabling or skipping tests unless absolutely necessary and documented
test/evaluator.test.ts (12)
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Validate that provider options are properly passed to the underlying service
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Test both success and error cases for each provider
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Ensure provider caching behaves as expected
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:19.000Z
Learning: Applies to test/**/*.{test,spec}.ts : Ensure all tests are independent and can run in any order
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Test error handling and edge cases (rate limits, timeouts, etc.)
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Run tests with `--randomize` flag to ensure your mocks setup and teardown don't affect other tests
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Reset mocks between tests to prevent test pollution
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Organize tests in descriptive `describe` and `it` blocks
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Mock as few functions as possible to keep tests realistic
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : When writing expectations, prefer assertions on entire objects rather than individual keys
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Clean up after tests to prevent side effects (e.g., use afterEach(() => { jest.resetAllMocks(); }))
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : For database tests, use in-memory instances or proper test fixtures
test/util/config/load.test.ts (17)
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Mock external dependencies but not the code being tested
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Mock API responses to avoid external dependencies in tests
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Mock as few functions as possible to keep tests realistic
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Reset mocks between tests to prevent test pollution
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Ensure provider caching behaves as expected
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Use Jest's mocking utilities rather than complex custom mocks
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Validate that provider options are properly passed to the underlying service
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Run tests with `--randomize` flag to ensure your mocks setup and teardown don't affect other tests
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Prefer shallow mocking over deep mocking
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Clean up after tests to prevent side effects (e.g., use afterEach(() => { jest.resetAllMocks(); }))
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/examples.mdc:0-0
Timestamp: 2025-06-30T13:43:50.304Z
Learning: When developing or testing examples locally, use 'npm run local' commands instead of 'npx promptfoo@latest' to ensure local changes are tested
Learnt from: CR
PR: promptfoo/promptfoo#0
File: CLAUDE.md:0-0
Timestamp: 2025-06-30T13:43:03.637Z
Learning: Use CommonJS modules (type: "commonjs" in package.json).
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Test both success and error cases for each provider
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/gh-cli-workflow.mdc:0-0
Timestamp: 2025-06-30T13:44:03.652Z
Learning: After resolving conflicts, run the full test suite from root: `npm test`
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:19.000Z
Learning: Applies to test/**/*.{test,spec}.ts : Clean up any test data or mocks after each test
Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/jest.mdc:0-0
Timestamp: 2025-06-30T13:44:18.999Z
Learning: Applies to test/**/*.{test,spec}.ts : Test error handling and edge cases (rate limits, timeouts, etc.)
Learnt from: CR
PR: promptfoo/promptfoo#0
File: CLAUDE.md:0-0
Timestamp: 2025-06-30T13:43:03.637Z
Learning: Applies to test/**/*.{test,spec}.{js,jsx,ts,tsx} : Follow Jest best practices with describe/it blocks.
🧬 Code Graph Analysis (2)
test/util/testCaseReader.test.ts (1)
src/util/testCaseReader.ts (1)
  • readTest (235-285)
src/util/config/load.ts (1)
src/util/testCaseReader.ts (1)
  • readTest (235-285)
⏰ Context from checks skipped due to timeout of 90000ms (18)
  • GitHub Check: webui tests
  • GitHub Check: Redteam
  • GitHub Check: Redteam Custom Enterprise Server
  • GitHub Check: Test on Node 18.x and ubuntu-latest
  • GitHub Check: Test on Node 24.x and ubuntu-latest
  • GitHub Check: Test on Node 20.x and windows-latest
  • GitHub Check: Test on Node 18.x and windows-latest
  • GitHub Check: Share Test
  • GitHub Check: Test on Node 20.x and ubuntu-latest
  • GitHub Check: Test on Node 18.x and macOS-latest
  • GitHub Check: Cursor BugBot
  • GitHub Check: Build Docs
  • GitHub Check: Build on Node 18.x
  • GitHub Check: Build on Node 22.x
  • GitHub Check: Build on Node 24.x
  • GitHub Check: Build on Node 20.x
  • GitHub Check: Style Check
  • GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (12)
src/util/config/load.ts (1)

526-526: LGTM! Correctly implements the default test validation skip.

This change properly passes the isDefaultTest flag as true when loading default test configurations, allowing partial test case properties without triggering validation errors. The implementation aligns perfectly with the PR objective to fix issue #4478.

src/util/testCaseReader.ts (2)

238-238: LGTM! Clean backward-compatible parameter addition.

The optional isDefaultTest parameter with a default value of false maintains backward compatibility while enabling the new functionality to skip validation for default test configurations.


264-264: LGTM! Proper conditional validation implementation.

The validation logic correctly skips the required properties check when isDefaultTest is true, allowing partial default test configurations to be loaded without errors. The inline comment clearly explains the purpose of this conditional behavior.

test/evaluator.test.ts (1)

2919-3030: Well-structured tests for defaultTest merging functionality.

These tests appropriately verify the runtime merging behavior of defaultTest options during evaluation, which complements the enhanced loading and validation mechanisms introduced in the PR. The test scenarios cover both basic merging and override behavior effectively.

Key strengths:

  • Proper Jest structure with describe/it blocks
  • Realistic mock provider setup with appropriate token usage
  • Comprehensive assertions using toEqual() for object comparisons
  • Good coverage of both merging and override scenarios
  • Follows the guideline of asserting on entire objects rather than individual properties

The tests align well with the PR objective of allowing partial defaultTest configurations while ensuring proper merging behavior during evaluation.

test/util/testCaseReader.test.ts (5)

534-552: LGTM: Well-structured test for embedding provider validation skip.

This test effectively verifies that defaultTest configurations with embedding provider options bypass validation, which aligns with the PR objectives.


554-565: LGTM: Good coverage for model-graded eval provider scenario.

This test case properly validates that string-based provider configurations for model-graded evaluation work correctly with the isDefaultTest flag.


567-593: LGTM: Comprehensive test for text provider configuration.

This test case thoroughly validates complex nested provider configurations, ensuring the entire structure is preserved when validation is skipped for default tests.


595-618: LGTM: Good test coverage for object-based provider configuration.

This test case validates that object-based provider configurations (with id and config properties) work correctly with the validation skip functionality.


620-629: LGTM: Important negative test case for validation behavior.

This test ensures that the existing validation logic remains intact when isDefaultTest is false, which is crucial for maintaining backward compatibility.

test/util/config/load.test.ts (3)

62-138: LGTM: Comprehensive mocking setup for integration testing.

The mocking setup is well-structured and properly handles the new isDefaultTest parameter in the readTest mock. The realistic mock implementations will help ensure accurate integration testing.


1319-1397: LGTM: Comprehensive integration test for embedding provider configuration.

This test effectively validates the end-to-end loading of defaultTest configurations with embedding providers. The test setup is realistic and the assertions thoroughly verify both the defaultTest and regular test loading.


1399-1439: LGTM: Good integration test for model-graded eval provider.

This test provides good coverage for the string-based provider configuration scenario in defaultTest. The test structure is consistent and the assertions properly verify the provider configuration is preserved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant