Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 31 additions & 0 deletions .github/workflows/docs-check.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
name: Docs Check

on:
pull_request:
paths:
- 'docs/**'
- '.github/workflows/docs-check.yml'

jobs:
build:
name: Build Documentation
runs-on: ubuntu-latest
defaults:
run:
working-directory: docs
steps:
- name: Checkout
uses: actions/checkout@v6

- name: Set up Node.js
uses: actions/setup-node@v6
with:
node-version: 20
cache: npm
cache-dependency-path: docs/package-lock.json

- name: Install dependencies
run: npm ci

- name: Build site
run: npm run build
4 changes: 2 additions & 2 deletions docs/docs/advanced/mutation-testing.md
Original file line number Diff line number Diff line change
Expand Up @@ -143,6 +143,6 @@ A mutation is considered "detected" if **any** metric on **any** test input prod

As a rule of thumb:

- **>80% detection rate** -- strong evaluation coverage
- **80%+ detection rate** -- strong evaluation coverage
- **50--80% detection rate** -- acceptable but review undetected mutations
- **<50% detection rate** -- evaluation metrics need improvement
- **Under 50% detection rate** -- evaluation metrics need improvement
2 changes: 1 addition & 1 deletion docs/docs/advanced/red-teaming.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,7 @@ void agentShouldHandleBoundaryInputs(AgentTestCase testCase) {

Built-in boundary tests:
- Empty string
- Extremely long input (>100k characters)
- Extremely long input (100k+ characters)
- Special characters and unicode
- SQL/HTML/JSON injection strings
- Null bytes and control characters
Expand Down
8 changes: 4 additions & 4 deletions docs/docs/advanced/statistical-analysis.md
Original file line number Diff line number Diff line change
Expand Up @@ -102,10 +102,10 @@ Stability assessments are based on the coefficient of variation (CV):

| CV Range | Assessment |
|---|---|
| CV <= 0.05 | Highly stable |
| CV <= threshold | Stable |
| CV <= threshold x 2 | Moderately unstable |
| CV > threshold x 2 | Highly unstable |
| CV 0.05 | Highly stable |
| CV threshold | Stable |
| CV threshold x 2 | Moderately unstable |
| CV &gt; threshold x 2 | Highly unstable |

A warning is emitted if fewer than 3 runs are provided.

Expand Down
Loading