Skip to content

Commit be06e44

Browse files
CopilotDevilTea
andcommitted
feat: add comprehensive benchmarks for all steps and benchmark creation guide
Co-authored-by: DevilTea <16652879+DevilTea@users.noreply.github.com>
1 parent 060af54 commit be06e44

File tree

4 files changed

+773
-3
lines changed

4 files changed

+773
-3
lines changed
Lines changed: 356 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,356 @@
1+
# How to Create a Benchmark for a Step
2+
3+
This guide explains how to create performance benchmarks for step plugins in the Valchecker framework. Benchmarks are essential for tracking performance improvements and regressions over time.
4+
5+
## Overview
6+
7+
Benchmarks measure the execution speed (operations per second) of steps under controlled conditions. They help identify performance bottlenecks and validate that optimizations actually improve performance.
8+
9+
## When to Create Benchmarks
10+
11+
You should create a benchmark:
12+
13+
- **For every new step**: When implementing a new step plugin
14+
- **For performance-critical operations**: Steps that are likely to be used frequently
15+
- **After optimization**: To measure and validate performance improvements
16+
- **For comparison**: To compare different implementation approaches
17+
18+
## Benchmark File Location
19+
20+
Benchmarks are located in the `/benchmarks` directory at the repository root:
21+
22+
```
23+
benchmarks/
24+
├── core.bench.ts # Core functionality benchmarks
25+
├── steps.bench.ts # Common step benchmarks
26+
├── all-steps.bench.ts # Comprehensive step coverage
27+
└── vitest.config.ts # Benchmark configuration
28+
```
29+
30+
## Basic Benchmark Structure
31+
32+
Use Vitest's `bench` function to define benchmarks:
33+
34+
```typescript
35+
import { bench, describe } from 'vitest'
36+
import { createValchecker, stepName } from '@valchecker/internal'
37+
38+
describe('Step Category - Operation Name', () => {
39+
bench('description of what is being benchmarked', () => {
40+
const v = createValchecker({ steps: [stepName, ...dependencies] })
41+
const schema = v.stepName(/* parameters */)
42+
schema.execute(/* test data */)
43+
})
44+
})
45+
```
46+
47+
## Step-by-Step Guide
48+
49+
### 1. Import Required Dependencies
50+
51+
```typescript
52+
import { bench, describe } from 'vitest'
53+
import { createValchecker, yourStep } from '@valchecker/internal'
54+
```
55+
56+
### 2. Create a Describe Block
57+
58+
Group related benchmarks together:
59+
60+
```typescript
61+
describe('Step Category - Your Step', () => {
62+
// benchmarks go here
63+
})
64+
```
65+
66+
**Category examples**:
67+
- Type Validators (`string`, `number`, `boolean`)
68+
- String Operations (`toLowercase`, `startsWith`)
69+
- Numeric Constraints (`min`, `max`, `integer`)
70+
- Array Operations (`toFiltered`, `toSorted`)
71+
- Object Operations (`object`, `strictObject`)
72+
- Composition (`union`, `intersection`)
73+
- Advanced Operations (`check`, `transform`, `fallback`)
74+
75+
### 3. Define Individual Benchmarks
76+
77+
Each benchmark should test a specific use case:
78+
79+
```typescript
80+
bench('stepName - specific scenario', () => {
81+
const v = createValchecker({ steps: [yourStep] })
82+
const schema = v.yourStep(/* params */)
83+
schema.execute(/* representative data */)
84+
})
85+
```
86+
87+
## Benchmark Naming Convention
88+
89+
Use descriptive names that clearly indicate what is being measured:
90+
91+
```typescript
92+
// Good names
93+
bench('string - basic validation')
94+
bench('string - with startsWith constraint')
95+
bench('array - 100 objects')
96+
bench('object - 10 field validation')
97+
bench('transform - multiple transformations')
98+
99+
// Bad names
100+
bench('test1')
101+
bench('string')
102+
bench('benchmark')
103+
```
104+
105+
## Choosing Test Data
106+
107+
Select test data that represents realistic use cases:
108+
109+
### Basic Operations
110+
```typescript
111+
bench('string - basic validation', () => {
112+
const v = createValchecker({ steps: [string] })
113+
const schema = v.string()
114+
schema.execute('hello world') // Simple, representative string
115+
})
116+
```
117+
118+
### Array Operations
119+
```typescript
120+
bench('array - 10 strings', () => {
121+
const v = createValchecker({ steps: [array, string] })
122+
const schema = v.array(v.string())
123+
// Use realistic array size
124+
schema.execute(['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j'])
125+
})
126+
```
127+
128+
### Object Operations
129+
```typescript
130+
bench('object - 3 field validation', () => {
131+
const v = createValchecker({ steps: [object, string, number] })
132+
const schema = v.object({
133+
name: v.string(),
134+
age: v.number(),
135+
email: v.string(),
136+
})
137+
// Representative object
138+
schema.execute({ name: 'John', age: 30, email: 'john@example.com' })
139+
})
140+
```
141+
142+
## Common Benchmark Patterns
143+
144+
### Testing Different Scales
145+
146+
Benchmark the same operation at different scales to understand scalability:
147+
148+
```typescript
149+
describe('Array Operations - Scalability', () => {
150+
bench('array - 10 elements', () => {
151+
const v = createValchecker({ steps: [array, string] })
152+
const schema = v.array(v.string())
153+
const data = Array.from({ length: 10 }, (_, i) => `item${i}`)
154+
schema.execute(data)
155+
})
156+
157+
bench('array - 100 elements', () => {
158+
const v = createValchecker({ steps: [array, string] })
159+
const schema = v.array(v.string())
160+
const data = Array.from({ length: 100 }, (_, i) => `item${i}`)
161+
schema.execute(data)
162+
})
163+
164+
bench('array - 1000 elements', () => {
165+
const v = createValchecker({ steps: [array, string] })
166+
const schema = v.array(v.string())
167+
const data = Array.from({ length: 1000 }, (_, i) => `item${i}`)
168+
schema.execute(data)
169+
})
170+
})
171+
```
172+
173+
### Testing Chained Operations
174+
175+
Benchmark steps in isolation and when chained:
176+
177+
```typescript
178+
describe('Chained String Operations', () => {
179+
bench('toLowercase - alone', () => {
180+
const v = createValchecker({ steps: [string, toLowercase] })
181+
const schema = v.string().toLowercase()
182+
schema.execute('HELLO WORLD')
183+
})
184+
185+
bench('toLowercase + startsWith - chained', () => {
186+
const v = createValchecker({ steps: [string, toLowercase, startsWith] })
187+
const schema = v.string().toLowercase().startsWith('hello')
188+
schema.execute('HELLO WORLD')
189+
})
190+
})
191+
```
192+
193+
### Success vs Failure Paths
194+
195+
Benchmark both success and failure scenarios when relevant:
196+
197+
```typescript
198+
describe('Fallback Operations', () => {
199+
bench('fallback - success path (no fallback needed)', () => {
200+
const v = createValchecker({ steps: [string, fallback] })
201+
const schema = v.string().fallback('default')
202+
schema.execute('hello') // Valid input
203+
})
204+
205+
bench('fallback - failure path (fallback applied)', () => {
206+
const v = createValchecker({ steps: [string, fallback] })
207+
const schema = v.string().fallback('default')
208+
schema.execute(42) // Invalid input triggers fallback
209+
})
210+
})
211+
```
212+
213+
## Running Benchmarks
214+
215+
### Run All Benchmarks
216+
```bash
217+
pnpm bench
218+
```
219+
220+
### Run Specific Benchmark File
221+
```bash
222+
pnpm bench all-steps.bench.ts
223+
```
224+
225+
### Watch Mode (for development)
226+
```bash
227+
pnpm bench:watch
228+
```
229+
230+
## Interpreting Results
231+
232+
Benchmark output shows:
233+
- **ops/sec**: Operations per second (higher is better)
234+
- **mean**: Average execution time
235+
- **min/max**: Fastest and slowest execution times
236+
- **margin**: Statistical margin of error
237+
238+
Example output:
239+
```
240+
✓ benchmarks all-steps.bench.ts > Type Validators
241+
· string - basic validation 415,374 ops/sec ±8.34%
242+
· number - basic validation 391,341 ops/sec ±8.59%
243+
```
244+
245+
## Benchmark Checklist
246+
247+
When creating a benchmark for a step:
248+
249+
- [ ] **Location**: Benchmark is in the appropriate file (`all-steps.bench.ts` or a category-specific file)
250+
- [ ] **Imports**: All required steps and utilities are imported
251+
- [ ] **Describe Block**: Benchmark is grouped in a descriptive `describe()` block
252+
- [ ] **Naming**: Benchmark name clearly describes what is being measured
253+
- [ ] **Test Data**: Uses realistic, representative test data
254+
- [ ] **Scale**: Tests relevant scales (e.g., small, medium, large arrays)
255+
- [ ] **Scenarios**: Covers important scenarios (success/failure, chained/isolated)
256+
- [ ] **Execution**: Benchmark actually executes the step (calls `.execute()`)
257+
- [ ] **No Side Effects**: Benchmark doesn't modify external state
258+
- [ ] **Consistency**: Follows the same patterns as existing benchmarks
259+
260+
## Best Practices
261+
262+
### DO:
263+
- ✅ Use realistic test data
264+
- ✅ Test at multiple scales when relevant
265+
- ✅ Group related benchmarks together
266+
- ✅ Use descriptive names
267+
- ✅ Benchmark the actual execution path users will use
268+
- ✅ Include both simple and complex scenarios
269+
- ✅ Run benchmarks before and after optimizations
270+
271+
### DON'T:
272+
- ❌ Use trivial or unrealistic test data
273+
- ❌ Create benchmarks that don't actually execute the step
274+
- ❌ Benchmark setup/initialization code (only the execution)
275+
- ❌ Modify external state within benchmarks
276+
- ❌ Use random data (makes results inconsistent)
277+
- ❌ Copy-paste benchmarks without understanding them
278+
- ❌ Ignore benchmark failures or warnings
279+
280+
## Example: Complete Benchmark for a New Step
281+
282+
Here's a complete example for a hypothetical `capitalize` step:
283+
284+
```typescript
285+
import { bench, describe } from 'vitest'
286+
import { createValchecker, string, capitalize } from '@valchecker/internal'
287+
288+
describe('String Operations - Capitalize', () => {
289+
// Basic operation
290+
bench('capitalize - single word', () => {
291+
const v = createValchecker({ steps: [string, capitalize] })
292+
const schema = v.string().capitalize()
293+
schema.execute('hello')
294+
})
295+
296+
// Multiple words
297+
bench('capitalize - multiple words', () => {
298+
const v = createValchecker({ steps: [string, capitalize] })
299+
const schema = v.string().capitalize()
300+
schema.execute('hello world from valchecker')
301+
})
302+
303+
// Long string
304+
bench('capitalize - long string', () => {
305+
const v = createValchecker({ steps: [string, capitalize] })
306+
const schema = v.string().capitalize()
307+
const longString = 'this is a very long string with many words to test performance'.repeat(10)
308+
schema.execute(longString)
309+
})
310+
311+
// Chained operations
312+
bench('capitalize - chained with toLowerCase', () => {
313+
const v = createValchecker({ steps: [string, toLowercase, capitalize] })
314+
const schema = v.string().toLowercase().capitalize()
315+
schema.execute('HELLO WORLD')
316+
})
317+
})
318+
```
319+
320+
## Integration with Development Workflow
321+
322+
### 1. Create the Step
323+
Follow [How to Define a Step](./how-to-define-create-step.md)
324+
325+
### 2. Write Tests
326+
Follow [How to Test a Step](./how-to-test-a-step.md)
327+
328+
### 3. Create Benchmark
329+
Follow this guide to create performance benchmarks
330+
331+
### 4. Run Verification
332+
```bash
333+
# Ensure tests pass
334+
pnpm test
335+
336+
# Run benchmarks
337+
pnpm bench
338+
339+
# Check for performance regressions
340+
# Compare ops/sec with previous baseline
341+
```
342+
343+
## Maintaining Benchmarks
344+
345+
- **Update benchmarks** when step implementation changes significantly
346+
- **Add benchmarks** for new scenarios or edge cases
347+
- **Review benchmarks** during code reviews
348+
- **Track performance** over time to catch regressions
349+
- **Document significant changes** in benchmark results
350+
351+
## Additional Resources
352+
353+
- [Vitest Benchmarking Guide](https://vitest.dev/guide/features.html#benchmarking)
354+
- [PERFORMANCE_REPORT.md](../benchmarks/PERFORMANCE_REPORT.md) - Performance optimization analysis
355+
- [How to Test a Step](./how-to-test-a-step.md) - Testing guide
356+
- [How to Define a Step](./how-to-define-create-step.md) - Step creation guide

agents_guides/how-to-define-create-step.md

Lines changed: 20 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -284,5 +284,23 @@ Follow this checklist to ensure your step is correctly implemented:
284284
- [ ] **Message Resolution**: All error messages use `resolveMessage()` to support custom, default, and global messages.
285285
- [ ] **Tree-Shaking**: The export includes the `/* @__NO_SIDE_EFFECTS__ */` comment.
286286
- [ ] **File Organization**: The step is in its own file (`/steps/stepName/stepName.ts`) and exported from `/steps/index.ts`.
287-
288-
> **Note**: Each step file should contain only one step method. This means each plugin will register only one method on the validation chain.
287+
- [ ] **Tests**: Comprehensive tests are written following [How to Test a Step](./how-to-test-a-step.md).
288+
- [ ] **Benchmarks**: Performance benchmarks are created following [How to Create a Benchmark](./how-to-create-benchmark.md).
289+
290+
> **Note**: Each step file should contain only one step method. This means each plugin will register only one method on the validation chain.
291+
292+
## Development Workflow
293+
294+
The recommended workflow for creating a step is:
295+
296+
1. **Define the step** following this guide
297+
2. **Write comprehensive tests** following [How to Test a Step](./how-to-test-a-step.md)
298+
3. **Create performance benchmarks** following [How to Create a Benchmark](./how-to-create-benchmark.md)
299+
4. **Run verification**:
300+
```bash
301+
pnpm lint # Ensure code style compliance
302+
pnpm typecheck # Verify TypeScript types
303+
pnpm test # Ensure all tests pass
304+
pnpm bench # Measure performance
305+
```
306+
5. **Review and iterate** based on test results and benchmark performance

0 commit comments

Comments
 (0)