-
Notifications
You must be signed in to change notification settings - Fork 0
st speed
Compares AI provider performance across a container: generation time, tokens per second, fact-checking throughput, and consistency. Useful for choosing a provider when speed matters.
Run after: st-bang st-cross
Related: st-stones st-cross st-heatmap Multi-Model
Multi-model (0.9.0+): when same-make agents (e.g.
anthropic-opusandanthropic-sonnet) appear in a container,st-speedshows one row per agent withmake:modellabels for disambiguation. See Multi-Model.
st-speed report.json # Full performance summary (all AIs)
st-speed --agent gemini report.json # Filter display to one AI
st-speed --agent openai --ai-caption report.json # All-AI summary + caption written by OpenAI
st-speed --ai-short report.json # All-AI summary + short caption (default AI)
st-speed --history crypto/*.json # Trend analysis across multiple files
st-speed --csv report.json # Export raw timing data to CSV
st-speed projector_sonos_options.json
Performance Summary: projector_sonos_options.json
======================================================================
Story Generation:
AI Time Tokens Tok/s Samples
openai 00:18 1631 88.57 1
perplexity 00:23 2725 115.57 1
gemini 00:53 11141 207.56 1
anthropic 01:17 4421 57.37 1
xai (cache) 3687 — 1
Fact-Checking Performance:
AI Avg Median Min Max StdDev Samples Segments
openai 01:53 01:43 00:50 03:09 50.4s 5 29/job
perplexity 02:59 02:31 01:41 05:16 83.2s 5 29/job
gemini 05:21 04:58 02:51 08:32 124.4s 5 29/job
xai 06:16 05:53 03:22 09:50 140.0s 5 29/job
anthropic 10:15 09:46 05:04 16:45 251.9s 5 29/job
Note: Each sample is one complete fact-check job.
'Segments' shows avg AI calls per job (typically 20-50 paragraphs).
======================================================================
Using --agent openai --ai-caption generates the caption with OpenAI but still shows all
providers in the performance table:
st-speed --agent openai --ai-caption projector_sonos_options.json
Performance Summary: projector_sonos_options.json
======================================================================
Story Generation:
AI Time Tokens Tok/s Samples
openai 00:18 1631 88.57 1
perplexity 00:23 2725 115.57 1
gemini 00:53 11141 207.56 1
anthropic 01:17 4421 57.37 1
xai (cache) 3687 — 1
Fact-Checking Performance:
AI Avg Median Min Max StdDev Samples Segments
openai 01:53 01:43 00:50 03:09 50.4s 5 29/job
perplexity 02:59 02:31 01:41 05:16 83.2s 5 29/job
gemini 05:21 04:58 02:51 08:32 124.4s 5 29/job
xai 06:16 05:53 03:22 09:50 140.0s 5 29/job
anthropic 10:15 09:46 05:04 16:45 251.9s 5 29/job
Note: Each sample is one complete fact-check job.
'Segments' shows avg AI calls per job (typically 20-50 paragraphs).
Detailed Caption (generated by openai):
──────────────────────────────────────────────────────────────────────
OpenAI leads the speed race on both fronts — wrapping up story
generation in under 20 seconds and completing a full 29-segment
fact-check in under 2 minutes on average. Perplexity is a close
second for generation but falls to nearly 3 minutes per fact-check.
Gemini, xAI, and Anthropic bring up the rear, with Anthropic averaging
over 10 minutes per job and a standard deviation of over 4 minutes —
making it the least predictable choice. For time-sensitive workflows,
OpenAI or Perplexity are the clear picks; Gemini sits in the middle
ground. Anthropic's high variance suggests it should only be used
where response time is not a constraint.
──────────────────────────────────────────────────────────────────────
======================================================================
Key behaviour:
--agentselects which AI writes the caption — it does not filter the performance table. All providers are always shown so you get the full comparison. Use--agentwithout any--ai-*flag if you want to filter the display to a single provider.
| Flag | Output | Length |
|---|---|---|
--ai-title |
Punchy headline | ≤ 10 words |
--ai-short |
One-paragraph summary | ≤ 80 words |
--ai-caption |
Two-paragraph detailed caption | 100–160 words |
--ai-summary |
Technical summary with recommendations | 120–200 words |
--ai-story |
Full narrative report (saved to JSON) | 800–1200 words |
Combine any content flag with --agent <provider> to choose who writes it:
st-speed --agent anthropic --ai-summary report.json
st-speed --agent gemini --ai-story report.json
| Option | Description |
|---|---|
file.json [file.json …] |
Path to one or more JSON container files |
--agent AI |
AI for content generation (default: auto). When used with --ai-* flags, selects which AI generates the content but does not filter the performance display. Without --ai-* flags, also filters the display to one provider. |
--csv CSV |
Export raw timing data to a CSV file |
--history |
Analyze trends across multiple files |
--cache |
Enable API response caching (default for AI content generation) |
--no-cache |
Disable API response caching (forces fresh AI calls) |
-q, --quiet
|
Minimal output |
-v, --verbose
|
Verbose output (show generation details) |
Reads timing{} dicts from data[] entries (generation) and fact[].timing dicts
(fact-checking). Timing is written by st-gen / st-fact on every non-cached call
and is absent on cache hits.
For fact-checks, elapsed time is extrapolated from fresh segments so that partially-cached
runs remain comparable with fully-fresh ones (see extract_fact_check_timing() in source).