st speed

st-speed — Analyze AI provider performance and speed

Compares AI provider performance across a container: generation time, tokens per second, fact-checking throughput, and consistency. Useful for choosing a provider when speed matters.

Run after: st-bang st-cross

Related: st-stones st-cross st-heatmap Multi-Model

Multi-model (0.9.0+): when same-make agents (e.g. anthropic-opus and anthropic-sonnet) appear in a container, st-speed shows one row per agent with make:model labels for disambiguation. See Multi-Model.

Usage

st-speed report.json                          # Full performance summary (all AIs)
st-speed --agent gemini report.json              # Filter display to one AI
st-speed --agent openai --ai-caption report.json # All-AI summary + caption written by OpenAI
st-speed --ai-short report.json               # All-AI summary + short caption (default AI)
st-speed --history crypto/*.json              # Trend analysis across multiple files
st-speed --csv report.json                    # Export raw timing data to CSV

Example output

Basic performance summary

st-speed projector_sonos_options.json

Performance Summary: projector_sonos_options.json
======================================================================

Story Generation:
AI            Time    Tokens    Tok/s    Samples

openai        00:18     1631    88.57          1
perplexity    00:23     2725   115.57          1
gemini        00:53    11141   207.56          1
anthropic     01:17     4421    57.37          1
xai (cache)            3687        —          1

Fact-Checking Performance:
AI            Avg     Median    Min     Max     StdDev    Samples    Segments

openai        01:53    01:43    00:50   03:09    50.4s          5    29/job
perplexity    02:59    02:31    01:41   05:16    83.2s          5    29/job
gemini        05:21    04:58    02:51   08:32   124.4s          5    29/job
xai           06:16    05:53    03:22   09:50   140.0s          5    29/job
anthropic     10:15    09:46    05:04   16:45   251.9s          5    29/job

Note: Each sample is one complete fact-check job.
      'Segments' shows avg AI calls per job (typically 20-50 paragraphs).
======================================================================

With AI-generated caption

Using --agent openai --ai-caption generates the caption with OpenAI but still shows all providers in the performance table:

st-speed --agent openai --ai-caption projector_sonos_options.json

Performance Summary: projector_sonos_options.json
======================================================================

Story Generation:
AI            Time    Tokens    Tok/s    Samples

openai        00:18     1631    88.57          1
perplexity    00:23     2725   115.57          1
gemini        00:53    11141   207.56          1
anthropic     01:17     4421    57.37          1
xai (cache)            3687        —          1

Fact-Checking Performance:
AI            Avg     Median    Min     Max     StdDev    Samples    Segments

openai        01:53    01:43    00:50   03:09    50.4s          5    29/job
perplexity    02:59    02:31    01:41   05:16    83.2s          5    29/job
gemini        05:21    04:58    02:51   08:32   124.4s          5    29/job
xai           06:16    05:53    03:22   09:50   140.0s          5    29/job
anthropic     10:15    09:46    05:04   16:45   251.9s          5    29/job

Note: Each sample is one complete fact-check job.
      'Segments' shows avg AI calls per job (typically 20-50 paragraphs).

Detailed Caption (generated by openai):
──────────────────────────────────────────────────────────────────────
OpenAI leads the speed race on both fronts — wrapping up story
generation in under 20 seconds and completing a full 29-segment
fact-check in under 2 minutes on average. Perplexity is a close
second for generation but falls to nearly 3 minutes per fact-check.
Gemini, xAI, and Anthropic bring up the rear, with Anthropic averaging
over 10 minutes per job and a standard deviation of over 4 minutes —
making it the least predictable choice. For time-sensitive workflows,
OpenAI or Perplexity are the clear picks; Gemini sits in the middle
ground. Anthropic's high variance suggests it should only be used
where response time is not a constraint.
──────────────────────────────────────────────────────────────────────
======================================================================

Key behaviour: --agent selects which AI writes the caption — it does not filter the performance table. All providers are always shown so you get the full comparison. Use --agent without any --ai-* flag if you want to filter the display to a single provider.

AI content options

Flag	Output	Length
`--ai-title`	Punchy headline	≤ 10 words
`--ai-short`	One-paragraph summary	≤ 80 words
`--ai-caption`	Two-paragraph detailed caption	100–160 words
`--ai-summary`	Technical summary with recommendations	120–200 words
`--ai-story`	Full narrative report (saved to JSON)	800–1200 words

Combine any content flag with --agent <provider> to choose who writes it:

st-speed --agent anthropic --ai-summary report.json
st-speed --agent gemini --ai-story report.json

Options

Option	Description
`file.json [file.json …]`	Path to one or more JSON container files
`--agent AI`	AI for content generation (default: auto). When used with `--ai-` flags, selects which AI generates the content but does not* filter the performance display. Without `--ai-*` flags, also filters the display to one provider.
`--csv CSV`	Export raw timing data to a CSV file
`--history`	Analyze trends across multiple files
`--cache`	Enable API response caching (default for AI content generation)
`--no-cache`	Disable API response caching (forces fresh AI calls)
`-q`, `--quiet`	Minimal output
`-v`, `--verbose`	Verbose output (show generation details)

For developers

Reads timing{} dicts from data[] entries (generation) and fact[].timing dicts (fact-checking). Timing is written by st-gen / st-fact on every non-cached call and is absent on cache hits.

For fact-checks, elapsed time is extrapolated from fresh segments so that partially-cached runs remain comparable with fully-fresh ones (see extract_fact_check_timing() in source).

st speed

st-speed — Analyze AI provider performance and speed

Usage

Example output

Basic performance summary

With AI-generated caption

AI content options

Options

For developers

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally