Skip to content

Experiments/baseline#1

Merged
2imi9 merged 3 commits intomainfrom
experiments/baseline
Mar 27, 2026
Merged

Experiments/baseline#1
2imi9 merged 3 commits intomainfrom
experiments/baseline

Conversation

@2imi9
Copy link
Copy Markdown
Owner

@2imi9 2imi9 commented Mar 27, 2026

No description provided.

2imi9 and others added 3 commits March 27, 2026 15:11
- make_figures.py generates all paper figures from results_baseline.tsv
  and results_enhanced.tsv (fig1 head-to-head, fig2 efficiency, fig3 cost,
  progress.png README teaser)
- progress.png: new teaser showing baseline vs web-enhanced convergence
  curves with paper-citation annotations on the 5 kept improvements
  (replaces karpathy's original placeholder image)

Run: uv run python make_figures.py

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Separate training cost (always local) from code agent cost (optional API)
- Add per-experiment and 53-run cost estimates for Claude Haiku, Sonnet, GPT-4o
- Remove inaccurate "no cloud billing, no API tokens" claim
- Show that even with GPT-4o API, local training is still ~5x cheaper than full cloud H100

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Reverting our custom progress.png while we redesign the experiments
to start both conditions from the same raw config for a proper A/B
comparison. Will regenerate figures once new enhanced_v2 data is ready.

make_figures.py is kept for when we re-run.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@2imi9 2imi9 merged commit 5a3e806 into main Mar 27, 2026
2imi9 added a commit that referenced this pull request Apr 1, 2026
The original train.py selected FA3 repo by compute capability:
- varunneal/flash-attention-3 for Hopper (SM 9.0)
- kernels-community/flash-attn3 for all other GPUs

This was lost during the V4 rewrite. Restored now.

Note: neither repo supports Blackwell (SM 12.0) yet — falls through
to FlexAttention. See #1, #4.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant