Skip to content

Dreamer-pixels lab: notebook execution + 3 asset figures#20

Merged
ChatGPU merged 2 commits into
mainfrom
claude/epic-ritchie-A7YtN
May 27, 2026
Merged

Dreamer-pixels lab: notebook execution + 3 asset figures#20
ChatGPU merged 2 commits into
mainfrom
claude/epic-ritchie-A7YtN

Conversation

@ChatGPU
Copy link
Copy Markdown
Owner

@ChatGPU ChatGPU commented May 27, 2026

Dreamer-pixels reproduction lab final run.

End-to-end notebook ~5 min 10 s on CPU. World-model pretraining ~70 s; full 8-cycle Dreamer loop ~232 s. Three assets generated:

  • reconstruction_grid.png — WM recovers cart/pole visual structure from cycle 1 (slight ghosting on frame 0 where h_0 = 0).
  • latent_vs_real_rollout.png — imagination tracks the real env for ~5 steps then drifts; the pixel-MSE subplot exhibits the ~2× jump that bounds the trustworthy imagination horizon.
  • return_vs_steps.png — return hovers near random baseline (~25 vs ~20). README documents this honestly: at the chosen CPU budget the imagination horizon is too short to credit-assign a balance policy, but the architecture is faithful to DreamerV1 (encoder + RSSM with deterministic h + stochastic Gaussian z + decoder + reward + continue heads, KL with balancing α = 0.8, λ-returns in latent imagination).

https://claude.ai/code/session_017Ez7KNKDCGRRLjEnJi9TW7


Generated by Claude Code

claude added 2 commits May 27, 2026 17:35
Trimming the world model's training context so the lab finishes on
CPU well under the 8-minute ceiling without losing the latent
imagination story.

https://claude.ai/code/session_017Ez7KNKDCGRRLjEnJi9TW7
The Dreamer-pixels reproduction lab finished its full pipeline:
- 12-cell notebook runs ~5 min 10 s on CPU.
- World-model pretraining ~70 s; full Dreamer 8-cycle loop ~232 s.
- assets/reconstruction_grid.png shows the WM recovers cart/pole
  visual structure from the first cycle onward (slight ghosting only
  on frame 0 where h_0 = 0).
- assets/latent_vs_real_rollout.png shows the imagination tracks the
  real env for ~5 steps then drifts; the pixel-MSE subplot exhibits
  the ~2x jump that puts a hard cap on imagination depth.
- assets/return_vs_steps.png hovers near the random baseline (~25 vs
  ~20). README documents this honestly: at the chosen CPU budget the
  imagination horizon is too short to credit-assign a balance policy,
  but the architecture is faithful to DreamerV1 (encoder + RSSM with
  det h + stochastic Gaussian z + decoder + reward + continue heads,
  KL with balancing alpha=0.8, lambda-returns in latent imagination).

https://claude.ai/code/session_017Ez7KNKDCGRRLjEnJi9TW7
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants