Skip to content

Hide & Seek 2.0 — v1.9.0 (it actually learns: CPU self-play)

Choose a tag to compare

@GeFAA GeFAA released this 19 Jun 22:06
· 1 commit to main since this release

Hide & Seek 2.0 -- v1.9.0

Live: https://gefaa.github.io/hide-and-seek-2/ — hard-refresh (Ctrl+Shift+R).

It actually learns now — no more scripted scenarios

The demo behaviour is no longer scripted or simulated. It's produced by an AI
that genuinely learns by self-play on the CPU (learn/, tabular Q-learning,
no GPU). Two agents start from random and get better by playing each other:

seeker sight-rate vs a random hider:  0.11 -> 0.70   (learned to hunt)
hider  evasion    vs a random seeker:  0.90 -> 0.99   (learned to evade)

Both start at chance and climb far above it — that's the proof. Run it yourself:
python -m learn.train.

In the viewer

  • Watch plays real rollouts of the trained policy: a trained seeker
    catching a random hider, a trained hider evading a random seeker, and an
    untrained baseline — each ending in a clear SEEKERS WIN / HIDERS WIN banner.
  • Learning plots the measured training curve (the seeker climbs to ~70%,
    then the hider learns to evade and takes over the arms race) — not synthetic.

The full GPU JAX/MAPPO stack remains in the repo for scale; learn/ is the
runs-anywhere counterpart.