Waypoint-1.5 Support by lapp0 · Pull Request #33 · Overworldai/world_engine

lapp0 · 2026-03-18T15:40:45Z

Waypoint-1.5 required changes

TAEHV integration
Update KV Cache for Waypoint-1.5 frame_idx (backwards compatible with Waypoint-1) + pos_id change including f_pos (incremental frame position)
load_state_dict converts the weights to a format compatible with the inference engine if model is WP1.5
auto_aspect_ratio defaults to True, impacts WP1.5 only, enforces inputs / outputs are 720p or 360p.
Update README to document Waypoint-1.5

Misc changes not specific to Waypoint-1.5

COMPILE_OPTIONS for more throughput
Enable "state snapshots" (or "game checkpoints") via get_state and load_state
Load directly to GPU to minimize CPU memory overhead
Clydes dynamic angle computation, allowing infinite length generation with no memory impact
Allow load_weights=False, to create a randomly initialized model for benchmarking
Fix torch warning via converting controller inputs w/ torch.as_tensor

examples/

Doc strings for showing dependencies and run examples
Remove reference client
Update gen_sample.py for WP1.5
Credit work from Fix quickstart example and example dependency setup #20

) * fix: uv sync issue with python version 3.9 * fix: VRAM explosion * refactor: init on gpu device directly * fix: don't use fbgemm on windows for now * feat: orthoropeangles * fix: NoCastModule OrthoRoPEAngles * fix: remove pos_ids from args * fix: remove old src rope replacement patch * fix: remove out of scope ae changes * fix: remove out of scope text encoder changes * fix: patch_model pos_ids --------- Co-authored-by: Philpax <me@philpax.me>

feat: use built triton-windows fork to fix long-path issue

README.md

philpax · 2026-03-19T20:07:22Z

README.md

+## Waypoint-1.5 Behavior
+All interfaces between Waypoint-1 (or 1.1) and Waypoint-1.5 **except** the following:
+
+In Waypoint-1.5, the `img` passed to `append_frame(...)` and returned by `gen_frame(...)` is now a sequence of 4 frames. Waypoint-1.5 applies temporal compression and generates 4 frames for every controller input.


describe the implications this has for frame pacing; what's the correct way to feed inputs and display the rendered frames to the user?

philpax · 2026-03-19T20:07:45Z

README.md

@@ -77,14 +77,25 @@ for controller_input in [
 	img = engine.gen_frame(ctrl=controller_input)


this probably needs to be updated for 4-frame use, or this snippet should be deleted entirely and pointed at one of the examples

IMO, the Waypoint-1.5 clarification below on the nature of img is sufficient

I'd add a comment pointing to the clarification below, so they have an idea of what to expect for the shape of img

pyproject.toml

philpax · 2026-03-19T20:10:13Z

examples/gen_sample.py

+    "https://gist.github.com/user-attachments/assets/68c943a4-008a-4c25-948c-c81ab4c47d21",
+])
+frame = cv2.imdecode(np.frombuffer(urllib.request.urlopen(url).read(), np.uint8), cv2.IMREAD_COLOR)
+engine.append_frame(torch.from_numpy(np.repeat(frame[None], 4, axis=0)))


branch on whether it's a WP-1 or WP-1.5 model and change the append behaviour accordingly; add a comment indicating that we're repeating to meet the 4-frame requirement for WP-1.5

IMHO, new users shouldn't be directed towards WP1 at all. Make integration possible, but move away from backwards compatible examples.

Hmm; yeah, that's fine.

However, at the risk of potentially causing you a lot of grief, perhaps the solution is to make WP-1 use work against [1, H, W, 3], so that consuming code always works with [N, H, W, 3] shapes, where N is 1 for WP-1, or config.temporal_compression for newer models? (including updating all of the examples to work over config.temporal_compression/tensor shape as opposed to 4)

This is already a breaking change for consumers, so this kind of unification makes sense to me, and it means that downstream code should Just Work:tm: (in the sense that you pace out at gentime/N, which gracefully degrades when N=1).

philpax · 2026-03-19T20:10:31Z

examples/gen_sample.py

-if __name__ == "__main__":
-    gen_vid()
+# Set seed frame
+url = random.choice([


can we include seed images within the repo and point to local files instead? easier to hack / see what's going on

Thoughts on me pointing to Biome repo images?

That's fine once we've locked them in, which we haven't quite yet done (stalled on generating decently-ID images). Ok to leave this as-is for now, and then we can replace them just before release, I think.

philpax · 2026-03-19T20:12:48Z

examples/gen_sample.py

+with iio.imopen("out.mp4", "w", plugin="pyav") as out:
+    out.write(engine.gen_frame().cpu().numpy(), fps=60, codec="libx264")
+    for ctrl in controller_sequence:
+        out.write(engine.gen_frame(ctrl=ctrl).cpu().numpy())


not super obvious how the inputs map to frames here, especially in the 4-frame model; does this mean that we're supplying one of the inputs once every four frames? how would I do multiple different inputs within that four-frame block?

Is this sufficient?

four_frames = engine.gen_frame().cpu().numpy() # int8 [4, H, W, 3] out.write(four_frames, fps=60, codec="libx264")

I'd use config.temporal_compression for clarity

Er, wait, that suggestion's for the seed frame; I think that still doesn't address my issue, which is "how do the inputs that I, as a user, get mapped to inputs under the temporally compressed regime?" Do I bundle together the last four frames of inputs? Do I only send inputs from the current frame, so that only one-fourth of the inputs make it through?

philpax · 2026-03-19T20:15:10Z

examples/benchmark.py

@@ -1,12 +1,17 @@
+"""
+Additional Dependencies: pytest-benchmark


can we move all of the additional deps into pyproject.toml using https://docs.astral.sh/uv/concepts/projects/dependencies/#dependency-groups so that users can do uv run --dev pytest examples/benchmark.py? ditto for the other examples

we want getting up to speed with WE to be as easy as possible; ideally, you clone a repo and run uv run --dev examples/gen_sample.py Overworld/Waypoint-1.5-1B with no additional steps to see the WE do its thing (uv run should do all the intermediate work)

Updated pyproject, now the following comments work:

# MODEL_URI="Overworld/Waypoint-1.5-1B" uv run --dev pytest examples/benchmark.py

# uv run --dev examples/gen_sample.py Overworld/Waypoint-1.5-1B

(Model should be Overworld-Models/MR160k for now)

lapp0 · 2026-03-18T15:42:57Z

src/quantize.py

            else None
        )
-        w_amax = lin.weight.data.clone().amax().float().squeeze()
+        w_amax = lin.weight.data.abs().amax()


Out of scope, minor bug fix

lapp0 · 2026-03-20T16:35:20Z

examples/gen_sample.py

+    "https://gist.github.com/user-attachments/assets/68c943a4-008a-4c25-948c-c81ab4c47d21",
+])
+frame = cv2.imdecode(np.frombuffer(urllib.request.urlopen(url).read(), np.uint8), cv2.IMREAD_COLOR)
+engine.append_frame(torch.from_numpy(np.repeat(frame[None], 4, axis=0)))


IMHO, new users shouldn't be directed towards WP1 at all. Make integration possible, but move away from backwards compatible examples.

lapp0 · 2026-03-20T16:36:26Z

examples/gen_sample.py

+with iio.imopen("out.mp4", "w", plugin="pyav") as out:
+    out.write(engine.gen_frame().cpu().numpy(), fps=60, codec="libx264")
+    for ctrl in controller_sequence:
+        out.write(engine.gen_frame(ctrl=ctrl).cpu().numpy())


Is this sufficient?

four_frames = engine.gen_frame().cpu().numpy() # int8 [4, H, W, 3] out.write(four_frames, fps=60, codec="libx264")

lapp0 · 2026-03-20T16:38:03Z

examples/gen_sample.py

-if __name__ == "__main__":
-    gen_vid()
+# Set seed frame
+url = random.choice([


Thoughts on me pointing to Biome repo images?

lapp0 · 2026-03-20T17:25:02Z

examples/benchmark.py

@@ -1,12 +1,17 @@
+"""
+Additional Dependencies: pytest-benchmark


Updated pyproject, now the following comments work:

# MODEL_URI="Overworld/Waypoint-1.5-1B" uv run --dev pytest examples/benchmark.py

# uv run --dev examples/gen_sample.py Overworld/Waypoint-1.5-1B

(Model should be Overworld-Models/MR160k for now)

lapp0 · 2026-03-20T17:35:45Z

README.md

@@ -77,14 +77,25 @@ for controller_input in [
 	img = engine.gen_frame(ctrl=controller_input)


IMO, the Waypoint-1.5 clarification below on the nature of img is sufficient

ScottieFox

So far, branch is stable for anticipation of WP1.5 model behavior and its communication to server.py loaded into the .stream service. The .stream product is not exposed to the end user, and thus all further changes should have BIOME as its consideration as long as compatibility of function exists between both.

MalarzDawid and others added 30 commits February 22, 2026 11:54

Fix README examples and align default model URI in example scripts

6198fa1

Add examples extra dependencies for OpenCV and benchmark tooling

991a861

implement state loading / saving

3ab6475

moe + fbgemm optimization

f1c93e1

wp-1.5 staging

c6f95be

clean up and fix ae

7cf8c25

fix temporal compression rope bugs

586a3c0

vae reset in world_engine.reset

5125dc1

reduce peak memory

9ba9b4d

test: revert direct device init (#28)

bf90520

feat: use built triton-windows fork to fix long-path issue

177101f

update gen_sample

fe5873d

better quant

1935b64

avoid warning when creating mouse / scroll tensors

facd12a

disable unimportant compile options

b2b3fb6

Merge remote-tracking branch 'origin/wp-1.5' into wp1.5

48d5f68

clean up model loading

8f84795

remove unnecessary push_to_hub

3df610b

remove unnecessary save_pretrained

a437614

Merge pull request #29 from Overworldai/use-patched-triton-windows

c076f88

feat: use built triton-windows fork to fix long-path issue

reduce cpu memory

39630e4

Merge remote-tracking branch 'origin/wp-1.5' into wp1.5

5ba3689

pass device

235276f

fix #27 - use triton-windows longpath fix

d4fd76a

cleanup dead code

4469f3e

Merge remote-tracking branch 'refs/remotes/origin/wp-1.5' into wp-1.5

aa02df5

auto 720p

4511a7b

ensure correct device

844c44c

no internal model URIs, document requirements in docstring at top

3612b8b

lapp0 added 2 commits March 19, 2026 15:45

update readme to document WP1.5

f5cc301

update readme to document WP1.5

5c910b7

lapp0 requested review from Clydingus, ScottieFox and philpax March 19, 2026 20:03

lapp0 marked this pull request as ready for review March 19, 2026 20:03

philpax requested changes Mar 19, 2026

View reviewed changes

lapp0 added 8 commits March 20, 2026 12:32

no fbgemm dep

274d685

benchmark dont force AE

ae9ac61

move kv cache to appropriate device

25503c1

improve example w/ four_frames var

da66a6d

improve example w/ four_frames var

f482fc2

improve example w/ four_frames var

e8cd112

dev dependency group for examples, uv docs

9308e9e

dev dependency group for examples, uv docs

c2261dd

lapp0 mentioned this pull request Mar 20, 2026

Adds support for int8 w8a8_gemlite quantization #34

Open

fix a missing word

e2060f0

lapp0 commented Mar 20, 2026

View reviewed changes

Credit PR #20

c487603

ScottieFox reviewed Mar 20, 2026

View reviewed changes

		@@ -77,14 +77,25 @@ for controller_input in [
		img = engine.gen_frame(ctrl=controller_input)

		@@ -1,12 +1,17 @@
		"""
		Additional Dependencies: pytest-benchmark

Conversation

lapp0 commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

philpax Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lapp0 Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lapp0 Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ScottieFox left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

lapp0 commented Mar 18, 2026 •

edited

Loading

philpax Mar 19, 2026 •

edited

Loading

lapp0 Mar 20, 2026 •

edited

Loading

lapp0 Mar 20, 2026 •

edited

Loading