Add Structured Output Support to Policy #46

Jack-Khuu · 2025-08-13T02:28:16Z

Add structured output (aka guided decoding) example to policy

python src/forge/actors/policy.py (With structured output toggled)

User: What is 3+5?
Assistant: Positive

ebsmothers

One question, otherwise looks good

ebsmothers · 2025-08-13T13:44:45Z

src/forge/actors/policy.py

+    sampling_params = None
+    if guided_decoding:
+        # Add config for structured output
+        vllm_args = await policy_actor.get_vllm_args.choose()


noob q: why are we running .choose here? It seems to me the more idiomatic thing to do here would be .call but maybe I'm missing the point (also I assume calling all actors is not a bottleneck here anyways as this just returns an instance variable)

You use choose when you just want one of the actors to do something. Basically like "if rank == 0: do"

From my understanding call also triggers the entire Mesh, while like @pbontrager mentioned choose and call_one are more direct (or in the case of no arg it just picks a random)

pbontrager

Thanks!

pbontrager · 2025-08-13T17:05:35Z

src/forge/actors/policy.py

+    sampling_params = None
+    if guided_decoding:
+        # Add config for structured output
+        vllm_args = await policy_actor.get_vllm_args.choose()


You use choose when you just want one of the actors to do something. Basically like "if rank == 0: do"

pbontrager · 2025-08-13T17:07:07Z

src/forge/actors/policy.py

    print("Model running")

-    prompt = "Tell me a joke"
+    prompt = "What is 3+5?" if guided_decoding else "Tell me a joke"


Can you spin these things off as proper tests at some point? @ebsmothers is running something like this too big for our current test setup?

We can throw them on the H100s and it should be fine once we get things more fleshed out

pbontrager · 2025-08-13T17:13:03Z

src/forge/actors/policy.py

-    router = await router_mesh.spawn("policy_router", PolicyRouter, policy=policy_actor)
+
+    sampling_params = None
+    if guided_decoding:


We need to think of how we want to do this from the config in the future. Maybe some sampling param builders that the user can call?

I'm kinda down to just lean into the vllm convention to start (Read from generation.json to populate, then let users override the kwargs)

This is probably something we ask for upstream

Add structured output support to POlicy

2bdeb2f

Jack-Khuu requested review from allenwang28, ebsmothers, joecummings and pbontrager August 13, 2025 02:28

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Aug 13, 2025

ebsmothers approved these changes Aug 13, 2025

View reviewed changes

pbontrager approved these changes Aug 13, 2025

View reviewed changes

Jack-Khuu merged commit d7f0c35 into main Aug 13, 2025
1 check passed

Jack-Khuu deleted the guided-decoding branch August 13, 2025 18:13

photomz pushed a commit to photomz/forge that referenced this pull request Oct 25, 2025

Add Structured Output Support to Policy (meta-pytorch#46)

73f4bd0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Structured Output Support to Policy #46

Add Structured Output Support to Policy #46

Uh oh!

Jack-Khuu commented Aug 13, 2025

Uh oh!

ebsmothers left a comment

Uh oh!

ebsmothers Aug 13, 2025

Uh oh!

pbontrager Aug 13, 2025

Uh oh!

Jack-Khuu Aug 13, 2025

Uh oh!

pbontrager left a comment

Uh oh!

pbontrager Aug 13, 2025

Uh oh!

pbontrager Aug 13, 2025

Uh oh!

Jack-Khuu Aug 13, 2025

Uh oh!

pbontrager Aug 13, 2025

Uh oh!

Jack-Khuu Aug 13, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Add Structured Output Support to Policy #46

Add Structured Output Support to Policy #46

Uh oh!

Conversation

Jack-Khuu commented Aug 13, 2025

Uh oh!

ebsmothers left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pbontrager left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants