Skip to content

Conversation

@TerryChan
Copy link
Contributor

In the room_selector example, the runner incorrectly executes the first task (tasks[0]) on every step instead of advancing to the next one.
As a result, the example produces inconsistent outputs: the expected_choice and final result are mismatched, but the model is still assigned a high reward.

@ultmaster
Copy link
Contributor

/ci

@github-actions
Copy link

github-actions bot commented Nov 4, 2025

🚀 CI Watcher for correlation id-3486595127-mhkq7yy9 triggered by comment 3486595127
🏃‍♀️ Tracking 1 workflow run(s):

✅ All runs completed.

@ultmaster ultmaster merged commit 44dbfde into microsoft:main Nov 5, 2025
8 checks passed
totoluo pushed a commit to totoluo/agent-lightning that referenced this pull request Nov 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants