Skip to content

feat: add multi-turn api and stateful obs wrapper#4

Merged
lkevinzc merged 1 commit intomainfrom
obs-wrapper
Jun 5, 2025
Merged

feat: add multi-turn api and stateful obs wrapper#4
lkevinzc merged 1 commit intomainfrom
obs-wrapper

Conversation

@lkevinzc
Copy link
Contributor

@lkevinzc lkevinzc commented Jun 5, 2025

Example output of ChatTemplatedObservation:

---------- observation ----------
<|im_start|>user
You are playing Guess The Number.
You have to guess the number between 1 and 20 (inclusive) within 5 turns.
As you enter your guess, the game will provide you with hints such as the target number is 'higher' or 'lower'.
You may provide your response in any manner. Only the number that is wrapped inside \boxed{} will be considered as your guess. For example, \boxed{5}.
As you play, the history of your guesses will be appended below. Use the information to complete the game before you run out of guesses.
Enter your guess.<|im_end|>
<|im_start|>assistant

---------- action ----------
\boxed{3}
---------- reward ----------
0.1
==============================
---------- observation ----------
<|im_start|>user
You are playing Guess The Number.
You have to guess the number between 1 and 20 (inclusive) within 5 turns.
As you enter your guess, the game will provide you with hints such as the target number is 'higher' or 'lower'.
You may provide your response in any manner. Only the number that is wrapped inside \boxed{} will be considered as your guess. For example, \boxed{5}.
As you play, the history of your guesses will be appended below. Use the information to complete the game before you run out of guesses.
<|im_end|>
<|im_start|>assistant
\boxed{3}<|im_end|>
<|im_start|>user
At turn 1, you guessed 3, and the target number is higher than 3.
Enter your guess.<|im_end|>
<|im_start|>assistant

---------- action ----------
\boxed{5}
---------- reward ----------
0.1

@lkevinzc lkevinzc merged commit 84a9446 into main Jun 5, 2025
@lkevinzc lkevinzc deleted the obs-wrapper branch June 5, 2025 14:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant