Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accessing oracle policy commands for tw-cooking games #272

Closed
vmicheli opened this issue Jul 16, 2021 · 5 comments
Closed

Accessing oracle policy commands for tw-cooking games #272

vmicheli opened this issue Jul 16, 2021 · 5 comments

Comments

@vmicheli
Copy link

Hey,

I'm unable to access oracle policy commands for tw-cooking games which were introduced a couple of months ago: #261

I generated a game with:

tw-make tw-cooking --recipe 3 --take 3 --cook --cut --open --go 12 --split train --output tw_games/tw-game.z8 --seed 11985

and tried to play it with:

tw-play --hint tw_games/tw-game.z8

but oracle policy commands are not displayed.

Am I doing something wrong with the game generation or the play command?

@MarcCote
Copy link
Contributor

MarcCote commented Jul 16, 2021

Weird! I just tried your two commands on the master branch and get:

Oracle: [0/11|(0): take red hot pepper > go east > open screen door > cook red hot pepper with BBQ > go east > go north > take red apple from counter > cook red apple with stove > take knife from counter > slice red apple with knife > chop red hot pepper with knife > open fridge > take white onion from fridge > go south > go west > cook white onion with BBQ > chop white onion with knife > go east > go north > prepare meal > eat meal]

Edit: master is in sync with 1.4.3

@vmicheli
Copy link
Author

I just tried the commands again and it works now. That's even weirder ahah.

Anyways it is time for some long-range language modeling, I'll let you know if I get interesting results!

@MarcCote
Copy link
Contributor

Ok. Let me know if that happens again. Maybe there's some stochastic bug hidden in the oracle's trajectory computation! Also, as you can see the oracle assumes initial knowledge of the recipe (couldn't find a workaround yet).

Something I just thought while writing this is we could split the game in two:

  • Task 1: find the cookbook
  • Task 2: complete the recipe while starting in the kitchen (i.e., the location of the cookbook).

@MarcCote
Copy link
Contributor

Hmm, also something I just noted with the oracle's trajectory above. There's no examine cookbook :(

@vmicheli
Copy link
Author

At the moment I'm doing the data collection by playing the game myself with the assistance of the oracle. Hopefully only a few tens of demonstrations are necessary before moving on to RL.

But if we wanted to automate the data collection, then as you pointed out the agent would first need to find the cookbook (task 1), examine it and proceed (task 2).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants