Paper: Play to Grade: Testing Coding Games as Classifying Markov Decision Process
Blog post: https://anie.me/play2grade/
SAIL Blog post: https://ai.stanford.edu/blog/play-to-grade/
A correct program:
An incorrect program:
We include our training and model code for the two environments: Car and Bounce. We provide the Jupyter notebook for easy re-running of our experiment.
We should note that since we didn't fix any random seed for our experiments, the result you get will be slightly different from what we report on the paper. Since we run the experiment for 3 or 5 times, the difference shouldn't be significant.
We recommend Python >= 3.7.
pip install -r requirements.txt
The training is relatively quick and is done in the Jupyter notebook inside each folder.
We do require at least one GPU for training (in Jupyter notebook, many function calls have "cuda=True").
- "Car Experiments.ipynb"
- "Bounce Experiment.ipynb"
You can run our simulator by:
cd bounce
python bounce.py
This simulator is built to take in student programs, represented as JSON files. We include 10 reference programs in
./bounce/programs/*.json
. The simulator allows you to actually play the game using your keyboard (arrow key).
Load in a JSON program and play it yourself! These are included in the bounce.py
.
program = Program()
program.set_correct()
# program.load("programs/miss_paddle_no_launch_ball.json")
# program.load("programs/hit_goal_no_point.json")
# program.load("programs/empty.json")
# program.load("programs/multi_ball.json")
# program.load("programs/ball_through_wall.json")
# program.load("programs/goal_bounce.json")
# program.load("programs/multi_ball2.json")
# program.load("programs/paddle_not_bounce.json")
game = Bounce(program)
game.run()
Note that you need to have a monitor in order to play. Can't be played in a server environment.
We additionally provide bounce_var.py
simulator that includes speed setting changes. This simulator is
exactly the same as the simulator in bounce.py
.
We also provide bounce_theme.py
simulator, that includes both speed setting changes and thematic changes.
This simulator is different from bounce.py
and bounce_var.py
in terms of paddle sizes, etc.
bounce_theme.py
is the faithful recreation of the Code.org simulator, but is also the most challenging setting for
vision-based RL agent.
We include the training 10 programs in ./bounce/programs/*.json
. These
programs' format is the same as the programs in the full dataset.
We use SAE (Sequential AutoEncoder) to refer to "HoareLSTM" in paper, and we use "hoarelstm" in code to refer to "Contrastive HoareLSTM" in paper.