Skip to content

Using Burlap RL library for GA Tech OMSC cs7641 project4

Notifications You must be signed in to change notification settings

robododge/omscs_ml_a4_burlap

Repository files navigation

OMSCS ML CS7641 Burlap Templates

Implementes templates to use the Java-based Reienforcement Learning alogrithm's provied in the BURLAP libaray from Brown University.

Setup and run

  • This project uses java and gradle. Make sure you have a recent version of Java JDK installed (recommend JDK 15 or higher)
  • Install gradle
  • Run gradle build ./gradlew build

Run the demos

  • ./gradlew helloGridWorld Open the BURLAP GridWorld hellow world explorer, keys:
    • A-West, D-East, W-Up, S-Down
  • ./gradlew blockDudeViewer Run BURLAP's BlockDude, keys:
    • a - West, d - East, w - jump up
    • s - pickup, x - putdown
  • ./gradlew demoExperiment Runs the complete demo experiments in RunExperiments.java

Import into you IDE

  • Intellij - import new gradle project, select the root directory of this project
  • Eclipse - (no tested)

Create and run your experiments.

A sample experiment has been provided in RunExperiments.java Edit this file to setup various experiment sizes, current examples:

  1. Setup Large & Small GridWorldExperiments
  2. Setup the Level1 & Level2 BlockDude experiments

Also, three MDP solver alogorithms are provided:

  1. Value Iteraion Experiments (use the VISettings class to set hyperparametrs)
  2. Policy Iteration Experiments (use the PISettings class to set hyperparametrs)
  3. Q-Learning Experimnets (use the QSettings class to set hyperparametrs)

For running your experiments, you can just execute the main() of the RunExperiments.java class from your IDE.

Experiment output

A CSV writer is attached to each experiment, the output filename of each experiment is controlled by a "shortName" which is configured as part of your experiment type settings, PISettings, VISettings or QSettings. This short name will provide a filename prefix for each of the experiment runs.

Example file output output/smprob-24105858/blockdude/

Metrics Captured: Each experiment type has the ability to capture metrics collected during the iteraions of the experiments here is sample of metrics collected:

  1. "iter" - iteration id
  2. "delta" - delta value found at each iteration
  3. "wallclock" - wallclock time spent in each iteration, milliseconds for VI/PI, but nanosecond for QLearning
  4. "evals" - the number of VI evals done within a single policy step for PI
  5. "numSteps" - for QLearning, number of steps during last episode of learning

About

Using Burlap RL library for GA Tech OMSC cs7641 project4

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published