GP-RL

GP-RL is a simple DEMO of the GPSARSA reinforcement learning algorithm described in ''Reinforcement learning with Gaussian processes'' by Yaakov Engel.

In the learning environment, the agent is put into a 1.0 * 1.0 coutinuous 2D maze and keeps roaming by 0.1 per step until it reaches the goal. Upon every step the agent can choose any direction in [0, 2*PI], and the actual direction is the chosen direction plus a random noise in [0, PI/6].

Under the default settings the agent gets -1.0 reward upon every step outside the goal reagion and gets 10.0 reward when reaching the goals. After that the agent is flung randomly to start a new episode.

The DEMO has a GUI which displays the agent's movement and the learnt strategy. There is also a console panel available for testing different learning parameters and changing the obstacles and goals.

Compile & Run

The main RL algorithm is written in Scala and the GUI part in Java, and thus a JVM is needed for running the program. SBT needs to be installed for handling dependencis and compilation.

Use sbt compile to download the dependencies and compile the project and use sbt run to start the main GUI.

Note

Breeze is used for number crunching, which in turn calls netlib-java in doing linear algebra computation. Netlib-java can optionally choose to use native or Java implementation of BLAS and LAPACK, which can be specified by user in java command line arguments by setting -Dcom.github.fommil.netlib.BLAS=com.github.fommil.netlib.F2jBLAS -Dcom.github.fommil.netlib.LAPACK=com.github.fommil.netlib.F2jLAPACK -Dcom.github.fommil.netlib.ARPACK=com.github.fommil.netlib.F2jARPACK, and it will try to read these system properties in initialization. This feature however violates the requirements of Java applet for safety reasons. Thus the corresponding codes in com.github.fommil.netlib.{BLAS, LAPACK, ARPACK} must be set to the pure Java implementation (written in the FALLBACK fields) to be used in an applet. I managed to do this by recompiling netlib-java (only the core module is needed, and the Java code is generated from Fortran codes by plugin) using Maven and substitute core-1.1.2.jar in the local ivy2 repository, and the applet can be deployed in a browser.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
project		project
src		src
.gitignore		.gitignore
README.md		README.md
assembly.sbt		assembly.sbt
build.sbt		build.sbt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GP-RL

Compile & Run

Note

About

Releases

Packages

Languages

superfan89/GP-RL

Folders and files

Latest commit

History

Repository files navigation

GP-RL

Compile & Run

Note

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages