This Go
module provides functionality to access OpenAI Gym in Go using cgo
and the C Python
API. The API has been kept very similar to the API of OpenAI Gym, except that certain aspects have been made more "Go-like". For example, action and observation spaces are unexported, and they must be accessed through their respective getter methods. Additionally, many functions return Go error
s, when the OpenAI Gym API does not.
This module simply provides Go
bindings for OpenAI Gym. The module uses an embedded Python
interpreter in Go
code, so the actual gym code running under-the-hood is still Python
. Don't expect Go
-level performance. If you wanted reinforcement learning environments implemented completely in Go
, see my GoLearn: Reinforcement Learning in Go module.
Current State: Classic control and MuJoCo environments work as returned by gym.make()
in Python
. Rendering of environments, either through the Render()
method or through PixelObservationWrapper
s does not work. Environments must either have Box
or Discrete
observation and action spaces. Other spaces have not been implemented. These environments will still work, you just won't be able to inspect their observation or action spaces with the ObservationSpace()
and ActionSpace()
methods respectively.
If all you need is to be able to call the Python
functions/methods gym.make()
, env.step()
, env.reset()
, and env.seed()
, then you can consider this module exactly what you need. If you need some of the fancier Open AI Gym tools, like all their wrappers, stay tuned! Those are soon to come!
Currently, any wrappers that deal with multi-dimensional arrays are not supported. This includes PixelObservationWrapper
s and FrameStack
wrappers. Only single-dimensional state observations are supported.
This package has the following dependencies:
- OpenAI Gym
Python 3.7
(currently tested with Python 3.7.9)Python 3.7-dev
(automatically installed when installingPython3.7
from source)- Go-Python: Go bindings for the C-API of CPython3
- Along with all its dependencies
- pkg-config
The pkg-config program looks at the paths described in the PKG_CONFIG_PATH
environment variable. In one of those paths, you must have have the python3.pc
package configuration file. On my Ubuntu installation, the python3.pc
file is in /usr/local/lib/pkgconfig
. To add this directory to the PKG_CONFIG_PATH
environment variable, run the following code on the command line:
export PKG_CONFIG_PATH="$PKG_CONFIG_PATH":/usr/local/lib/pkgconfig
or put that line in your .zshrc
or 'bashrc
file for the environment variable to be automatically set whenever a terminal is started.
Once all dependencies have been installed, you're ready to start using GoGym
!
Warning: this module only works with Python 3.7
. No other version of Python
is currently supported by Go-Python
.
To install Python 3.7
(along with the Python3.7-dev
package) from source:
- ead over to the Python 3.7 Download Page and download one of the compressed archived.
- Extract the archive.
- Enter the extracted directory
- Run
./configure --enable-shared --enable-optimizations
- Run
sudo make install
- Enjoy your new
Python 3.7
installation! Why not installgnureadline
?
env, err := Make("Ant-v2")
if err != nil {
panic(err)
}
_, err = env.Reset()
if err != nil {
panic(err)
}
for i := 0; i < 10; i++ {
obs, reward, done, err := env.Step(env.ActionSpace().Sample())
if err != nil {
panic(err)
}
fmt.Println(obs, reward, done)
}
- The rendering functionality of OpenAI Gym is currently not supported. For some reason the
C Python API
cannot find thegym.error
package. - You may need to link the
Python 3.7
library forcgo
:#cgo LDFLAGS: -lpython3
or#cgo LDFLAGS: -lpython3.7
- If using many environments concurrently in the same process, the dreaded
Python
GIL will ensure that performance decreases. Try to limit the number of environments per-process to 1 to ensure the best performance (in fact, this limitation exists when running OpenAI Gym inPython
too). - Since
Go-Python
provides bindings only for thePython C API
and not theNumPy C API
,Python
List
s are passed as actions to thegym
environments instead ofNumPy
ndarray
s. Casting thePython
List
s toNumPy
ndarray
s would just be an extra unneeded step. - So far, only Gym environments which satisfy the regular Gym interface (having
Step()
,Reset()
, andSeed()
methods) can be constructed. Any others (e.g. the Algorithmic Environments) will result in a panic. This means that MuJoCo, classic control, and Atari should work.
- Make all environments work, even those that do not implement the regular environment API
- Add all wrappers
- Add all spaces
- Depending on the observation space type, Step() should construct the appropriate observation (vector, dict, tuple) and return a structure of that type. E.g. if the environment is wrapped by a PixelObservation wrapper, then the returned observation is actually a Python dict[string]np.array, so we should also do this. Step() will return an interface{}. Then, for a composite type: for each index we construct an associated value (e.g. if observation["pixels"] is a np.array, we return a []float64 at that index) etc.
- To use any environments with nD (n > 1) observation dimensions, we will need to use the numpy C api to iterate over the array and turn it into a []float. It will then be up to the agent to reshape that into the appropriate shape for its input.
- Get rid of
go-python3
and just use thePython C API
instead. This way,GoGym
will work with newer versions ofPython
too, and it will just be nicer. - Implement functionality using the
NumPy C API
to createBoxSpace
s that have n-dimensional shapes. This will also allow us to use wrappers that return observations with n-dimensional shapes. Basically, we'll just return the[]float64
and the client will have to reshape it.