Powered by stable-baselines
OBS_SHAPE
- observation shape as a json arrayACTION_SHAPE
- action shape as int/json arrayPORT
- port to listen for action requests on, defaults to 80POLICY
- Stable Baselines PPO policy to use, defaults toMlpPolicy
SAVE_STEPS
- Save every steps; defaults to 1000MODEL_PATH
- load/save path for the Stable Baselines PPO model, defaults to"models/model"
RESET
- if true, will create a new model instead of loading an existing oneVERBOSE
- Stable Baselines PPO2 verbosity level (int)N_STEPS
- number of steps to run between training; defaults to 2048BATCH_SIZE
- batch size when training; defaults to 64N_EPOCHS
- number of epochs to train; defaults to 10
- more documentation (env vars, request/response)
- support LSTM/CNN policies
- test :)
- figure out where first obs should come from..
- allow done to be set through a route eg
/done