Define serialization format for behavioral cloning datasets #5

cswinter · 2021-11-23T04:57:19Z

Define a serialization format for behavioral cloning datasets of entity-gym observation/action pairs and add helper methods to serialize/deserialize datasets. Probably using MessagePack.

cswinter · 2021-11-26T23:53:30Z

Partial progress in #29
Still missing:

serialize ActionSpace
serialize ObsSpace
Record full logprobs, not just logprob of selected action
Record chosen action
Capture returns rather than instantaneous rewards

cswinter · 2022-01-19T04:35:26Z

Some more progress in #147, we now record the full logits and chosen actions.

Adds a new `supervised.py` script to enn-ppo which trains a model from samples recorded by another policy. Also makes various improvements to the sample recorder: - add `--eval-capture-samples`/`--eval-capture-logits` options to record samples/logits during eval to a file - add `--eval-on-step-0` arg to enable/disable running eval on the first step - add `--codecraft-only-opponent` to run an eval with only a loaded eval policy against itself (this is slightly hacky, I'm planning to remove all the CodeCraft-specific options later) - include action and observation spaces when recording samples - fix `RaggedBufferBool` getting deserialized to `None` - misc fixes to the `SampleRecorder` and `Trace` Resolves #5, #6, and #8.

Adds a new `supervised.py` script to enn-ppo which trains a model from samples recorded by another policy. Also makes various improvements to the sample recorder: - add `--eval-capture-samples`/`--eval-capture-logits` options to record samples/logits during eval to a file - add `--eval-on-step-0` arg to enable/disable running eval on the first step - add `--codecraft-only-opponent` to run an eval with only a loaded eval policy against itself (this is slightly hacky, I'm planning to remove all the CodeCraft-specific options later) - include action and observation spaces when recording samples - fix `RaggedBufferBool` getting deserialized to `None` - misc fixes to the `SampleRecorder` and `Trace` Resolves entity-neural-network/incubator#5, entity-neural-network/incubator#6, and entity-neural-network/incubator#8.

cswinter assigned cswinter and unassigned cswinter Nov 26, 2021

cswinter mentioned this issue Feb 21, 2022

Behavioral cloning #175

Merged

cswinter closed this as completed in #175 Feb 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Define serialization format for behavioral cloning datasets #5

Define serialization format for behavioral cloning datasets #5

cswinter commented Nov 23, 2021

cswinter commented Nov 26, 2021 •

edited

cswinter commented Jan 19, 2022

Define serialization format for behavioral cloning datasets #5

Define serialization format for behavioral cloning datasets #5

Comments

cswinter commented Nov 23, 2021

cswinter commented Nov 26, 2021 • edited

cswinter commented Jan 19, 2022

cswinter commented Nov 26, 2021 •

edited