Skip to content

Fraser-Greenlee/rl_repl_agent

Repository files navigation

PPO-Seq2Seq

Train a Seq2Seq model by using PPO to generate samples.

The Seq2Seq model learns to match the output of a Python REPL with an RL model generating the samples.

About

Agent for my RL-REPL environment.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published