Masked Auto-Regressive Variational Acceleration: Fast Inference Makes Practical Reinforcement Learning
The same as MAR.
distilling:
bash run_batch.shRL:
bash rl.shbash eval.shIf you have any questions, feel free to contact me through email (yxgu25@stu.pku.edu.cn). Enjoy!
