Skip to content

Use loop break instead of sys.exit() on early stop#123

Merged
dnddnjs merged 1 commit into
masterfrom
cleanup/training-loop-exit
May 17, 2026
Merged

Use loop break instead of sys.exit() on early stop#123
dnddnjs merged 1 commit into
masterfrom
cleanup/training-loop-exit

Conversation

@dnddnjs
Copy link
Copy Markdown
Contributor

@dnddnjs dnddnjs commented May 17, 2026

Summary

  • Replace sys.exit() in the three CartPole training scripts with a clean loop exit so the function returns normally instead of tearing down the process.
  • DQN/A2C use a solved flag (nested while/for); PPO has a single outer loop so a plain break is enough.
  • The final torch.save now runs through one code path whether training ends from early stop or from exhausting EPISODES.

Why

sys.exit() raises SystemExit, which:

  • Skips env.close() and any outer cleanup
  • Makes the training function impossible to import/unit-test (the test process dies too)
  • Kills the kernel under Jupyter/IPython

Credit to @AlexisBogroff in #84 for flagging the pattern on the original TF code; this PR applies the same idea to the current PyTorch tree.

Test plan

  • python 2-cartpole/1-dqn.py — confirm it stops cleanly when the 10-episode running mean crosses 490 and the saved model loads via --test
  • Same for 2-cartpole/2-a2c.py and 2-cartpole/3-ppo.py

sys.exit() raises SystemExit which short-circuits env.close() and any
outer cleanup, can't be unit-tested, and kills the kernel under
Jupyter/IPython. Replace with a `solved` flag (DQN/A2C, nested loops)
or a plain break (PPO, single loop) so the function returns normally
and the final save runs through the same path as the EPISODES-exhausted
case.

Credit to @AlexisBogroff in #84 for flagging the pattern; applied to
the current PyTorch tree.
@dnddnjs dnddnjs merged commit 565a1fe into master May 17, 2026
@dnddnjs dnddnjs deleted the cleanup/training-loop-exit branch May 17, 2026 07:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant