Update the TicTacToe environment #1192

dm-ackerman · 2024-03-20T03:16:59Z

Description

Update the TicTacToe environment to v4

Major changes:

~~Fix termination handling to work with stable baselines3. This now trains as expected with the SB3 PPO code~~
Rewrite of underlying board code
Add test functions for the underlying board code
~~Bump version to 4~~

Minor changes:

adds some code comments
removes unneeded code
make board class more encapsulated
only create rendering screen if rendering
minor performance improvements (~30% faster)
streamline win detection code
remove redundant calculation of winning configurations

Type of change

Bug fix (non-breaking change which fixes an issue)
This change requires a documentation update

Checklist:

I have run the pre-commit checks with pre-commit run --all-files (see CONTRIBUTING.md instructions to set it up)
I have run pytest -v and no errors are present.
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
I solved any possible warnings that pytest -v has generated that are related to my code to the best of my knowledge.
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes

Rewards are only set once, only need to accumulate them once. No need to modify the accumulated rewards if they ae not set.

For unknown reasons, this is required for stable baselines training to work correctly. It does not appear to impact other usage as the agents are still looped over correctly after the game ends - just in a different order.

This makes the env code less cluttered and better encapsulates the board behavior. It also expands the checks for a valid move.

These never change. There is no reason to recalculate them constantly.

removes the duplicate calls to check winner

This keeps the env from needing to access the internals of the board to get the moves available.

This causes problems with test cases that have invalid board configurations to test win lines.

dm-ackerman · 2024-03-20T03:54:05Z

AgileRL tutorial is broken again. I think I can fix that and I'll pin the version so it won't be an issue in the future

dm-ackerman · 2024-03-20T04:57:24Z

Tests are failing due to AgileRL version change. #1193 should resolve that.

pettingzoo/classic/tictactoe/board.py

pettingzoo/classic/tictactoe/tictactoe.py

pettingzoo/classic/tictactoe/test_board.py

tutorials/SB3/test/test_sb3_action_mask.py

elliottower · 2024-03-22T15:29:43Z

pettingzoo/classic/tictactoe/test_board.py

            board.play_turn(0, outside_space)

    # 2) move by unknown agent should be rejected
    for unknown_agent in [-1, 2]:
-        with pytest.raises(BadTicTacToeMoveException):
+        with pytest.raises(AssertionError):


Why pytest raises? Haven’t seen that before

As far as I know, that's the recommended pytest way to confirm that a block of code raises a specific assertion. Pytest itself doesn't raise the exception. It catches and ignores that exception (because it's expected) but raises an error if the code in that block doesn't raise the expected error.

Ok fair, we should probably also go and make the code adhere to that as well, because I’m pretty sure elsewhere that isn’t done, but I can see it being good practice. Not a problem here tbh since it’s so small I think it’s fine to have here and then we can do an issue or future prs to fix other stuff to adhere to this practice

pettingzoo/classic/tictactoe/test_board.py

This now always switches agents in each step. The previous change was a bug that "fixed" the behaviour due to a bug in the SB3 tutorial. The agent should be swapped every step as is done now.

The change to v4 was due to the agent handling at the end of an episode The other changes to the env don't change the behaviour of the env, so it is left at v3.

dm-ackerman · 2024-05-03T20:10:26Z

Apparently the failure to train with SB3 wasn't a bug with the env, it was a SB3 tutorial bug. I removed the change that "fixed" the env because it was wrong. I believe the rest of the changes are valid, but I don't think it requires a version bump, so I moved it back to v3.

dm-ackerman added 18 commits March 9, 2024 16:46

Simplify TicTacToe reward accumulation

b12f6ba

Rewards are only set once, only need to accumulate them once. No need to modify the accumulated rewards if they ae not set.

Don't update TicTacToe agent on winning step

24f7569

For unknown reasons, this is required for stable baselines training to work correctly. It does not appear to impact other usage as the agents are still looped over correctly after the game ends - just in a different order.

Move TicTacToe test for valid moves to Board

e3f9cc9

This makes the env code less cluttered and better encapsulates the board behavior. It also expands the checks for a valid move.

Hard code winning lines in TicTacToe board

6708fdb

These never change. There is no reason to recalculate them constantly.

Simplify win check in TicTacToe

ea15ddf

Clean up tictactoe board functions

967caae

Add reset to TicTacToe board

e6851a1

Update TicTacToe masking and observe

42794bc

Update TicTacToe winning code

2847c17

removes the duplicate calls to check winner

Minor cleanups of TicTacToe code

eda5033

Don't create screen if not rending in TicTacToe

364c307

Add legal_moves() to TicTacToe board

758e6a2

This keeps the env from needing to access the internals of the board to get the moves available.

Remove win detection short-cut in TicTacToe

8b7ce5e

This causes problems with test cases that have invalid board configurations to test win lines.

Remove unneeded variable in TicTacToe

5bebafa

Add test cases for TicTacToe board

8e33bf7

Update TicTacToe code comments

df4c950

Merge branch 'master' into ttt_update

f750754

Bump TicTacToe environment to version 4

72ef62f

Add __future__ annotations to TicTacToe tests

e4bd228

Merge branch 'master' into ttt_update

4f35a22

elliottower reviewed Mar 20, 2024

View reviewed changes

pettingzoo/classic/tictactoe/board.py Outdated Show resolved Hide resolved

elliottower reviewed Mar 20, 2024

View reviewed changes

pettingzoo/classic/tictactoe/tictactoe.py Show resolved Hide resolved

elliottower reviewed Mar 21, 2024

View reviewed changes

pettingzoo/classic/tictactoe/test_board.py Show resolved Hide resolved

elliottower reviewed Mar 21, 2024

View reviewed changes

tutorials/SB3/test/test_sb3_action_mask.py Outdated Show resolved Hide resolved

Change TicTacToe from medium to easy in SB3 test.

1f95299

elliottower approved these changes Mar 22, 2024

View reviewed changes

Replace TicTacToe exceptions with asserts

39de495

elliottower reviewed Mar 22, 2024

View reviewed changes

pettingzoo/classic/tictactoe/test_board.py Outdated Show resolved Hide resolved

Check messages of assert errors in tictactoe test

6336393

elliottower approved these changes Mar 22, 2024

View reviewed changes

dm-ackerman added 2 commits May 3, 2024 16:17

Fix agent swap in TicTacToe

71ca217

This now always switches agents in each step. The previous change was a bug that "fixed" the behaviour due to a bug in the SB3 tutorial. The agent should be swapped every step as is done now.

revert TicTacToe version to 3

4170f2b

The change to v4 was due to the agent handling at the end of an episode The other changes to the env don't change the behaviour of the env, so it is left at v3.

dm-ackerman changed the title ~~Update the TicTacToe environment to v4~~ Update the TicTacToe environment May 3, 2024

Merge branch 'master' into ttt_update

da8373e

Merge branch 'master' into ttt_update

44716c3

elliottower merged commit 98e8c20 into Farama-Foundation:master May 3, 2024
46 checks passed

dm-ackerman deleted the ttt_update branch May 6, 2024 16:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update the TicTacToe environment #1192

Update the TicTacToe environment #1192

dm-ackerman commented Mar 20, 2024 •

edited

dm-ackerman commented Mar 20, 2024

dm-ackerman commented Mar 20, 2024

elliottower Mar 22, 2024

dm-ackerman Mar 22, 2024 •

edited

elliottower Mar 22, 2024

dm-ackerman commented May 3, 2024

Update the TicTacToe environment #1192

Update the TicTacToe environment #1192

Conversation

dm-ackerman commented Mar 20, 2024 • edited

Description

Type of change

Checklist:

dm-ackerman commented Mar 20, 2024

dm-ackerman commented Mar 20, 2024

elliottower Mar 22, 2024

Choose a reason for hiding this comment

dm-ackerman Mar 22, 2024 • edited

Choose a reason for hiding this comment

elliottower Mar 22, 2024

Choose a reason for hiding this comment

dm-ackerman commented May 3, 2024

dm-ackerman commented Mar 20, 2024 •

edited

dm-ackerman Mar 22, 2024 •

edited