Policy Gradient Learning tic-tac-toe

#Siraj Raval Coding challenge 09/12/17

Python code to train agents to play tic-tac-toe using policy gradients.

How to use

Run

$ ./train_agents.py training_games learning_rate

to train the agents, where 'training_games' is the number of training games to be played and 'learning_rate' is the gradient descent learning rate that updates the policy network.

After training, you can run

$ ./test_agents.py num_trials

to test the agents' performance against each other, where 'num_trials' is the number of games to be played by the agents.

You can also run

$ ./play_agent.py agent1 agent2

to play against a trained agent or watch them play against each other, where agent1 and agent2 are the types of the agent.

For more information, run

$ ./script_name.py -h

Requirements

Python 3 required. run requirements.txt

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
play_agent.py		play_agent.py
requirements.txt		requirements.txt
test_agents.py		test_agents.py
tic_tac_toe.py		tic_tac_toe.py
train_agents.py		train_agents.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Policy Gradient Learning tic-tac-toe

How to use

Requirements

About

Releases

Packages

Languages

License

I-NicKK/Tic-Tac-Toe

Folders and files

Latest commit

History

Repository files navigation

Policy Gradient Learning tic-tac-toe

How to use

Requirements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages