carrom-with-wfnn

CS 747 ASSIGNMENT 4

CARROM BOARD SIMULATION USING REINFORCEMENT LEARNING

The agent uses a WIRE FITTED NEURAL NETWORK (WFNN) model for predicting the values of actions.

This WFNN training model has a neural network and a wire fitting interpolator.

It basically works in 3 steps:

Feed the state into the neural network. From the output of the neural network, find the action with the highest q. Execute the action. Record an experience composed of initial state, action, next state and the reward received as a result of the action.
Computation of a new estimate of Q values using the action with the maximum q values (one step Q-learning equation)
Fitting the wires with an interpolated curve and calculation of wire fitter partial derivatives to calculate desired action and q values. Lastly, train the neural network using backpropagation.

An optimum weight is achieved which clears the board in 26 turns.

I have provided a script.sh that directly runs the simulation. (FOr the saved value of the optimum weight. No training required.)

For viewing the backpropagation part, one can use the commented lines in the script.sh code.

References:

TEAM NAME: INNOVATORS Team members: MEHTA NIHAR NIKHIL VENKAT KALYAN

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commits
1_player_server		1_player_server
2_player_server		2_player_server
Images		Images
carrom_agent		carrom_agent
logs		logs
one_step		one_step
Authors.txt		Authors.txt
LICENSE		LICENSE
README.md		README.md
Readme.txt		Readme.txt
data_accumulating.txt		data_accumulating.txt
epispode.txt		epispode.txt
notes.txt		notes.txt
script.sh		script.sh
start_experiment.py		start_experiment.py
use_layout.png		use_layout.png

mehtanihar/carrom-with-wfnn