Two different implementations of "perfect information" AI: One learning, using a Q-Learning/SARSA reinforcement learner, the other searching, using the Minimax searching algorithm modified with Alpha-beta pruning.
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
bin
src
README.md

README.md

Reversi-AI

Two different implementations of "perfect information" Artificial Intelligence: One learning, using a Q-Learning/SARSA reinforcement learner, the other searching, using the Minimax searching algorithm modified with Alpha-beta pruning.

Reversi Alphabeta vs SARSA

This GIF demonstrates a learning AI playing against a searching AI. Can you guess which AI is the winner?

Making a move

My first objective for this project was creating the game with all of the rules playable for two human players. I ran into many challenges along the way, specifically, validating a move proved to be difficult to implement.
Initial Board Where dark may play Dark's move
initboard moves1 darkplay1

This is a very basic example of a valid move, particularly because it's the first move of the game. As the game becomes more complex and moves begin to influence the board in more than one direction verifying the validity of moves becomes more difficult.

The code for checking for a move. This requires looking in 8 cardinal directions (unless the piece is on the corner or the edge of the board, which is only 4 and 6 directions respectively).

public ArrayList<Position> checkMove(int row, int col){	
		ArrayList<Position> allflip = new ArrayList<Position>();
		int x = row, y = col;
		
		//--------------------------------------------------
		//code for checking in different cardinal directions
		//--------------------------------------------------
		
		for (int currentDirection : directionsToLook) {

			Position currentPos = new Position(x, y);
			ArrayList<Position> toflip = new ArrayList<Position>();
			
			moveDir(currentPos, currentDirection);
			
			while ((currentPos.getRow()<8) && (currentPos.getRow()>=0) && (currentPos.getCol()<8) && (currentPos.getCol()>=0)) {

				char currentPiece = gameBoard.getBoardCoor(currentPos.getRow(),currentPos.getCol());
				
				if(currentPiece == opponent()) {
					
					toflip.add(currentPos.asPiece(opponent()));
					moveDir(currentPos, currentDirection);
					
				} else if (currentPiece != gameBoard.getWhoseMove()) {
					break;
				} else {
					allflip.addAll(toflip);
					break;
				}
			}
			
		}
		return allflip;
}

The class "TestReversi" is where the game can be started from. Settings can be modified in this class, currently by commenting or uncommenting parts of code. TestReversi is currently set up so that a human player can play against the reinforcement learner. Make a move by specifying a row and a col like so: 3 4 (row 3, col 4)

Features to add in the future: Ask the user when they start the game who they would like to play against, a searching AI, a learning AI, or a human. Or if they would like, they can watch two AI play against each other. (Remove having to comment and uncomment code to change who plays the game).