Erica Dominic, Erica Detemmerman, and Isaac Haberman
The game of googol is the quintessential optimal stopping problem: the player's only task is to determine the best time to quit the game. We trained four agents to play the game of googol using common reinforcement learning algorithms: first visit Monte Carlo, SARSA, Q-learning, and deep Q-learning. All agents learn to play the game of googol with success rates between 19% and 37%, with the deep Q-learning agent proving the most successful. Furthermore, we demonstrate that all agents are able to transfer the knowledge accumulated while playing one version of the game to another version of the game. We also conducted a small number of trials in which human participants were asked to play the game of googol. Collectively, human participants attained a success rate of 35%, comparable to the deep Q-learning agent; however, human players' strategies do not correspond with strategies observed in the reinforcement learning agents, as evidenced by trends in stopping choices.
See Report.pdf for more information.