Skip to content

Commit

Permalink
presentation MCTS
Browse files Browse the repository at this point in the history
  • Loading branch information
TeXitoi committed Nov 28, 2011
1 parent 5ed19bc commit bb868d2
Show file tree
Hide file tree
Showing 2 changed files with 46 additions and 5 deletions.
36 changes: 31 additions & 5 deletions tex/S32-qualification-abstract.tex
Expand Up @@ -30,11 +30,36 @@ \section{Method}

\subsection{Monte Carlo Tree Search}

presentation

usage in the game of Go

our adaptations to optimisation problems
\cite{kocsis2006bandit} proposed to guide the exploration of a tree
using multi-armed bandit technics and Monte Carlo as an evaluation of
a node. This method is also known as UCT (for Upper Confidence bound
for Tree, the name of the main formula of the method) or MCTS (for
Monte Carlo Tree Search). Thanks to this method, Mogo, a Go
artificial intelligence, became competitive with humans
\cite{gelly2007contribution}. Because this method allow to explore a
tree integrating the exploration exploitation problem, we decided to
base our optimisation method on MCTS.

Basically, the method works as follow:
\begin{itemize}
\item UCT choose a node to expand.
\item The node is expanded, and one Monte Carlo simulation is done at
each new created node. Each Monte Carlo simulation returns an
evaluation of its node between 0 and 1.
\item The evaluation of the simulation are used to update the
knowledge of the problem.
\end{itemize}
This sequence is done until the stop condition is meet.

We adapted MCTS to optimization problems. Our main modification is
that we manage solved nodes. As a consequence, if our method runs
enough time, it will stop with the optimal solution.

This method is composed of different modules:
\begin{itemize}
\item a search tree;
\item a Monte Carlo Simulation system.
\end{itemize}

\subsection{Search Tree}

Expand Down Expand Up @@ -103,6 +128,7 @@ \section{Perspectives}

\section{Conclusion}

\bibliographystyle{plain}
\bibliography{bibliography}

\end{document}
15 changes: 15 additions & 0 deletions tex/bibliography.bib
@@ -0,0 +1,15 @@
@INPROCEEDINGS{kocsis2006bandit,
author = {Levente Kocsis and Csaba Szepesv\'ari},
title = {Bandit based Monte-Carlo Planning},
booktitle = {In: ECML-06. Number 4212 in LNCS},
year = {2006},
pages = {282--293},
publisher = {Springer}
}

@phdthesis{gelly2007contribution,
author = "Sylvain Gelly",
title = "Une contribution \`a l'apprentissage par renforcement; application au computer Go",
school = "Universit\'e Paris-Sud",
year = 2007
}

0 comments on commit bb868d2

Please sign in to comment.