presentation MCTS

gturri · Nov 28, 2011 · bb868d2 · bb868d2
1 parent 5ed19bc
commit bb868d2
Show file tree

Hide file tree

Showing 2 changed files with 46 additions and 5 deletions.
diff --git a/tex/S32-qualification-abstract.tex b/tex/S32-qualification-abstract.tex
@@ -30,11 +30,36 @@ \section{Method}
 
 \subsection{Monte Carlo Tree Search}
 
-presentation
-
-usage in the game of Go
-
-our adaptations to optimisation problems
+\cite{kocsis2006bandit} proposed to guide the exploration of a tree
+using multi-armed bandit technics and Monte Carlo as an evaluation of
+a node.  This method is also known as UCT (for Upper Confidence bound
+for Tree, the name of the main formula of the method) or MCTS (for
+Monte Carlo Tree Search).  Thanks to this method, Mogo, a Go
+artificial intelligence, became competitive with humans
+\cite{gelly2007contribution}.  Because this method allow to explore a
+tree integrating the exploration exploitation problem, we decided to
+base our optimisation method on MCTS.
+
+Basically, the method works as follow:
+\begin{itemize}
+\item UCT choose a node to expand.
+\item The node is expanded, and one Monte Carlo simulation is done at
+  each new created node. Each Monte Carlo simulation returns an
+  evaluation of its node between 0 and 1.
+\item The evaluation of the simulation are used to update the
+  knowledge of the problem.
+\end{itemize}
+This sequence is done until the stop condition is meet.
+
+We adapted MCTS to optimization problems. Our main modification is
+that we manage solved nodes.  As a consequence, if our method runs
+enough time, it will stop with the optimal solution.
+
+This method is composed of different modules:
+\begin{itemize}
+\item a search tree;
+\item a Monte Carlo Simulation system.
+\end{itemize}
 
 \subsection{Search Tree}
 
@@ -103,6 +128,7 @@ \section{Perspectives}
 
 \section{Conclusion}
 
+\bibliographystyle{plain}
 \bibliography{bibliography}
 
 \end{document}
diff --git a/tex/bibliography.bib b/tex/bibliography.bib
@@ -0,0 +1,15 @@
+@INPROCEEDINGS{kocsis2006bandit,
+    author = {Levente Kocsis and Csaba Szepesv\'ari},
+    title = {Bandit based Monte-Carlo Planning},
+    booktitle = {In: ECML-06. Number 4212 in LNCS},
+    year = {2006},
+    pages = {282--293},
+    publisher = {Springer}
+}
+
+@phdthesis{gelly2007contribution,
+   author = "Sylvain Gelly",
+   title = "Une contribution \`a l'apprentissage par renforcement; application au computer Go",
+   school = "Universit\'e Paris-Sud",
+   year = 2007
+}