Skip to content

Commit

Permalink
Add docs to line tac toe
Browse files Browse the repository at this point in the history
  • Loading branch information
Zachary Marion committed Dec 30, 2018
1 parent e28834f commit 24c24fc
Show file tree
Hide file tree
Showing 11 changed files with 264 additions and 81 deletions.
Binary file modified docs/.doctrees/api/algorithms.doctree
Binary file not shown.
Binary file modified docs/.doctrees/environment.pickle
Binary file not shown.
90 changes: 62 additions & 28 deletions docs/_modules/algorithms/mcts.html

Large diffs are not rendered by default.

15 changes: 9 additions & 6 deletions docs/_modules/games/tictactoe.html
Original file line number Diff line number Diff line change
Expand Up @@ -202,10 +202,6 @@ <h1>Source code for games.tictactoe</h1><div class="highlight"><pre>
<span class="k">return</span> <span class="o">-</span><span class="mi">1</span>
<span class="k">return</span> <span class="bp">self</span><span class="o">.</span><span class="n">heuristic</span><span class="p">(</span><span class="n">s</span><span class="p">)</span>

<span class="k">def</span> <span class="nf">heuristic</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">_</span><span class="p">):</span>
<span class="sd">&#39;&#39;&#39; Stubbed for now &#39;&#39;&#39;</span>
<span class="k">return</span> <span class="mi">0</span>

<span class="k">def</span> <span class="nf">next_state</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">s</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">p</span><span class="p">):</span>
<span class="n">copy</span> <span class="o">=</span> <span class="n">s</span><span class="o">.</span><span class="n">copy</span><span class="p">()</span>
<span class="n">copy</span><span class="p">[</span><span class="n">a</span><span class="p">]</span> <span class="o">=</span> <span class="n">p</span>
Expand All @@ -223,7 +219,13 @@ <h1>Source code for games.tictactoe</h1><div class="highlight"><pre>
<span class="k">def</span> <span class="nf">to_hash</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">s</span><span class="p">):</span>
<span class="k">return</span> <span class="nb">hash</span><span class="p">(</span><span class="nb">tuple</span><span class="p">(</span><span class="n">s</span><span class="p">))</span>

<span class="k">def</span> <span class="nf">is_winner</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">s</span><span class="p">,</span> <span class="n">p</span><span class="p">):</span>
<span class="nd">@staticmethod</span>
<span class="k">def</span> <span class="nf">heuristic</span><span class="p">(</span><span class="n">_</span><span class="p">):</span>
<span class="sd">&#39;&#39;&#39; Stubbed for now &#39;&#39;&#39;</span>
<span class="k">return</span> <span class="mi">0</span>

<span class="nd">@staticmethod</span>
<span class="k">def</span> <span class="nf">is_winner</span><span class="p">(</span><span class="n">s</span><span class="p">,</span> <span class="n">p</span><span class="p">):</span>
<span class="sd">&#39;&#39;&#39;</span>
<span class="sd"> Return whether a particular player has won the game. Ideally this would</span>
<span class="sd"> be generalized to an nxn board.</span>
Expand All @@ -237,7 +239,8 @@ <h1>Source code for games.tictactoe</h1><div class="highlight"><pre>
<span class="p">(</span><span class="n">s</span><span class="p">[</span><span class="mi">6</span><span class="p">]</span> <span class="o">==</span> <span class="n">p</span> <span class="ow">and</span> <span class="n">s</span><span class="p">[</span><span class="mi">4</span><span class="p">]</span> <span class="o">==</span> <span class="n">p</span> <span class="ow">and</span> <span class="n">s</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span> <span class="o">==</span> <span class="n">p</span><span class="p">)</span> <span class="ow">or</span>
<span class="p">(</span><span class="n">s</span><span class="p">[</span><span class="mi">8</span><span class="p">]</span> <span class="o">==</span> <span class="n">p</span> <span class="ow">and</span> <span class="n">s</span><span class="p">[</span><span class="mi">4</span><span class="p">]</span> <span class="o">==</span> <span class="n">p</span> <span class="ow">and</span> <span class="n">s</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">==</span> <span class="n">p</span><span class="p">))</span>

<span class="k">def</span> <span class="nf">stringify_player</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">tile</span><span class="p">):</span>
<span class="nd">@staticmethod</span>
<span class="k">def</span> <span class="nf">stringify_player</span><span class="p">(</span><span class="n">tile</span><span class="p">):</span>
<span class="n">mapping</span> <span class="o">=</span> <span class="nb">dict</span><span class="p">(</span><span class="nb">enumerate</span><span class="p">([</span><span class="s1">&#39;X&#39;</span><span class="p">,</span> <span class="s1">&#39;O&#39;</span><span class="p">]))</span>
<span class="k">return</span> <span class="n">mapping</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">tile</span><span class="p">,</span> <span class="s1">&#39; &#39;</span><span class="p">)</span></div>
</pre></div>
Expand Down
108 changes: 102 additions & 6 deletions docs/api/algorithms.html
Original file line number Diff line number Diff line change
Expand Up @@ -197,7 +197,45 @@ <h1>Algorithms<a class="headerlink" href="#algorithms" title="Permalink to this
<dl class="method">
<dt id="algorithms.MCTS.best_action">
<code class="descname">best_action</code><span class="sig-paren">(</span><em>g</em>, <em>s</em>, <em>p</em><span class="sig-paren">)</span><a class="reference internal" href="../_modules/algorithms/mcts.html#MCTS.best_action"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#algorithms.MCTS.best_action" title="Permalink to this definition"></a></dt>
<dd><p>Returns the best action for a given player in a given game state</p>
<dd><p>Get the best action for a given player in a given game state</p>
<table class="docutils field-list" frame="void" rules="none">
<col class="field-name" />
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>g</strong> (<a class="reference internal" href="core.html#core.Game" title="core.Game"><em>Game</em></a>) – The game</li>
<li><strong>s</strong> (<em>state</em>) – The current state of the game</li>
<li><strong>p</strong> (<em>int</em>) – The current player</li>
</ul>
</td>
</tr>
<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">The best action given the current knowledge of the game</p>
</td>
</tr>
<tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">int</p>
</td>
</tr>
</tbody>
</table>
</dd></dl>

<dl class="method">
<dt id="algorithms.MCTS.execute_episode">
<code class="descname">execute_episode</code><span class="sig-paren">(</span><em>g</em>, <em>c_punt=1.4</em><span class="sig-paren">)</span><a class="reference internal" href="../_modules/algorithms/mcts.html#MCTS.execute_episode"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#algorithms.MCTS.execute_episode" title="Permalink to this definition"></a></dt>
<dd><p>Execute a single iteration of the search and update the internal state
based on the generated examples</p>
<table class="docutils field-list" frame="void" rules="none">
<col class="field-name" />
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
<li><strong>g</strong> (<a class="reference internal" href="core.html#core.Game" title="core.Game"><em>Game</em></a>) – The game</li>
<li><strong>c_punt</strong> (<em>float</em>) – The degree of exploration. Defaults to 1.4</li>
</ul>
</td>
</tr>
</tbody>
</table>
</dd></dl>

<dl class="method">
Expand Down Expand Up @@ -233,15 +271,35 @@ <h1>Algorithms<a class="headerlink" href="#algorithms" title="Permalink to this
</table>
</dd></dl>

<dl class="method">
<dl class="staticmethod">
<dt id="algorithms.MCTS.random_playout">
<code class="descname">random_playout</code><span class="sig-paren">(</span><em>g</em>, <em>s</em>, <em>p</em><span class="sig-paren">)</span><a class="reference internal" href="../_modules/algorithms/mcts.html#MCTS.random_playout"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#algorithms.MCTS.random_playout" title="Permalink to this definition"></a></dt>
<em class="property">static </em><code class="descname">random_playout</code><span class="sig-paren">(</span><em>g</em>, <em>s</em>, <em>p</em>, <em>max_moves=1000</em><span class="sig-paren">)</span><a class="reference internal" href="../_modules/algorithms/mcts.html#MCTS.random_playout"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#algorithms.MCTS.random_playout" title="Permalink to this definition"></a></dt>
<dd><p>Perform a random playout and return the winner</p>
<table class="docutils field-list" frame="void" rules="none">
<col class="field-name" />
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>g</strong> (<a class="reference internal" href="core.html#core.Game" title="core.Game"><em>Game</em></a>) – The game</li>
<li><strong>s</strong> (<em>any</em>) – The state of the game to start the playout from</li>
<li><strong>p</strong> (<em>player</em>) – The player whose turn it currently is</li>
<li><strong>max_moves</strong> (<em>int</em>) – Maximum number of moves before the function exits</li>
</ul>
</td>
</tr>
<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first">The winner of the game, or -1 if there is not one</p>
</td>
</tr>
<tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">int</p>
</td>
</tr>
</tbody>
</table>
</dd></dl>

<dl class="method">
<dt id="algorithms.MCTS.search">
<code class="descname">search</code><span class="sig-paren">(</span><em>g</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../_modules/algorithms/mcts.html#MCTS.search"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#algorithms.MCTS.search" title="Permalink to this definition"></a></dt>
<code class="descname">search</code><span class="sig-paren">(</span><em>g</em>, <em>num_iters=100</em>, <em>verbose=False</em>, <em>c_punt=1.4</em><span class="sig-paren">)</span><a class="reference internal" href="../_modules/algorithms/mcts.html#MCTS.search"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#algorithms.MCTS.search" title="Permalink to this definition"></a></dt>
<dd><p>Play out a certain number of games, each time updating our win and play
counts for any state that we visit during the game. As we continue to
play, num_wins / num_plays for a given state should begin to converge on
Expand All @@ -250,26 +308,64 @@ <h1>Algorithms<a class="headerlink" href="#algorithms" title="Permalink to this
<col class="field-name" />
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><strong>g</strong> (<a class="reference internal" href="core.html#core.Game" title="core.Game"><em>Game</em></a>) – Game to train on</td>
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
<li><strong>g</strong> (<a class="reference internal" href="core.html#core.Game" title="core.Game"><em>Game</em></a>) – Game to train on</li>
<li><strong>num_iters</strong> (<em>int</em>) – Number of search iterations</li>
<li><strong>verbose</strong> (<em>bool</em>) – Whether or not to render a progress bar</li>
<li><strong>c_punt</strong> (<em>float</em>) – The degree of exploration. Defaults to 1.4</li>
</ul>
</td>
</tr>
</tbody>
</table>
</dd></dl>

<dl class="method">
<dt id="algorithms.MCTS.search_episode">
<code class="descname">search_episode</code><span class="sig-paren">(</span><em>g</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="reference internal" href="../_modules/algorithms/mcts.html#MCTS.search_episode"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#algorithms.MCTS.search_episode" title="Permalink to this definition"></a></dt>
<code class="descname">search_episode</code><span class="sig-paren">(</span><em>g</em>, <em>c_punt=1.4</em><span class="sig-paren">)</span><a class="reference internal" href="../_modules/algorithms/mcts.html#MCTS.search_episode"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#algorithms.MCTS.search_episode" title="Permalink to this definition"></a></dt>
<dd><p>We play a game by starting in the boards starting state and then
choosing a random move. We then move to the next state, keeping
track of which moves we chose. At the end of the game we go through
our visited list and update the values of wins and plays so that we
have a better understanding of which states are good and which are bad</p>
<table class="docutils field-list" frame="void" rules="none">
<col class="field-name" />
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>g</strong> (<a class="reference internal" href="core.html#core.Game" title="core.Game"><em>Game</em></a>) – Game to search</li>
<li><strong>c_punt</strong> (<em>float</em>) – The degree of exploration. Defaults to 1.4</li>
</ul>
</td>
</tr>
<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first"><dl class="docutils">
<dt>List of examples where each entry is of the format</dt>
<dd><p class="first last"><code class="code docutils literal notranslate"><span class="pre">[player,</span> <span class="pre">state_hash,</span> <span class="pre">reward]</span></code></p>
</dd>
</dl>
</p>
</td>
</tr>
<tr class="field-odd field"><th class="field-name">Return type:</th><td class="field-body"><p class="first last">list</p>
</td>
</tr>
</tbody>
</table>
</dd></dl>

<dl class="method">
<dt id="algorithms.MCTS.update">
<code class="descname">update</code><span class="sig-paren">(</span><em>examples</em><span class="sig-paren">)</span><a class="reference internal" href="../_modules/algorithms/mcts.html#MCTS.update"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#algorithms.MCTS.update" title="Permalink to this definition"></a></dt>
<dd><p>Backpropagate the result of the training episodes</p>
<table class="docutils field-list" frame="void" rules="none">
<col class="field-name" />
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><strong>examples</strong> (<em>list</em>) – List of examples where each entry is of the format
<code class="code docutils literal notranslate"><span class="pre">[player,</span> <span class="pre">state_hash,</span> <span class="pre">reward]</span></code></td>
</tr>
</tbody>
</table>
</dd></dl>

</dd></dl>
Expand Down
11 changes: 10 additions & 1 deletion docs/genindex.html
Original file line number Diff line number Diff line change
Expand Up @@ -156,6 +156,7 @@ <h1 id="index">Index</h1>
<div class="genindex-jumpbox">
<a href="#A"><strong>A</strong></a>
| <a href="#B"><strong>B</strong></a>
| <a href="#E"><strong>E</strong></a>
| <a href="#F"><strong>F</strong></a>
| <a href="#G"><strong>G</strong></a>
| <a href="#H"><strong>H</strong></a>
Expand Down Expand Up @@ -200,6 +201,14 @@ <h2 id="B">B</h2>
</ul></td>
</tr></table>

<h2 id="E">E</h2>
<table style="width: 100%" class="indextable genindextable"><tr>
<td style="width: 33%; vertical-align: top;"><ul>
<li><a href="api/algorithms.html#algorithms.MCTS.execute_episode">execute_episode() (algorithms.MCTS method)</a>
</li>
</ul></td>
</tr></table>

<h2 id="F">F</h2>
<table style="width: 100%" class="indextable genindextable"><tr>
<td style="width: 33%; vertical-align: top;"><ul>
Expand Down Expand Up @@ -299,7 +308,7 @@ <h2 id="P">P</h2>
<h2 id="R">R</h2>
<table style="width: 100%" class="indextable genindextable"><tr>
<td style="width: 33%; vertical-align: top;"><ul>
<li><a href="api/algorithms.html#algorithms.MCTS.random_playout">random_playout() (algorithms.MCTS method)</a>
<li><a href="api/algorithms.html#algorithms.MCTS.random_playout">random_playout() (algorithms.MCTS static method)</a>
</li>
</ul></td>
<td style="width: 33%; vertical-align: top;"><ul>
Expand Down
Binary file modified docs/objects.inv
Binary file not shown.

0 comments on commit 24c24fc

Please sign in to comment.