Skip to content

Commit

Permalink
Merge pull request #27 from sankhaMukherjee/dev
Browse files Browse the repository at this point in the history
added documentation
  • Loading branch information
sankhaMukherjee committed Jun 5, 2019
2 parents dafce6b + 65f1e05 commit 620442f
Show file tree
Hide file tree
Showing 5 changed files with 139 additions and 2 deletions.
48 changes: 48 additions & 0 deletions docs/_modules/lib/agents/Agent_DQN.html
Original file line number Diff line number Diff line change
Expand Up @@ -152,6 +152,54 @@ <h1>Source code for lib.agents.Agent_DQN</h1><div class="highlight"><pre>
<span class="kn">import</span> <span class="nn">sys</span>

<div class="viewcode-block" id="Agent_DQN"><a class="viewcode-back" href="../../../lib.agents.html#lib.agents.Agent_DQN.Agent_DQN">[docs]</a><span class="k">class</span> <span class="nc">Agent_DQN</span><span class="p">:</span>
<span class="sd">&#39;&#39;&#39;A class allowing the training of the DQN</span>

<span class="sd"> This class is intended to be used by functions within the ``lib.agents.trainAgents``</span>
<span class="sd"> module.</span>
<span class="sd"> </span>
<span class="sd"> The DQN algorithm was first proposed over some years ago and was slated to be used for </span>
<span class="sd"> improving the state of affairs of traditional reinforcement learning and extending it</span>
<span class="sd"> to deep reinforcement learning. This class allows you to easily set up a DQN learning</span>
<span class="sd"> framework. This class does not care about the type of environment. Just that the action</span>
<span class="sd"> an agent is able to take is one of a finite number of actions, each action at a particular</span>
<span class="sd"> state has an associated Q-value. This algorithm attempts to find theright Q value for each</span>
<span class="sd"> action.</span>

<span class="sd"> The class itself does not care about the specifics of the state, and the Qnetworks that </span>
<span class="sd"> calculate the results. It is up to the user to specify the right environment and the </span>
<span class="sd"> associated networks that will allow the algorithm to solve the Bellman equation.</span>
<span class="sd"> </span>
<span class="sd"> [link to paper](https://storage.googleapis.com/deepmind-media/dqn/DQNNaturePaper.pdf)</span>
<span class="sd"> </span>
<span class="sd"> Parameters</span>
<span class="sd"> ----------</span>
<span class="sd"> env : instance of an Env class</span>
<span class="sd"> The environment that will be used for generating the result of a particulat action</span>
<span class="sd"> in the current state</span>
<span class="sd"> memory : instance of the Memory class</span>
<span class="sd"> The environment that will allow one to store and retrieve previously held states that</span>
<span class="sd"> can be used to train upon.</span>
<span class="sd"> qNetworkSlow : neural network instance</span>
<span class="sd"> This is a neural network instance that can be used for converting a state into a</span>
<span class="sd"> set of Q-values. This is the slower version, used for making a prediction, and is </span>
<span class="sd"> never trained. Its parameters are slowly updated over time to slowly allow it to </span>
<span class="sd"> converge to the right value</span>
<span class="sd"> qNetworkFast : neural network instance</span>
<span class="sd"> This is the instance of the faster network that can be used for training Q-learning</span>
<span class="sd"> algorithm. This is the main network that implements the Bellman equation.</span>
<span class="sd"> numActions : int</span>
<span class="sd"> The number of discrete actions that the current environment can accept.</span>
<span class="sd"> gamma : float</span>
<span class="sd"> The discount factor. currently not used</span>
<span class="sd"> device : str, optional</span>
<span class="sd"> the device where you want to run your algorithm, by default &#39;cpu&#39;. If you want to run</span>
<span class="sd"> the optimization of a particular GPU, you may specify that. For example with &#39;cuda:0&#39;</span>
<span class="sd"> </span>
<span class="sd"> Raises</span>
<span class="sd"> ------</span>
<span class="sd"> type</span>
<span class="sd"> [description]</span>
<span class="sd"> &#39;&#39;&#39;</span>

<span class="k">def</span> <span class="nf">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">env</span><span class="p">,</span> <span class="n">memory</span><span class="p">,</span> <span class="n">qNetworkSlow</span><span class="p">,</span> <span class="n">qNetworkFast</span><span class="p">,</span> <span class="n">numActions</span><span class="p">,</span> <span class="n">gamma</span><span class="p">,</span> <span class="n">device</span><span class="o">=</span><span class="s1">&#39;cpu&#39;</span><span class="p">):</span>
<span class="sd">&#39;&#39;&#39;A class allowing the training of the DQN</span>
Expand Down
2 changes: 1 addition & 1 deletion docs/_sources/index.rst.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
.. src documentation master file, created by
sphinx-quickstart on Wed Jun 5 12:34:41 2019.
sphinx-quickstart on Wed Jun 5 12:41:08 2019.
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
Expand Down
41 changes: 41 additions & 0 deletions docs/lib.agents.html
Original file line number Diff line number Diff line change
Expand Up @@ -189,6 +189,47 @@ <h2>Submodules<a class="headerlink" href="#submodules" title="Permalink to this
<dt id="lib.agents.Agent_DQN.Agent_DQN">
<em class="property">class </em><code class="descclassname">lib.agents.Agent_DQN.</code><code class="descname">Agent_DQN</code><span class="sig-paren">(</span><em>env</em>, <em>memory</em>, <em>qNetworkSlow</em>, <em>qNetworkFast</em>, <em>numActions</em>, <em>gamma</em>, <em>device='cpu'</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/lib/agents/Agent_DQN.html#Agent_DQN"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#lib.agents.Agent_DQN.Agent_DQN" title="Permalink to this definition"></a></dt>
<dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">object</span></code></p>
<p>A class allowing the training of the DQN</p>
<p>This class is intended to be used by functions within the <code class="docutils literal notranslate"><span class="pre">lib.agents.trainAgents</span></code>
module.</p>
<p>The DQN algorithm was first proposed over some years ago and was slated to be used for
improving the state of affairs of traditional reinforcement learning and extending it
to deep reinforcement learning. This class allows you to easily set up a DQN learning
framework. This class does not care about the type of environment. Just that the action
an agent is able to take is one of a finite number of actions, each action at a particular
state has an associated Q-value. This algorithm attempts to find theright Q value for each
action.</p>
<p>The class itself does not care about the specifics of the state, and the Qnetworks that
calculate the results. It is up to the user to specify the right environment and the
associated networks that will allow the algorithm to solve the Bellman equation.</p>
<p>[link to paper](<a class="reference external" href="https://storage.googleapis.com/deepmind-media/dqn/DQNNaturePaper.pdf">https://storage.googleapis.com/deepmind-media/dqn/DQNNaturePaper.pdf</a>)</p>
<table class="docutils field-list" frame="void" rules="none">
<col class="field-name" />
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
<li><strong>env</strong> (<em>instance of an Env class</em>) – The environment that will be used for generating the result of a particulat action
in the current state</li>
<li><strong>memory</strong> (<em>instance of the Memory class</em>) – The environment that will allow one to store and retrieve previously held states that
can be used to train upon.</li>
<li><strong>qNetworkSlow</strong> (<em>neural network instance</em>) – This is a neural network instance that can be used for converting a state into a
set of Q-values. This is the slower version, used for making a prediction, and is
never trained. Its parameters are slowly updated over time to slowly allow it to
converge to the right value</li>
<li><strong>qNetworkFast</strong> (<em>neural network instance</em>) – This is the instance of the faster network that can be used for training Q-learning
algorithm. This is the main network that implements the Bellman equation.</li>
<li><strong>numActions</strong> (<em>int</em>) – The number of discrete actions that the current environment can accept.</li>
<li><strong>gamma</strong> (<em>float</em>) – The discount factor. currently not used</li>
<li><strong>device</strong> (<em>str</em><em>, </em><em>optional</em>) – the device where you want to run your algorithm, by default ‘cpu’. If you want to run
the optimization of a particular GPU, you may specify that. For example with ‘cuda:0’</li>
</ul>
</td>
</tr>
<tr class="field-even field"><th class="field-name">Raises:</th><td class="field-body"><p class="first last"><code class="xref py py-exc docutils literal notranslate"><span class="pre">type</span></code> – [description]</p>
</td>
</tr>
</tbody>
</table>
<dl class="method">
<dt id="lib.agents.Agent_DQN.Agent_DQN.checkTrainingMode">
<code class="descname">checkTrainingMode</code><span class="sig-paren">(</span><span class="sig-paren">)</span><a class="reference internal" href="_modules/lib/agents/Agent_DQN.html#Agent_DQN.checkTrainingMode"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#lib.agents.Agent_DQN.Agent_DQN.checkTrainingMode" title="Permalink to this definition"></a></dt>
Expand Down

0 comments on commit 620442f

Please sign in to comment.