diff --git a/README.md b/README.md
index decc491..4115ee4 100644
--- a/README.md
+++ b/README.md
@@ -10,16 +10,33 @@ be easily configured that vary assumptions about: user preferences and item
 familiarity; user latent state and its dynamics; and choice models and other
 user response behavior. We outline how RecSim offers value to RL and RS
 researchers and practitioners, and how it can serve as a vehicle for
-academic-industrial collaboration.
+academic-industrial collaboration. For a detailed description of the RecSim
+architecture please read [Ie et al](https://arxiv.org/abs/1909.04847). Please
+cite the paper if you use the code from this repository in your work.
+
+### Bibtex
+
+```
+@article{ie2019recsim,
+    title={RecSim: A Configurable Simulation Platform for Recommender Systems},
+    author={Eugene Ie and Chih-wei Hsu and Martin Mladenov and Vihan Jain and Sanmit Narvekar and Jing Wang and Rui Wu and Craig Boutilier},
+    year={2019},
+    eprint={1909.04847},
+    archivePrefix={arXiv},
+    primaryClass={cs.LG}
+}
+```
 
 <a id='Disclaimer'></a>
+
 ## Disclaimer
 
 This is not an officially supported Google product.
 
 ## What's new
-*  **12/13/2019:** Added (abstract) classes for both multi-user environments and
-  agents. Added bandit algorithms for generalized linear models.
+
+*   **12/13/2019:** Added (abstract) classes for both multi-user environments
+    and agents. Added bandit algorithms for generalized linear models.
 
 ## Installation and Sample Usage
 
@@ -64,10 +81,12 @@ You could also find the simulated logs in /tmp/recsim/episode_logs.tfrecord
 
 ## Tutorials
 
-To get started, please check out our Colab tutorials. In [**RecSim:
-Overview**](recsim/colab/RecSim_Overview.ipynb), we give a brief overview about
-RecSim. We then talk about each configurable component:
-[**environment**](recsim/colab/RecSim_Developing_an_Environment.ipynb) and
+To get started, please check out our Colab tutorials. In
+[**RecSim: Overview**](recsim/colab/RecSim_Overview.ipynb),
+we give a brief overview about RecSim. We then talk about each configurable
+component:
+[**environment**](recsim/colab/RecSim_Developing_an_Environment.ipynb)
+and
 [**recommender agent**](recsim/colab/RecSim_Developing_an_Agent.ipynb).
 
 ## Documentation
diff --git a/docs/api_docs/python/_redirects.yaml b/docs/api_docs/python/_redirects.yaml
index ebf9185..2bf1a1e 100644
--- a/docs/api_docs/python/_redirects.yaml
+++ b/docs/api_docs/python/_redirects.yaml
@@ -1,3 +1,9 @@
 redirects:
+- from: /recsim/api_docs/python/recsim/environments/interest_exploration/FLAGS
+  to: /recsim/api_docs/python/recsim/environments/interest_evolution/FLAGS
+- from: /recsim/api_docs/python/recsim/environments/long_term_satisfaction/FLAGS
+  to: /recsim/api_docs/python/recsim/environments/interest_evolution/FLAGS
 - from: /recsim/api_docs/python/recsim/simulator/environment/SingleUserEnvironment
   to: /recsim/api_docs/python/recsim/simulator/environment/Environment
+- from: /recsim/api_docs/python/recsim/simulator/runner_lib/FLAGS
+  to: /recsim/api_docs/python/recsim/environments/interest_evolution/FLAGS
diff --git a/docs/api_docs/python/_toc.yaml b/docs/api_docs/python/_toc.yaml
index 15d1b10..8ef5af0 100644
--- a/docs/api_docs/python/_toc.yaml
+++ b/docs/api_docs/python/_toc.yaml
@@ -201,6 +201,8 @@ toc:
       path: /recsim/api_docs/python/recsim/environments/interest_evolution/clicked_watchtime_reward
     - title: create_environment
       path: /recsim/api_docs/python/recsim/environments/interest_evolution/create_environment
+    - title: FLAGS
+      path: /recsim/api_docs/python/recsim/environments/interest_evolution/FLAGS
     - title: IEvResponse
       path: /recsim/api_docs/python/recsim/environments/interest_evolution/IEvResponse
     - title: IEvUserDistributionSampler
diff --git a/docs/api_docs/python/index.md b/docs/api_docs/python/index.md
index 58f1905..d0ecd1b 100644
--- a/docs/api_docs/python/index.md
+++ b/docs/api_docs/python/index.md
@@ -77,6 +77,7 @@
 *   <a href="./recsim/document/CandidateSet.md"><code>recsim.document.CandidateSet</code></a>
 *   <a href="./recsim/environments.md"><code>recsim.environments</code></a>
 *   <a href="./recsim/environments/interest_evolution.md"><code>recsim.environments.interest_evolution</code></a>
+*   <a href="./recsim/environments/interest_evolution/FLAGS.md"><code>recsim.environments.interest_evolution.FLAGS</code></a>
 *   <a href="./recsim/environments/interest_evolution/IEvResponse.md"><code>recsim.environments.interest_evolution.IEvResponse</code></a>
 *   <a href="./recsim/environments/interest_evolution/IEvUserDistributionSampler.md"><code>recsim.environments.interest_evolution.IEvUserDistributionSampler</code></a>
 *   <a href="./recsim/environments/interest_evolution/IEvUserModel.md"><code>recsim.environments.interest_evolution.IEvUserModel</code></a>
@@ -89,6 +90,7 @@
 *   <a href="./recsim/environments/interest_evolution/create_environment.md"><code>recsim.environments.interest_evolution.create_environment</code></a>
 *   <a href="./recsim/environments/interest_evolution/total_clicks_reward.md"><code>recsim.environments.interest_evolution.total_clicks_reward</code></a>
 *   <a href="./recsim/environments/interest_exploration.md"><code>recsim.environments.interest_exploration</code></a>
+*   <a href="./recsim/environments/interest_evolution/FLAGS.md"><code>recsim.environments.interest_exploration.FLAGS</code></a>
 *   <a href="./recsim/environments/interest_exploration/IEClusterUserSampler.md"><code>recsim.environments.interest_exploration.IEClusterUserSampler</code></a>
 *   <a href="./recsim/environments/interest_exploration/IEDocument.md"><code>recsim.environments.interest_exploration.IEDocument</code></a>
 *   <a href="./recsim/environments/interest_exploration/IEResponse.md"><code>recsim.environments.interest_exploration.IEResponse</code></a>
@@ -98,6 +100,7 @@
 *   <a href="./recsim/environments/interest_exploration/create_environment.md"><code>recsim.environments.interest_exploration.create_environment</code></a>
 *   <a href="./recsim/environments/interest_exploration/total_clicks_reward.md"><code>recsim.environments.interest_exploration.total_clicks_reward</code></a>
 *   <a href="./recsim/environments/long_term_satisfaction.md"><code>recsim.environments.long_term_satisfaction</code></a>
+*   <a href="./recsim/environments/interest_evolution/FLAGS.md"><code>recsim.environments.long_term_satisfaction.FLAGS</code></a>
 *   <a href="./recsim/environments/long_term_satisfaction/LTSDocument.md"><code>recsim.environments.long_term_satisfaction.LTSDocument</code></a>
 *   <a href="./recsim/environments/long_term_satisfaction/LTSDocumentSampler.md"><code>recsim.environments.long_term_satisfaction.LTSDocumentSampler</code></a>
 *   <a href="./recsim/environments/long_term_satisfaction/LTSResponse.md"><code>recsim.environments.long_term_satisfaction.LTSResponse</code></a>
@@ -116,6 +119,7 @@
 *   <a href="./recsim/simulator/recsim_gym/RecSimGymEnv.md"><code>recsim.simulator.recsim_gym.RecSimGymEnv</code></a>
 *   <a href="./recsim/simulator/runner_lib.md"><code>recsim.simulator.runner_lib</code></a>
 *   <a href="./recsim/simulator/runner_lib/EvalRunner.md"><code>recsim.simulator.runner_lib.EvalRunner</code></a>
+*   <a href="./recsim/environments/interest_evolution/FLAGS.md"><code>recsim.simulator.runner_lib.FLAGS</code></a>
 *   <a href="./recsim/simulator/runner_lib/Runner.md"><code>recsim.simulator.runner_lib.Runner</code></a>
 *   <a href="./recsim/simulator/runner_lib/TrainRunner.md"><code>recsim.simulator.runner_lib.TrainRunner</code></a>
 *   <a href="./recsim/simulator/runner_lib/load_gin_configs.md"><code>recsim.simulator.runner_lib.load_gin_configs</code></a>
diff --git a/docs/api_docs/python/recsim.md b/docs/api_docs/python/recsim.md
index 6b09e9f..fa6dc20 100644
--- a/docs/api_docs/python/recsim.md
+++ b/docs/api_docs/python/recsim.md
@@ -5,7 +5,10 @@
 
 # Module: recsim
 
+<!-- Insert buttons and diff -->
+
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/__init__.py">View
diff --git a/docs/api_docs/python/recsim/agent.md b/docs/api_docs/python/recsim/agent.md
index feaddf9..9d41786 100644
--- a/docs/api_docs/python/recsim/agent.md
+++ b/docs/api_docs/python/recsim/agent.md
@@ -5,7 +5,10 @@
 
 # Module: recsim.agent
 
+<!-- Insert buttons and diff -->
+
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View
diff --git a/docs/api_docs/python/recsim/agent/AbstractEpisodicRecommenderAgent.md b/docs/api_docs/python/recsim/agent/AbstractEpisodicRecommenderAgent.md
index 1505ec4..dd328f0 100644
--- a/docs/api_docs/python/recsim/agent/AbstractEpisodicRecommenderAgent.md
+++ b/docs/api_docs/python/recsim/agent/AbstractEpisodicRecommenderAgent.md
@@ -1,7 +1,6 @@
 <div itemscope itemtype="http://developers.google.com/ReferenceObject">
 <meta itemprop="name" content="recsim.agent.AbstractEpisodicRecommenderAgent" />
 <meta itemprop="path" content="Stable" />
-<meta itemprop="property" content="multi_user"/>
 <meta itemprop="property" content="__init__"/>
 <meta itemprop="property" content="begin_episode"/>
 <meta itemprop="property" content="bundle_and_checkpoint"/>
@@ -12,50 +11,67 @@
 
 # recsim.agent.AbstractEpisodicRecommenderAgent
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View
 source</a>
 
-## Class `AbstractEpisodicRecommenderAgent`
-
-<!-- Start diff -->
 Abstract class for recommender systems that solves episodic tasks.
 
 Inherits From:
 [`AbstractRecommenderAgent`](../../recsim/agent/AbstractRecommenderAgent.md)
 
-<!-- Placeholder for "Used in" -->
-
-<h2 id="__init__"><code>__init__</code></h2>
-
-<a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View
-source</a>
-
-```python
-__init__(
-    action_space,
-    summary_writer=None
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.agent.AbstractEpisodicRecommenderAgent(
+    action_space, summary_writer=None
 )
-```
+</code></pre>
 
-Initializes AbstractEpisodicRecommenderAgent.
-
-#### Args:
+<!-- Placeholder for "Used in" -->
 
-*   <b>`action_space`</b>: A gym.spaces object that specifies the format of
-    actions.
-*   <b>`summary_writer`</b>: A Tensorflow summary writer to pass to the agent
-    for in-agent training statistics in Tensorboard.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+`action_space`
+</td>
+<td>
+A gym.spaces object that specifies the format of actions.
+</td>
+</tr><tr>
+<td>
+`summary_writer`
+</td>
+<td>
+A Tensorflow summary writer to pass to the agent
+for in-agent training statistics in Tensorboard.
+</td>
+</tr>
+</table>
 
-## Properties
+<!-- Tabular view -->
 
-<h3 id="multi_user"><code>multi_user</code></h3>
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
+<tr>
+<td>
+`multi_user`
+</td>
+<td>
 Returns boolean indicating whether this agent serves multiple users.
+</td>
+</tr>
+</table>
 
 ## Methods
 
@@ -64,121 +80,251 @@ Returns boolean indicating whether this agent serves multiple users.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View
 source</a>
 
-```python
-begin_episode(observation=None)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>begin_episode(
+    observation=None
+)
+</code></pre>
 
 Returns the agent's first action for this episode.
 
-#### Args:
+<!-- Tabular view -->
 
-*   <b>`observation`</b>: numpy array, the environment's initial observation.
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
 
-#### Returns:
+<tr>
+<td>
+`observation`
+</td>
+<td>
+numpy array, the environment's initial observation.
+</td>
+</tr>
+</table>
 
-*   <b>`slate`</b>: An integer array of size _slate_size, where each element is
-    an index into the list of doc_obs
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+
+<tr>
+<td>
+`slate`
+</td>
+<td>
+An integer array of size _slate_size, where each element is an
+index into the list of doc_obs
+</td>
+</tr>
+</table>
 
 <h3 id="bundle_and_checkpoint"><code>bundle_and_checkpoint</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View
 source</a>
 
-```python
-bundle_and_checkpoint(
-    checkpoint_dir,
-    iteration_number
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>bundle_and_checkpoint(
+    checkpoint_dir, iteration_number
 )
-```
+</code></pre>
 
 Returns a self-contained bundle of the agent's state.
 
-#### Args:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`checkpoint_dir`
+</td>
+<td>
+A string that represents the path to the checkpoint and is
+used when we save TensorFlow objects by tf.Save.
+</td>
+</tr><tr>
+<td>
+`iteration_number`
+</td>
+<td>
+An integer that represents the checkpoint version and is
+used when restoring replay buffer.
+</td>
+</tr>
+</table>
 
-*   <b>`checkpoint_dir`</b>: A string that represents the path to the checkpoint
-    and is used when we save TensorFlow objects by tf.Save.
-*   <b>`iteration_number`</b>: An integer that represents the checkpoint version
-    and is used when restoring replay buffer.
+<!-- Tabular view -->
 
-#### Returns:
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+A dictionary containing additional Python objects to be checkpointed by
+the experiment. Each key is a string for the object name and the value
+is actual object. If the checkpoint directory does not exist, returns
+empty dictionary.
+</td>
+</tr>
 
-A dictionary containing additional Python objects to be checkpointed by the
-experiment. Each key is a string for the object name and the value is actual
-object. If the checkpoint directory does not exist, returns empty dictionary.
+</table>
 
 <h3 id="end_episode"><code>end_episode</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View
 source</a>
 
-```python
-end_episode(
-    reward,
-    observation=None
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>end_episode(
+    reward, observation=None
 )
-```
+</code></pre>
 
 Signals the end of the episode to the agent.
 
-#### Args:
-
-*   <b>`reward`</b>: An float that is the last reward from the environment.
-*   <b>`observation`</b>: numpy array that represents the last observation of
-    the episode.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`reward`
+</td>
+<td>
+An float that is the last reward from the environment.
+</td>
+</tr><tr>
+<td>
+`observation`
+</td>
+<td>
+numpy array that represents the last observation of the
+episode.
+</td>
+</tr>
+</table>
 
 <h3 id="step"><code>step</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View
 source</a>
 
-```python
-step(
-    reward,
-    observation
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>@abc.abstractmethod</code>
+<code>step(
+    reward, observation
 )
-```
+</code></pre>
 
 Records the most recent transition and returns the agent's next action.
 
 We store the observation of the last time step since we want to store it with
 the reward.
 
-#### Args:
-
-*   <b>`reward`</b>: The reward received from the agent's most recent action as
-    a float.
-*   <b>`observation`</b>: A dictionary that includes the most recent
-    observations.
-
-#### Returns:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`reward`
+</td>
+<td>
+The reward received from the agent's most recent action as a
+float.
+</td>
+</tr><tr>
+<td>
+`observation`
+</td>
+<td>
+A dictionary that includes the most recent observations.
+</td>
+</tr>
+</table>
 
-*   <b>`slate`</b>: An integer array of size _slate_size, where each element is
-    an index into the list of doc_obs
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+
+<tr>
+<td>
+`slate`
+</td>
+<td>
+An integer array of size _slate_size, where each element is an
+index into the list of doc_obs
+</td>
+</tr>
+</table>
 
 <h3 id="unbundle"><code>unbundle</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View
 source</a>
 
-```python
-unbundle(
-    checkpoint_dir,
-    iteration_number,
-    bundle_dict
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>unbundle(
+    checkpoint_dir, iteration_number, bundle_dict
 )
-```
+</code></pre>
 
 Restores the agent from a checkpoint.
 
-#### Args:
-
-*   <b>`checkpoint_dir`</b>: A string that represents the path to the checkpoint
-    and is used when we save TensorFlow objects by tf.Save.
-*   <b>`iteration_number`</b>: An integer that represents the checkpoint version
-    and is used when restoring replay buffer.
-*   <b>`bundle_dict`</b>: A dict containing additional Python objects owned by
-    the agent. Each key is an object name and the value is the actual object.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`checkpoint_dir`
+</td>
+<td>
+A string that represents the path to the checkpoint and is
+used when we save TensorFlow objects by tf.Save.
+</td>
+</tr><tr>
+<td>
+`iteration_number`
+</td>
+<td>
+An integer that represents the checkpoint version and is
+used when restoring replay buffer.
+</td>
+</tr><tr>
+<td>
+`bundle_dict`
+</td>
+<td>
+A dict containing additional Python objects owned by the
+agent. Each key is an object name and the value is the actual object.
+</td>
+</tr>
+</table>
 
-#### Returns:
+<!-- Tabular view -->
 
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
 bool, True if unbundling was successful.
+</td>
+</tr>
+
+</table>
diff --git a/docs/api_docs/python/recsim/agent/AbstractHierarchicalAgentLayer.md b/docs/api_docs/python/recsim/agent/AbstractHierarchicalAgentLayer.md
index 4f799e2..79de22a 100644
--- a/docs/api_docs/python/recsim/agent/AbstractHierarchicalAgentLayer.md
+++ b/docs/api_docs/python/recsim/agent/AbstractHierarchicalAgentLayer.md
@@ -1,7 +1,6 @@
 <div itemscope itemtype="http://developers.google.com/ReferenceObject">
 <meta itemprop="name" content="recsim.agent.AbstractHierarchicalAgentLayer" />
 <meta itemprop="path" content="Stable" />
-<meta itemprop="property" content="multi_user"/>
 <meta itemprop="property" content="__init__"/>
 <meta itemprop="property" content="begin_episode"/>
 <meta itemprop="property" content="bundle_and_checkpoint"/>
@@ -12,48 +11,59 @@
 
 # recsim.agent.AbstractHierarchicalAgentLayer
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View
 source</a>
 
-## Class `AbstractHierarchicalAgentLayer`
-
-<!-- Start diff -->
 Parent class for stackable agent layers.
 
 Inherits From:
 [`AbstractRecommenderAgent`](../../recsim/agent/AbstractRecommenderAgent.md)
 
-<!-- Placeholder for "Used in" -->
-
-<h2 id="__init__"><code>__init__</code></h2>
-
-<a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View
-source</a>
-
-```python
-__init__(
-    action_space,
-    *base_agent_ctors
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.agent.AbstractHierarchicalAgentLayer(
+    action_space, *base_agent_ctors
 )
-```
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
 
-Initializes AbstractRecommenderAgent.
+<!-- Tabular view -->
 
-#### Args:
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
-*   <b>`action_space`</b>: A gym.spaces object that specifies the format of
-    actions.
+<tr>
+<td>
+`action_space`
+</td>
+<td>
+A gym.spaces object that specifies the format of actions.
+</td>
+</tr>
+</table>
 
-## Properties
+<!-- Tabular view -->
 
-<h3 id="multi_user"><code>multi_user</code></h3>
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
+<tr>
+<td>
+`multi_user`
+</td>
+<td>
 Returns boolean indicating whether this agent serves multiple users.
+</td>
+</tr>
+</table>
 
 ## Methods
 
@@ -62,102 +72,189 @@ Returns boolean indicating whether this agent serves multiple users.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View
 source</a>
 
-```python
-begin_episode(observation=None)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>begin_episode(
+    observation=None
+)
+</code></pre>
 
 <h3 id="bundle_and_checkpoint"><code>bundle_and_checkpoint</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View
 source</a>
 
-```python
-bundle_and_checkpoint(
-    checkpoint_dir,
-    iteration_number
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>bundle_and_checkpoint(
+    checkpoint_dir, iteration_number
 )
-```
+</code></pre>
 
 Returns a self-contained bundle of the agent's state.
 
-#### Args:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`checkpoint_dir`
+</td>
+<td>
+A string for the directory where objects will be saved.
+</td>
+</tr><tr>
+<td>
+`iteration_number`
+</td>
+<td>
+An integer of iteration number to use for naming the
+checkpoint file.
+</td>
+</tr>
+</table>
 
-*   <b>`checkpoint_dir`</b>: A string for the directory where objects will be
-    saved.
-*   <b>`iteration_number`</b>: An integer of iteration number to use for naming
-    the checkpoint file.
+<!-- Tabular view -->
 
-#### Returns:
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+A dictionary containing additional Python objects to be checkpointed by
+the experiment. Each key is a string for the object name and the value
+is actual object. If the checkpoint directory does not exist, returns
+empty dictionary.
+</td>
+</tr>
 
-A dictionary containing additional Python objects to be checkpointed by the
-experiment. Each key is a string for the object name and the value is actual
-object. If the checkpoint directory does not exist, returns empty dictionary.
+</table>
 
 <h3 id="end_episode"><code>end_episode</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View
 source</a>
 
-```python
-end_episode(
-    reward,
-    observation
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>end_episode(
+    reward, observation
 )
-```
+</code></pre>
 
 <h3 id="step"><code>step</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View
 source</a>
 
-```python
-step(
-    reward,
-    observation
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>@abc.abstractmethod</code>
+<code>step(
+    reward, observation
 )
-```
+</code></pre>
 
 Records the most recent transition and returns the agent's next action.
 
 We store the observation of the last time step since we want to store it with
 the reward.
 
-#### Args:
-
-*   <b>`reward`</b>: The reward received from the agent's most recent action as
-    a float.
-*   <b>`observation`</b>: A dictionary that includes the most recent
-    observations.
-
-#### Returns:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`reward`
+</td>
+<td>
+The reward received from the agent's most recent action as a
+float.
+</td>
+</tr><tr>
+<td>
+`observation`
+</td>
+<td>
+A dictionary that includes the most recent observations.
+</td>
+</tr>
+</table>
 
-*   <b>`slate`</b>: An integer array of size _slate_size, where each element is
-    an index into the list of doc_obs
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+
+<tr>
+<td>
+`slate`
+</td>
+<td>
+An integer array of size _slate_size, where each element is an
+index into the list of doc_obs
+</td>
+</tr>
+</table>
 
 <h3 id="unbundle"><code>unbundle</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View
 source</a>
 
-```python
-unbundle(
-    checkpoint_dir,
-    iteration_number,
-    bundle_dict
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>unbundle(
+    checkpoint_dir, iteration_number, bundle_dict
 )
-```
+</code></pre>
 
 Restores the agent from a checkpoint.
 
-#### Args:
-
-*   <b>`checkpoint_dir`</b>: A string that represents the path to the checkpoint
-    saved by tf.Save.
-*   <b>`iteration_number`</b>: An integer that represents the checkpoint version
-    and is used when restoring replay buffer.
-*   <b>`bundle_dict`</b>: A dict containing additional Python objects owned by
-    the agent. Each key is an object name and the value is the actual object.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`checkpoint_dir`
+</td>
+<td>
+A string that represents the path to the checkpoint saved
+by tf.Save.
+</td>
+</tr><tr>
+<td>
+`iteration_number`
+</td>
+<td>
+An integer that represents the checkpoint version and is
+used when restoring replay buffer.
+</td>
+</tr><tr>
+<td>
+`bundle_dict`
+</td>
+<td>
+A dict containing additional Python objects owned by the
+agent. Each key is an object name and the value is the actual object.
+</td>
+</tr>
+</table>
 
-#### Returns:
+<!-- Tabular view -->
 
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
 bool, True if unbundling was successful.
+</td>
+</tr>
+
+</table>
diff --git a/docs/api_docs/python/recsim/agent/AbstractMultiUserEpisodicRecommenderAgent.md b/docs/api_docs/python/recsim/agent/AbstractMultiUserEpisodicRecommenderAgent.md
index efa4b0f..b8a7493 100644
--- a/docs/api_docs/python/recsim/agent/AbstractMultiUserEpisodicRecommenderAgent.md
+++ b/docs/api_docs/python/recsim/agent/AbstractMultiUserEpisodicRecommenderAgent.md
@@ -1,7 +1,6 @@
 <div itemscope itemtype="http://developers.google.com/ReferenceObject">
 <meta itemprop="name" content="recsim.agent.AbstractMultiUserEpisodicRecommenderAgent" />
 <meta itemprop="path" content="Stable" />
-<meta itemprop="property" content="multi_user"/>
 <meta itemprop="property" content="__init__"/>
 <meta itemprop="property" content="begin_episode"/>
 <meta itemprop="property" content="bundle_and_checkpoint"/>
@@ -12,46 +11,59 @@
 
 # recsim.agent.AbstractMultiUserEpisodicRecommenderAgent
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View
 source</a>
 
-## Class `AbstractMultiUserEpisodicRecommenderAgent`
-
-<!-- Start diff -->
-
 Abstract class to model a recommender agent handling multiple users.
 
 Inherits From:
 [`AbstractEpisodicRecommenderAgent`](../../recsim/agent/AbstractEpisodicRecommenderAgent.md)
 
-<!-- Placeholder for "Used in" -->
-
-<h2 id="__init__"><code>__init__</code></h2>
-
-<a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View
-source</a>
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.agent.AbstractMultiUserEpisodicRecommenderAgent(
+    action_space
+)
+</code></pre>
 
-```python
-__init__(action_space)
-```
+<!-- Placeholder for "Used in" -->
 
-Initializes AbstractMultiUserEpisodicRecommenderAgent.
+<!-- Tabular view -->
 
-#### Args:
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
-*   <b>`action_space`</b>: A gym.spaces object that specifies the format of
-    actions.
+<tr>
+<td>
+`action_space`
+</td>
+<td>
+A gym.spaces object that specifies the format of actions.
+</td>
+</tr>
+</table>
 
-## Properties
+<!-- Tabular view -->
 
-<h3 id="multi_user"><code>multi_user</code></h3>
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
+<tr>
+<td>
+`multi_user`
+</td>
+<td>
 Returns boolean indicating whether this agent serves multiple users.
+</td>
+</tr>
+</table>
 
 ## Methods
 
@@ -60,121 +72,251 @@ Returns boolean indicating whether this agent serves multiple users.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View
 source</a>
 
-```python
-begin_episode(observation=None)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>begin_episode(
+    observation=None
+)
+</code></pre>
 
 Returns the agent's first action for this episode.
 
-#### Args:
+<!-- Tabular view -->
 
-*   <b>`observation`</b>: numpy array, the environment's initial observation.
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
 
-#### Returns:
+<tr>
+<td>
+`observation`
+</td>
+<td>
+numpy array, the environment's initial observation.
+</td>
+</tr>
+</table>
 
-*   <b>`slate`</b>: An integer array of size _slate_size, where each element is
-    an index into the list of doc_obs
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+
+<tr>
+<td>
+`slate`
+</td>
+<td>
+An integer array of size _slate_size, where each element is an
+index into the list of doc_obs
+</td>
+</tr>
+</table>
 
 <h3 id="bundle_and_checkpoint"><code>bundle_and_checkpoint</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View
 source</a>
 
-```python
-bundle_and_checkpoint(
-    checkpoint_dir,
-    iteration_number
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>bundle_and_checkpoint(
+    checkpoint_dir, iteration_number
 )
-```
+</code></pre>
 
 Returns a self-contained bundle of the agent's state.
 
-#### Args:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`checkpoint_dir`
+</td>
+<td>
+A string that represents the path to the checkpoint and is
+used when we save TensorFlow objects by tf.Save.
+</td>
+</tr><tr>
+<td>
+`iteration_number`
+</td>
+<td>
+An integer that represents the checkpoint version and is
+used when restoring replay buffer.
+</td>
+</tr>
+</table>
 
-*   <b>`checkpoint_dir`</b>: A string that represents the path to the checkpoint
-    and is used when we save TensorFlow objects by tf.Save.
-*   <b>`iteration_number`</b>: An integer that represents the checkpoint version
-    and is used when restoring replay buffer.
+<!-- Tabular view -->
 
-#### Returns:
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+A dictionary containing additional Python objects to be checkpointed by
+the experiment. Each key is a string for the object name and the value
+is actual object. If the checkpoint directory does not exist, returns
+empty dictionary.
+</td>
+</tr>
 
-A dictionary containing additional Python objects to be checkpointed by the
-experiment. Each key is a string for the object name and the value is actual
-object. If the checkpoint directory does not exist, returns empty dictionary.
+</table>
 
 <h3 id="end_episode"><code>end_episode</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View
 source</a>
 
-```python
-end_episode(
-    reward,
-    observation=None
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>end_episode(
+    reward, observation=None
 )
-```
+</code></pre>
 
 Signals the end of the episode to the agent.
 
-#### Args:
-
-*   <b>`reward`</b>: An float that is the last reward from the environment.
-*   <b>`observation`</b>: numpy array that represents the last observation of
-    the episode.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`reward`
+</td>
+<td>
+An float that is the last reward from the environment.
+</td>
+</tr><tr>
+<td>
+`observation`
+</td>
+<td>
+numpy array that represents the last observation of the
+episode.
+</td>
+</tr>
+</table>
 
 <h3 id="step"><code>step</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View
 source</a>
 
-```python
-step(
-    reward,
-    observation
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>@abc.abstractmethod</code>
+<code>step(
+    reward, observation
 )
-```
+</code></pre>
 
 Records the most recent transition and returns the agent's next action.
 
 We store the observation of the last time step since we want to store it with
 the reward.
 
-#### Args:
-
-*   <b>`reward`</b>: The reward received from the agent's most recent action as
-    a float.
-*   <b>`observation`</b>: A dictionary that includes the most recent
-    observations.
-
-#### Returns:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`reward`
+</td>
+<td>
+The reward received from the agent's most recent action as a
+float.
+</td>
+</tr><tr>
+<td>
+`observation`
+</td>
+<td>
+A dictionary that includes the most recent observations.
+</td>
+</tr>
+</table>
 
-*   <b>`slate`</b>: An integer array of size _slate_size, where each element is
-    an index into the list of doc_obs
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+
+<tr>
+<td>
+`slate`
+</td>
+<td>
+An integer array of size _slate_size, where each element is an
+index into the list of doc_obs
+</td>
+</tr>
+</table>
 
 <h3 id="unbundle"><code>unbundle</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View
 source</a>
 
-```python
-unbundle(
-    checkpoint_dir,
-    iteration_number,
-    bundle_dict
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>unbundle(
+    checkpoint_dir, iteration_number, bundle_dict
 )
-```
+</code></pre>
 
 Restores the agent from a checkpoint.
 
-#### Args:
-
-*   <b>`checkpoint_dir`</b>: A string that represents the path to the checkpoint
-    and is used when we save TensorFlow objects by tf.Save.
-*   <b>`iteration_number`</b>: An integer that represents the checkpoint version
-    and is used when restoring replay buffer.
-*   <b>`bundle_dict`</b>: A dict containing additional Python objects owned by
-    the agent. Each key is an object name and the value is the actual object.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`checkpoint_dir`
+</td>
+<td>
+A string that represents the path to the checkpoint and is
+used when we save TensorFlow objects by tf.Save.
+</td>
+</tr><tr>
+<td>
+`iteration_number`
+</td>
+<td>
+An integer that represents the checkpoint version and is
+used when restoring replay buffer.
+</td>
+</tr><tr>
+<td>
+`bundle_dict`
+</td>
+<td>
+A dict containing additional Python objects owned by the
+agent. Each key is an object name and the value is the actual object.
+</td>
+</tr>
+</table>
 
-#### Returns:
+<!-- Tabular view -->
 
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
 bool, True if unbundling was successful.
+</td>
+</tr>
+
+</table>
diff --git a/docs/api_docs/python/recsim/agent/AbstractRecommenderAgent.md b/docs/api_docs/python/recsim/agent/AbstractRecommenderAgent.md
index 6ad6178..e8b10ef 100644
--- a/docs/api_docs/python/recsim/agent/AbstractRecommenderAgent.md
+++ b/docs/api_docs/python/recsim/agent/AbstractRecommenderAgent.md
@@ -1,7 +1,6 @@
 <div itemscope itemtype="http://developers.google.com/ReferenceObject">
 <meta itemprop="name" content="recsim.agent.AbstractRecommenderAgent" />
 <meta itemprop="path" content="Stable" />
-<meta itemprop="property" content="multi_user"/>
 <meta itemprop="property" content="__init__"/>
 <meta itemprop="property" content="bundle_and_checkpoint"/>
 <meta itemprop="property" content="step"/>
@@ -10,42 +9,56 @@
 
 # recsim.agent.AbstractRecommenderAgent
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View
 source</a>
 
-## Class `AbstractRecommenderAgent`
-
-<!-- Start diff -->
 Abstract class to model a recommender system agent.
 
-<!-- Placeholder for "Used in" -->
-
-<h2 id="__init__"><code>__init__</code></h2>
-
-<a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View
-source</a>
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.agent.AbstractRecommenderAgent(
+    action_space
+)
+</code></pre>
 
-```python
-__init__(action_space)
-```
+<!-- Placeholder for "Used in" -->
 
-Initializes AbstractRecommenderAgent.
+<!-- Tabular view -->
 
-#### Args:
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
-*   <b>`action_space`</b>: A gym.spaces object that specifies the format of
-    actions.
+<tr>
+<td>
+`action_space`
+</td>
+<td>
+A gym.spaces object that specifies the format of actions.
+</td>
+</tr>
+</table>
 
-## Properties
+<!-- Tabular view -->
 
-<h3 id="multi_user"><code>multi_user</code></h3>
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
+<tr>
+<td>
+`multi_user`
+</td>
+<td>
 Returns boolean indicating whether this agent serves multiple users.
+</td>
+</tr>
+</table>
 
 ## Methods
 
@@ -54,12 +67,12 @@ Returns boolean indicating whether this agent serves multiple users.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View
 source</a>
 
-```python
-bundle_and_checkpoint(
-    checkpoint_dir,
-    iteration_number
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>@abc.abstractmethod</code>
+<code>bundle_and_checkpoint(
+    checkpoint_dir, iteration_number
 )
-```
+</code></pre>
 
 Returns a self-contained bundle of the agent's state.
 
@@ -67,60 +80,115 @@ This is used for checkpointing. It will return a dictionary containing all
 non-TensorFlow objects (to be saved into a file by the caller), and it saves all
 TensorFlow objects into a checkpoint file.
 
-#### Args:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`checkpoint_dir`
+</td>
+<td>
+A string for the directory where objects will be saved.
+</td>
+</tr><tr>
+<td>
+`iteration_number`
+</td>
+<td>
+An integer of iteration number to use for naming the
+checkpoint file.
+</td>
+</tr>
+</table>
 
-*   <b>`checkpoint_dir`</b>: A string for the directory where objects will be
-    saved.
-*   <b>`iteration_number`</b>: An integer of iteration number to use for naming
-    the checkpoint file.
+<!-- Tabular view -->
 
-#### Returns:
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+A dictionary containing additional Python objects to be checkpointed by
+the experiment. Each key is a string for the object name and the value
+is actual object. If the checkpoint directory does not exist, returns
+empty dictionary.
+</td>
+</tr>
 
-A dictionary containing additional Python objects to be checkpointed by the
-experiment. Each key is a string for the object name and the value is actual
-object. If the checkpoint directory does not exist, returns empty dictionary.
+</table>
 
 <h3 id="step"><code>step</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View
 source</a>
 
-```python
-step(
-    reward,
-    observation
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>@abc.abstractmethod</code>
+<code>step(
+    reward, observation
 )
-```
+</code></pre>
 
 Records the most recent transition and returns the agent's next action.
 
 We store the observation of the last time step since we want to store it with
 the reward.
 
-#### Args:
-
-*   <b>`reward`</b>: The reward received from the agent's most recent action as
-    a float.
-*   <b>`observation`</b>: A dictionary that includes the most recent
-    observations.
-
-#### Returns:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`reward`
+</td>
+<td>
+The reward received from the agent's most recent action as a
+float.
+</td>
+</tr><tr>
+<td>
+`observation`
+</td>
+<td>
+A dictionary that includes the most recent observations.
+</td>
+</tr>
+</table>
 
-*   <b>`slate`</b>: An integer array of size _slate_size, where each element is
-    an index into the list of doc_obs
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+
+<tr>
+<td>
+`slate`
+</td>
+<td>
+An integer array of size _slate_size, where each element is an
+index into the list of doc_obs
+</td>
+</tr>
+</table>
 
 <h3 id="unbundle"><code>unbundle</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View
 source</a>
 
-```python
-unbundle(
-    checkpoint_dir,
-    iteration_number,
-    bundle_dict
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>@abc.abstractmethod</code>
+<code>unbundle(
+    checkpoint_dir, iteration_number, bundle_dict
 )
-```
+</code></pre>
 
 Restores the agent from a checkpoint.
 
@@ -128,15 +196,48 @@ Restores the agent's Python objects to those specified in bundle_dict, and
 restores the TensorFlow objects to those specified in the checkpoint_dir. If the
 checkpoint_dir does not exist, will not reset the agent's state.
 
-#### Args:
-
-*   <b>`checkpoint_dir`</b>: A string that represents the path to the checkpoint
-    saved by tf.Save.
-*   <b>`iteration_number`</b>: An integer that represents the checkpoint version
-    and is used when restoring replay buffer.
-*   <b>`bundle_dict`</b>: A dict containing additional Python objects owned by
-    the agent. Each key is an object name and the value is the actual object.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`checkpoint_dir`
+</td>
+<td>
+A string that represents the path to the checkpoint saved
+by tf.Save.
+</td>
+</tr><tr>
+<td>
+`iteration_number`
+</td>
+<td>
+An integer that represents the checkpoint version and is
+used when restoring replay buffer.
+</td>
+</tr><tr>
+<td>
+`bundle_dict`
+</td>
+<td>
+A dict containing additional Python objects owned by the
+agent. Each key is an object name and the value is the actual object.
+</td>
+</tr>
+</table>
 
-#### Returns:
+<!-- Tabular view -->
 
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
 bool, True if unbundling was successful.
+</td>
+</tr>
+
+</table>
diff --git a/docs/api_docs/python/recsim/agents.md b/docs/api_docs/python/recsim/agents.md
index d6528eb..a23cf68 100644
--- a/docs/api_docs/python/recsim/agents.md
+++ b/docs/api_docs/python/recsim/agents.md
@@ -5,7 +5,10 @@
 
 # Module: recsim.agents
 
+<!-- Insert buttons and diff -->
+
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/__init__.py">View
diff --git a/docs/api_docs/python/recsim/agents/agent_utils.md b/docs/api_docs/python/recsim/agents/agent_utils.md
index 7e607ae..548ed20 100644
--- a/docs/api_docs/python/recsim/agents/agent_utils.md
+++ b/docs/api_docs/python/recsim/agents/agent_utils.md
@@ -5,7 +5,10 @@
 
 # Module: recsim.agents.agent_utils
 
+<!-- Insert buttons and diff -->
+
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/agent_utils.py">View
diff --git a/docs/api_docs/python/recsim/agents/agent_utils/GymSpaceWalker.md b/docs/api_docs/python/recsim/agents/agent_utils/GymSpaceWalker.md
index 8cc9616..fe0901e 100644
--- a/docs/api_docs/python/recsim/agents/agent_utils/GymSpaceWalker.md
+++ b/docs/api_docs/python/recsim/agents/agent_utils/GymSpaceWalker.md
@@ -7,19 +7,23 @@
 
 # recsim.agents.agent_utils.GymSpaceWalker
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/agent_utils.py">View
 source</a>
 
-## Class `GymSpaceWalker`
-
-<!-- Start diff -->
 Class for recursively applying a given function to a gym space.
 
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.agents.agent_utils.GymSpaceWalker(
+    gym_space, leaf_op
+)
+</code></pre>
+
 <!-- Placeholder for "Used in" -->
 
 Gym spaces have nested structure in terms of container spaces (e.g. Dict and
@@ -30,26 +34,18 @@ the proces. E.g., given a gym space of the form Tuple((Box(1), Box(1)) and a
 leaf operator f, this class can is used to transform an observation (a, b) to
 [f(a), f(b)].
 
-#### Args:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+</table>
 
 gym_space: An instance of an OpenAI Gym space. leaf_op: A function taking as
 arguments an OpenAI Gym space and an observation conforming to that space. There
 are no requirements on its output.
 
-<h2 id="__init__"><code>__init__</code></h2>
-
-<a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/agent_utils.py">View
-source</a>
-
-```python
-__init__(
-    gym_space,
-    leaf_op
-)
-```
-
-Initialize self. See help(type(self)) for accurate signature.
-
 ## Methods
 
 <h3 id="apply_and_flatten"><code>apply_and_flatten</code></h3>
@@ -57,6 +53,8 @@ Initialize self. See help(type(self)) for accurate signature.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/agent_utils.py">View
 source</a>
 
-```python
-apply_and_flatten(gym_observations)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>apply_and_flatten(
+    gym_observations
+)
+</code></pre>
diff --git a/docs/api_docs/python/recsim/agents/agent_utils/epsilon_greedy_exploration.md b/docs/api_docs/python/recsim/agents/agent_utils/epsilon_greedy_exploration.md
index a7ec774..9395afe 100644
--- a/docs/api_docs/python/recsim/agents/agent_utils/epsilon_greedy_exploration.md
+++ b/docs/api_docs/python/recsim/agents/agent_utils/epsilon_greedy_exploration.md
@@ -5,30 +5,27 @@
 
 # recsim.agents.agent_utils.epsilon_greedy_exploration
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/agent_utils.py">View
 source</a>
 
-<!-- Start diff -->
 Epsilon greedy exploration.
 
-```python
-recsim.agents.agent_utils.epsilon_greedy_exploration(
-    state_action_iterator,
-    q_function,
-    epsilon
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.agents.agent_utils.epsilon_greedy_exploration(
+    state_action_iterator, q_function, epsilon
 )
-```
+</code></pre>
 
 <!-- Placeholder for "Used in" -->
 
 Either picks a slate uniformly at random with probability epsilon, or returns a
-slate with maximal Q-value. TODO(mmladenov): more verbose doc. Args:
-state_action_iterator: an iterator over slate, state_action_index tuples.
-q_function: a container holding Q-values of state-action pairs. epsilon:
-probability of random action. Returns: slate: the picked slate. sa_index: the
-index of the picked slate in the Q-value table.
+slate with maximal Q-value. Args: state_action_iterator: an iterator over slate,
+state_action_index tuples. q_function: a container holding Q-values of
+state-action pairs. epsilon: probability of random action. Returns: slate: the
+picked slate. sa_index: the index of the picked slate in the Q-value table.
diff --git a/docs/api_docs/python/recsim/agents/agent_utils/min_count_exploration.md b/docs/api_docs/python/recsim/agents/agent_utils/min_count_exploration.md
index 6967b66..2ab2572 100644
--- a/docs/api_docs/python/recsim/agents/agent_utils/min_count_exploration.md
+++ b/docs/api_docs/python/recsim/agents/agent_utils/min_count_exploration.md
@@ -5,23 +5,22 @@
 
 # recsim.agents.agent_utils.min_count_exploration
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/agent_utils.py">View
 source</a>
 
-<!-- Start diff -->
 Minimum count exploration.
 
-```python
-recsim.agents.agent_utils.min_count_exploration(
-    state_action_iterator,
-    counts_function
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.agents.agent_utils.min_count_exploration(
+    state_action_iterator, counts_function
 )
-```
+</code></pre>
 
 <!-- Placeholder for "Used in" -->
 
diff --git a/docs/api_docs/python/recsim/agents/bandits.md b/docs/api_docs/python/recsim/agents/bandits.md
index 980293a..437f6fc 100644
--- a/docs/api_docs/python/recsim/agents/bandits.md
+++ b/docs/api_docs/python/recsim/agents/bandits.md
@@ -5,7 +5,10 @@
 
 # Module: recsim.agents.bandits
 
+<!-- Insert buttons and diff -->
+
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/bandits/__init__.py">View
diff --git a/docs/api_docs/python/recsim/agents/bandits/algorithms.md b/docs/api_docs/python/recsim/agents/bandits/algorithms.md
index 8eb5f15..ad11539 100644
--- a/docs/api_docs/python/recsim/agents/bandits/algorithms.md
+++ b/docs/api_docs/python/recsim/agents/bandits/algorithms.md
@@ -5,7 +5,10 @@
 
 # Module: recsim.agents.bandits.algorithms
 
+<!-- Insert buttons and diff -->
+
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/bandits/algorithms.py">View
diff --git a/docs/api_docs/python/recsim/agents/bandits/algorithms/KLUCB.md b/docs/api_docs/python/recsim/agents/bandits/algorithms/KLUCB.md
index 4008fe1..6642a9b 100644
--- a/docs/api_docs/python/recsim/agents/bandits/algorithms/KLUCB.md
+++ b/docs/api_docs/python/recsim/agents/bandits/algorithms/KLUCB.md
@@ -11,48 +11,61 @@
 
 # recsim.agents.bandits.algorithms.KLUCB
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/bandits/algorithms.py">View
 source</a>
 
-## Class `KLUCB`
-
-<!-- Start diff -->
 Kullback-Leibler Upper Confidence Bounds (KL-UCB) algorithm.
 
 Inherits From:
 [`MABAlgorithm`](../../../../recsim/agents/bandits/algorithms/MABAlgorithm.md)
 
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.agents.bandits.algorithms.KLUCB(
+    num_arms, params, seed=0
+)
+</code></pre>
+
 <!-- Placeholder for "Used in" -->
 
 See "The KL-UCB algorithm for bounded stochastic bandits and beyond" by Garivier
 and Cappe.
 
-<h2 id="__init__"><code>__init__</code></h2>
-
-<a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/bandits/algorithms.py">View
-source</a>
-
-```python
-__init__(
-    num_arms,
-    params,
-    seed=0
-)
-```
-
-Initializes MABAlgorithm.
-
-#### Args:
-
-*   <b>`num_arms`</b>: Number of arms. Must be greater than one.
-*   <b>`params`</b>: A dictionary which includes additional parameters like
-    optimism_scaling. Default is an empty dictionary.
-*   <b>`seed`</b>: Random seed for this object. Default is zero.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+`num_arms`
+</td>
+<td>
+Number of arms. Must be greater than one.
+</td>
+</tr><tr>
+<td>
+`params`
+</td>
+<td>
+A dictionary which includes additional parameters like
+optimism_scaling. Default is an empty dictionary.
+</td>
+</tr><tr>
+<td>
+`seed`
+</td>
+<td>
+Random seed for this object. Default is zero.
+</td>
+</tr>
+</table>
 
 ## Methods
 
@@ -61,18 +74,22 @@ Initializes MABAlgorithm.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/bandits/algorithms.py">View
 source</a>
 
-```python
-get_arm(t)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>get_arm(
+    t
+)
+</code></pre>
 
 <h3 id="get_score"><code>get_score</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/bandits/algorithms.py">View
 source</a>
 
-```python
-get_score(t)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>get_score(
+    t
+)
+</code></pre>
 
 Computes upper confidence bounds of reward / pulls at round t.
 
@@ -81,31 +98,29 @@ Computes upper confidence bounds of reward / pulls at round t.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/bandits/algorithms.py">View
 source</a>
 
-```python
-@staticmethod
-print()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>@staticmethod</code>
+<code>print()
+</code></pre>
 
 <h3 id="set_state"><code>set_state</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/bandits/algorithms.py">View
 source</a>
 
-```python
-set_state(
-    pulls,
-    reward
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>set_state(
+    pulls, reward
 )
-```
+</code></pre>
 
 <h3 id="update"><code>update</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/bandits/algorithms.py">View
 source</a>
 
-```python
-update(
-    arm,
-    reward
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>update(
+    arm, reward
 )
-```
+</code></pre>
diff --git a/docs/api_docs/python/recsim/agents/bandits/algorithms/MABAlgorithm.md b/docs/api_docs/python/recsim/agents/bandits/algorithms/MABAlgorithm.md
index 4c47765..54cef67 100644
--- a/docs/api_docs/python/recsim/agents/bandits/algorithms/MABAlgorithm.md
+++ b/docs/api_docs/python/recsim/agents/bandits/algorithms/MABAlgorithm.md
@@ -8,55 +8,97 @@
 
 # recsim.agents.bandits.algorithms.MABAlgorithm
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/bandits/algorithms.py">View
 source</a>
 
-## Class `MABAlgorithm`
-
-<!-- Start diff -->
 Base class for Multi-armed bandit (MAB) algorithms.
 
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.agents.bandits.algorithms.MABAlgorithm(
+    num_arms, params, seed=0
+)
+</code></pre>
+
 <!-- Placeholder for "Used in" -->
 
 We implement multi-armed bandit algorithms with confidence width tuning proposed
 in Hsu et al. https://arxiv.org/abs/1904.02664.
 
-#### Attributes:
-
-*   <b>`pulls`</b>: A numpy array which counts number of pulls of each arm
-*   <b>`reward`</b>: A numpy array which sums up reward of each arm
-*   <b>`optimism_scaling`</b>: A float specifying the confidence level. Default
-    value (1.0) corresponds to the exploration strategy presented in the
-    literature. A smaller number means less exploration and more exploitation.
-*   <b>`_rng`</b>: An instance of random.RandomState for random number
-    generation
-
-<h2 id="__init__"><code>__init__</code></h2>
-
-<a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/bandits/algorithms.py">View
-source</a>
-
-```python
-__init__(
-    num_arms,
-    params,
-    seed=0
-)
-```
-
-Initializes MABAlgorithm.
-
-#### Args:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+`num_arms`
+</td>
+<td>
+Number of arms. Must be greater than one.
+</td>
+</tr><tr>
+<td>
+`params`
+</td>
+<td>
+A dictionary which includes additional parameters like
+optimism_scaling. Default is an empty dictionary.
+</td>
+</tr><tr>
+<td>
+`seed`
+</td>
+<td>
+Random seed for this object. Default is zero.
+</td>
+</tr>
+</table>
 
-*   <b>`num_arms`</b>: Number of arms. Must be greater than one.
-*   <b>`params`</b>: A dictionary which includes additional parameters like
-    optimism_scaling. Default is an empty dictionary.
-*   <b>`seed`</b>: Random seed for this object. Default is zero.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
+
+<tr>
+<td>
+`pulls`
+</td>
+<td>
+A numpy array which counts number of pulls of each arm
+</td>
+</tr><tr>
+<td>
+`reward`
+</td>
+<td>
+A numpy array which sums up reward of each arm
+</td>
+</tr><tr>
+<td>
+`optimism_scaling`
+</td>
+<td>
+A float specifying the confidence level. Default value
+(1.0) corresponds to the exploration strategy presented in the literature.
+A smaller number means less exploration and more exploitation.
+</td>
+</tr><tr>
+<td>
+`_rng`
+</td>
+<td>
+An instance of random.RandomState for random number generation
+</td>
+</tr>
+</table>
 
 ## Methods
 
@@ -65,21 +107,19 @@ Initializes MABAlgorithm.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/bandits/algorithms.py">View
 source</a>
 
-```python
-set_state(
-    pulls,
-    reward
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>set_state(
+    pulls, reward
 )
-```
+</code></pre>
 
 <h3 id="update"><code>update</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/bandits/algorithms.py">View
 source</a>
 
-```python
-update(
-    arm,
-    reward
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>update(
+    arm, reward
 )
-```
+</code></pre>
diff --git a/docs/api_docs/python/recsim/agents/bandits/algorithms/ThompsonSampling.md b/docs/api_docs/python/recsim/agents/bandits/algorithms/ThompsonSampling.md
index 60c2eb7..9412f90 100644
--- a/docs/api_docs/python/recsim/agents/bandits/algorithms/ThompsonSampling.md
+++ b/docs/api_docs/python/recsim/agents/bandits/algorithms/ThompsonSampling.md
@@ -11,47 +11,60 @@
 
 # recsim.agents.bandits.algorithms.ThompsonSampling
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/bandits/algorithms.py">View
 source</a>
 
-## Class `ThompsonSampling`
-
-<!-- Start diff -->
 Thompson Sampling algorithm for the Bernoulli bandit.
 
 Inherits From:
 [`MABAlgorithm`](../../../../recsim/agents/bandits/algorithms/MABAlgorithm.md)
 
-<!-- Placeholder for "Used in" -->
-
-See "Further Optimal Regret Bounds for Thompson Sampling" by Agrawal and Goyal.
-
-<h2 id="__init__"><code>__init__</code></h2>
-
-<a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/bandits/algorithms.py">View
-source</a>
-
-```python
-__init__(
-    num_arms,
-    params,
-    seed=0
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.agents.bandits.algorithms.ThompsonSampling(
+    num_arms, params, seed=0
 )
-```
+</code></pre>
 
-Initializes MABAlgorithm.
+<!-- Placeholder for "Used in" -->
 
-#### Args:
+See "Further Optimal Regret Bounds for Thompson Sampling" by Agrawal and Goyal.
 
-*   <b>`num_arms`</b>: Number of arms. Must be greater than one.
-*   <b>`params`</b>: A dictionary which includes additional parameters like
-    optimism_scaling. Default is an empty dictionary.
-*   <b>`seed`</b>: Random seed for this object. Default is zero.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+`num_arms`
+</td>
+<td>
+Number of arms. Must be greater than one.
+</td>
+</tr><tr>
+<td>
+`params`
+</td>
+<td>
+A dictionary which includes additional parameters like
+optimism_scaling. Default is an empty dictionary.
+</td>
+</tr><tr>
+<td>
+`seed`
+</td>
+<td>
+Random seed for this object. Default is zero.
+</td>
+</tr>
+</table>
 
 ## Methods
 
@@ -60,18 +73,22 @@ Initializes MABAlgorithm.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/bandits/algorithms.py">View
 source</a>
 
-```python
-get_arm(t)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>get_arm(
+    t
+)
+</code></pre>
 
 <h3 id="get_score"><code>get_score</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/bandits/algorithms.py">View
 source</a>
 
-```python
-get_score(t)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>get_score(
+    t
+)
+</code></pre>
 
 Samples scores from the posterior distribution.
 
@@ -80,31 +97,29 @@ Samples scores from the posterior distribution.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/bandits/algorithms.py">View
 source</a>
 
-```python
-@staticmethod
-print()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>@staticmethod</code>
+<code>print()
+</code></pre>
 
 <h3 id="set_state"><code>set_state</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/bandits/algorithms.py">View
 source</a>
 
-```python
-set_state(
-    pulls,
-    reward
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>set_state(
+    pulls, reward
 )
-```
+</code></pre>
 
 <h3 id="update"><code>update</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/bandits/algorithms.py">View
 source</a>
 
-```python
-update(
-    arm,
-    reward
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>update(
+    arm, reward
 )
-```
+</code></pre>
diff --git a/docs/api_docs/python/recsim/agents/bandits/algorithms/UCB1.md b/docs/api_docs/python/recsim/agents/bandits/algorithms/UCB1.md
index f3f8e63..663ec53 100644
--- a/docs/api_docs/python/recsim/agents/bandits/algorithms/UCB1.md
+++ b/docs/api_docs/python/recsim/agents/bandits/algorithms/UCB1.md
@@ -11,48 +11,61 @@
 
 # recsim.agents.bandits.algorithms.UCB1
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/bandits/algorithms.py">View
 source</a>
 
-## Class `UCB1`
-
-<!-- Start diff -->
 UCB1 algorithm.
 
 Inherits From:
 [`MABAlgorithm`](../../../../recsim/agents/bandits/algorithms/MABAlgorithm.md)
 
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.agents.bandits.algorithms.UCB1(
+    num_arms, params, seed=0
+)
+</code></pre>
+
 <!-- Placeholder for "Used in" -->
 
 See "Finite-time Analysis of the Multiarmed Bandit Problem" by Auer,
 Cesa-Bianchi, and Fischer.
 
-<h2 id="__init__"><code>__init__</code></h2>
-
-<a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/bandits/algorithms.py">View
-source</a>
-
-```python
-__init__(
-    num_arms,
-    params,
-    seed=0
-)
-```
-
-Initializes MABAlgorithm.
-
-#### Args:
-
-*   <b>`num_arms`</b>: Number of arms. Must be greater than one.
-*   <b>`params`</b>: A dictionary which includes additional parameters like
-    optimism_scaling. Default is an empty dictionary.
-*   <b>`seed`</b>: Random seed for this object. Default is zero.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+`num_arms`
+</td>
+<td>
+Number of arms. Must be greater than one.
+</td>
+</tr><tr>
+<td>
+`params`
+</td>
+<td>
+A dictionary which includes additional parameters like
+optimism_scaling. Default is an empty dictionary.
+</td>
+</tr><tr>
+<td>
+`seed`
+</td>
+<td>
+Random seed for this object. Default is zero.
+</td>
+</tr>
+</table>
 
 ## Methods
 
@@ -61,18 +74,22 @@ Initializes MABAlgorithm.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/bandits/algorithms.py">View
 source</a>
 
-```python
-get_arm(t)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>get_arm(
+    t
+)
+</code></pre>
 
 <h3 id="get_score"><code>get_score</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/bandits/algorithms.py">View
 source</a>
 
-```python
-get_score(t)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>get_score(
+    t
+)
+</code></pre>
 
 Computes upper confidence bounds of reward / pulls at round t.
 
@@ -81,31 +98,29 @@ Computes upper confidence bounds of reward / pulls at round t.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/bandits/algorithms.py">View
 source</a>
 
-```python
-@staticmethod
-print()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>@staticmethod</code>
+<code>print()
+</code></pre>
 
 <h3 id="set_state"><code>set_state</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/bandits/algorithms.py">View
 source</a>
 
-```python
-set_state(
-    pulls,
-    reward
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>set_state(
+    pulls, reward
 )
-```
+</code></pre>
 
 <h3 id="update"><code>update</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/bandits/algorithms.py">View
 source</a>
 
-```python
-update(
-    arm,
-    reward
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>update(
+    arm, reward
 )
-```
+</code></pre>
diff --git a/docs/api_docs/python/recsim/agents/cluster_bandit_agent.md b/docs/api_docs/python/recsim/agents/cluster_bandit_agent.md
index 99bdb53..023ca7b 100644
--- a/docs/api_docs/python/recsim/agents/cluster_bandit_agent.md
+++ b/docs/api_docs/python/recsim/agents/cluster_bandit_agent.md
@@ -5,7 +5,10 @@
 
 # Module: recsim.agents.cluster_bandit_agent
 
+<!-- Insert buttons and diff -->
+
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/cluster_bandit_agent.py">View
diff --git a/docs/api_docs/python/recsim/agents/cluster_bandit_agent/ClusterBanditAgent.md b/docs/api_docs/python/recsim/agents/cluster_bandit_agent/ClusterBanditAgent.md
index 0cc7567..9b4562f 100644
--- a/docs/api_docs/python/recsim/agents/cluster_bandit_agent/ClusterBanditAgent.md
+++ b/docs/api_docs/python/recsim/agents/cluster_bandit_agent/ClusterBanditAgent.md
@@ -1,7 +1,6 @@
 <div itemscope itemtype="http://developers.google.com/ReferenceObject">
 <meta itemprop="name" content="recsim.agents.cluster_bandit_agent.ClusterBanditAgent" />
 <meta itemprop="path" content="Stable" />
-<meta itemprop="property" content="multi_user"/>
 <meta itemprop="property" content="__init__"/>
 <meta itemprop="property" content="begin_episode"/>
 <meta itemprop="property" content="bundle_and_checkpoint"/>
@@ -12,22 +11,27 @@
 
 # recsim.agents.cluster_bandit_agent.ClusterBanditAgent
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/cluster_bandit_agent.py">View
 source</a>
 
-## Class `ClusterBanditAgent`
-
-<!-- Start diff -->
 An agent that recommends items with the highest UCBs of topic affinities.
 
 Inherits From:
 [`AbstractClickBanditLayer`](../../../recsim/agents/layers/abstract_click_bandit/AbstractClickBanditLayer.md)
 
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.agents.cluster_bandit_agent.ClusterBanditAgent(
+    observation_space, action_space, alg_ctor=recsim.agents.bandits.algorithms.UCB1,
+    ci_scaling=1.0, random_seed=0, **kwargs
+)
+</code></pre>
+
 <!-- Placeholder for "Used in" -->
 
 This agent assumes no knowledge of user's affinity for each topic but receives
@@ -35,35 +39,73 @@ observations of user's past responses for each topic. When creating a slate, it
 utilizes a bandit algorithm to pick the best topics. Within the same best topic,
 we pick documents with the best document quality scores.
 
-<h2 id="__init__"><code>__init__</code></h2>
-
-```python
-__init__(
-    *args,
-    **kwargs
-)
-```
-
-Initializes a new bandit agent for clustered arm exploration.
-
-#### Args:
-
-*   <b>`observation_space`</b>: Instance of a gym space corresponding to the
-    observation format.
-*   <b>`action_space`</b>: A gym.spaces object that specifies the format of
-    actions.
-*   <b>`alg_ctor`</b>: A class of an MABAlgorithm for exploration, default to
-    UCB1.
-*   <b>`ci_scaling`</b>: A floating number specifying the scaling of confidence
-    bound.
-*   <b>`random_seed`</b>: An integer for random seed.
-*   <b>`**kwargs`</b>: currently unused arguments.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+`observation_space`
+</td>
+<td>
+Instance of a gym space corresponding to the
+observation format.
+</td>
+</tr><tr>
+<td>
+`action_space`
+</td>
+<td>
+A gym.spaces object that specifies the format of actions.
+</td>
+</tr><tr>
+<td>
+`alg_ctor`
+</td>
+<td>
+A class of an MABAlgorithm for exploration, default to UCB1.
+</td>
+</tr><tr>
+<td>
+`ci_scaling`
+</td>
+<td>
+A floating number specifying the scaling of confidence bound.
+</td>
+</tr><tr>
+<td>
+`random_seed`
+</td>
+<td>
+An integer for random seed.
+</td>
+</tr><tr>
+<td>
+`**kwargs`
+</td>
+<td>
+currently unused arguments.
+</td>
+</tr>
+</table>
 
-## Properties
+<!-- Tabular view -->
 
-<h3 id="multi_user"><code>multi_user</code></h3>
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
+<tr>
+<td>
+`multi_user`
+</td>
+<td>
 Returns boolean indicating whether this agent serves multiple users.
+</td>
+</tr>
+</table>
 
 ## Methods
 
@@ -72,104 +114,191 @@ Returns boolean indicating whether this agent serves multiple users.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View
 source</a>
 
-```python
-begin_episode(observation=None)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>begin_episode(
+    observation=None
+)
+</code></pre>
 
 <h3 id="bundle_and_checkpoint"><code>bundle_and_checkpoint</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View
 source</a>
 
-```python
-bundle_and_checkpoint(
-    checkpoint_dir,
-    iteration_number
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>bundle_and_checkpoint(
+    checkpoint_dir, iteration_number
 )
-```
+</code></pre>
 
 Returns a self-contained bundle of the agent's state.
 
-#### Args:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`checkpoint_dir`
+</td>
+<td>
+A string for the directory where objects will be saved.
+</td>
+</tr><tr>
+<td>
+`iteration_number`
+</td>
+<td>
+An integer of iteration number to use for naming the
+checkpoint file.
+</td>
+</tr>
+</table>
 
-*   <b>`checkpoint_dir`</b>: A string for the directory where objects will be
-    saved.
-*   <b>`iteration_number`</b>: An integer of iteration number to use for naming
-    the checkpoint file.
+<!-- Tabular view -->
 
-#### Returns:
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+A dictionary containing additional Python objects to be checkpointed by
+the experiment. Each key is a string for the object name and the value
+is actual object. If the checkpoint directory does not exist, returns
+empty dictionary.
+</td>
+</tr>
 
-A dictionary containing additional Python objects to be checkpointed by the
-experiment. Each key is a string for the object name and the value is actual
-object. If the checkpoint directory does not exist, returns empty dictionary.
+</table>
 
 <h3 id="end_episode"><code>end_episode</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View
 source</a>
 
-```python
-end_episode(
-    reward,
-    observation
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>end_episode(
+    reward, observation
 )
-```
+</code></pre>
 
 <h3 id="step"><code>step</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/layers/abstract_click_bandit.py">View
 source</a>
 
-```python
-step(
-    reward,
-    observation
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>step(
+    reward, observation
 )
-```
+</code></pre>
 
 Records the most recent transition and returns the agent's next action.
 
 We store the observation of the last time step since we want to store it with
 the reward.
 
-#### Args:
-
-*   <b>`reward`</b>: Unused.
-*   <b>`observation`</b>: A dictionary that includes the most recent
-    observations and should have the following fields:
-    -   user: A dictionary representing user's observed state. Assumes
-        observation['user']['sufficient_statics'] is a dictionary containing
-        base agent impression counts and base agent click counts.
-
-#### Returns:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`reward`
+</td>
+<td>
+Unused.
+</td>
+</tr><tr>
+<td>
+`observation`
+</td>
+<td>
+A dictionary that includes the most recent observations and
+should have the following fields:
+- user: A dictionary representing user's observed state. Assumes
+observation['user']['sufficient_statics'] is a dictionary containing
+base agent impression counts and base agent click counts.
+</td>
+</tr>
+</table>
 
-*   <b>`slate`</b>: An integer array of size _slate_size, where each element is
-    an index into the list of doc_obs
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+
+<tr>
+<td>
+`slate`
+</td>
+<td>
+An integer array of size _slate_size, where each element is an
+index into the list of doc_obs
+</td>
+</tr>
+</table>
 
 <h3 id="unbundle"><code>unbundle</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View
 source</a>
 
-```python
-unbundle(
-    checkpoint_dir,
-    iteration_number,
-    bundle_dict
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>unbundle(
+    checkpoint_dir, iteration_number, bundle_dict
 )
-```
+</code></pre>
 
 Restores the agent from a checkpoint.
 
-#### Args:
-
-*   <b>`checkpoint_dir`</b>: A string that represents the path to the checkpoint
-    saved by tf.Save.
-*   <b>`iteration_number`</b>: An integer that represents the checkpoint version
-    and is used when restoring replay buffer.
-*   <b>`bundle_dict`</b>: A dict containing additional Python objects owned by
-    the agent. Each key is an object name and the value is the actual object.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`checkpoint_dir`
+</td>
+<td>
+A string that represents the path to the checkpoint saved
+by tf.Save.
+</td>
+</tr><tr>
+<td>
+`iteration_number`
+</td>
+<td>
+An integer that represents the checkpoint version and is
+used when restoring replay buffer.
+</td>
+</tr><tr>
+<td>
+`bundle_dict`
+</td>
+<td>
+A dict containing additional Python objects owned by the
+agent. Each key is an object name and the value is the actual object.
+</td>
+</tr>
+</table>
 
-#### Returns:
+<!-- Tabular view -->
 
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
 bool, True if unbundling was successful.
+</td>
+</tr>
+
+</table>
diff --git a/docs/api_docs/python/recsim/agents/cluster_bandit_agent/GreedyClusterAgent.md b/docs/api_docs/python/recsim/agents/cluster_bandit_agent/GreedyClusterAgent.md
index 25709c0..24a99e8 100644
--- a/docs/api_docs/python/recsim/agents/cluster_bandit_agent/GreedyClusterAgent.md
+++ b/docs/api_docs/python/recsim/agents/cluster_bandit_agent/GreedyClusterAgent.md
@@ -1,7 +1,6 @@
 <div itemscope itemtype="http://developers.google.com/ReferenceObject">
 <meta itemprop="name" content="recsim.agents.cluster_bandit_agent.GreedyClusterAgent" />
 <meta itemprop="path" content="Stable" />
-<meta itemprop="property" content="multi_user"/>
 <meta itemprop="property" content="__init__"/>
 <meta itemprop="property" content="begin_episode"/>
 <meta itemprop="property" content="bundle_and_checkpoint"/>
@@ -12,53 +11,66 @@
 
 # recsim.agents.cluster_bandit_agent.GreedyClusterAgent
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/cluster_bandit_agent.py">View source</a>
 
 
 
-## Class `GreedyClusterAgent`
-
-<!-- Start diff -->
 Simple agent sorting all documents of a topic according to quality.
 
 Inherits From: [`AbstractEpisodicRecommenderAgent`](../../../recsim/agent/AbstractEpisodicRecommenderAgent.md)
 
-<!-- Placeholder for "Used in" -->
-
-
-<h2 id="__init__"><code>__init__</code></h2>
-
-<a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/cluster_bandit_agent.py">View source</a>
-
-``` python
-__init__(
-    observation_space,
-    action_space,
-    cluster_id,
-    **kwargs
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.agents.cluster_bandit_agent.GreedyClusterAgent(
+    observation_space, action_space, cluster_id, **kwargs
 )
-```
-
-Initializes AbstractEpisodicRecommenderAgent.
-
+</code></pre>
 
-#### Args:
-
-
-* <b>`action_space`</b>: A gym.spaces object that specifies the format of actions.
-* <b>`summary_writer`</b>: A Tensorflow summary writer to pass to the agent
-  for in-agent training statistics in Tensorboard.
+<!-- Placeholder for "Used in" -->
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+`action_space`
+</td>
+<td>
+A gym.spaces object that specifies the format of actions.
+</td>
+</tr><tr>
+<td>
+`summary_writer`
+</td>
+<td>
+A Tensorflow summary writer to pass to the agent
+for in-agent training statistics in Tensorboard.
+</td>
+</tr>
+</table>
 
-## Properties
+<!-- Tabular view -->
 
-<h3 id="multi_user"><code>multi_user</code></h3>
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
+<tr>
+<td>
+`multi_user`
+</td>
+<td>
 Returns boolean indicating whether this agent serves multiple users.
+</td>
+</tr>
+</table>
 
 ## Methods
 
@@ -66,137 +78,246 @@ Returns boolean indicating whether this agent serves multiple users.
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View source</a>
 
-``` python
-begin_episode(observation=None)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>begin_episode(
+    observation=None
+)
+</code></pre>
 
 Returns the agent's first action for this episode.
 
+<!-- Tabular view -->
 
-#### Args:
-
-
-* <b>`observation`</b>: numpy array, the environment's initial observation.
-
-
-#### Returns:
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
 
+<tr>
+<td>
+`observation`
+</td>
+<td>
+numpy array, the environment's initial observation.
+</td>
+</tr>
+</table>
 
-* <b>`slate`</b>: An integer array of size _slate_size, where each element is an
-  index into the list of doc_obs
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+
+<tr>
+<td>
+`slate`
+</td>
+<td>
+An integer array of size _slate_size, where each element is an
+index into the list of doc_obs
+</td>
+</tr>
+</table>
 
 <h3 id="bundle_and_checkpoint"><code>bundle_and_checkpoint</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View source</a>
 
-``` python
-bundle_and_checkpoint(
-    checkpoint_dir,
-    iteration_number
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>bundle_and_checkpoint(
+    checkpoint_dir, iteration_number
 )
-```
+</code></pre>
 
 Returns a self-contained bundle of the agent's state.
 
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`checkpoint_dir`
+</td>
+<td>
+A string that represents the path to the checkpoint and is
+used when we save TensorFlow objects by tf.Save.
+</td>
+</tr><tr>
+<td>
+`iteration_number`
+</td>
+<td>
+An integer that represents the checkpoint version and is
+used when restoring replay buffer.
+</td>
+</tr>
+</table>
 
-#### Args:
-
-
-* <b>`checkpoint_dir`</b>: A string that represents the path to the checkpoint and is
-  used when we save TensorFlow objects by tf.Save.
-* <b>`iteration_number`</b>: An integer that represents the checkpoint version and is
-  used when restoring replay buffer.
-
-
-#### Returns:
+<!-- Tabular view -->
 
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
 A dictionary containing additional Python objects to be checkpointed by
-  the experiment. Each key is a string for the object name and the value
-  is actual object. If the checkpoint directory does not exist, returns
-  empty dictionary.
+the experiment. Each key is a string for the object name and the value
+is actual object. If the checkpoint directory does not exist, returns
+empty dictionary.
+</td>
+</tr>
 
+</table>
 
 <h3 id="end_episode"><code>end_episode</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View source</a>
 
-``` python
-end_episode(
-    reward,
-    observation=None
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>end_episode(
+    reward, observation=None
 )
-```
+</code></pre>
 
 Signals the end of the episode to the agent.
 
-
-#### Args:
-
-
-* <b>`reward`</b>: An float that is the last reward from the environment.
-* <b>`observation`</b>: numpy array that represents the last observation of the
-  episode.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`reward`
+</td>
+<td>
+An float that is the last reward from the environment.
+</td>
+</tr><tr>
+<td>
+`observation`
+</td>
+<td>
+numpy array that represents the last observation of the
+episode.
+</td>
+</tr>
+</table>
 
 <h3 id="step"><code>step</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/cluster_bandit_agent.py">View source</a>
 
-``` python
-step(
-    reward,
-    observation
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>step(
+    reward, observation
 )
-```
+</code></pre>
 
 Records the most recent transition and returns the agent's next action.
 
 We store the observation of the last time step since we want to store it
 with the reward.
 
-#### Args:
-
-
-* <b>`reward`</b>: The reward received from the agent's most recent action as a
-  float.
-* <b>`observation`</b>: A dictionary that includes the most recent observations.
-
-
-#### Returns:
-
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`reward`
+</td>
+<td>
+The reward received from the agent's most recent action as a
+float.
+</td>
+</tr><tr>
+<td>
+`observation`
+</td>
+<td>
+A dictionary that includes the most recent observations.
+</td>
+</tr>
+</table>
 
-* <b>`slate`</b>: An integer array of size _slate_size, where each element is an
-  index into the list of doc_obs
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+
+<tr>
+<td>
+`slate`
+</td>
+<td>
+An integer array of size _slate_size, where each element is an
+index into the list of doc_obs
+</td>
+</tr>
+</table>
 
 <h3 id="unbundle"><code>unbundle</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View source</a>
 
-``` python
-unbundle(
-    checkpoint_dir,
-    iteration_number,
-    bundle_dict
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>unbundle(
+    checkpoint_dir, iteration_number, bundle_dict
 )
-```
+</code></pre>
 
 Restores the agent from a checkpoint.
 
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`checkpoint_dir`
+</td>
+<td>
+A string that represents the path to the checkpoint and is
+used when we save TensorFlow objects by tf.Save.
+</td>
+</tr><tr>
+<td>
+`iteration_number`
+</td>
+<td>
+An integer that represents the checkpoint version and is
+used when restoring replay buffer.
+</td>
+</tr><tr>
+<td>
+`bundle_dict`
+</td>
+<td>
+A dict containing additional Python objects owned by the
+agent. Each key is an object name and the value is the actual object.
+</td>
+</tr>
+</table>
 
-#### Args:
-
-
-* <b>`checkpoint_dir`</b>: A string that represents the path to the checkpoint and is
-  used when we save TensorFlow objects by tf.Save.
-* <b>`iteration_number`</b>: An integer that represents the checkpoint version and is
-  used when restoring replay buffer.
-* <b>`bundle_dict`</b>: A dict containing additional Python objects owned by the
-  agent. Each key is an object name and the value is the actual object.
-
-
-#### Returns:
+<!-- Tabular view -->
 
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
 bool, True if unbundling was successful.
+</td>
+</tr>
 
-
-
-
+</table>
diff --git a/docs/api_docs/python/recsim/agents/dopamine.md b/docs/api_docs/python/recsim/agents/dopamine.md
index 3f1e8c8..5ab93f0 100644
--- a/docs/api_docs/python/recsim/agents/dopamine.md
+++ b/docs/api_docs/python/recsim/agents/dopamine.md
@@ -5,7 +5,10 @@
 
 # Module: recsim.agents.dopamine
 
+<!-- Insert buttons and diff -->
+
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/dopamine/__init__.py">View
diff --git a/docs/api_docs/python/recsim/agents/dopamine/dqn_agent.md b/docs/api_docs/python/recsim/agents/dopamine/dqn_agent.md
index 979f4d4..76cb45a 100644
--- a/docs/api_docs/python/recsim/agents/dopamine/dqn_agent.md
+++ b/docs/api_docs/python/recsim/agents/dopamine/dqn_agent.md
@@ -5,7 +5,10 @@
 
 # Module: recsim.agents.dopamine.dqn_agent
 
+<!-- Insert buttons and diff -->
+
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/dopamine/dqn_agent.py">View
diff --git a/docs/api_docs/python/recsim/agents/dopamine/dqn_agent/DQNAgentRecSim.md b/docs/api_docs/python/recsim/agents/dopamine/dqn_agent/DQNAgentRecSim.md
index 331af58..5e00f7e 100644
--- a/docs/api_docs/python/recsim/agents/dopamine/dqn_agent/DQNAgentRecSim.md
+++ b/docs/api_docs/python/recsim/agents/dopamine/dqn_agent/DQNAgentRecSim.md
@@ -11,110 +11,259 @@
 
 # recsim.agents.dopamine.dqn_agent.DQNAgentRecSim
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/dopamine/dqn_agent.py">View
 source</a>
 
-## Class `DQNAgentRecSim`
-
-<!-- Start diff -->
 RecSim-specific Dopamine DQN agent that converts the observation space.
 
-<!-- Placeholder for "Used in" -->
-
-<h2 id="__init__"><code>__init__</code></h2>
-
-<a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/dopamine/dqn_agent.py">View
-source</a>
-
-```python
-__init__(
-    sess,
-    observation_space,
-    num_actions,
-    stack_size,
-    optimizer_name,
-    eval_mode,
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.agents.dopamine.dqn_agent.DQNAgentRecSim(
+    sess, observation_space, num_actions, stack_size, optimizer_name, eval_mode,
     **kwargs
 )
-```
-
-Initializes the agent and constructs the components of its graph.
-
-#### Args:
-
-*   <b>`sess`</b>: `tf.Session`, for executing ops.
-*   <b>`num_actions`</b>: int, number of actions the agent can take at any
-    state.
-*   <b>`observation_shape`</b>: tuple of ints describing the observation shape.
-*   <b>`observation_dtype`</b>: tf.DType, specifies the type of the
-    observations. Note that if your inputs are continuous, you should set this
-    to tf.float32.
-*   <b>`stack_size`</b>: int, number of frames to use in state stack.
-*   <b>`network`</b>: tf.Keras.Model, expecting 2 parameters: num_actions,
-    network_type. A call to this object will return an instantiation of the
-    network provided. The network returned can be run with different inputs to
-    create different outputs. See
-    dopamine.discrete_domains.atari_lib.NatureDQNNetwork as an example.
-*   <b>`gamma`</b>: float, discount factor with the usual RL meaning.
-*   <b>`update_horizon`</b>: int, horizon at which updates are performed, the
-    'n' in n-step update.
-*   <b>`min_replay_history`</b>: int, number of transitions that should be
-    experienced before the agent begins training its value function.
-*   <b>`update_period`</b>: int, period between DQN updates.
-*   <b>`target_update_period`</b>: int, update period for the target network.
-*   <b>`epsilon_fn`</b>: function expecting 4 parameters: (decay_period, step,
-    warmup_steps, epsilon). This function should return the epsilon value used
-    for exploration during training.
-*   <b>`epsilon_train`</b>: float, the value to which the agent's epsilon is
-    eventually decayed during training.
-*   <b>`epsilon_eval`</b>: float, epsilon used when evaluating the agent.
-*   <b>`epsilon_decay_period`</b>: int, length of the epsilon decay schedule.
-*   <b>`tf_device`</b>: str, Tensorflow device on which the agent's graph is
-    executed.
-*   <b>`eval_mode`</b>: bool, True for evaluation and False for training.
-*   <b>`use_staging`</b>: bool, when True use a staging area to prefetch the
-    next training batch, speeding training up by about 30%.
-*   <b>`max_tf_checkpoints_to_keep`</b>: int, the number of TensorFlow
-    checkpoints to keep.
-*   <b>`optimizer`</b>: `tf.train.Optimizer`, for training the value function.
-*   <b>`summary_writer`</b>: SummaryWriter object for outputting training
-    statistics. Summary writing disabled if set to None.
-*   <b>`summary_writing_frequency`</b>: int, frequency with which summaries will
-    be written. Lower values will result in slower training.
-*   <b>`allow_partial_reload`</b>: bool, whether we allow reloading a partial
-    agent (for instance, only the network parameters).
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+`sess`
+</td>
+<td>
+`tf.compat.v1.Session`, for executing ops.
+</td>
+</tr><tr>
+<td>
+`num_actions`
+</td>
+<td>
+int, number of actions the agent can take at any state.
+</td>
+</tr><tr>
+<td>
+`observation_shape`
+</td>
+<td>
+tuple of ints describing the observation shape.
+</td>
+</tr><tr>
+<td>
+`observation_dtype`
+</td>
+<td>
+tf.DType, specifies the type of the observations. Note
+that if your inputs are continuous, you should set this to tf.float32.
+</td>
+</tr><tr>
+<td>
+`stack_size`
+</td>
+<td>
+int, number of frames to use in state stack.
+</td>
+</tr><tr>
+<td>
+`network`
+</td>
+<td>
+tf.Keras.Model, expecting 2 parameters: num_actions,
+network_type. A call to this object will return an instantiation of the
+network provided. The network returned can be run with different inputs
+to create different outputs. See
+dopamine.discrete_domains.atari_lib.NatureDQNNetwork as an example.
+</td>
+</tr><tr>
+<td>
+`gamma`
+</td>
+<td>
+float, discount factor with the usual RL meaning.
+</td>
+</tr><tr>
+<td>
+`update_horizon`
+</td>
+<td>
+int, horizon at which updates are performed, the 'n' in
+n-step update.
+</td>
+</tr><tr>
+<td>
+`min_replay_history`
+</td>
+<td>
+int, number of transitions that should be experienced
+before the agent begins training its value function.
+</td>
+</tr><tr>
+<td>
+`update_period`
+</td>
+<td>
+int, period between DQN updates.
+</td>
+</tr><tr>
+<td>
+`target_update_period`
+</td>
+<td>
+int, update period for the target network.
+</td>
+</tr><tr>
+<td>
+`epsilon_fn`
+</td>
+<td>
+function expecting 4 parameters:
+(decay_period, step, warmup_steps, epsilon). This function should return
+the epsilon value used for exploration during training.
+</td>
+</tr><tr>
+<td>
+`epsilon_train`
+</td>
+<td>
+float, the value to which the agent's epsilon is eventually
+decayed during training.
+</td>
+</tr><tr>
+<td>
+`epsilon_eval`
+</td>
+<td>
+float, epsilon used when evaluating the agent.
+</td>
+</tr><tr>
+<td>
+`epsilon_decay_period`
+</td>
+<td>
+int, length of the epsilon decay schedule.
+</td>
+</tr><tr>
+<td>
+`tf_device`
+</td>
+<td>
+str, Tensorflow device on which the agent's graph is executed.
+</td>
+</tr><tr>
+<td>
+`eval_mode`
+</td>
+<td>
+bool, True for evaluation and False for training.
+</td>
+</tr><tr>
+<td>
+`use_staging`
+</td>
+<td>
+bool, when True use a staging area to prefetch the next
+training batch, speeding training up by about 30%.
+</td>
+</tr><tr>
+<td>
+`max_tf_checkpoints_to_keep`
+</td>
+<td>
+int, the number of TensorFlow checkpoints to
+keep.
+</td>
+</tr><tr>
+<td>
+`optimizer`
+</td>
+<td>
+`tf.compat.v1.train.Optimizer`, for training the value
+function.
+</td>
+</tr><tr>
+<td>
+`summary_writer`
+</td>
+<td>
+SummaryWriter object for outputting training statistics.
+Summary writing disabled if set to None.
+</td>
+</tr><tr>
+<td>
+`summary_writing_frequency`
+</td>
+<td>
+int, frequency with which summaries will be
+written. Lower values will result in slower training.
+</td>
+</tr><tr>
+<td>
+`allow_partial_reload`
+</td>
+<td>
+bool, whether we allow reloading a partial agent
+(for instance, only the network parameters).
+</td>
+</tr>
+</table>
 
 ## Methods
 
 <h3 id="begin_episode"><code>begin_episode</code></h3>
 
-```python
-begin_episode(observation)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>begin_episode(
+    observation
+)
+</code></pre>
 
 Returns the agent's first action for this episode.
 
-#### Args:
+<!-- Tabular view -->
 
-*   <b>`observation`</b>: numpy array, the environment's initial observation.
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
 
-#### Returns:
+<tr>
+<td>
+`observation`
+</td>
+<td>
+numpy array, the environment's initial observation.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
 
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
 int, the selected action.
+</td>
+</tr>
+
+</table>
 
 <h3 id="bundle_and_checkpoint"><code>bundle_and_checkpoint</code></h3>
 
-```python
-bundle_and_checkpoint(
-    checkpoint_dir,
-    iteration_number
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>bundle_and_checkpoint(
+    checkpoint_dir, iteration_number
 )
-```
+</code></pre>
 
 Returns a self-contained bundle of the agent's state.
 
@@ -122,66 +271,129 @@ This is used for checkpointing. It will return a dictionary containing all
 non-TensorFlow objects (to be saved into a file by the caller), and it saves all
 TensorFlow objects into a checkpoint file.
 
-#### Args:
-
-*   <b>`checkpoint_dir`</b>: str, directory where TensorFlow objects will be
-    saved.
-*   <b>`iteration_number`</b>: int, iteration number to use for naming the
-    checkpoint file.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`checkpoint_dir`
+</td>
+<td>
+str, directory where TensorFlow objects will be saved.
+</td>
+</tr><tr>
+<td>
+`iteration_number`
+</td>
+<td>
+int, iteration number to use for naming the checkpoint
+file.
+</td>
+</tr>
+</table>
 
-#### Returns:
+<!-- Tabular view -->
 
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
 A dict containing additional Python objects to be checkpointed by the
 experiment. If the checkpoint directory does not exist, returns None.
+</td>
+</tr>
+
+</table>
 
 <h3 id="end_episode"><code>end_episode</code></h3>
 
-```python
-end_episode(reward)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>end_episode(
+    reward
+)
+</code></pre>
 
 Signals the end of the episode to the agent.
 
 We store the observation of the current time step, which is the last observation
 of the episode.
 
-#### Args:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
 
-*   <b>`reward`</b>: float, the last reward from the environment.
+<tr>
+<td>
+`reward`
+</td>
+<td>
+float, the last reward from the environment.
+</td>
+</tr>
+</table>
 
 <h3 id="step"><code>step</code></h3>
 
-```python
-step(
-    reward,
-    observation
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>step(
+    reward, observation
 )
-```
+</code></pre>
 
 Records the most recent transition and returns the agent's next action.
 
 We store the observation of the last time step since we want to store it with
 the reward.
 
-#### Args:
-
-*   <b>`reward`</b>: float, the reward received from the agent's most recent
-    action.
-*   <b>`observation`</b>: numpy array, the most recent observation.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`reward`
+</td>
+<td>
+float, the reward received from the agent's most recent action.
+</td>
+</tr><tr>
+<td>
+`observation`
+</td>
+<td>
+numpy array, the most recent observation.
+</td>
+</tr>
+</table>
 
-#### Returns:
+<!-- Tabular view -->
 
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
 int, the selected action.
+</td>
+</tr>
+
+</table>
 
 <h3 id="unbundle"><code>unbundle</code></h3>
 
-```python
-unbundle(
-    checkpoint_dir,
-    iteration_number,
-    bundle_dictionary
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>unbundle(
+    checkpoint_dir, iteration_number, bundle_dictionary
 )
-```
+</code></pre>
 
 Restores the agent from a checkpoint.
 
@@ -189,14 +401,47 @@ Restores the agent's Python objects to those specified in bundle_dictionary, and
 restores the TensorFlow objects to those specified in the checkpoint_dir. If the
 checkpoint_dir does not exist, will not reset the agent's state.
 
-#### Args:
-
-*   <b>`checkpoint_dir`</b>: str, path to the checkpoint saved by tf.Save.
-*   <b>`iteration_number`</b>: int, checkpoint version, used when restoring the
-    replay buffer.
-*   <b>`bundle_dictionary`</b>: dict, containing additional Python objects owned
-    by the agent.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`checkpoint_dir`
+</td>
+<td>
+str, path to the checkpoint saved by tf.Save.
+</td>
+</tr><tr>
+<td>
+`iteration_number`
+</td>
+<td>
+int, checkpoint version, used when restoring the replay
+buffer.
+</td>
+</tr><tr>
+<td>
+`bundle_dictionary`
+</td>
+<td>
+dict, containing additional Python objects owned by
+the agent.
+</td>
+</tr>
+</table>
 
-#### Returns:
+<!-- Tabular view -->
 
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
 bool, True if unbundling was successful.
+</td>
+</tr>
+
+</table>
diff --git a/docs/api_docs/python/recsim/agents/dopamine/dqn_agent/DQNNetworkType.md b/docs/api_docs/python/recsim/agents/dopamine/dqn_agent/DQNNetworkType.md
index 35487a2..38213ed 100644
--- a/docs/api_docs/python/recsim/agents/dopamine/dqn_agent/DQNNetworkType.md
+++ b/docs/api_docs/python/recsim/agents/dopamine/dqn_agent/DQNNetworkType.md
@@ -1,46 +1,38 @@
 <div itemscope itemtype="http://developers.google.com/ReferenceObject">
 <meta itemprop="name" content="recsim.agents.dopamine.dqn_agent.DQNNetworkType" />
 <meta itemprop="path" content="Stable" />
-<meta itemprop="property" content="q_values"/>
 <meta itemprop="property" content="__new__"/>
 </div>
 
 # recsim.agents.dopamine.dqn_agent.DQNNetworkType
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/dopamine/dqn_agent.py">View source</a>
 
 
 
-## Class `DQNNetworkType`
-
-<!-- Start diff -->
 dqn_network(q_values,)
 
-<!-- Placeholder for "Used in" -->
-
-<h2 id="__new__"><code>__new__</code></h2>
-
-```python
-@staticmethod
-__new__(
-    _cls,
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.agents.dopamine.dqn_agent.DQNNetworkType(
     q_values
 )
-```
-
-Create new instance of dqn_network(q_values,)
-
-## Properties
-
-<h3 id="q_values"><code>q_values</code></h3>
-
-
+</code></pre>
 
+<!-- Placeholder for "Used in" -->
+<!-- Tabular view -->
 
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
+<tr> <td> `q_values` </td> <td>
 
+</td>
+</tr>
+</table>
diff --git a/docs/api_docs/python/recsim/agents/dopamine/dqn_agent/ObservationAdapter.md b/docs/api_docs/python/recsim/agents/dopamine/dqn_agent/ObservationAdapter.md
index 02f62d3..5cb981e 100644
--- a/docs/api_docs/python/recsim/agents/dopamine/dqn_agent/ObservationAdapter.md
+++ b/docs/api_docs/python/recsim/agents/dopamine/dqn_agent/ObservationAdapter.md
@@ -1,44 +1,45 @@
 <div itemscope itemtype="http://developers.google.com/ReferenceObject">
 <meta itemprop="name" content="recsim.agents.dopamine.dqn_agent.ObservationAdapter" />
 <meta itemprop="path" content="Stable" />
-<meta itemprop="property" content="output_observation_space"/>
 <meta itemprop="property" content="__init__"/>
 <meta itemprop="property" content="encode"/>
 </div>
 
 # recsim.agents.dopamine.dqn_agent.ObservationAdapter
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/dopamine/dqn_agent.py">View
 source</a>
 
-## Class `ObservationAdapter`
-
-<!-- Start diff -->
 An adapter to convert between user/doc observation and images.
 
-<!-- Placeholder for "Used in" -->
-
-<h2 id="__init__"><code>__init__</code></h2>
-
-```python
-__init__(
-    *args,
-    **kwargs
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.agents.dopamine.dqn_agent.ObservationAdapter(
+    input_observation_space, stack_size=1
 )
-```
+</code></pre>
 
-Initialize self. See help(type(self)) for accurate signature.
-
-## Properties
+<!-- Placeholder for "Used in" -->
+<!-- Tabular view -->
 
-<h3 id="output_observation_space"><code>output_observation_space</code></h3>
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
+<tr>
+<td>
+`output_observation_space`
+</td>
+<td>
 The output observation space of the adapter.
+</td>
+</tr>
+</table>
 
 ## Methods
 
@@ -47,8 +48,10 @@ The output observation space of the adapter.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/dopamine/dqn_agent.py">View
 source</a>
 
-```python
-encode(observation)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>encode(
+    observation
+)
+</code></pre>
 
 Encode user observation and document observations to an image.
diff --git a/docs/api_docs/python/recsim/agents/dopamine/dqn_agent/ResponseAdapter.md b/docs/api_docs/python/recsim/agents/dopamine/dqn_agent/ResponseAdapter.md
index 3a4c25a..bddbfef 100644
--- a/docs/api_docs/python/recsim/agents/dopamine/dqn_agent/ResponseAdapter.md
+++ b/docs/api_docs/python/recsim/agents/dopamine/dqn_agent/ResponseAdapter.md
@@ -1,55 +1,65 @@
 <div itemscope itemtype="http://developers.google.com/ReferenceObject">
 <meta itemprop="name" content="recsim.agents.dopamine.dqn_agent.ResponseAdapter" />
 <meta itemprop="path" content="Stable" />
-<meta itemprop="property" content="response_dtype"/>
-<meta itemprop="property" content="response_names"/>
-<meta itemprop="property" content="response_shape"/>
 <meta itemprop="property" content="__init__"/>
 <meta itemprop="property" content="encode"/>
 </div>
 
 # recsim.agents.dopamine.dqn_agent.ResponseAdapter
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/dopamine/dqn_agent.py">View
 source</a>
 
-## Class `ResponseAdapter`
-
-<!-- Start diff -->
 Custom flattening of responses to accommodate dopamine replay buffer.
 
-<!-- Placeholder for "Used in" -->
-
-<h2 id="__init__"><code>__init__</code></h2>
-
-<a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/dopamine/dqn_agent.py">View
-source</a>
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.agents.dopamine.dqn_agent.ResponseAdapter(
+    input_response_space
+)
+</code></pre>
 
-```python
-__init__(input_response_space)
-```
+<!-- Placeholder for "Used in" -->
 
-Init function for ResponseAdapter.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+`input_response_space`
+</td>
+<td>
+this is assumed to be an instance of
+gym.spaces.Tuple; each element of the tuple is has to be an instance
+of gym.spaces.Dict consisting of feature_name: 0-d gym.spaces.Box
+(single float) key-value pairs.
+</td>
+</tr>
+</table>
 
-#### Args:
+<!-- Tabular view -->
 
-*   <b>`input_response_space`</b>: this is assumed to be an instance of
-    gym.spaces.Tuple; each element of the tuple is has to be an instance of
-    gym.spaces.Dict consisting of feature_name: 0-d gym.spaces.Box (single
-    float) key-value pairs.
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
-## Properties
+<tr> <td> `response_dtype` </td> <td>
 
-<h3 id="response_dtype"><code>response_dtype</code></h3>
+</td> </tr><tr> <td> `response_names` </td> <td>
 
-<h3 id="response_names"><code>response_names</code></h3>
+</td> </tr><tr> <td> `response_shape` </td> <td>
 
-<h3 id="response_shape"><code>response_shape</code></h3>
+</td>
+</tr>
+</table>
 
 ## Methods
 
@@ -58,6 +68,8 @@ Init function for ResponseAdapter.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/dopamine/dqn_agent.py">View
 source</a>
 
-```python
-encode(responses)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>encode(
+    responses
+)
+</code></pre>
diff --git a/docs/api_docs/python/recsim/agents/dopamine/dqn_agent/recsim_dqn_network.md b/docs/api_docs/python/recsim/agents/dopamine/dqn_agent/recsim_dqn_network.md
index 8703e62..00a59e9 100644
--- a/docs/api_docs/python/recsim/agents/dopamine/dqn_agent/recsim_dqn_network.md
+++ b/docs/api_docs/python/recsim/agents/dopamine/dqn_agent/recsim_dqn_network.md
@@ -5,22 +5,19 @@
 
 # recsim.agents.dopamine.dqn_agent.recsim_dqn_network
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/dopamine/dqn_agent.py">View
 source</a>
 
-<!-- Start diff -->
-
-```python
-recsim.agents.dopamine.dqn_agent.recsim_dqn_network(
-    user,
-    doc,
-    scope
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.agents.dopamine.dqn_agent.recsim_dqn_network(
+    user, doc, scope
 )
-```
+</code></pre>
 
 <!-- Placeholder for "Used in" -->
diff --git a/docs/api_docs/python/recsim/agents/dopamine/dqn_agent/wrapped_replay_buffer.md b/docs/api_docs/python/recsim/agents/dopamine/dqn_agent/wrapped_replay_buffer.md
index b988380..592c297 100644
--- a/docs/api_docs/python/recsim/agents/dopamine/dqn_agent/wrapped_replay_buffer.md
+++ b/docs/api_docs/python/recsim/agents/dopamine/dqn_agent/wrapped_replay_buffer.md
@@ -5,18 +5,19 @@
 
 # recsim.agents.dopamine.dqn_agent.wrapped_replay_buffer
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/dopamine/dqn_agent.py">View
 source</a>
 
-<!-- Start diff -->
-
-```python
-recsim.agents.dopamine.dqn_agent.wrapped_replay_buffer(**kwargs)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.agents.dopamine.dqn_agent.wrapped_replay_buffer(
+    **kwargs
+)
+</code></pre>
 
 <!-- Placeholder for "Used in" -->
diff --git a/docs/api_docs/python/recsim/agents/full_slate_q_agent.md b/docs/api_docs/python/recsim/agents/full_slate_q_agent.md
index ecbfd34..ea7a1f8 100644
--- a/docs/api_docs/python/recsim/agents/full_slate_q_agent.md
+++ b/docs/api_docs/python/recsim/agents/full_slate_q_agent.md
@@ -5,7 +5,10 @@
 
 # Module: recsim.agents.full_slate_q_agent
 
+<!-- Insert buttons and diff -->
+
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/full_slate_q_agent.py">View
diff --git a/docs/api_docs/python/recsim/agents/full_slate_q_agent/FullSlateQAgent.md b/docs/api_docs/python/recsim/agents/full_slate_q_agent/FullSlateQAgent.md
index 2c70a3e..ad17479 100644
--- a/docs/api_docs/python/recsim/agents/full_slate_q_agent/FullSlateQAgent.md
+++ b/docs/api_docs/python/recsim/agents/full_slate_q_agent/FullSlateQAgent.md
@@ -1,7 +1,6 @@
 <div itemscope itemtype="http://developers.google.com/ReferenceObject">
 <meta itemprop="name" content="recsim.agents.full_slate_q_agent.FullSlateQAgent" />
 <meta itemprop="path" content="Stable" />
-<meta itemprop="property" content="multi_user"/>
 <meta itemprop="property" content="__init__"/>
 <meta itemprop="property" content="begin_episode"/>
 <meta itemprop="property" content="bundle_and_checkpoint"/>
@@ -12,56 +11,100 @@
 
 # recsim.agents.full_slate_q_agent.FullSlateQAgent
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/full_slate_q_agent.py">View
 source</a>
 
-## Class `FullSlateQAgent`
-
-<!-- Start diff -->
 A recommender agent implements full slate Q-learning based on DQN agent.
 
 Inherits From:
 [`DQNAgentRecSim`](../../../recsim/agents/dopamine/dqn_agent/DQNAgentRecSim.md),
 [`AbstractEpisodicRecommenderAgent`](../../../recsim/agent/AbstractEpisodicRecommenderAgent.md)
 
-<!-- Placeholder for "Used in" -->
-
-This is a standard, nondecomposed Q-learning method that treats each slate
-atomically (i.e., holistically) as a single action.
-
-<h2 id="__init__"><code>__init__</code></h2>
-
-```python
-__init__(
-    *args,
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.agents.full_slate_q_agent.FullSlateQAgent(
+    sess, observation_space, action_space, optimizer_name='', eval_mode=False,
     **kwargs
 )
-```
+</code></pre>
 
-Initializes a FullSlateQAgent.
+<!-- Placeholder for "Used in" -->
 
-#### Args:
+This is a standard, nondecomposed Q-learning method that treats each slate
+atomically (i.e., holistically) as a single action.
 
-*   <b>`sess`</b>: a Tensorflow session.
-*   <b>`observation_space`</b>: A gym.spaces object that specifies the format of
-    observations.
-*   <b>`action_space`</b>: A gym.spaces object that specifies the format of
-    actions.
-*   <b>`optimizer_name`</b>: The name of the optimizer.
-*   <b>`eval_mode`</b>: A bool for whether the agent is in training or
-    evaluation mode.
-*   <b>`**kwargs`</b>: Keyword arguments to the DQNAgent.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+`sess`
+</td>
+<td>
+a Tensorflow session.
+</td>
+</tr><tr>
+<td>
+`observation_space`
+</td>
+<td>
+A gym.spaces object that specifies the format of
+observations.
+</td>
+</tr><tr>
+<td>
+`action_space`
+</td>
+<td>
+A gym.spaces object that specifies the format of actions.
+</td>
+</tr><tr>
+<td>
+`optimizer_name`
+</td>
+<td>
+The name of the optimizer.
+</td>
+</tr><tr>
+<td>
+`eval_mode`
+</td>
+<td>
+A bool for whether the agent is in training or evaluation mode.
+</td>
+</tr><tr>
+<td>
+`**kwargs`
+</td>
+<td>
+Keyword arguments to the DQNAgent.
+</td>
+</tr>
+</table>
 
-## Properties
+<!-- Tabular view -->
 
-<h3 id="multi_user"><code>multi_user</code></h3>
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
+<tr>
+<td>
+`multi_user`
+</td>
+<td>
 Returns boolean indicating whether this agent serves multiple users.
+</td>
+</tr>
+</table>
 
 ## Methods
 
@@ -70,29 +113,51 @@ Returns boolean indicating whether this agent serves multiple users.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/full_slate_q_agent.py">View
 source</a>
 
-```python
-begin_episode(observation)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>begin_episode(
+    observation
+)
+</code></pre>
 
 Returns the agent's first action for this episode.
 
-#### Args:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
 
-*   <b>`observation`</b>: numpy array, the environment's initial observation.
+<tr>
+<td>
+`observation`
+</td>
+<td>
+numpy array, the environment's initial observation.
+</td>
+</tr>
+</table>
 
-#### Returns:
+<!-- Tabular view -->
 
-An integer array of size _slate_size, the selected slated, each element of which
-is an index in the list of doc_obs.
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+An integer array of size _slate_size, the selected slated, each
+element of which is an index in the list of doc_obs.
+</td>
+</tr>
+
+</table>
 
 <h3 id="bundle_and_checkpoint"><code>bundle_and_checkpoint</code></h3>
 
-```python
-bundle_and_checkpoint(
-    checkpoint_dir,
-    iteration_number
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>bundle_and_checkpoint(
+    checkpoint_dir, iteration_number
 )
-```
+</code></pre>
 
 Returns a self-contained bundle of the agent's state.
 
@@ -100,79 +165,147 @@ This is used for checkpointing. It will return a dictionary containing all
 non-TensorFlow objects (to be saved into a file by the caller), and it saves all
 TensorFlow objects into a checkpoint file.
 
-#### Args:
-
-*   <b>`checkpoint_dir`</b>: str, directory where TensorFlow objects will be
-    saved.
-*   <b>`iteration_number`</b>: int, iteration number to use for naming the
-    checkpoint file.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`checkpoint_dir`
+</td>
+<td>
+str, directory where TensorFlow objects will be saved.
+</td>
+</tr><tr>
+<td>
+`iteration_number`
+</td>
+<td>
+int, iteration number to use for naming the checkpoint
+file.
+</td>
+</tr>
+</table>
 
-#### Returns:
+<!-- Tabular view -->
 
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
 A dict containing additional Python objects to be checkpointed by the
 experiment. If the checkpoint directory does not exist, returns None.
+</td>
+</tr>
+
+</table>
 
 <h3 id="end_episode"><code>end_episode</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/full_slate_q_agent.py">View
 source</a>
 
-```python
-end_episode(
-    reward,
-    observation
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>end_episode(
+    reward, observation
 )
-```
+</code></pre>
 
 Signals the end of the episode to the agent.
 
 We store the observation of the current time step, which is the last observation
 of the episode.
 
-#### Args:
-
-*   <b>`reward`</b>: float, the last reward from the environment.
-*   <b>`observation`</b>: numpy array, the environment's initial observation.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`reward`
+</td>
+<td>
+float, the last reward from the environment.
+</td>
+</tr><tr>
+<td>
+`observation`
+</td>
+<td>
+numpy array, the environment's initial observation.
+</td>
+</tr>
+</table>
 
 <h3 id="step"><code>step</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/full_slate_q_agent.py">View
 source</a>
 
-```python
-step(
-    reward,
-    observation
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>step(
+    reward, observation
 )
-```
+</code></pre>
 
 Receives observations of environment and returns a slate.
 
-#### Args:
-
-*   <b>`reward`</b>: A double representing the overall reward to the recommended
-    slate.
-*   <b>`observation`</b>: A dictionary that stores all the observations
-    including:
-    -   user: A list of floats representing the user's observed state
-    -   doc: A list of observations of document features
-    -   response: A vector valued response signal that represent user's response
-        to each document
-
-#### Returns:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`reward`
+</td>
+<td>
+A double representing the overall reward to the recommended slate.
+</td>
+</tr><tr>
+<td>
+`observation`
+</td>
+<td>
+A dictionary that stores all the observations including:
+- user: A list of floats representing the user's observed state
+- doc: A list of observations of document features
+- response: A vector valued response signal that represent user's
+response to each document
+</td>
+</tr>
+</table>
 
-*   <b>`slate`</b>: An integer array of size _slate_size, where each element is
-    an index in the list of document observvations.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+
+<tr>
+<td>
+`slate`
+</td>
+<td>
+An integer array of size _slate_size, where each element is an
+index in the list of document observvations.
+</td>
+</tr>
+</table>
 
 <h3 id="unbundle"><code>unbundle</code></h3>
 
-```python
-unbundle(
-    checkpoint_dir,
-    iteration_number,
-    bundle_dictionary
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>unbundle(
+    checkpoint_dir, iteration_number, bundle_dictionary
 )
-```
+</code></pre>
 
 Restores the agent from a checkpoint.
 
@@ -180,14 +313,47 @@ Restores the agent's Python objects to those specified in bundle_dictionary, and
 restores the TensorFlow objects to those specified in the checkpoint_dir. If the
 checkpoint_dir does not exist, will not reset the agent's state.
 
-#### Args:
-
-*   <b>`checkpoint_dir`</b>: str, path to the checkpoint saved by tf.Save.
-*   <b>`iteration_number`</b>: int, checkpoint version, used when restoring the
-    replay buffer.
-*   <b>`bundle_dictionary`</b>: dict, containing additional Python objects owned
-    by the agent.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`checkpoint_dir`
+</td>
+<td>
+str, path to the checkpoint saved by tf.Save.
+</td>
+</tr><tr>
+<td>
+`iteration_number`
+</td>
+<td>
+int, checkpoint version, used when restoring the replay
+buffer.
+</td>
+</tr><tr>
+<td>
+`bundle_dictionary`
+</td>
+<td>
+dict, containing additional Python objects owned by
+the agent.
+</td>
+</tr>
+</table>
 
-#### Returns:
+<!-- Tabular view -->
 
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
 bool, True if unbundling was successful.
+</td>
+</tr>
+
+</table>
diff --git a/docs/api_docs/python/recsim/agents/greedy_pctr_agent.md b/docs/api_docs/python/recsim/agents/greedy_pctr_agent.md
index d374dc3..171a929 100644
--- a/docs/api_docs/python/recsim/agents/greedy_pctr_agent.md
+++ b/docs/api_docs/python/recsim/agents/greedy_pctr_agent.md
@@ -5,7 +5,10 @@
 
 # Module: recsim.agents.greedy_pctr_agent
 
+<!-- Insert buttons and diff -->
+
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/greedy_pctr_agent.py">View
diff --git a/docs/api_docs/python/recsim/agents/greedy_pctr_agent/GreedyPCTRAgent.md b/docs/api_docs/python/recsim/agents/greedy_pctr_agent/GreedyPCTRAgent.md
index 3d0990a..a10b861 100644
--- a/docs/api_docs/python/recsim/agents/greedy_pctr_agent/GreedyPCTRAgent.md
+++ b/docs/api_docs/python/recsim/agents/greedy_pctr_agent/GreedyPCTRAgent.md
@@ -1,7 +1,6 @@
 <div itemscope itemtype="http://developers.google.com/ReferenceObject">
 <meta itemprop="name" content="recsim.agents.greedy_pctr_agent.GreedyPCTRAgent" />
 <meta itemprop="path" content="Stable" />
-<meta itemprop="property" content="multi_user"/>
 <meta itemprop="property" content="__init__"/>
 <meta itemprop="property" content="begin_episode"/>
 <meta itemprop="property" content="bundle_and_checkpoint"/>
@@ -13,22 +12,27 @@
 
 # recsim.agents.greedy_pctr_agent.GreedyPCTRAgent
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/greedy_pctr_agent.py">View
 source</a>
 
-## Class `GreedyPCTRAgent`
-
-<!-- Start diff -->
 An agent that recommends slates with the highest pCTR items.
 
 Inherits From:
 [`AbstractEpisodicRecommenderAgent`](../../../recsim/agent/AbstractEpisodicRecommenderAgent.md)
 
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.agents.greedy_pctr_agent.GreedyPCTRAgent(
+    action_space, belief_state,
+    choice_model=cm.MultinomialLogitChoiceModel({'no_click_mass': 5})
+)
+</code></pre>
+
 <!-- Placeholder for "Used in" -->
 
 This agent assumes knowledge of the true underlying choice model. Note that this
@@ -36,36 +40,52 @@ implicitly means it receives observations of the true user and document states.
 This agent myopically creates slates with items that have the highest
 probability of being clicked under the given choice model.
 
-<h2 id="__init__"><code>__init__</code></h2>
-
-<a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/greedy_pctr_agent.py">View
-source</a>
-
-```python
-__init__(
-    action_space,
-    belief_state,
-    choice_model=cm.MultinomialLogitChoiceModel({'no_click_mass': 5})
-)
-```
-
-Initializes a new greedy pCTR agent.
-
-#### Args:
-
-*   <b>`action_space`</b>: A gym.spaces object that specifies the format of
-    actions
-*   <b>`belief_state`</b>: An instantiation of AbstractUserState assumed by the
-    agent
-*   <b>`choice_model`</b>: An instantiation of AbstractChoiceModel assumed by
-    the agent Default to a multinomial logit choice model with
-    no_click_mass = 5.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+`action_space`
+</td>
+<td>
+A gym.spaces object that specifies the format of actions
+</td>
+</tr><tr>
+<td>
+`belief_state`
+</td>
+<td>
+An instantiation of AbstractUserState assumed by the agent
+</td>
+</tr><tr>
+<td>
+`choice_model`
+</td>
+<td>
+An instantiation of AbstractChoiceModel assumed by the agent
+Default to a multinomial logit choice model with no_click_mass = 5.
+</td>
+</tr>
+</table>
 
-## Properties
+<!-- Tabular view -->
 
-<h3 id="multi_user"><code>multi_user</code></h3>
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
+<tr>
+<td>
+`multi_user`
+</td>
+<td>
 Returns boolean indicating whether this agent serves multiple users.
+</td>
+</tr>
+</table>
 
 ## Methods
 
@@ -74,142 +94,298 @@ Returns boolean indicating whether this agent serves multiple users.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View
 source</a>
 
-```python
-begin_episode(observation=None)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>begin_episode(
+    observation=None
+)
+</code></pre>
 
 Returns the agent's first action for this episode.
 
-#### Args:
+<!-- Tabular view -->
 
-*   <b>`observation`</b>: numpy array, the environment's initial observation.
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
 
-#### Returns:
+<tr>
+<td>
+`observation`
+</td>
+<td>
+numpy array, the environment's initial observation.
+</td>
+</tr>
+</table>
 
-*   <b>`slate`</b>: An integer array of size _slate_size, where each element is
-    an index into the list of doc_obs
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+
+<tr>
+<td>
+`slate`
+</td>
+<td>
+An integer array of size _slate_size, where each element is an
+index into the list of doc_obs
+</td>
+</tr>
+</table>
 
 <h3 id="bundle_and_checkpoint"><code>bundle_and_checkpoint</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View
 source</a>
 
-```python
-bundle_and_checkpoint(
-    checkpoint_dir,
-    iteration_number
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>bundle_and_checkpoint(
+    checkpoint_dir, iteration_number
 )
-```
+</code></pre>
 
 Returns a self-contained bundle of the agent's state.
 
-#### Args:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`checkpoint_dir`
+</td>
+<td>
+A string that represents the path to the checkpoint and is
+used when we save TensorFlow objects by tf.Save.
+</td>
+</tr><tr>
+<td>
+`iteration_number`
+</td>
+<td>
+An integer that represents the checkpoint version and is
+used when restoring replay buffer.
+</td>
+</tr>
+</table>
 
-*   <b>`checkpoint_dir`</b>: A string that represents the path to the checkpoint
-    and is used when we save TensorFlow objects by tf.Save.
-*   <b>`iteration_number`</b>: An integer that represents the checkpoint version
-    and is used when restoring replay buffer.
+<!-- Tabular view -->
 
-#### Returns:
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+A dictionary containing additional Python objects to be checkpointed by
+the experiment. Each key is a string for the object name and the value
+is actual object. If the checkpoint directory does not exist, returns
+empty dictionary.
+</td>
+</tr>
 
-A dictionary containing additional Python objects to be checkpointed by the
-experiment. Each key is a string for the object name and the value is actual
-object. If the checkpoint directory does not exist, returns empty dictionary.
+</table>
 
 <h3 id="end_episode"><code>end_episode</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View
 source</a>
 
-```python
-end_episode(
-    reward,
-    observation=None
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>end_episode(
+    reward, observation=None
 )
-```
+</code></pre>
 
 Signals the end of the episode to the agent.
 
-#### Args:
-
-*   <b>`reward`</b>: An float that is the last reward from the environment.
-*   <b>`observation`</b>: numpy array that represents the last observation of
-    the episode.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`reward`
+</td>
+<td>
+An float that is the last reward from the environment.
+</td>
+</tr><tr>
+<td>
+`observation`
+</td>
+<td>
+numpy array that represents the last observation of the
+episode.
+</td>
+</tr>
+</table>
 
 <h3 id="findBestDocuments"><code>findBestDocuments</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/greedy_pctr_agent.py">View
 source</a>
 
-```python
-findBestDocuments(scores)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>findBestDocuments(
+    scores
+)
+</code></pre>
 
 Returns the indices of the highest scores in sorted order.
 
-#### Args:
+<!-- Tabular view -->
 
-*   <b>`scores`</b>: A list of floats representing unnormalized document scores
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
 
-#### Returns:
+<tr>
+<td>
+`scores`
+</td>
+<td>
+A list of floats representing unnormalized document scores
+</td>
+</tr>
+</table>
 
-*   <b>`sorted_indices`</b>: A list of integers indexing the highest scores, in
-    sorted order
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+
+<tr>
+<td>
+`sorted_indices`
+</td>
+<td>
+A list of integers indexing the highest scores, in sorted
+order
+</td>
+</tr>
+</table>
 
 <h3 id="step"><code>step</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/greedy_pctr_agent.py">View
 source</a>
 
-```python
-step(
-    reward,
-    observation
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>step(
+    reward, observation
 )
-```
+</code></pre>
 
 Records the most recent transition and returns the agent's next action.
 
 We store the observation of the last time step since we want to store it with
 the reward.
 
-#### Args:
-
-*   <b>`reward`</b>: Unused.
-*   <b>`observation`</b>: A dictionary that includes the most recent
-    observations and should have the following fields:
-    -   user: A list of floats representing the user's observed state
-    -   doc: A list of observations of document features
-
-#### Returns:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`reward`
+</td>
+<td>
+Unused.
+</td>
+</tr><tr>
+<td>
+`observation`
+</td>
+<td>
+A dictionary that includes the most recent observations and
+should have the following fields:
+- user: A list of floats representing the user's observed state
+- doc: A list of observations of document features
+</td>
+</tr>
+</table>
 
-*   <b>`slate`</b>: An integer array of size _slate_size, where each element is
-    an index into the list of doc_obs
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+
+<tr>
+<td>
+`slate`
+</td>
+<td>
+An integer array of size _slate_size, where each element is an
+index into the list of doc_obs
+</td>
+</tr>
+</table>
 
 <h3 id="unbundle"><code>unbundle</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View
 source</a>
 
-```python
-unbundle(
-    checkpoint_dir,
-    iteration_number,
-    bundle_dict
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>unbundle(
+    checkpoint_dir, iteration_number, bundle_dict
 )
-```
+</code></pre>
 
 Restores the agent from a checkpoint.
 
-#### Args:
-
-*   <b>`checkpoint_dir`</b>: A string that represents the path to the checkpoint
-    and is used when we save TensorFlow objects by tf.Save.
-*   <b>`iteration_number`</b>: An integer that represents the checkpoint version
-    and is used when restoring replay buffer.
-*   <b>`bundle_dict`</b>: A dict containing additional Python objects owned by
-    the agent. Each key is an object name and the value is the actual object.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`checkpoint_dir`
+</td>
+<td>
+A string that represents the path to the checkpoint and is
+used when we save TensorFlow objects by tf.Save.
+</td>
+</tr><tr>
+<td>
+`iteration_number`
+</td>
+<td>
+An integer that represents the checkpoint version and is
+used when restoring replay buffer.
+</td>
+</tr><tr>
+<td>
+`bundle_dict`
+</td>
+<td>
+A dict containing additional Python objects owned by the
+agent. Each key is an object name and the value is the actual object.
+</td>
+</tr>
+</table>
 
-#### Returns:
+<!-- Tabular view -->
 
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
 bool, True if unbundling was successful.
+</td>
+</tr>
+
+</table>
diff --git a/docs/api_docs/python/recsim/agents/layers.md b/docs/api_docs/python/recsim/agents/layers.md
index 049ccef..8a7696a 100644
--- a/docs/api_docs/python/recsim/agents/layers.md
+++ b/docs/api_docs/python/recsim/agents/layers.md
@@ -5,7 +5,10 @@
 
 # Module: recsim.agents.layers
 
+<!-- Insert buttons and diff -->
+
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/layers/__init__.py">View
diff --git a/docs/api_docs/python/recsim/agents/layers/abstract_click_bandit.md b/docs/api_docs/python/recsim/agents/layers/abstract_click_bandit.md
index b187052..6d6210e 100644
--- a/docs/api_docs/python/recsim/agents/layers/abstract_click_bandit.md
+++ b/docs/api_docs/python/recsim/agents/layers/abstract_click_bandit.md
@@ -5,8 +5,10 @@
 
 # Module: recsim.agents.layers.abstract_click_bandit
 
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/layers/abstract_click_bandit.py">View source</a>
diff --git a/docs/api_docs/python/recsim/agents/layers/abstract_click_bandit/AbstractClickBanditLayer.md b/docs/api_docs/python/recsim/agents/layers/abstract_click_bandit/AbstractClickBanditLayer.md
index ef5b859..584e51a 100644
--- a/docs/api_docs/python/recsim/agents/layers/abstract_click_bandit/AbstractClickBanditLayer.md
+++ b/docs/api_docs/python/recsim/agents/layers/abstract_click_bandit/AbstractClickBanditLayer.md
@@ -1,7 +1,6 @@
 <div itemscope itemtype="http://developers.google.com/ReferenceObject">
 <meta itemprop="name" content="recsim.agents.layers.abstract_click_bandit.AbstractClickBanditLayer" />
 <meta itemprop="path" content="Stable" />
-<meta itemprop="property" content="multi_user"/>
 <meta itemprop="property" content="__init__"/>
 <meta itemprop="property" content="begin_episode"/>
 <meta itemprop="property" content="bundle_and_checkpoint"/>
@@ -12,22 +11,28 @@
 
 # recsim.agents.layers.abstract_click_bandit.AbstractClickBanditLayer
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/layers/abstract_click_bandit.py">View source</a>
 
 
 
-## Class `AbstractClickBanditLayer`
-
-<!-- Start diff -->
 A hierarchical bandit layer which treats a set of base agents as arms.
 
 Inherits From: [`AbstractHierarchicalAgentLayer`](../../../../recsim/agent/AbstractHierarchicalAgentLayer.md)
 
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.agents.layers.abstract_click_bandit.AbstractClickBanditLayer(
+    observation_space, action_space, arm_base_agent_ctors,
+    alg_ctor=recsim.agents.bandits.algorithms.UCB1, ci_scaling=1.0, random_seed=0,
+    **kwargs
+)
+</code></pre>
+
 <!-- Placeholder for "Used in" -->
 
 This layer consumes a list of base agents with apriori unknown mean payoffs
@@ -40,36 +45,81 @@ confidence bound as index, the AbstractClickBandit will put the partial slate
 of the highest-UCB base agent in first place, then the second, until the slate
 is complete.
 
-<h2 id="__init__"><code>__init__</code></h2>
-
-``` python
-__init__(
-    *args,
-    **kwargs
-)
-```
-
-Initializes a new bandit agent for clustered arm exploration.
-
-
-#### Args:
-
-
-* <b>`observation_space`</b>: Instance of a gym space corresponding to the
-  observation format.
-* <b>`action_space`</b>: A gym.spaces object that specifies the format of actions.
-* <b>`arm_base_agent_ctors`</b>: a list of agent constructors, each agent corresponds
-  to a bandit arm.
-* <b>`alg_ctor`</b>: A class of an MABAlgorithm for exploration, default to UCB1.
-* <b>`ci_scaling`</b>: A floating number specifying the scaling of confidence bound.
-* <b>`random_seed`</b>: An integer for random seed.
-* <b>`**kwargs`</b>: arguments for base agents.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+`observation_space`
+</td>
+<td>
+Instance of a gym space corresponding to the
+observation format.
+</td>
+</tr><tr>
+<td>
+`action_space`
+</td>
+<td>
+A gym.spaces object that specifies the format of actions.
+</td>
+</tr><tr>
+<td>
+`arm_base_agent_ctors`
+</td>
+<td>
+a list of agent constructors, each agent corresponds
+to a bandit arm.
+</td>
+</tr><tr>
+<td>
+`alg_ctor`
+</td>
+<td>
+A class of an MABAlgorithm for exploration, default to UCB1.
+</td>
+</tr><tr>
+<td>
+`ci_scaling`
+</td>
+<td>
+A floating number specifying the scaling of confidence bound.
+</td>
+</tr><tr>
+<td>
+`random_seed`
+</td>
+<td>
+An integer for random seed.
+</td>
+</tr><tr>
+<td>
+`**kwargs`
+</td>
+<td>
+arguments for base agents.
+</td>
+</tr>
+</table>
 
-## Properties
+<!-- Tabular view -->
 
-<h3 id="multi_user"><code>multi_user</code></h3>
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
+<tr>
+<td>
+`multi_user`
+</td>
+<td>
 Returns boolean indicating whether this agent serves multiple users.
+</td>
+</tr>
+</table>
 
 ## Methods
 
@@ -77,120 +127,187 @@ Returns boolean indicating whether this agent serves multiple users.
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View source</a>
 
-``` python
-begin_episode(observation=None)
-```
-
-
-
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>begin_episode(
+    observation=None
+)
+</code></pre>
 
 <h3 id="bundle_and_checkpoint"><code>bundle_and_checkpoint</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View source</a>
 
-``` python
-bundle_and_checkpoint(
-    checkpoint_dir,
-    iteration_number
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>bundle_and_checkpoint(
+    checkpoint_dir, iteration_number
 )
-```
+</code></pre>
 
 Returns a self-contained bundle of the agent's state.
 
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`checkpoint_dir`
+</td>
+<td>
+A string for the directory where objects will be saved.
+</td>
+</tr><tr>
+<td>
+`iteration_number`
+</td>
+<td>
+An integer of iteration number to use for naming the
+checkpoint file.
+</td>
+</tr>
+</table>
 
-#### Args:
-
-
-* <b>`checkpoint_dir`</b>: A string for the directory where objects will be saved.
-* <b>`iteration_number`</b>: An integer of iteration number to use for naming the
-  checkpoint file.
-
-
-#### Returns:
+<!-- Tabular view -->
 
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
 A dictionary containing additional Python objects to be checkpointed by
-  the experiment. Each key is a string for the object name and the value
-  is actual object. If the checkpoint directory does not exist, returns
-  empty dictionary.
+the experiment. Each key is a string for the object name and the value
+is actual object. If the checkpoint directory does not exist, returns
+empty dictionary.
+</td>
+</tr>
 
+</table>
 
 <h3 id="end_episode"><code>end_episode</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View source</a>
 
-``` python
-end_episode(
-    reward,
-    observation
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>end_episode(
+    reward, observation
 )
-```
-
-
-
+</code></pre>
 
 <h3 id="step"><code>step</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/layers/abstract_click_bandit.py">View source</a>
 
-``` python
-step(
-    reward,
-    observation
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>step(
+    reward, observation
 )
-```
+</code></pre>
 
 Records the most recent transition and returns the agent's next action.
 
 We store the observation of the last time step since we want to store it
 with the reward.
 
-#### Args:
-
-
-* <b>`reward`</b>: Unused.
-* <b>`observation`</b>: A dictionary that includes the most recent observations and
-  should have the following fields:
-  - user: A dictionary representing user's observed state. Assumes
-    observation['user']['sufficient_statics'] is a dictionary containing
-    base agent impression counts and base agent click counts.
-
-
-#### Returns:
-
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`reward`
+</td>
+<td>
+Unused.
+</td>
+</tr><tr>
+<td>
+`observation`
+</td>
+<td>
+A dictionary that includes the most recent observations and
+should have the following fields:
+- user: A dictionary representing user's observed state. Assumes
+observation['user']['sufficient_statics'] is a dictionary containing
+base agent impression counts and base agent click counts.
+</td>
+</tr>
+</table>
 
-* <b>`slate`</b>: An integer array of size _slate_size, where each element is an
-  index into the list of doc_obs
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+
+<tr>
+<td>
+`slate`
+</td>
+<td>
+An integer array of size _slate_size, where each element is an
+index into the list of doc_obs
+</td>
+</tr>
+</table>
 
 <h3 id="unbundle"><code>unbundle</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View source</a>
 
-``` python
-unbundle(
-    checkpoint_dir,
-    iteration_number,
-    bundle_dict
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>unbundle(
+    checkpoint_dir, iteration_number, bundle_dict
 )
-```
+</code></pre>
 
 Restores the agent from a checkpoint.
 
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`checkpoint_dir`
+</td>
+<td>
+A string that represents the path to the checkpoint saved
+by tf.Save.
+</td>
+</tr><tr>
+<td>
+`iteration_number`
+</td>
+<td>
+An integer that represents the checkpoint version and is
+used when restoring replay buffer.
+</td>
+</tr><tr>
+<td>
+`bundle_dict`
+</td>
+<td>
+A dict containing additional Python objects owned by the
+agent. Each key is an object name and the value is the actual object.
+</td>
+</tr>
+</table>
 
-#### Args:
-
-
-* <b>`checkpoint_dir`</b>: A string that represents the path to the checkpoint saved
-  by tf.Save.
-* <b>`iteration_number`</b>: An integer that represents the checkpoint version and is
-  used when restoring replay buffer.
-* <b>`bundle_dict`</b>: A dict containing additional Python objects owned by the
-  agent. Each key is an object name and the value is the actual object.
-
-
-#### Returns:
+<!-- Tabular view -->
 
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
 bool, True if unbundling was successful.
+</td>
+</tr>
 
-
-
-
+</table>
diff --git a/docs/api_docs/python/recsim/agents/layers/cluster_click_statistics.md b/docs/api_docs/python/recsim/agents/layers/cluster_click_statistics.md
index c303c18..67fb790 100644
--- a/docs/api_docs/python/recsim/agents/layers/cluster_click_statistics.md
+++ b/docs/api_docs/python/recsim/agents/layers/cluster_click_statistics.md
@@ -5,7 +5,10 @@
 
 # Module: recsim.agents.layers.cluster_click_statistics
 
+<!-- Insert buttons and diff -->
+
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/layers/cluster_click_statistics.py">View
diff --git a/docs/api_docs/python/recsim/agents/layers/cluster_click_statistics/ClusterClickStatsLayer.md b/docs/api_docs/python/recsim/agents/layers/cluster_click_statistics/ClusterClickStatsLayer.md
index 408c2a9..d37f970 100644
--- a/docs/api_docs/python/recsim/agents/layers/cluster_click_statistics/ClusterClickStatsLayer.md
+++ b/docs/api_docs/python/recsim/agents/layers/cluster_click_statistics/ClusterClickStatsLayer.md
@@ -1,8 +1,6 @@
 <div itemscope itemtype="http://developers.google.com/ReferenceObject">
 <meta itemprop="name" content="recsim.agents.layers.cluster_click_statistics.ClusterClickStatsLayer" />
 <meta itemprop="path" content="Stable" />
-<meta itemprop="property" content="multi_user"/>
-<meta itemprop="property" content="observation_space"/>
 <meta itemprop="property" content="__init__"/>
 <meta itemprop="property" content="begin_episode"/>
 <meta itemprop="property" content="bundle_and_checkpoint"/>
@@ -13,22 +11,26 @@
 
 # recsim.agents.layers.cluster_click_statistics.ClusterClickStatsLayer
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/layers/cluster_click_statistics.py">View
 source</a>
 
-## Class `ClusterClickStatsLayer`
-
-<!-- Start diff -->
 Track impressions and clicks on a per-cluster basis and pass down to agent.
 
 Inherits From:
 [`SufficientStatisticsLayer`](../../../../recsim/agents/layers/sufficient_statistics/SufficientStatisticsLayer.md)
 
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.agents.layers.cluster_click_statistics.ClusterClickStatsLayer(
+    base_agent_ctor, observation_space, action_space, **kwargs
+)
+</code></pre>
+
 <!-- Placeholder for "Used in" -->
 
 This module assumes each document belongs to single cluster and we know the
@@ -36,39 +38,56 @@ number of possible clusters. Every time we increase impression count for a
 cluster if the agent recommends a document from that cluster. We also increase
 click count for a cluster if user responds a click.
 
-<h2 id="__init__"><code>__init__</code></h2>
-
-<a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/layers/cluster_click_statistics.py">View
-source</a>
-
-```python
-__init__(
-    base_agent_ctor,
-    observation_space,
-    action_space,
-    **kwargs
-)
-```
-
-Initializes a ClusterClickStatsLayer object.
-
-#### Args:
-
-*   <b>`base_agent_ctor`</b>: a constructor for the base agent.
-*   <b>`observation_space`</b>: a gym.spaces object specifying the format of
-    observations.
-*   <b>`action_space`</b>: A gym.spaces object that specifies the format of
-    actions.
-*   <b>`**kwargs`</b>: arguments to pass to the downstream agent at construction
-    time.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+`base_agent_ctor`
+</td>
+<td>
+a constructor for the base agent.
+</td>
+</tr><tr>
+<td>
+`observation_space`
+</td>
+<td>
+a gym.spaces object specifying the format of
+observations.
+</td>
+</tr><tr>
+<td>
+`action_space`
+</td>
+<td>
+A gym.spaces object that specifies the format of actions.
+</td>
+</tr><tr>
+<td>
+`**kwargs`
+</td>
+<td>
+arguments to pass to the downstream agent at construction time.
+</td>
+</tr>
+</table>
 
-## Properties
+<!-- Tabular view -->
 
-<h3 id="multi_user"><code>multi_user</code></h3>
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
-Returns boolean indicating whether this agent serves multiple users.
+<tr> <td> `multi_user` </td> <td> Returns boolean indicating whether this agent
+serves multiple users. </td> </tr><tr> <td> `observation_space` </td> <td>
 
-<h3 id="observation_space"><code>observation_space</code></h3>
+</td>
+</tr>
+</table>
 
 ## Methods
 
@@ -77,102 +96,188 @@ Returns boolean indicating whether this agent serves multiple users.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View
 source</a>
 
-```python
-begin_episode(observation=None)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>begin_episode(
+    observation=None
+)
+</code></pre>
 
 <h3 id="bundle_and_checkpoint"><code>bundle_and_checkpoint</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View
 source</a>
 
-```python
-bundle_and_checkpoint(
-    checkpoint_dir,
-    iteration_number
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>bundle_and_checkpoint(
+    checkpoint_dir, iteration_number
 )
-```
+</code></pre>
 
 Returns a self-contained bundle of the agent's state.
 
-#### Args:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`checkpoint_dir`
+</td>
+<td>
+A string for the directory where objects will be saved.
+</td>
+</tr><tr>
+<td>
+`iteration_number`
+</td>
+<td>
+An integer of iteration number to use for naming the
+checkpoint file.
+</td>
+</tr>
+</table>
 
-*   <b>`checkpoint_dir`</b>: A string for the directory where objects will be
-    saved.
-*   <b>`iteration_number`</b>: An integer of iteration number to use for naming
-    the checkpoint file.
+<!-- Tabular view -->
 
-#### Returns:
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+A dictionary containing additional Python objects to be checkpointed by
+the experiment. Each key is a string for the object name and the value
+is actual object. If the checkpoint directory does not exist, returns
+empty dictionary.
+</td>
+</tr>
 
-A dictionary containing additional Python objects to be checkpointed by the
-experiment. Each key is a string for the object name and the value is actual
-object. If the checkpoint directory does not exist, returns empty dictionary.
+</table>
 
 <h3 id="end_episode"><code>end_episode</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/layers/sufficient_statistics.py">View
 source</a>
 
-```python
-end_episode(
-    reward,
-    observation
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>end_episode(
+    reward, observation
 )
-```
+</code></pre>
 
 <h3 id="step"><code>step</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/layers/sufficient_statistics.py">View
 source</a>
 
-```python
-step(
-    reward,
-    observation
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>step(
+    reward, observation
 )
-```
+</code></pre>
 
 Records the most recent transition and returns the agent's next action.
 
 We store the observation of the last time step since we want to store it with
 the reward.
 
-#### Args:
-
-*   <b>`reward`</b>: The reward received from the agent's most recent action as
-    a float.
-*   <b>`observation`</b>: A dictionary that includes the most recent
-    observations.
-
-#### Returns:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`reward`
+</td>
+<td>
+The reward received from the agent's most recent action as a
+float.
+</td>
+</tr><tr>
+<td>
+`observation`
+</td>
+<td>
+A dictionary that includes the most recent observations.
+</td>
+</tr>
+</table>
 
-*   <b>`slate`</b>: An integer array of size _slate_size, where each element is
-    an index into the list of doc_obs
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+
+<tr>
+<td>
+`slate`
+</td>
+<td>
+An integer array of size _slate_size, where each element is an
+index into the list of doc_obs
+</td>
+</tr>
+</table>
 
 <h3 id="unbundle"><code>unbundle</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View
 source</a>
 
-```python
-unbundle(
-    checkpoint_dir,
-    iteration_number,
-    bundle_dict
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>unbundle(
+    checkpoint_dir, iteration_number, bundle_dict
 )
-```
+</code></pre>
 
 Restores the agent from a checkpoint.
 
-#### Args:
-
-*   <b>`checkpoint_dir`</b>: A string that represents the path to the checkpoint
-    saved by tf.Save.
-*   <b>`iteration_number`</b>: An integer that represents the checkpoint version
-    and is used when restoring replay buffer.
-*   <b>`bundle_dict`</b>: A dict containing additional Python objects owned by
-    the agent. Each key is an object name and the value is the actual object.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`checkpoint_dir`
+</td>
+<td>
+A string that represents the path to the checkpoint saved
+by tf.Save.
+</td>
+</tr><tr>
+<td>
+`iteration_number`
+</td>
+<td>
+An integer that represents the checkpoint version and is
+used when restoring replay buffer.
+</td>
+</tr><tr>
+<td>
+`bundle_dict`
+</td>
+<td>
+A dict containing additional Python objects owned by the
+agent. Each key is an object name and the value is the actual object.
+</td>
+</tr>
+</table>
 
-#### Returns:
+<!-- Tabular view -->
 
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
 bool, True if unbundling was successful.
+</td>
+</tr>
+
+</table>
diff --git a/docs/api_docs/python/recsim/agents/layers/fixed_length_history.md b/docs/api_docs/python/recsim/agents/layers/fixed_length_history.md
index f59dfed..c32196f 100644
--- a/docs/api_docs/python/recsim/agents/layers/fixed_length_history.md
+++ b/docs/api_docs/python/recsim/agents/layers/fixed_length_history.md
@@ -5,7 +5,10 @@
 
 # Module: recsim.agents.layers.fixed_length_history
 
+<!-- Insert buttons and diff -->
+
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/layers/fixed_length_history.py">View
diff --git a/docs/api_docs/python/recsim/agents/layers/fixed_length_history/FixedLengthHistoryLayer.md b/docs/api_docs/python/recsim/agents/layers/fixed_length_history/FixedLengthHistoryLayer.md
index b74a665..908a990 100644
--- a/docs/api_docs/python/recsim/agents/layers/fixed_length_history/FixedLengthHistoryLayer.md
+++ b/docs/api_docs/python/recsim/agents/layers/fixed_length_history/FixedLengthHistoryLayer.md
@@ -1,8 +1,6 @@
 <div itemscope itemtype="http://developers.google.com/ReferenceObject">
 <meta itemprop="name" content="recsim.agents.layers.fixed_length_history.FixedLengthHistoryLayer" />
 <meta itemprop="path" content="Stable" />
-<meta itemprop="property" content="multi_user"/>
-<meta itemprop="property" content="observation_space"/>
 <meta itemprop="property" content="__init__"/>
 <meta itemprop="property" content="begin_episode"/>
 <meta itemprop="property" content="bundle_and_checkpoint"/>
@@ -13,22 +11,27 @@
 
 # recsim.agents.layers.fixed_length_history.FixedLengthHistoryLayer
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/layers/fixed_length_history.py">View
 source</a>
 
-## Class `FixedLengthHistoryLayer`
-
-<!-- Start diff -->
 Creates a buffer of the last k rewards and observations.
 
 Inherits From:
 [`SufficientStatisticsLayer`](../../../../recsim/agents/layers/sufficient_statistics/SufficientStatisticsLayer.md)
 
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.agents.layers.fixed_length_history.FixedLengthHistoryLayer(
+    base_agent_ctor, observation_space, action_space, history_length,
+    remember_user=True, remember_response=True, remember_doc=False, **kwargs
+)
+</code></pre>
+
 <!-- Placeholder for "Used in" -->
 
 This module introduces sufficient statistics in the form of a buffer holding the
@@ -39,51 +42,87 @@ are not enough observations to fill the buffer, so they will be filled with
 None. Each non-vacuous element of the tuple is an instance of (a subset of)
 observation_space.
 
-<h2 id="__init__"><code>__init__</code></h2>
-
-<a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/layers/fixed_length_history.py">View
-source</a>
-
-```python
-__init__(
-    base_agent_ctor,
-    observation_space,
-    action_space,
-    history_length,
-    remember_user=True,
-    remember_response=True,
-    remember_doc=False,
-    **kwargs
-)
-```
-
-Initializes a FixedLengthHistoryLayer object.
-
-#### Args:
-
-*   <b>`base_agent_ctor`</b>: a constructor for the base agent.
-*   <b>`observation_space`</b>: a gym.spaces object specifying the format of
-    observations.
-*   <b>`action_space`</b>: A gym.spaces object that specifies the format of
-    actions.
-*   <b>`history_length`</b>: positive integer number of observations to
-    remember.
-*   <b>`remember_user`</b>: boolean, indicates whether to track
-    observation_space[\'user\'].
-*   <b>`remember_response`</b>: boolean, indicates whether to track
-    observation_space[\'response\'].
-*   <b>`remember_doc`</b>: boolean, indicates whether to track
-    observation_space[\'doc\'].
-*   <b>`**kwargs`</b>: arguments to pass to the downstream agent at construction
-    time.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+`base_agent_ctor`
+</td>
+<td>
+a constructor for the base agent.
+</td>
+</tr><tr>
+<td>
+`observation_space`
+</td>
+<td>
+a gym.spaces object specifying the format of
+observations.
+</td>
+</tr><tr>
+<td>
+`action_space`
+</td>
+<td>
+A gym.spaces object that specifies the format of actions.
+</td>
+</tr><tr>
+<td>
+`history_length`
+</td>
+<td>
+positive integer number of observations to remember.
+</td>
+</tr><tr>
+<td>
+`remember_user`
+</td>
+<td>
+boolean, indicates whether to track
+observation_space[\'user\'].
+</td>
+</tr><tr>
+<td>
+`remember_response`
+</td>
+<td>
+boolean, indicates whether to track
+observation_space[\'response\'].
+</td>
+</tr><tr>
+<td>
+`remember_doc`
+</td>
+<td>
+boolean, indicates whether to track
+observation_space[\'doc\'].
+</td>
+</tr><tr>
+<td>
+`**kwargs`
+</td>
+<td>
+arguments to pass to the downstream agent at construction time.
+</td>
+</tr>
+</table>
 
-## Properties
+<!-- Tabular view -->
 
-<h3 id="multi_user"><code>multi_user</code></h3>
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
-Returns boolean indicating whether this agent serves multiple users.
+<tr> <td> `multi_user` </td> <td> Returns boolean indicating whether this agent
+serves multiple users. </td> </tr><tr> <td> `observation_space` </td> <td>
 
-<h3 id="observation_space"><code>observation_space</code></h3>
+</td>
+</tr>
+</table>
 
 ## Methods
 
@@ -92,102 +131,188 @@ Returns boolean indicating whether this agent serves multiple users.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View
 source</a>
 
-```python
-begin_episode(observation=None)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>begin_episode(
+    observation=None
+)
+</code></pre>
 
 <h3 id="bundle_and_checkpoint"><code>bundle_and_checkpoint</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View
 source</a>
 
-```python
-bundle_and_checkpoint(
-    checkpoint_dir,
-    iteration_number
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>bundle_and_checkpoint(
+    checkpoint_dir, iteration_number
 )
-```
+</code></pre>
 
 Returns a self-contained bundle of the agent's state.
 
-#### Args:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`checkpoint_dir`
+</td>
+<td>
+A string for the directory where objects will be saved.
+</td>
+</tr><tr>
+<td>
+`iteration_number`
+</td>
+<td>
+An integer of iteration number to use for naming the
+checkpoint file.
+</td>
+</tr>
+</table>
 
-*   <b>`checkpoint_dir`</b>: A string for the directory where objects will be
-    saved.
-*   <b>`iteration_number`</b>: An integer of iteration number to use for naming
-    the checkpoint file.
+<!-- Tabular view -->
 
-#### Returns:
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+A dictionary containing additional Python objects to be checkpointed by
+the experiment. Each key is a string for the object name and the value
+is actual object. If the checkpoint directory does not exist, returns
+empty dictionary.
+</td>
+</tr>
 
-A dictionary containing additional Python objects to be checkpointed by the
-experiment. Each key is a string for the object name and the value is actual
-object. If the checkpoint directory does not exist, returns empty dictionary.
+</table>
 
 <h3 id="end_episode"><code>end_episode</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/layers/sufficient_statistics.py">View
 source</a>
 
-```python
-end_episode(
-    reward,
-    observation
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>end_episode(
+    reward, observation
 )
-```
+</code></pre>
 
 <h3 id="step"><code>step</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/layers/sufficient_statistics.py">View
 source</a>
 
-```python
-step(
-    reward,
-    observation
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>step(
+    reward, observation
 )
-```
+</code></pre>
 
 Records the most recent transition and returns the agent's next action.
 
 We store the observation of the last time step since we want to store it with
 the reward.
 
-#### Args:
-
-*   <b>`reward`</b>: The reward received from the agent's most recent action as
-    a float.
-*   <b>`observation`</b>: A dictionary that includes the most recent
-    observations.
-
-#### Returns:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`reward`
+</td>
+<td>
+The reward received from the agent's most recent action as a
+float.
+</td>
+</tr><tr>
+<td>
+`observation`
+</td>
+<td>
+A dictionary that includes the most recent observations.
+</td>
+</tr>
+</table>
 
-*   <b>`slate`</b>: An integer array of size _slate_size, where each element is
-    an index into the list of doc_obs
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+
+<tr>
+<td>
+`slate`
+</td>
+<td>
+An integer array of size _slate_size, where each element is an
+index into the list of doc_obs
+</td>
+</tr>
+</table>
 
 <h3 id="unbundle"><code>unbundle</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View
 source</a>
 
-```python
-unbundle(
-    checkpoint_dir,
-    iteration_number,
-    bundle_dict
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>unbundle(
+    checkpoint_dir, iteration_number, bundle_dict
 )
-```
+</code></pre>
 
 Restores the agent from a checkpoint.
 
-#### Args:
-
-*   <b>`checkpoint_dir`</b>: A string that represents the path to the checkpoint
-    saved by tf.Save.
-*   <b>`iteration_number`</b>: An integer that represents the checkpoint version
-    and is used when restoring replay buffer.
-*   <b>`bundle_dict`</b>: A dict containing additional Python objects owned by
-    the agent. Each key is an object name and the value is the actual object.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`checkpoint_dir`
+</td>
+<td>
+A string that represents the path to the checkpoint saved
+by tf.Save.
+</td>
+</tr><tr>
+<td>
+`iteration_number`
+</td>
+<td>
+An integer that represents the checkpoint version and is
+used when restoring replay buffer.
+</td>
+</tr><tr>
+<td>
+`bundle_dict`
+</td>
+<td>
+A dict containing additional Python objects owned by the
+agent. Each key is an object name and the value is the actual object.
+</td>
+</tr>
+</table>
 
-#### Returns:
+<!-- Tabular view -->
 
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
 bool, True if unbundling was successful.
+</td>
+</tr>
+
+</table>
diff --git a/docs/api_docs/python/recsim/agents/layers/sufficient_statistics.md b/docs/api_docs/python/recsim/agents/layers/sufficient_statistics.md
index 2290cb9..3ddbf09 100644
--- a/docs/api_docs/python/recsim/agents/layers/sufficient_statistics.md
+++ b/docs/api_docs/python/recsim/agents/layers/sufficient_statistics.md
@@ -5,7 +5,10 @@
 
 # Module: recsim.agents.layers.sufficient_statistics
 
+<!-- Insert buttons and diff -->
+
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/layers/sufficient_statistics.py">View
diff --git a/docs/api_docs/python/recsim/agents/layers/sufficient_statistics/SufficientStatisticsLayer.md b/docs/api_docs/python/recsim/agents/layers/sufficient_statistics/SufficientStatisticsLayer.md
index c658c97..e109ac6 100644
--- a/docs/api_docs/python/recsim/agents/layers/sufficient_statistics/SufficientStatisticsLayer.md
+++ b/docs/api_docs/python/recsim/agents/layers/sufficient_statistics/SufficientStatisticsLayer.md
@@ -1,8 +1,6 @@
 <div itemscope itemtype="http://developers.google.com/ReferenceObject">
 <meta itemprop="name" content="recsim.agents.layers.sufficient_statistics.SufficientStatisticsLayer" />
 <meta itemprop="path" content="Stable" />
-<meta itemprop="property" content="multi_user"/>
-<meta itemprop="property" content="observation_space"/>
 <meta itemprop="property" content="__init__"/>
 <meta itemprop="property" content="begin_episode"/>
 <meta itemprop="property" content="bundle_and_checkpoint"/>
@@ -13,22 +11,27 @@
 
 # recsim.agents.layers.sufficient_statistics.SufficientStatisticsLayer
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/layers/sufficient_statistics.py">View
 source</a>
 
-## Class `SufficientStatisticsLayer`
-
-<!-- Start diff -->
 A module to log user responses on different clusters.
 
 Inherits From:
 [`AbstractHierarchicalAgentLayer`](../../../../recsim/agent/AbstractHierarchicalAgentLayer.md)
 
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.agents.layers.sufficient_statistics.SufficientStatisticsLayer(
+    base_agent_ctor, observation_space, action_space, sufficient_statistics_space,
+    **kwargs
+)
+</code></pre>
+
 <!-- Placeholder for "Used in" -->
 
 This module assumes each document belongs to single cluster and we know the
@@ -36,42 +39,64 @@ number of possible clusters. Every time we increase impression count for a
 cluster if the agent recommends a document from that cluster. We also increase
 click count for a cluster if user responds a click.
 
-<h2 id="__init__"><code>__init__</code></h2>
-
-<a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/layers/sufficient_statistics.py">View
-source</a>
-
-```python
-__init__(
-    base_agent_ctor,
-    observation_space,
-    action_space,
-    sufficient_statistics_space,
-    **kwargs
-)
-```
-
-Initializes a UserClusterHistory object.
-
-#### Args:
-
-*   <b>`base_agent_ctor`</b>: a constructor for the base agent.
-*   <b>`observation_space`</b>: a gym.spaces object specifying the format of
-    observations.
-*   <b>`action_space`</b>: A gym.spaces object that specifies the format of
-    actions.
-*   <b>`sufficient_statistics_space`</b>: a gym.spaces object specifying the
-    format of the created sufficient statistics.
-*   <b>`**kwargs`</b>: arguments to pass to the downstream agent at construction
-    time.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+`base_agent_ctor`
+</td>
+<td>
+a constructor for the base agent.
+</td>
+</tr><tr>
+<td>
+`observation_space`
+</td>
+<td>
+a gym.spaces object specifying the format of
+observations.
+</td>
+</tr><tr>
+<td>
+`action_space`
+</td>
+<td>
+A gym.spaces object that specifies the format of actions.
+</td>
+</tr><tr>
+<td>
+`sufficient_statistics_space`
+</td>
+<td>
+a gym.spaces object specifying the format of
+the created sufficient statistics.
+</td>
+</tr><tr>
+<td>
+`**kwargs`
+</td>
+<td>
+arguments to pass to the downstream agent at construction time.
+</td>
+</tr>
+</table>
 
-## Properties
+<!-- Tabular view -->
 
-<h3 id="multi_user"><code>multi_user</code></h3>
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
-Returns boolean indicating whether this agent serves multiple users.
+<tr> <td> `multi_user` </td> <td> Returns boolean indicating whether this agent
+serves multiple users. </td> </tr><tr> <td> `observation_space` </td> <td>
 
-<h3 id="observation_space"><code>observation_space</code></h3>
+</td>
+</tr>
+</table>
 
 ## Methods
 
@@ -80,102 +105,188 @@ Returns boolean indicating whether this agent serves multiple users.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View
 source</a>
 
-```python
-begin_episode(observation=None)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>begin_episode(
+    observation=None
+)
+</code></pre>
 
 <h3 id="bundle_and_checkpoint"><code>bundle_and_checkpoint</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View
 source</a>
 
-```python
-bundle_and_checkpoint(
-    checkpoint_dir,
-    iteration_number
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>bundle_and_checkpoint(
+    checkpoint_dir, iteration_number
 )
-```
+</code></pre>
 
 Returns a self-contained bundle of the agent's state.
 
-#### Args:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`checkpoint_dir`
+</td>
+<td>
+A string for the directory where objects will be saved.
+</td>
+</tr><tr>
+<td>
+`iteration_number`
+</td>
+<td>
+An integer of iteration number to use for naming the
+checkpoint file.
+</td>
+</tr>
+</table>
 
-*   <b>`checkpoint_dir`</b>: A string for the directory where objects will be
-    saved.
-*   <b>`iteration_number`</b>: An integer of iteration number to use for naming
-    the checkpoint file.
+<!-- Tabular view -->
 
-#### Returns:
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+A dictionary containing additional Python objects to be checkpointed by
+the experiment. Each key is a string for the object name and the value
+is actual object. If the checkpoint directory does not exist, returns
+empty dictionary.
+</td>
+</tr>
 
-A dictionary containing additional Python objects to be checkpointed by the
-experiment. Each key is a string for the object name and the value is actual
-object. If the checkpoint directory does not exist, returns empty dictionary.
+</table>
 
 <h3 id="end_episode"><code>end_episode</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/layers/sufficient_statistics.py">View
 source</a>
 
-```python
-end_episode(
-    reward,
-    observation
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>end_episode(
+    reward, observation
 )
-```
+</code></pre>
 
 <h3 id="step"><code>step</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/layers/sufficient_statistics.py">View
 source</a>
 
-```python
-step(
-    reward,
-    observation
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>step(
+    reward, observation
 )
-```
+</code></pre>
 
 Records the most recent transition and returns the agent's next action.
 
 We store the observation of the last time step since we want to store it with
 the reward.
 
-#### Args:
-
-*   <b>`reward`</b>: The reward received from the agent's most recent action as
-    a float.
-*   <b>`observation`</b>: A dictionary that includes the most recent
-    observations.
-
-#### Returns:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`reward`
+</td>
+<td>
+The reward received from the agent's most recent action as a
+float.
+</td>
+</tr><tr>
+<td>
+`observation`
+</td>
+<td>
+A dictionary that includes the most recent observations.
+</td>
+</tr>
+</table>
 
-*   <b>`slate`</b>: An integer array of size _slate_size, where each element is
-    an index into the list of doc_obs
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+
+<tr>
+<td>
+`slate`
+</td>
+<td>
+An integer array of size _slate_size, where each element is an
+index into the list of doc_obs
+</td>
+</tr>
+</table>
 
 <h3 id="unbundle"><code>unbundle</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View
 source</a>
 
-```python
-unbundle(
-    checkpoint_dir,
-    iteration_number,
-    bundle_dict
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>unbundle(
+    checkpoint_dir, iteration_number, bundle_dict
 )
-```
+</code></pre>
 
 Restores the agent from a checkpoint.
 
-#### Args:
-
-*   <b>`checkpoint_dir`</b>: A string that represents the path to the checkpoint
-    saved by tf.Save.
-*   <b>`iteration_number`</b>: An integer that represents the checkpoint version
-    and is used when restoring replay buffer.
-*   <b>`bundle_dict`</b>: A dict containing additional Python objects owned by
-    the agent. Each key is an object name and the value is the actual object.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`checkpoint_dir`
+</td>
+<td>
+A string that represents the path to the checkpoint saved
+by tf.Save.
+</td>
+</tr><tr>
+<td>
+`iteration_number`
+</td>
+<td>
+An integer that represents the checkpoint version and is
+used when restoring replay buffer.
+</td>
+</tr><tr>
+<td>
+`bundle_dict`
+</td>
+<td>
+A dict containing additional Python objects owned by the
+agent. Each key is an object name and the value is the actual object.
+</td>
+</tr>
+</table>
 
-#### Returns:
+<!-- Tabular view -->
 
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
 bool, True if unbundling was successful.
+</td>
+</tr>
+
+</table>
diff --git a/docs/api_docs/python/recsim/agents/layers/temporal_aggregation.md b/docs/api_docs/python/recsim/agents/layers/temporal_aggregation.md
index abcd284..da261bd 100644
--- a/docs/api_docs/python/recsim/agents/layers/temporal_aggregation.md
+++ b/docs/api_docs/python/recsim/agents/layers/temporal_aggregation.md
@@ -5,7 +5,10 @@
 
 # Module: recsim.agents.layers.temporal_aggregation
 
+<!-- Insert buttons and diff -->
+
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/layers/temporal_aggregation.py">View
diff --git a/docs/api_docs/python/recsim/agents/layers/temporal_aggregation/TemporalAggregationLayer.md b/docs/api_docs/python/recsim/agents/layers/temporal_aggregation/TemporalAggregationLayer.md
index 460c2a8..9ca4ead 100644
--- a/docs/api_docs/python/recsim/agents/layers/temporal_aggregation/TemporalAggregationLayer.md
+++ b/docs/api_docs/python/recsim/agents/layers/temporal_aggregation/TemporalAggregationLayer.md
@@ -1,7 +1,6 @@
 <div itemscope itemtype="http://developers.google.com/ReferenceObject">
 <meta itemprop="name" content="recsim.agents.layers.temporal_aggregation.TemporalAggregationLayer" />
 <meta itemprop="path" content="Stable" />
-<meta itemprop="property" content="multi_user"/>
 <meta itemprop="property" content="__init__"/>
 <meta itemprop="property" content="begin_episode"/>
 <meta itemprop="property" content="bundle_and_checkpoint"/>
@@ -12,22 +11,27 @@
 
 # recsim.agents.layers.temporal_aggregation.TemporalAggregationLayer
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/layers/temporal_aggregation.py">View
 source</a>
 
-## Class `TemporalAggregationLayer`
-
-<!-- Start diff -->
 Temporally aggregated reinforcement learning agent.
 
 Inherits From:
 [`AbstractHierarchicalAgentLayer`](../../../../recsim/agent/AbstractHierarchicalAgentLayer.md)
 
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.agents.layers.temporal_aggregation.TemporalAggregationLayer(
+    base_agent_ctor, observation_space, action_space, gamma=0.0,
+    aggregation_period=1, switching_cost=1.0, document_comparison_fcn=None, **kwargs
+)
+</code></pre>
+
 <!-- Placeholder for "Used in" -->
 
 A reinforcement learning agent that implements learns a temporally aggregated
@@ -48,49 +52,92 @@ becomes non-Markovian.
 The two methods are not mutually exclusive and may be used in conjunction by
 specifying a non-unit aggregation_period and a non-zero switching_cost.
 
-<h2 id="__init__"><code>__init__</code></h2>
-
-<a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/layers/temporal_aggregation.py">View
-source</a>
-
-```python
-__init__(
-    base_agent_ctor,
-    observation_space,
-    action_space,
-    gamma=0.0,
-    aggregation_period=1,
-    switching_cost=1.0,
-    document_comparison_fcn=None,
-    **kwargs
-)
-```
-
-TemporallyAggregatedAgent init.
-
-#### Args:
-
-*   <b>`base_agent_ctor`</b>: a constructor for the base agent.
-*   <b>`observation_space`</b>: a gym.spaces object specifying the format of
-    observations.
-*   <b>`action_space`</b>: A gym.spaces object that specifies the format of
-    actions.
-*   <b>`gamma`</b>: geometric discounting factor between [0, 1) for the
-    event-level objective.
-*   <b>`aggregation_period`</b>: number of time steps to hold an action fixed.
-*   <b>`switching_cost`</b>: a non-negative penalty for switching an action.
-*   <b>`document_comparison_fcn`</b>: a function taking two document
-    observations and returning a Boolean value that indicates if they are
-    considered equivalent. This is useful for making decisions at a higher
-    abstraction level (e.g. comparing only document topics). If not provided,
-    this will default to direct observation equality.
-*   <b>`**kwargs`</b>: base_agent initialization args.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+`base_agent_ctor`
+</td>
+<td>
+a constructor for the base agent.
+</td>
+</tr><tr>
+<td>
+`observation_space`
+</td>
+<td>
+a gym.spaces object specifying the format of
+observations.
+</td>
+</tr><tr>
+<td>
+`action_space`
+</td>
+<td>
+A gym.spaces object that specifies the format of actions.
+</td>
+</tr><tr>
+<td>
+`gamma`
+</td>
+<td>
+geometric discounting factor between [0, 1) for the event-level
+objective.
+</td>
+</tr><tr>
+<td>
+`aggregation_period`
+</td>
+<td>
+number of time steps to hold an action fixed.
+</td>
+</tr><tr>
+<td>
+`switching_cost`
+</td>
+<td>
+a non-negative penalty for switching an action.
+</td>
+</tr><tr>
+<td>
+`document_comparison_fcn`
+</td>
+<td>
+a function taking two document observations and
+returning a Boolean value that indicates if they are considered
+equivalent. This is useful for making decisions at a higher abstraction
+level (e.g. comparing only document topics). If not provided, this will
+default to direct observation equality.
+</td>
+</tr><tr>
+<td>
+`**kwargs`
+</td>
+<td>
+base_agent initialization args.
+</td>
+</tr>
+</table>
 
-## Properties
+<!-- Tabular view -->
 
-<h3 id="multi_user"><code>multi_user</code></h3>
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
+<tr>
+<td>
+`multi_user`
+</td>
+<td>
 Returns boolean indicating whether this agent serves multiple users.
+</td>
+</tr>
+</table>
 
 ## Methods
 
@@ -99,110 +146,209 @@ Returns boolean indicating whether this agent serves multiple users.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View
 source</a>
 
-```python
-begin_episode(observation=None)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>begin_episode(
+    observation=None
+)
+</code></pre>
 
 <h3 id="bundle_and_checkpoint"><code>bundle_and_checkpoint</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View
 source</a>
 
-```python
-bundle_and_checkpoint(
-    checkpoint_dir,
-    iteration_number
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>bundle_and_checkpoint(
+    checkpoint_dir, iteration_number
 )
-```
+</code></pre>
 
 Returns a self-contained bundle of the agent's state.
 
-#### Args:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`checkpoint_dir`
+</td>
+<td>
+A string for the directory where objects will be saved.
+</td>
+</tr><tr>
+<td>
+`iteration_number`
+</td>
+<td>
+An integer of iteration number to use for naming the
+checkpoint file.
+</td>
+</tr>
+</table>
 
-*   <b>`checkpoint_dir`</b>: A string for the directory where objects will be
-    saved.
-*   <b>`iteration_number`</b>: An integer of iteration number to use for naming
-    the checkpoint file.
+<!-- Tabular view -->
 
-#### Returns:
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+A dictionary containing additional Python objects to be checkpointed by
+the experiment. Each key is a string for the object name and the value
+is actual object. If the checkpoint directory does not exist, returns
+empty dictionary.
+</td>
+</tr>
 
-A dictionary containing additional Python objects to be checkpointed by the
-experiment. Each key is a string for the object name and the value is actual
-object. If the checkpoint directory does not exist, returns empty dictionary.
+</table>
 
 <h3 id="end_episode"><code>end_episode</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View
 source</a>
 
-```python
-end_episode(
-    reward,
-    observation
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>end_episode(
+    reward, observation
 )
-```
+</code></pre>
 
 <h3 id="step"><code>step</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/layers/temporal_aggregation.py">View
 source</a>
 
-```python
-step(
-    reward,
-    observation
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>step(
+    reward, observation
 )
-```
+</code></pre>
 
 Preprocesses the reward and observation and calls base agent.
 
-#### Args:
-
-*   <b>`reward`</b>: The reward received from the agent's most recent action as
-    a float.
-*   <b>`observation`</b>: A dictionary that includes the most recent
-    observations and should have the following fields:
-    -   user: A NumPy array representing user's observed state. Assumes it is a
-        concatenation of topic pull counts and topic click counts.
-    -   doc: A NumPy array representing observations of document features.
-        Assumes it is a concatenation of one-hot encoding of topic_id and
-        document quality.
-
-#### Returns:
-
-*   <b>`slate`</b>: An integer array of size _slate_size, where each element is
-    an index into the list of doc_obs.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`reward`
+</td>
+<td>
+The reward received from the agent's most recent action as a
+float.
+</td>
+</tr><tr>
+<td>
+`observation`
+</td>
+<td>
+A dictionary that includes the most recent observations and
+should have the following fields:
+- user: A NumPy array representing user's observed state. Assumes it is
+a concatenation of topic pull counts and topic click counts.
+- doc: A NumPy array representing observations of document features.
+Assumes it is a concatenation of one-hot encoding of topic_id and
+document quality.
+</td>
+</tr>
+</table>
 
-#### Raises:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+
+<tr>
+<td>
+`slate`
+</td>
+<td>
+An integer array of size _slate_size, where each element is an
+index into the list of doc_obs.
+</td>
+</tr>
+</table>
 
-*   <b>`RuntimeError`</b>: if the agent has to hold a slate with given features
-    fixed for k steps but the documents needed to reconstruct that slate become
-    unavailable.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Raises</th></tr>
+
+<tr>
+<td>
+`RuntimeError`
+</td>
+<td>
+if the agent has to hold a slate with given features fixed
+for k steps but the documents needed to reconstruct that slate
+become unavailable.
+</td>
+</tr>
+</table>
 
 <h3 id="unbundle"><code>unbundle</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View
 source</a>
 
-```python
-unbundle(
-    checkpoint_dir,
-    iteration_number,
-    bundle_dict
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>unbundle(
+    checkpoint_dir, iteration_number, bundle_dict
 )
-```
+</code></pre>
 
 Restores the agent from a checkpoint.
 
-#### Args:
-
-*   <b>`checkpoint_dir`</b>: A string that represents the path to the checkpoint
-    saved by tf.Save.
-*   <b>`iteration_number`</b>: An integer that represents the checkpoint version
-    and is used when restoring replay buffer.
-*   <b>`bundle_dict`</b>: A dict containing additional Python objects owned by
-    the agent. Each key is an object name and the value is the actual object.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`checkpoint_dir`
+</td>
+<td>
+A string that represents the path to the checkpoint saved
+by tf.Save.
+</td>
+</tr><tr>
+<td>
+`iteration_number`
+</td>
+<td>
+An integer that represents the checkpoint version and is
+used when restoring replay buffer.
+</td>
+</tr><tr>
+<td>
+`bundle_dict`
+</td>
+<td>
+A dict containing additional Python objects owned by the
+agent. Each key is an object name and the value is the actual object.
+</td>
+</tr>
+</table>
 
-#### Returns:
+<!-- Tabular view -->
 
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
 bool, True if unbundling was successful.
+</td>
+</tr>
+
+</table>
diff --git a/docs/api_docs/python/recsim/agents/random_agent.md b/docs/api_docs/python/recsim/agents/random_agent.md
index 064738a..a8f0f8f 100644
--- a/docs/api_docs/python/recsim/agents/random_agent.md
+++ b/docs/api_docs/python/recsim/agents/random_agent.md
@@ -5,7 +5,10 @@
 
 # Module: recsim.agents.random_agent
 
+<!-- Insert buttons and diff -->
+
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/random_agent.py">View
diff --git a/docs/api_docs/python/recsim/agents/random_agent/RandomAgent.md b/docs/api_docs/python/recsim/agents/random_agent/RandomAgent.md
index 700ac72..8249fae 100644
--- a/docs/api_docs/python/recsim/agents/random_agent/RandomAgent.md
+++ b/docs/api_docs/python/recsim/agents/random_agent/RandomAgent.md
@@ -1,7 +1,6 @@
 <div itemscope itemtype="http://developers.google.com/ReferenceObject">
 <meta itemprop="name" content="recsim.agents.random_agent.RandomAgent" />
 <meta itemprop="path" content="Stable" />
-<meta itemprop="property" content="multi_user"/>
 <meta itemprop="property" content="__init__"/>
 <meta itemprop="property" content="begin_episode"/>
 <meta itemprop="property" content="bundle_and_checkpoint"/>
@@ -12,50 +11,67 @@
 
 # recsim.agents.random_agent.RandomAgent
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/random_agent.py">View
 source</a>
 
-## Class `RandomAgent`
-
-<!-- Start diff -->
 An agent that recommends a random slate of documents.
 
 Inherits From:
 [`AbstractEpisodicRecommenderAgent`](../../../recsim/agent/AbstractEpisodicRecommenderAgent.md)
 
-<!-- Placeholder for "Used in" -->
-
-<h2 id="__init__"><code>__init__</code></h2>
-
-<a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/random_agent.py">View
-source</a>
-
-```python
-__init__(
-    action_space,
-    random_seed=0
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.agents.random_agent.RandomAgent(
+    action_space, random_seed=0
 )
-```
+</code></pre>
 
-Initializes AbstractEpisodicRecommenderAgent.
-
-#### Args:
+<!-- Placeholder for "Used in" -->
 
-*   <b>`action_space`</b>: A gym.spaces object that specifies the format of
-    actions.
-*   <b>`summary_writer`</b>: A Tensorflow summary writer to pass to the agent
-    for in-agent training statistics in Tensorboard.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+`action_space`
+</td>
+<td>
+A gym.spaces object that specifies the format of actions.
+</td>
+</tr><tr>
+<td>
+`summary_writer`
+</td>
+<td>
+A Tensorflow summary writer to pass to the agent
+for in-agent training statistics in Tensorboard.
+</td>
+</tr>
+</table>
 
-## Properties
+<!-- Tabular view -->
 
-<h3 id="multi_user"><code>multi_user</code></h3>
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
+<tr>
+<td>
+`multi_user`
+</td>
+<td>
 Returns boolean indicating whether this agent serves multiple users.
+</td>
+</tr>
+</table>
 
 ## Methods
 
@@ -64,121 +80,250 @@ Returns boolean indicating whether this agent serves multiple users.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View
 source</a>
 
-```python
-begin_episode(observation=None)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>begin_episode(
+    observation=None
+)
+</code></pre>
 
 Returns the agent's first action for this episode.
 
-#### Args:
+<!-- Tabular view -->
 
-*   <b>`observation`</b>: numpy array, the environment's initial observation.
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
 
-#### Returns:
+<tr>
+<td>
+`observation`
+</td>
+<td>
+numpy array, the environment's initial observation.
+</td>
+</tr>
+</table>
 
-*   <b>`slate`</b>: An integer array of size _slate_size, where each element is
-    an index into the list of doc_obs
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+
+<tr>
+<td>
+`slate`
+</td>
+<td>
+An integer array of size _slate_size, where each element is an
+index into the list of doc_obs
+</td>
+</tr>
+</table>
 
 <h3 id="bundle_and_checkpoint"><code>bundle_and_checkpoint</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View
 source</a>
 
-```python
-bundle_and_checkpoint(
-    checkpoint_dir,
-    iteration_number
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>bundle_and_checkpoint(
+    checkpoint_dir, iteration_number
 )
-```
+</code></pre>
 
 Returns a self-contained bundle of the agent's state.
 
-#### Args:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`checkpoint_dir`
+</td>
+<td>
+A string that represents the path to the checkpoint and is
+used when we save TensorFlow objects by tf.Save.
+</td>
+</tr><tr>
+<td>
+`iteration_number`
+</td>
+<td>
+An integer that represents the checkpoint version and is
+used when restoring replay buffer.
+</td>
+</tr>
+</table>
 
-*   <b>`checkpoint_dir`</b>: A string that represents the path to the checkpoint
-    and is used when we save TensorFlow objects by tf.Save.
-*   <b>`iteration_number`</b>: An integer that represents the checkpoint version
-    and is used when restoring replay buffer.
+<!-- Tabular view -->
 
-#### Returns:
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+A dictionary containing additional Python objects to be checkpointed by
+the experiment. Each key is a string for the object name and the value
+is actual object. If the checkpoint directory does not exist, returns
+empty dictionary.
+</td>
+</tr>
 
-A dictionary containing additional Python objects to be checkpointed by the
-experiment. Each key is a string for the object name and the value is actual
-object. If the checkpoint directory does not exist, returns empty dictionary.
+</table>
 
 <h3 id="end_episode"><code>end_episode</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View
 source</a>
 
-```python
-end_episode(
-    reward,
-    observation=None
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>end_episode(
+    reward, observation=None
 )
-```
+</code></pre>
 
 Signals the end of the episode to the agent.
 
-#### Args:
-
-*   <b>`reward`</b>: An float that is the last reward from the environment.
-*   <b>`observation`</b>: numpy array that represents the last observation of
-    the episode.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`reward`
+</td>
+<td>
+An float that is the last reward from the environment.
+</td>
+</tr><tr>
+<td>
+`observation`
+</td>
+<td>
+numpy array that represents the last observation of the
+episode.
+</td>
+</tr>
+</table>
 
 <h3 id="step"><code>step</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/random_agent.py">View
 source</a>
 
-```python
-step(
-    reward,
-    observation
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>step(
+    reward, observation
 )
-```
+</code></pre>
 
 Records the most recent transition and returns the agent's next action.
 
 We store the observation of the last time step since we want to store it with
 the reward.
 
-#### Args:
-
-*   <b>`reward`</b>: Unused.
-*   <b>`observation`</b>: A dictionary that includes the most recent
-    observation. Should include 'doc' field that includes observation of all
-    candidates.
-
-#### Returns:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`reward`
+</td>
+<td>
+Unused.
+</td>
+</tr><tr>
+<td>
+`observation`
+</td>
+<td>
+A dictionary that includes the most recent observation.
+Should include 'doc' field that includes observation of all candidates.
+</td>
+</tr>
+</table>
 
-*   <b>`slate`</b>: An integer array of size _slate_size, where each element is
-    an index into the list of doc_obs
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+
+<tr>
+<td>
+`slate`
+</td>
+<td>
+An integer array of size _slate_size, where each element is an
+index into the list of doc_obs
+</td>
+</tr>
+</table>
 
 <h3 id="unbundle"><code>unbundle</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View
 source</a>
 
-```python
-unbundle(
-    checkpoint_dir,
-    iteration_number,
-    bundle_dict
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>unbundle(
+    checkpoint_dir, iteration_number, bundle_dict
 )
-```
+</code></pre>
 
 Restores the agent from a checkpoint.
 
-#### Args:
-
-*   <b>`checkpoint_dir`</b>: A string that represents the path to the checkpoint
-    and is used when we save TensorFlow objects by tf.Save.
-*   <b>`iteration_number`</b>: An integer that represents the checkpoint version
-    and is used when restoring replay buffer.
-*   <b>`bundle_dict`</b>: A dict containing additional Python objects owned by
-    the agent. Each key is an object name and the value is the actual object.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`checkpoint_dir`
+</td>
+<td>
+A string that represents the path to the checkpoint and is
+used when we save TensorFlow objects by tf.Save.
+</td>
+</tr><tr>
+<td>
+`iteration_number`
+</td>
+<td>
+An integer that represents the checkpoint version and is
+used when restoring replay buffer.
+</td>
+</tr><tr>
+<td>
+`bundle_dict`
+</td>
+<td>
+A dict containing additional Python objects owned by the
+agent. Each key is an object name and the value is the actual object.
+</td>
+</tr>
+</table>
 
-#### Returns:
+<!-- Tabular view -->
 
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
 bool, True if unbundling was successful.
+</td>
+</tr>
+
+</table>
diff --git a/docs/api_docs/python/recsim/agents/slate_decomp_q_agent.md b/docs/api_docs/python/recsim/agents/slate_decomp_q_agent.md
index 38d625c..f6284c9 100644
--- a/docs/api_docs/python/recsim/agents/slate_decomp_q_agent.md
+++ b/docs/api_docs/python/recsim/agents/slate_decomp_q_agent.md
@@ -5,7 +5,10 @@
 
 # Module: recsim.agents.slate_decomp_q_agent
 
+<!-- Insert buttons and diff -->
+
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/slate_decomp_q_agent.py">View
diff --git a/docs/api_docs/python/recsim/agents/slate_decomp_q_agent/SlateDecompQAgent.md b/docs/api_docs/python/recsim/agents/slate_decomp_q_agent/SlateDecompQAgent.md
index 49936f1..feeccdd 100644
--- a/docs/api_docs/python/recsim/agents/slate_decomp_q_agent/SlateDecompQAgent.md
+++ b/docs/api_docs/python/recsim/agents/slate_decomp_q_agent/SlateDecompQAgent.md
@@ -1,7 +1,6 @@
 <div itemscope itemtype="http://developers.google.com/ReferenceObject">
 <meta itemprop="name" content="recsim.agents.slate_decomp_q_agent.SlateDecompQAgent" />
 <meta itemprop="path" content="Stable" />
-<meta itemprop="property" content="multi_user"/>
 <meta itemprop="property" content="__init__"/>
 <meta itemprop="property" content="begin_episode"/>
 <meta itemprop="property" content="bundle_and_checkpoint"/>
@@ -12,56 +11,118 @@
 
 # recsim.agents.slate_decomp_q_agent.SlateDecompQAgent
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/slate_decomp_q_agent.py">View
 source</a>
 
-## Class `SlateDecompQAgent`
-
-<!-- Start diff -->
 A recommender agent implements DQN using slate decomposition techniques.
 
 Inherits From:
 [`DQNAgentRecSim`](../../../recsim/agents/dopamine/dqn_agent/DQNAgentRecSim.md),
 [`AbstractEpisodicRecommenderAgent`](../../../recsim/agent/AbstractEpisodicRecommenderAgent.md)
 
-<!-- Placeholder for "Used in" -->
-
-<h2 id="__init__"><code>__init__</code></h2>
-
-```python
-__init__(
-    *args,
-    **kwargs
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.agents.slate_decomp_q_agent.SlateDecompQAgent(
+    sess, observation_space, action_space, optimizer_name='', select_slate_fn=None,
+    compute_target_fn=None, stack_size=1, eval_mode=False, **kwargs
 )
-```
-
-Initializes SlateDecompQAgent.
+</code></pre>
 
-#### Args:
+<!-- Placeholder for "Used in" -->
 
-*   <b>`sess`</b>: a Tensorflow session.
-*   <b>`observation_space`</b>: A gym.spaces object that specifies the format of
-    observations.
-*   <b>`action_space`</b>: A gym.spaces object that specifies the format of
-    actions.
-*   <b>`optimizer_name`</b>: The name of the optimizer.
-*   <b>`select_slate_fn`</b>: A function that selects the slate.
-*   <b>`compute_target_fn`</b>: A function that omputes the target q value.
-*   <b>`stack_size`</b>: The stack size for the replay buffer.
-*   <b>`eval_mode`</b>: A bool for whether the agent is in training or
-    evaluation mode.
-*   <b>`**kwargs`</b>: Keyword arguments to the DQNAgent.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+`sess`
+</td>
+<td>
+a Tensorflow session.
+</td>
+</tr><tr>
+<td>
+`observation_space`
+</td>
+<td>
+A gym.spaces object that specifies the format of
+observations.
+</td>
+</tr><tr>
+<td>
+`action_space`
+</td>
+<td>
+A gym.spaces object that specifies the format of actions.
+</td>
+</tr><tr>
+<td>
+`optimizer_name`
+</td>
+<td>
+The name of the optimizer.
+</td>
+</tr><tr>
+<td>
+`select_slate_fn`
+</td>
+<td>
+A function that selects the slate.
+</td>
+</tr><tr>
+<td>
+`compute_target_fn`
+</td>
+<td>
+A function that omputes the target q value.
+</td>
+</tr><tr>
+<td>
+`stack_size`
+</td>
+<td>
+The stack size for the replay buffer.
+</td>
+</tr><tr>
+<td>
+`eval_mode`
+</td>
+<td>
+A bool for whether the agent is in training or evaluation mode.
+</td>
+</tr><tr>
+<td>
+`**kwargs`
+</td>
+<td>
+Keyword arguments to the DQNAgent.
+</td>
+</tr>
+</table>
 
-## Properties
+<!-- Tabular view -->
 
-<h3 id="multi_user"><code>multi_user</code></h3>
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
+<tr>
+<td>
+`multi_user`
+</td>
+<td>
 Returns boolean indicating whether this agent serves multiple users.
+</td>
+</tr>
+</table>
 
 ## Methods
 
@@ -70,29 +131,51 @@ Returns boolean indicating whether this agent serves multiple users.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/slate_decomp_q_agent.py">View
 source</a>
 
-```python
-begin_episode(observation)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>begin_episode(
+    observation
+)
+</code></pre>
 
 Returns the agent's first action for this episode.
 
-#### Args:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
 
-*   <b>`observation`</b>: numpy array, the environment's initial observation.
+<tr>
+<td>
+`observation`
+</td>
+<td>
+numpy array, the environment's initial observation.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
 
-#### Returns:
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+An integer array of size _slate_size, the selected slated, each
+element of which is an index in the list of doc_obs.
+</td>
+</tr>
 
-An integer array of size _slate_size, the selected slated, each element of which
-is an index in the list of doc_obs.
+</table>
 
 <h3 id="bundle_and_checkpoint"><code>bundle_and_checkpoint</code></h3>
 
-```python
-bundle_and_checkpoint(
-    checkpoint_dir,
-    iteration_number
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>bundle_and_checkpoint(
+    checkpoint_dir, iteration_number
 )
-```
+</code></pre>
 
 Returns a self-contained bundle of the agent's state.
 
@@ -100,76 +183,143 @@ This is used for checkpointing. It will return a dictionary containing all
 non-TensorFlow objects (to be saved into a file by the caller), and it saves all
 TensorFlow objects into a checkpoint file.
 
-#### Args:
-
-*   <b>`checkpoint_dir`</b>: str, directory where TensorFlow objects will be
-    saved.
-*   <b>`iteration_number`</b>: int, iteration number to use for naming the
-    checkpoint file.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`checkpoint_dir`
+</td>
+<td>
+str, directory where TensorFlow objects will be saved.
+</td>
+</tr><tr>
+<td>
+`iteration_number`
+</td>
+<td>
+int, iteration number to use for naming the checkpoint
+file.
+</td>
+</tr>
+</table>
 
-#### Returns:
+<!-- Tabular view -->
 
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
 A dict containing additional Python objects to be checkpointed by the
 experiment. If the checkpoint directory does not exist, returns None.
+</td>
+</tr>
+
+</table>
 
 <h3 id="end_episode"><code>end_episode</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/slate_decomp_q_agent.py">View
 source</a>
 
-```python
-end_episode(
-    reward,
-    observation
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>end_episode(
+    reward, observation
 )
-```
+</code></pre>
 
 Signals the end of the episode to the agent.
 
 We store the observation of the current time step, which is the last observation
 of the episode.
 
-#### Args:
-
-*   <b>`reward`</b>: float, the last reward from the environment.
-*   <b>`observation`</b>: numpy array, the environment's initial observation.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`reward`
+</td>
+<td>
+float, the last reward from the environment.
+</td>
+</tr><tr>
+<td>
+`observation`
+</td>
+<td>
+numpy array, the environment's initial observation.
+</td>
+</tr>
+</table>
 
 <h3 id="step"><code>step</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/slate_decomp_q_agent.py">View
 source</a>
 
-```python
-step(
-    reward,
-    observation
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>step(
+    reward, observation
 )
-```
+</code></pre>
 
 Records the transition and returns the agent's next action.
 
 It uses document-level user response instead of overral reward as the reward of
 the problem.
 
-#### Args:
-
-*   <b>`reward`</b>: unused.
-*   <b>`observation`</b>: a space.Dict that includes observation of the user
-    state observation, documents and user responses.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`reward`
+</td>
+<td>
+unused.
+</td>
+</tr><tr>
+<td>
+`observation`
+</td>
+<td>
+a space.Dict that includes observation of the user state
+observation, documents and user responses.
+</td>
+</tr>
+</table>
 
-#### Returns:
+<!-- Tabular view -->
 
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
 Array, the selected action.
+</td>
+</tr>
+
+</table>
 
 <h3 id="unbundle"><code>unbundle</code></h3>
 
-```python
-unbundle(
-    checkpoint_dir,
-    iteration_number,
-    bundle_dictionary
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>unbundle(
+    checkpoint_dir, iteration_number, bundle_dictionary
 )
-```
+</code></pre>
 
 Restores the agent from a checkpoint.
 
@@ -177,14 +327,47 @@ Restores the agent's Python objects to those specified in bundle_dictionary, and
 restores the TensorFlow objects to those specified in the checkpoint_dir. If the
 checkpoint_dir does not exist, will not reset the agent's state.
 
-#### Args:
-
-*   <b>`checkpoint_dir`</b>: str, path to the checkpoint saved by tf.Save.
-*   <b>`iteration_number`</b>: int, checkpoint version, used when restoring the
-    replay buffer.
-*   <b>`bundle_dictionary`</b>: dict, containing additional Python objects owned
-    by the agent.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`checkpoint_dir`
+</td>
+<td>
+str, path to the checkpoint saved by tf.Save.
+</td>
+</tr><tr>
+<td>
+`iteration_number`
+</td>
+<td>
+int, checkpoint version, used when restoring the replay
+buffer.
+</td>
+</tr><tr>
+<td>
+`bundle_dictionary`
+</td>
+<td>
+dict, containing additional Python objects owned by
+the agent.
+</td>
+</tr>
+</table>
 
-#### Returns:
+<!-- Tabular view -->
 
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
 bool, True if unbundling was successful.
+</td>
+</tr>
+
+</table>
diff --git a/docs/api_docs/python/recsim/agents/slate_decomp_q_agent/compute_probs_tf.md b/docs/api_docs/python/recsim/agents/slate_decomp_q_agent/compute_probs_tf.md
index 0defb2d..4510b39 100644
--- a/docs/api_docs/python/recsim/agents/slate_decomp_q_agent/compute_probs_tf.md
+++ b/docs/api_docs/python/recsim/agents/slate_decomp_q_agent/compute_probs_tf.md
@@ -5,37 +5,68 @@
 
 # recsim.agents.slate_decomp_q_agent.compute_probs_tf
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/slate_decomp_q_agent.py">View
 source</a>
 
-<!-- Start diff -->
 Computes the selection probability and returns selected index.
 
-```python
-recsim.agents.slate_decomp_q_agent.compute_probs_tf(
-    slate,
-    scores_tf,
-    score_no_click_tf
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.agents.slate_decomp_q_agent.compute_probs_tf(
+    slate, scores_tf, score_no_click_tf
 )
-```
+</code></pre>
 
 <!-- Placeholder for "Used in" -->
 
 This assumes scores are normalizable, e.g., scores cannot be negative.
 
-#### Args:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
-*   <b>`slate`</b>: a list of integers that represents the video slate.
-*   <b>`scores_tf`</b>: a float tensor that stores the scores of all documents.
-*   <b>`score_no_click_tf`</b>: a float tensor that represents the score for the
-    action of picking no document.
+<tr>
+<td>
+`slate`
+</td>
+<td>
+a list of integers that represents the video slate.
+</td>
+</tr><tr>
+<td>
+`scores_tf`
+</td>
+<td>
+a float tensor that stores the scores of all documents.
+</td>
+</tr><tr>
+<td>
+`score_no_click_tf`
+</td>
+<td>
+a float tensor that represents the score for the action
+of picking no document.
+</td>
+</tr>
+</table>
 
-#### Returns:
+<!-- Tabular view -->
 
-A float tensor that represents the probabilities of selecting each document in
-the slate.
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
+<tr class="alt">
+<td colspan="2">
+A float tensor that represents the probabilities of selecting each document
+in the slate.
+</td>
+</tr>
+
+</table>
diff --git a/docs/api_docs/python/recsim/agents/slate_decomp_q_agent/compute_target_greedy_q.md b/docs/api_docs/python/recsim/agents/slate_decomp_q_agent/compute_target_greedy_q.md
index 4fd726b..5c49e2c 100644
--- a/docs/api_docs/python/recsim/agents/slate_decomp_q_agent/compute_target_greedy_q.md
+++ b/docs/api_docs/python/recsim/agents/slate_decomp_q_agent/compute_target_greedy_q.md
@@ -5,45 +5,90 @@
 
 # recsim.agents.slate_decomp_q_agent.compute_target_greedy_q
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/slate_decomp_q_agent.py">View
 source</a>
 
-<!-- Start diff -->
 Computes the optimal target Q value with the adaptive greedy algorithm.
 
-```python
-recsim.agents.slate_decomp_q_agent.compute_target_greedy_q(
-    reward,
-    gamma,
-    next_actions,
-    next_q_values,
-    next_states,
-    terminals
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.agents.slate_decomp_q_agent.compute_target_greedy_q(
+    reward, gamma, next_actions, next_q_values, next_states, terminals
 )
-```
+</code></pre>
 
 <!-- Placeholder for "Used in" -->
 
 This algorithm corresponds to the method "GT" in Ie et al.
 https://arxiv.org/abs/1905.12767..
 
-#### Args:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
-*   <b>`reward`</b>: [batch_size] tensor, the immediate reward.
-*   <b>`gamma`</b>: float, discount factor with the usual RL meaning.
-*   <b>`next_actions`</b>: [batch_size, slate_size] tensor, the next slate.
-*   <b>`next_q_values`</b>: [batch_size, num_of_documents] tensor, the q values
-    of the documents in the next step.
-*   <b>`next_states`</b>: [batch_size, 1 + num_of_documents] tensor, the
-    features for the user and the docuemnts in the next step.
-*   <b>`terminals`</b>: [batch_size] tensor, indicating if this is a terminal
-    step.
+<tr>
+<td>
+`reward`
+</td>
+<td>
+[batch_size] tensor, the immediate reward.
+</td>
+</tr><tr>
+<td>
+`gamma`
+</td>
+<td>
+float, discount factor with the usual RL meaning.
+</td>
+</tr><tr>
+<td>
+`next_actions`
+</td>
+<td>
+[batch_size, slate_size] tensor, the next slate.
+</td>
+</tr><tr>
+<td>
+`next_q_values`
+</td>
+<td>
+[batch_size, num_of_documents] tensor, the q values of the
+documents in the next step.
+</td>
+</tr><tr>
+<td>
+`next_states`
+</td>
+<td>
+[batch_size, 1 + num_of_documents] tensor, the features for the
+user and the docuemnts in the next step.
+</td>
+</tr><tr>
+<td>
+`terminals`
+</td>
+<td>
+[batch_size] tensor, indicating if this is a terminal step.
+</td>
+</tr>
+</table>
 
-#### Returns:
+<!-- Tabular view -->
 
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
+<tr class="alt">
+<td colspan="2">
 [batch_size] tensor, the target q values.
+</td>
+</tr>
+
+</table>
diff --git a/docs/api_docs/python/recsim/agents/slate_decomp_q_agent/compute_target_optimal_q.md b/docs/api_docs/python/recsim/agents/slate_decomp_q_agent/compute_target_optimal_q.md
index c98c69d..ca87973 100644
--- a/docs/api_docs/python/recsim/agents/slate_decomp_q_agent/compute_target_optimal_q.md
+++ b/docs/api_docs/python/recsim/agents/slate_decomp_q_agent/compute_target_optimal_q.md
@@ -5,45 +5,90 @@
 
 # recsim.agents.slate_decomp_q_agent.compute_target_optimal_q
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/slate_decomp_q_agent.py">View
 source</a>
 
-<!-- Start diff -->
 Builds an op used as a target for the Q-value.
 
-```python
-recsim.agents.slate_decomp_q_agent.compute_target_optimal_q(
-    reward,
-    gamma,
-    next_actions,
-    next_q_values,
-    next_states,
-    terminals
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.agents.slate_decomp_q_agent.compute_target_optimal_q(
+    reward, gamma, next_actions, next_q_values, next_states, terminals
 )
-```
+</code></pre>
 
 <!-- Placeholder for "Used in" -->
 
 This algorithm corresponds to the method "OT" in Ie et al.
 https://arxiv.org/abs/1905.12767..
 
-#### Args:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
-*   <b>`reward`</b>: [batch_size] tensor, the immediate reward.
-*   <b>`gamma`</b>: float, discount factor with the usual RL meaning.
-*   <b>`next_actions`</b>: [batch_size, slate_size] tensor, the next slate.
-*   <b>`next_q_values`</b>: [batch_size, num_of_documents] tensor, the q values
-    of the documents in the next step.
-*   <b>`next_states`</b>: [batch_size, 1 + num_of_documents] tensor, the
-    features for the user and the docuemnts in the next step.
-*   <b>`terminals`</b>: [batch_size] tensor, indicating if this is a terminal
-    step.
+<tr>
+<td>
+`reward`
+</td>
+<td>
+[batch_size] tensor, the immediate reward.
+</td>
+</tr><tr>
+<td>
+`gamma`
+</td>
+<td>
+float, discount factor with the usual RL meaning.
+</td>
+</tr><tr>
+<td>
+`next_actions`
+</td>
+<td>
+[batch_size, slate_size] tensor, the next slate.
+</td>
+</tr><tr>
+<td>
+`next_q_values`
+</td>
+<td>
+[batch_size, num_of_documents] tensor, the q values of the
+documents in the next step.
+</td>
+</tr><tr>
+<td>
+`next_states`
+</td>
+<td>
+[batch_size, 1 + num_of_documents] tensor, the features for the
+user and the docuemnts in the next step.
+</td>
+</tr><tr>
+<td>
+`terminals`
+</td>
+<td>
+[batch_size] tensor, indicating if this is a terminal step.
+</td>
+</tr>
+</table>
 
-#### Returns:
+<!-- Tabular view -->
 
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
+<tr class="alt">
+<td colspan="2">
 [batch_size] tensor, the target q values.
+</td>
+</tr>
+
+</table>
diff --git a/docs/api_docs/python/recsim/agents/slate_decomp_q_agent/compute_target_sarsa.md b/docs/api_docs/python/recsim/agents/slate_decomp_q_agent/compute_target_sarsa.md
index 8ab69fc..61c5e4d 100644
--- a/docs/api_docs/python/recsim/agents/slate_decomp_q_agent/compute_target_sarsa.md
+++ b/docs/api_docs/python/recsim/agents/slate_decomp_q_agent/compute_target_sarsa.md
@@ -5,42 +5,87 @@
 
 # recsim.agents.slate_decomp_q_agent.compute_target_sarsa
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/slate_decomp_q_agent.py">View
 source</a>
 
-<!-- Start diff -->
 Computes the SARSA target Q value.
 
-```python
-recsim.agents.slate_decomp_q_agent.compute_target_sarsa(
-    reward,
-    gamma,
-    next_actions,
-    next_q_values,
-    next_states,
-    terminals
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.agents.slate_decomp_q_agent.compute_target_sarsa(
+    reward, gamma, next_actions, next_q_values, next_states, terminals
 )
-```
+</code></pre>
 
 <!-- Placeholder for "Used in" -->
 
-#### Args:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
-*   <b>`reward`</b>: [batch_size] tensor, the immediate reward.
-*   <b>`gamma`</b>: float, discount factor with the usual RL meaning.
-*   <b>`next_actions`</b>: [batch_size, slate_size] tensor, the next slate.
-*   <b>`next_q_values`</b>: [batch_size, num_of_documents] tensor, the q values
-    of the documents in the next step.
-*   <b>`next_states`</b>: [batch_size, 1 + num_of_documents] tensor, the
-    features for the user and the docuemnts in the next step.
-*   <b>`terminals`</b>: [batch_size] tensor, indicating if this is a terminal
-    step.
+<tr>
+<td>
+`reward`
+</td>
+<td>
+[batch_size] tensor, the immediate reward.
+</td>
+</tr><tr>
+<td>
+`gamma`
+</td>
+<td>
+float, discount factor with the usual RL meaning.
+</td>
+</tr><tr>
+<td>
+`next_actions`
+</td>
+<td>
+[batch_size, slate_size] tensor, the next slate.
+</td>
+</tr><tr>
+<td>
+`next_q_values`
+</td>
+<td>
+[batch_size, num_of_documents] tensor, the q values of the
+documents in the next step.
+</td>
+</tr><tr>
+<td>
+`next_states`
+</td>
+<td>
+[batch_size, 1 + num_of_documents] tensor, the features for the
+user and the docuemnts in the next step.
+</td>
+</tr><tr>
+<td>
+`terminals`
+</td>
+<td>
+[batch_size] tensor, indicating if this is a terminal step.
+</td>
+</tr>
+</table>
 
-#### Returns:
+<!-- Tabular view -->
 
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
+<tr class="alt">
+<td colspan="2">
 [batch_size] tensor, the target q values.
+</td>
+</tr>
+
+</table>
diff --git a/docs/api_docs/python/recsim/agents/slate_decomp_q_agent/compute_target_topk_q.md b/docs/api_docs/python/recsim/agents/slate_decomp_q_agent/compute_target_topk_q.md
index f3fb44b..703429b 100644
--- a/docs/api_docs/python/recsim/agents/slate_decomp_q_agent/compute_target_topk_q.md
+++ b/docs/api_docs/python/recsim/agents/slate_decomp_q_agent/compute_target_topk_q.md
@@ -5,45 +5,90 @@
 
 # recsim.agents.slate_decomp_q_agent.compute_target_topk_q
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/slate_decomp_q_agent.py">View
 source</a>
 
-<!-- Start diff -->
 Computes the optimal target Q value with the greedy algorithm.
 
-```python
-recsim.agents.slate_decomp_q_agent.compute_target_topk_q(
-    reward,
-    gamma,
-    next_actions,
-    next_q_values,
-    next_states,
-    terminals
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.agents.slate_decomp_q_agent.compute_target_topk_q(
+    reward, gamma, next_actions, next_q_values, next_states, terminals
 )
-```
+</code></pre>
 
 <!-- Placeholder for "Used in" -->
 
 This algorithm corresponds to the method "TT" in Ie et al.
 https://arxiv.org/abs/1905.12767.
 
-#### Args:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
-*   <b>`reward`</b>: [batch_size] tensor, the immediate reward.
-*   <b>`gamma`</b>: float, discount factor with the usual RL meaning.
-*   <b>`next_actions`</b>: [batch_size, slate_size] tensor, the next slate.
-*   <b>`next_q_values`</b>: [batch_size, num_of_documents] tensor, the q values
-    of the documents in the next step.
-*   <b>`next_states`</b>: [batch_size, 1 + num_of_documents] tensor, the
-    features for the user and the docuemnts in the next step.
-*   <b>`terminals`</b>: [batch_size] tensor, indicating if this is a terminal
-    step.
+<tr>
+<td>
+`reward`
+</td>
+<td>
+[batch_size] tensor, the immediate reward.
+</td>
+</tr><tr>
+<td>
+`gamma`
+</td>
+<td>
+float, discount factor with the usual RL meaning.
+</td>
+</tr><tr>
+<td>
+`next_actions`
+</td>
+<td>
+[batch_size, slate_size] tensor, the next slate.
+</td>
+</tr><tr>
+<td>
+`next_q_values`
+</td>
+<td>
+[batch_size, num_of_documents] tensor, the q values of the
+documents in the next step.
+</td>
+</tr><tr>
+<td>
+`next_states`
+</td>
+<td>
+[batch_size, 1 + num_of_documents] tensor, the features for the
+user and the docuemnts in the next step.
+</td>
+</tr><tr>
+<td>
+`terminals`
+</td>
+<td>
+[batch_size] tensor, indicating if this is a terminal step.
+</td>
+</tr>
+</table>
 
-#### Returns:
+<!-- Tabular view -->
 
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
+<tr class="alt">
+<td colspan="2">
 [batch_size] tensor, the target q values.
+</td>
+</tr>
+
+</table>
diff --git a/docs/api_docs/python/recsim/agents/slate_decomp_q_agent/create_agent.md b/docs/api_docs/python/recsim/agents/slate_decomp_q_agent/create_agent.md
index 3cfc886..22d7ed1 100644
--- a/docs/api_docs/python/recsim/agents/slate_decomp_q_agent/create_agent.md
+++ b/docs/api_docs/python/recsim/agents/slate_decomp_q_agent/create_agent.md
@@ -5,24 +5,21 @@
 
 # recsim.agents.slate_decomp_q_agent.create_agent
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/slate_decomp_q_agent.py">View
 source</a>
 
-<!-- Start diff -->
-
 Creates a slate decomposition agent given agent name.
 
-```python
-recsim.agents.slate_decomp_q_agent.create_agent(
-    agent_name,
-    sess,
-    **kwargs
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.agents.slate_decomp_q_agent.create_agent(
+    agent_name, sess, **kwargs
 )
-```
+</code></pre>
 
 <!-- Placeholder for "Used in" -->
diff --git a/docs/api_docs/python/recsim/agents/slate_decomp_q_agent/score_documents.md b/docs/api_docs/python/recsim/agents/slate_decomp_q_agent/score_documents.md
index 919e770..62dc055 100644
--- a/docs/api_docs/python/recsim/agents/slate_decomp_q_agent/score_documents.md
+++ b/docs/api_docs/python/recsim/agents/slate_decomp_q_agent/score_documents.md
@@ -5,44 +5,84 @@
 
 # recsim.agents.slate_decomp_q_agent.score_documents
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/slate_decomp_q_agent.py">View
 source</a>
 
-<!-- Start diff -->
 Computes unnormalized scores given both user and document observations.
 
-```python
-recsim.agents.slate_decomp_q_agent.score_documents(
-    user_obs,
-    doc_obs,
-    no_click_mass=1.0,
-    is_mnl=False,
-    min_normalizer=-1.0
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.agents.slate_decomp_q_agent.score_documents(
+    user_obs, doc_obs, no_click_mass=1.0, is_mnl=False, min_normalizer=-1.0
 )
-```
+</code></pre>
 
 <!-- Placeholder for "Used in" -->
 
 Similar to score_documents_tf but works on NumPy objects.
 
-#### Args:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
-*   <b>`user_obs`</b>: An instance of AbstractUserState.
-*   <b>`doc_obs`</b>: A numpy array that represents the observation of all
-    documents in the candidate set.
-*   <b>`no_click_mass`</b>: a float indicating the mass given to a no click
-    option
-*   <b>`is_mnl`</b>: whether to use a multinomial logit model instead of a
-    multinomial proportional model.
-*   <b>`min_normalizer`</b>: A float (<= 0) used to offset the scores to be
-    positive when using multinomial proportional model.
+<tr>
+<td>
+`user_obs`
+</td>
+<td>
+An instance of AbstractUserState.
+</td>
+</tr><tr>
+<td>
+`doc_obs`
+</td>
+<td>
+A numpy array that represents the observation of all documents in
+the candidate set.
+</td>
+</tr><tr>
+<td>
+`no_click_mass`
+</td>
+<td>
+a float indicating the mass given to a no click option
+</td>
+</tr><tr>
+<td>
+`is_mnl`
+</td>
+<td>
+whether to use a multinomial logit model instead of a multinomial
+proportional model.
+</td>
+</tr><tr>
+<td>
+`min_normalizer`
+</td>
+<td>
+A float (<= 0) used to offset the scores to be positive when
+using multinomial proportional model.
+</td>
+</tr>
+</table>
 
-#### Returns:
+<!-- Tabular view -->
 
-A float array that stores unnormalzied scores of documents and a float number
-that represents the score for the action of picking no document.
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
+<tr class="alt">
+<td colspan="2">
+A float array that stores unnormalzied scores of documents and a float
+number that represents the score for the action of picking no document.
+</td>
+</tr>
+
+</table>
diff --git a/docs/api_docs/python/recsim/agents/slate_decomp_q_agent/score_documents_tf.md b/docs/api_docs/python/recsim/agents/slate_decomp_q_agent/score_documents_tf.md
index 2d8804e..2fbbd80 100644
--- a/docs/api_docs/python/recsim/agents/slate_decomp_q_agent/score_documents_tf.md
+++ b/docs/api_docs/python/recsim/agents/slate_decomp_q_agent/score_documents_tf.md
@@ -5,26 +5,22 @@
 
 # recsim.agents.slate_decomp_q_agent.score_documents_tf
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/slate_decomp_q_agent.py">View
 source</a>
 
-<!-- Start diff -->
 Computes unnormalized scores given both user and document observations.
 
-```python
-recsim.agents.slate_decomp_q_agent.score_documents_tf(
-    user_obs,
-    doc_obs,
-    no_click_mass=1.0,
-    is_mnl=False,
-    min_normalizer=-1.0
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.agents.slate_decomp_q_agent.score_documents_tf(
+    user_obs, doc_obs, no_click_mass=1.0, is_mnl=False, min_normalizer=-1.0
 )
-```
+</code></pre>
 
 <!-- Placeholder for "Used in" -->
 
@@ -32,19 +28,63 @@ This implements both multinomial proportional model and multinormial logit model
 given some parameters. We also assume scores are based on inner products of
 user_obs and doc_obs.
 
-#### Args:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
-*   <b>`user_obs`</b>: An instance of AbstractUserState.
-*   <b>`doc_obs`</b>: A numpy array that represents the observation of all
-    documents in the candidate set.
-*   <b>`no_click_mass`</b>: a float indicating the mass given to a no click
-    option
-*   <b>`is_mnl`</b>: whether to use a multinomial logit model instead of a
-    multinomial proportional model.
-*   <b>`min_normalizer`</b>: A float (<= 0) used to offset the scores to be
-    positive when using multinomial proportional model.
+<tr>
+<td>
+`user_obs`
+</td>
+<td>
+An instance of AbstractUserState.
+</td>
+</tr><tr>
+<td>
+`doc_obs`
+</td>
+<td>
+A numpy array that represents the observation of all documents in
+the candidate set.
+</td>
+</tr><tr>
+<td>
+`no_click_mass`
+</td>
+<td>
+a float indicating the mass given to a no click option
+</td>
+</tr><tr>
+<td>
+`is_mnl`
+</td>
+<td>
+whether to use a multinomial logit model instead of a multinomial
+proportional model.
+</td>
+</tr><tr>
+<td>
+`min_normalizer`
+</td>
+<td>
+A float (<= 0) used to offset the scores to be positive when
+using multinomial proportional model.
+</td>
+</tr>
+</table>
 
-#### Returns:
+<!-- Tabular view -->
 
-A float tensor that stores unnormalzied scores of documents and a float tensor
-that represents the score for the action of picking no document.
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
+<tr class="alt">
+<td colspan="2">
+A float tensor that stores unnormalzied scores of documents and a float
+tensor that represents the score for the action of picking no document.
+</td>
+</tr>
+
+</table>
diff --git a/docs/api_docs/python/recsim/agents/slate_decomp_q_agent/select_slate_greedy.md b/docs/api_docs/python/recsim/agents/slate_decomp_q_agent/select_slate_greedy.md
index ec780f8..8b7679c 100644
--- a/docs/api_docs/python/recsim/agents/slate_decomp_q_agent/select_slate_greedy.md
+++ b/docs/api_docs/python/recsim/agents/slate_decomp_q_agent/select_slate_greedy.md
@@ -5,38 +5,74 @@
 
 # recsim.agents.slate_decomp_q_agent.select_slate_greedy
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/slate_decomp_q_agent.py">View
 source</a>
 
-<!-- Start diff -->
 Selects the slate using the adaptive greedy algorithm.
 
-```python
-recsim.agents.slate_decomp_q_agent.select_slate_greedy(
-    slate_size,
-    s_no_click,
-    s,
-    q
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.agents.slate_decomp_q_agent.select_slate_greedy(
+    slate_size, s_no_click, s, q
 )
-```
+</code></pre>
 
 <!-- Placeholder for "Used in" -->
 
 This algorithm corresponds to the method "GS" in Ie et al.
 https://arxiv.org/abs/1905.12767.
 
-#### Args:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
-*   <b>`slate_size`</b>: int, the size of the recommendation slate.
-*   <b>`s_no_click`</b>: float tensor, the score for not clicking any document.
-*   <b>`s`</b>: [num_of_documents] tensor, the scores for clicking documents.
-*   <b>`q`</b>: [num_of_documents] tensor, the predicted q values for documents.
+<tr>
+<td>
+`slate_size`
+</td>
+<td>
+int, the size of the recommendation slate.
+</td>
+</tr><tr>
+<td>
+`s_no_click`
+</td>
+<td>
+float tensor, the score for not clicking any document.
+</td>
+</tr><tr>
+<td>
+`s`
+</td>
+<td>
+[num_of_documents] tensor, the scores for clicking documents.
+</td>
+</tr><tr>
+<td>
+`q`
+</td>
+<td>
+[num_of_documents] tensor, the predicted q values for documents.
+</td>
+</tr>
+</table>
 
-#### Returns:
+<!-- Tabular view -->
 
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
+<tr class="alt">
+<td colspan="2">
 [slate_size] tensor, the selected slate.
+</td>
+</tr>
+
+</table>
diff --git a/docs/api_docs/python/recsim/agents/slate_decomp_q_agent/select_slate_optimal.md b/docs/api_docs/python/recsim/agents/slate_decomp_q_agent/select_slate_optimal.md
index 84992bf..59aa687 100644
--- a/docs/api_docs/python/recsim/agents/slate_decomp_q_agent/select_slate_optimal.md
+++ b/docs/api_docs/python/recsim/agents/slate_decomp_q_agent/select_slate_optimal.md
@@ -5,38 +5,74 @@
 
 # recsim.agents.slate_decomp_q_agent.select_slate_optimal
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/slate_decomp_q_agent.py">View
 source</a>
 
-<!-- Start diff -->
 Selects the slate using exhaustive search.
 
-```python
-recsim.agents.slate_decomp_q_agent.select_slate_optimal(
-    slate_size,
-    s_no_click,
-    s,
-    q
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.agents.slate_decomp_q_agent.select_slate_optimal(
+    slate_size, s_no_click, s, q
 )
-```
+</code></pre>
 
 <!-- Placeholder for "Used in" -->
 
 This algorithm corresponds to the method "OS" in Ie et al.
 https://arxiv.org/abs/1905.12767.
 
-#### Args:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
-*   <b>`slate_size`</b>: int, the size of the recommendation slate.
-*   <b>`s_no_click`</b>: float tensor, the score for not clicking any document.
-*   <b>`s`</b>: [num_of_documents] tensor, the scores for clicking documents.
-*   <b>`q`</b>: [num_of_documents] tensor, the predicted q values for documents.
+<tr>
+<td>
+`slate_size`
+</td>
+<td>
+int, the size of the recommendation slate.
+</td>
+</tr><tr>
+<td>
+`s_no_click`
+</td>
+<td>
+float tensor, the score for not clicking any document.
+</td>
+</tr><tr>
+<td>
+`s`
+</td>
+<td>
+[num_of_documents] tensor, the scores for clicking documents.
+</td>
+</tr><tr>
+<td>
+`q`
+</td>
+<td>
+[num_of_documents] tensor, the predicted q values for documents.
+</td>
+</tr>
+</table>
 
-#### Returns:
+<!-- Tabular view -->
 
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
+<tr class="alt">
+<td colspan="2">
 [slate_size] tensor, the selected slate.
+</td>
+</tr>
+
+</table>
diff --git a/docs/api_docs/python/recsim/agents/slate_decomp_q_agent/select_slate_topk.md b/docs/api_docs/python/recsim/agents/slate_decomp_q_agent/select_slate_topk.md
index 7885a6c..1ff6d55 100644
--- a/docs/api_docs/python/recsim/agents/slate_decomp_q_agent/select_slate_topk.md
+++ b/docs/api_docs/python/recsim/agents/slate_decomp_q_agent/select_slate_topk.md
@@ -5,38 +5,74 @@
 
 # recsim.agents.slate_decomp_q_agent.select_slate_topk
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/slate_decomp_q_agent.py">View
 source</a>
 
-<!-- Start diff -->
 Selects the slate using the top-K algorithm.
 
-```python
-recsim.agents.slate_decomp_q_agent.select_slate_topk(
-    slate_size,
-    s_no_click,
-    s,
-    q
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.agents.slate_decomp_q_agent.select_slate_topk(
+    slate_size, s_no_click, s, q
 )
-```
+</code></pre>
 
 <!-- Placeholder for "Used in" -->
 
 This algorithm corresponds to the method "TS" in Ie et al.
 https://arxiv.org/abs/1905.12767.
 
-#### Args:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
-*   <b>`slate_size`</b>: int, the size of the recommendation slate.
-*   <b>`s_no_click`</b>: float tensor, the score for not clicking any document.
-*   <b>`s`</b>: [num_of_documents] tensor, the scores for clicking documents.
-*   <b>`q`</b>: [num_of_documents] tensor, the predicted q values for documents.
+<tr>
+<td>
+`slate_size`
+</td>
+<td>
+int, the size of the recommendation slate.
+</td>
+</tr><tr>
+<td>
+`s_no_click`
+</td>
+<td>
+float tensor, the score for not clicking any document.
+</td>
+</tr><tr>
+<td>
+`s`
+</td>
+<td>
+[num_of_documents] tensor, the scores for clicking documents.
+</td>
+</tr><tr>
+<td>
+`q`
+</td>
+<td>
+[num_of_documents] tensor, the predicted q values for documents.
+</td>
+</tr>
+</table>
 
-#### Returns:
+<!-- Tabular view -->
 
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
+<tr class="alt">
+<td colspan="2">
 [slate_size] tensor, the selected slate.
+</td>
+</tr>
+
+</table>
diff --git a/docs/api_docs/python/recsim/agents/tabular_q_agent.md b/docs/api_docs/python/recsim/agents/tabular_q_agent.md
index 261c21e..4f136df 100644
--- a/docs/api_docs/python/recsim/agents/tabular_q_agent.md
+++ b/docs/api_docs/python/recsim/agents/tabular_q_agent.md
@@ -5,7 +5,10 @@
 
 # Module: recsim.agents.tabular_q_agent
 
+<!-- Insert buttons and diff -->
+
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/tabular_q_agent.py">View
diff --git a/docs/api_docs/python/recsim/agents/tabular_q_agent/TabularQAgent.md b/docs/api_docs/python/recsim/agents/tabular_q_agent/TabularQAgent.md
index 714cdae..ead99ec 100644
--- a/docs/api_docs/python/recsim/agents/tabular_q_agent/TabularQAgent.md
+++ b/docs/api_docs/python/recsim/agents/tabular_q_agent/TabularQAgent.md
@@ -1,7 +1,6 @@
 <div itemscope itemtype="http://developers.google.com/ReferenceObject">
 <meta itemprop="name" content="recsim.agents.tabular_q_agent.TabularQAgent" />
 <meta itemprop="path" content="Stable" />
-<meta itemprop="property" content="multi_user"/>
 <meta itemprop="property" content="__init__"/>
 <meta itemprop="property" content="begin_episode"/>
 <meta itemprop="property" content="bundle_and_checkpoint"/>
@@ -12,22 +11,29 @@
 
 # recsim.agents.tabular_q_agent.TabularQAgent
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/tabular_q_agent.py">View
 source</a>
 
-## Class `TabularQAgent`
-
-<!-- Start diff -->
 Tabular Q-learning agent with universal function approximation.
 
 Inherits From:
 [`AbstractEpisodicRecommenderAgent`](../../../recsim/agent/AbstractEpisodicRecommenderAgent.md)
 
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.agents.tabular_q_agent.TabularQAgent(
+    observation_space, action_space, eval_mode=False, ignore_response=True,
+    discretization_bounds=(0.0, 10.0), number_bins=100,
+    exploration_policy='epsilon_greedy', exploration_temperature=0.99,
+    learning_rate=0.1, gamma=0.99, ordinal_slates=False, **kwargs
+)
+</code></pre>
+
 <!-- Placeholder for "Used in" -->
 
 This agent provides a tabular implementation of the Q-learning algorithm. To
@@ -46,63 +52,130 @@ Q-function. Producing ground truth Q-functions is the main intended use of this
 agent, since discretization is prohibitively expensive in high-dimensional
 environments.
 
-<h2 id="__init__"><code>__init__</code></h2>
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+`observation_space`
+</td>
+<td>
+a gym.spaces object specifying the format of
+observations.
+</td>
+</tr><tr>
+<td>
+`action_space`
+</td>
+<td>
+a gym.spaces object that specifies the format of actions.
+</td>
+</tr><tr>
+<td>
+`eval_mode`
+</td>
+<td>
+Boolean indicating whether the agent is in training or eval
+mode.
+</td>
+</tr><tr>
+<td>
+`ignore_response`
+</td>
+<td>
+Boolean indicating whether the agent should ignore the
+response part of the observation.
+</td>
+</tr><tr>
+<td>
+`discretization_bounds`
+</td>
+<td>
+pair of real numbers indicating the min and max
+value for continuous attributes discretization. Values below the min
+will all be grouped in the first bin, while values above the max will
+all be grouped in the last bin. See the documentation of numpy.digitize
+for further details.
+</td>
+</tr><tr>
+<td>
+`number_bins`
+</td>
+<td>
+positive integer number of bins used to discretize continuous
+attributes.
+</td>
+</tr><tr>
+<td>
+`exploration_policy`
+</td>
+<td>
+either one of ['epsilon_greedy', 'min_count'] or a
+custom function.
+function.
+</td>
+</tr><tr>
+<td>
+`exploration_temperature`
+</td>
+<td>
+a real number passed as parameter to the
+exploration policy.
+</td>
+</tr><tr>
+<td>
+`learning_rate`
+</td>
+<td>
+a real number between 0 and 1 indicating how much to update
+Q-values, i.e. Q_t+1(s,a) = (1 - learning_rate) * Q_t(s, a)
++ learning_rate * (R(s,a) + ...).
+</td>
+</tr><tr>
+<td>
+`gamma`
+</td>
+<td>
+real value between 0 and 1 indicating the discount factor of the
+MDP.
+</td>
+</tr><tr>
+<td>
+`ordinal_slates`
+</td>
+<td>
+boolean indicating whether slate ordering matters, e.g.
+whether the slates (1, 2) and (2, 1) should be considered different
+actions. Using ordinal slates increases complexity factorially.
+</td>
+</tr><tr>
+<td>
+`**kwargs`
+</td>
+<td>
+additional arguments like eval_mode.
+</td>
+</tr>
+</table>
 
-<a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/tabular_q_agent.py">View
-source</a>
+<!-- Tabular view -->
 
-```python
-__init__(
-    observation_space,
-    action_space,
-    eval_mode=False,
-    ignore_response=True,
-    discretization_bounds=(0.0, 10.0),
-    number_bins=100,
-    exploration_policy='epsilon_greedy',
-    exploration_temperature=0.99,
-    learning_rate=0.1,
-    gamma=0.99,
-    **kwargs
-)
-```
-
-TabularQAgent init.
-
-#### Args:
-
-*   <b>`observation_space`</b>: a gym.spaces object specifying the format of
-    observations.
-*   <b>`action_space`</b>: a gym.spaces object that specifies the format of
-    actions.
-*   <b>`eval_mode`</b>: Boolean indicating whether the agent is in training or
-    eval mode.
-*   <b>`ignore_response`</b>: Boolean indicating whether the agent should ignore
-    the response part of the observation.
-*   <b>`discretization_bounds`</b>: pair of real numbers indicating the min and
-    max value for continuous attributes discretization. Values below the min
-    will all be grouped in the first bin, while values above the max will all be
-    grouped in the last bin. See the documentation of numpy.digitize for further
-    details.
-*   <b>`number_bins`</b>: positive integer number of bins used to discretize
-    continuous attributes.
-*   <b>`exploration_policy`</b>: either one of ['epsilon_greedy', 'min_count']
-    or a custom function. TODO(mmladenov): formalize requirements of this
-    function.
-*   <b>`exploration_temperature`</b>: a real number passed as parameter to the
-    exploration policy.
-*   <b>`learning_rate`</b>: a real number between 0 and 1 indicating how much to
-    update Q-values, i.e. Q_t+1(s,a) = (1 - learning_rate) * Q_t(s, a) +
-    learning_rate * (R(s,a) + ...).
-*   <b>`gamma`</b>: real value between 0 and 1 indicating the discount factor of
-    the MDP.
-*   <b>`**kwargs`</b>: additional arguments like eval_mode.
-
-## Properties
-
-<h3 id="multi_user"><code>multi_user</code></h3>
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
+<tr>
+<td>
+`multi_user`
+</td>
+<td>
 Returns boolean indicating whether this agent serves multiple users.
+</td>
+</tr>
+</table>
 
 ## Methods
 
@@ -111,130 +184,271 @@ Returns boolean indicating whether this agent serves multiple users.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agent.py">View
 source</a>
 
-```python
-begin_episode(observation=None)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>begin_episode(
+    observation=None
+)
+</code></pre>
 
 Returns the agent's first action for this episode.
 
-#### Args:
+<!-- Tabular view -->
 
-*   <b>`observation`</b>: numpy array, the environment's initial observation.
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
 
-#### Returns:
+<tr>
+<td>
+`observation`
+</td>
+<td>
+numpy array, the environment's initial observation.
+</td>
+</tr>
+</table>
 
-*   <b>`slate`</b>: An integer array of size _slate_size, where each element is
-    an index into the list of doc_obs
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+
+<tr>
+<td>
+`slate`
+</td>
+<td>
+An integer array of size _slate_size, where each element is an
+index into the list of doc_obs
+</td>
+</tr>
+</table>
 
 <h3 id="bundle_and_checkpoint"><code>bundle_and_checkpoint</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/tabular_q_agent.py">View
 source</a>
 
-```python
-bundle_and_checkpoint(
-    checkpoint_dir,
-    iteration_number
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>bundle_and_checkpoint(
+    checkpoint_dir, iteration_number
 )
-```
+</code></pre>
 
 Returns a self-contained bundle of the agent's state.
 
-#### Args:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`checkpoint_dir`
+</td>
+<td>
+A string for the directory where objects will be saved.
+</td>
+</tr><tr>
+<td>
+`iteration_number`
+</td>
+<td>
+An integer of iteration number to use for naming the
+checkpoint file.
+</td>
+</tr>
+</table>
 
-*   <b>`checkpoint_dir`</b>: A string for the directory where objects will be
-    saved.
-*   <b>`iteration_number`</b>: An integer of iteration number to use for naming
-    the checkpoint file.
+<!-- Tabular view -->
 
-#### Returns:
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+A dictionary containing additional Python objects to be checkpointed by
+the experiment. Each key is a string for the object name and the value
+is actual object. If the checkpoint directory does not exist, returns
+empty dictionary.
+</td>
+</tr>
 
-A dictionary containing additional Python objects to be checkpointed by the
-experiment. Each key is a string for the object name and the value is actual
-object. If the checkpoint directory does not exist, returns empty dictionary.
+</table>
 
 <h3 id="end_episode"><code>end_episode</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/tabular_q_agent.py">View
 source</a>
 
-```python
-end_episode(
-    reward,
-    observation
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>end_episode(
+    reward, observation
 )
-```
+</code></pre>
 
 Signals the end of the episode to the agent.
 
-#### Args:
-
-*   <b>`reward`</b>: An float that is the last reward from the environment.
-*   <b>`observation`</b>: numpy array that represents the last observation of
-    the episode.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`reward`
+</td>
+<td>
+An float that is the last reward from the environment.
+</td>
+</tr><tr>
+<td>
+`observation`
+</td>
+<td>
+numpy array that represents the last observation of the
+episode.
+</td>
+</tr>
+</table>
 
 <h3 id="step"><code>step</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/tabular_q_agent.py">View
 source</a>
 
-```python
-step(
-    reward,
-    observation
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>step(
+    reward, observation
 )
-```
+</code></pre>
 
 Records the most recent transition and returns the agent's next action.
 
 We store the observation of the last time step since we want to store it with
 the reward.
 
-#### Args:
-
-*   <b>`reward`</b>: The reward received from the agent's most recent action as
-    a float.
-*   <b>`observation`</b>: A dictionary that includes the most recent
-    observations and should have the following fields:
-    -   user: A NumPy array representing user's observed state. Assumes it is a
-        concatenation of topic pull counts and topic click counts.
-    -   doc: A NumPy array representing observations of document features.
-        Assumes it is a concatenation of one-hot encoding of topic_id and
-        document quality.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`reward`
+</td>
+<td>
+The reward received from the agent's most recent action as a
+float.
+</td>
+</tr><tr>
+<td>
+`observation`
+</td>
+<td>
+A dictionary that includes the most recent observations and
+should have the following fields:
+- user: A NumPy array representing user's observed state. Assumes it is
+a concatenation of topic pull counts and topic click counts.
+- doc: A NumPy array representing observations of document features.
+Assumes it is a concatenation of one-hot encoding of topic_id and
+document quality.
+</td>
+</tr>
+</table>
 
-#### Returns:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+
+<tr>
+<td>
+`slate`
+</td>
+<td>
+An integer array of size _slate_size, where each element is an
+index into the list of doc_obs
+</td>
+</tr>
+</table>
 
-*   <b>`slate`</b>: An integer array of size _slate_size, where each element is
-    an index into the list of doc_obs
+<!-- Tabular view -->
 
-#### Raises:
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Raises</th></tr>
 
-*   <b>`ValueError`</b>: if reward is not in [0, 1].
+<tr>
+<td>
+`ValueError`
+</td>
+<td>
+if reward is not in [0, 1].
+</td>
+</tr>
+</table>
 
 <h3 id="unbundle"><code>unbundle</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/agents/tabular_q_agent.py">View
 source</a>
 
-```python
-unbundle(
-    checkpoint_dir,
-    iteration_number,
-    bundle_dict
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>unbundle(
+    checkpoint_dir, iteration_number, bundle_dict
 )
-```
+</code></pre>
 
 Restores the agent from a checkpoint.
 
-#### Args:
-
-*   <b>`checkpoint_dir`</b>: A string that represents the path to the checkpoint
-    saved by tf.Save.
-*   <b>`iteration_number`</b>: An integer that represents the checkpoint version
-    and is used when restoring replay buffer.
-*   <b>`bundle_dict`</b>: A dict containing additional Python objects owned by
-    the agent. Each key is an object name and the value is the actual object.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`checkpoint_dir`
+</td>
+<td>
+A string that represents the path to the checkpoint saved
+by tf.Save.
+</td>
+</tr><tr>
+<td>
+`iteration_number`
+</td>
+<td>
+An integer that represents the checkpoint version and is
+used when restoring replay buffer.
+</td>
+</tr><tr>
+<td>
+`bundle_dict`
+</td>
+<td>
+A dict containing additional Python objects owned by the
+agent. Each key is an object name and the value is the actual object.
+</td>
+</tr>
+</table>
 
-#### Returns:
+<!-- Tabular view -->
 
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
 bool, True if unbundling was successful.
+</td>
+</tr>
+
+</table>
diff --git a/docs/api_docs/python/recsim/choice_model.md b/docs/api_docs/python/recsim/choice_model.md
index 409b650..ca55c75 100644
--- a/docs/api_docs/python/recsim/choice_model.md
+++ b/docs/api_docs/python/recsim/choice_model.md
@@ -5,7 +5,10 @@
 
 # Module: recsim.choice_model
 
+<!-- Insert buttons and diff -->
+
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/choice_model.py">View
diff --git a/docs/api_docs/python/recsim/choice_model/AbstractChoiceModel.md b/docs/api_docs/python/recsim/choice_model/AbstractChoiceModel.md
index afe8d62..20a3953 100644
--- a/docs/api_docs/python/recsim/choice_model/AbstractChoiceModel.md
+++ b/docs/api_docs/python/recsim/choice_model/AbstractChoiceModel.md
@@ -1,36 +1,40 @@
 <div itemscope itemtype="http://developers.google.com/ReferenceObject">
 <meta itemprop="name" content="recsim.choice_model.AbstractChoiceModel" />
 <meta itemprop="path" content="Stable" />
-<meta itemprop="property" content="score_no_click"/>
-<meta itemprop="property" content="scores"/>
 <meta itemprop="property" content="choose_item"/>
 <meta itemprop="property" content="score_documents"/>
 </div>
 
 # recsim.choice_model.AbstractChoiceModel
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/choice_model.py">View
 source</a>
 
-## Class `AbstractChoiceModel`
-
-<!-- Start diff -->
 Abstract class to represent the user choice model.
 
 <!-- Placeholder for "Used in" -->
 
 Each user has a choice model.
 
-## Properties
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
-<h3 id="score_no_click"><code>score_no_click</code></h3>
+<tr> <td> `score_no_click` </td> <td>
 
-<h3 id="scores"><code>scores</code></h3>
+</td> </tr><tr> <td> `scores` </td> <td>
+
+</td>
+</tr>
+</table>
 
 ## Methods
 
@@ -39,39 +43,88 @@ Each user has a choice model.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/choice_model.py">View
 source</a>
 
-```python
-choose_item()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>@abc.abstractmethod</code>
+<code>choose_item()
+</code></pre>
 
 Returns selected index of document in the slate.
 
-#### Returns:
-
-*   <b>`selected_index`</b>: a integer indicating which item was chosen, or None
-    if none were selected.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+
+<tr>
+<td>
+`selected_index`
+</td>
+<td>
+a integer indicating which item was chosen, or None if
+none were selected.
+</td>
+</tr>
+</table>
 
 <h3 id="score_documents"><code>score_documents</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/choice_model.py">View
 source</a>
 
-```python
-score_documents(
-    user_state,
-    doc_obs
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>@abc.abstractmethod</code>
+<code>score_documents(
+    user_state, doc_obs
 )
-```
+</code></pre>
 
 Computes unnormalized scores of documents in the slate given user state.
 
-#### Args:
-
-*   <b>`user_state`</b>: An instance of AbstractUserState.
-*   <b>`doc_obs`</b>: A numpy array that represents the observation of all
-    documents in the slate.
-
-#### Attributes:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`user_state`
+</td>
+<td>
+An instance of AbstractUserState.
+</td>
+</tr><tr>
+<td>
+`doc_obs`
+</td>
+<td>
+A numpy array that represents the observation of all documents in
+the slate.
+</td>
+</tr>
+</table>
 
-*   <b>`scores`</b>: A numpy array that stores the scores of all documents.
-*   <b>`score_no_click`</b>: A float that represents the score for the action of
-    picking no document.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Attributes</th></tr>
+
+<tr>
+<td>
+`scores`
+</td>
+<td>
+A numpy array that stores the scores of all documents.
+</td>
+</tr><tr>
+<td>
+`score_no_click`
+</td>
+<td>
+A float that represents the score for the action of
+picking no document.
+</td>
+</tr>
+</table>
diff --git a/docs/api_docs/python/recsim/choice_model/CascadeChoiceModel.md b/docs/api_docs/python/recsim/choice_model/CascadeChoiceModel.md
index be30760..b5b308b 100644
--- a/docs/api_docs/python/recsim/choice_model/CascadeChoiceModel.md
+++ b/docs/api_docs/python/recsim/choice_model/CascadeChoiceModel.md
@@ -1,8 +1,6 @@
 <div itemscope itemtype="http://developers.google.com/ReferenceObject">
 <meta itemprop="name" content="recsim.choice_model.CascadeChoiceModel" />
 <meta itemprop="path" content="Stable" />
-<meta itemprop="property" content="score_no_click"/>
-<meta itemprop="property" content="scores"/>
 <meta itemprop="property" content="__init__"/>
 <meta itemprop="property" content="choose_item"/>
 <meta itemprop="property" content="score_documents"/>
@@ -10,52 +8,61 @@
 
 # recsim.choice_model.CascadeChoiceModel
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/choice_model.py">View
 source</a>
 
-## Class `CascadeChoiceModel`
-
-<!-- Start diff -->
 The base class for cascade choice models.
 
 Inherits From:
 [`NormalizableChoiceModel`](../../recsim/choice_model/NormalizableChoiceModel.md)
 
-<!-- Placeholder for "Used in" -->
-
-#### Attributes:
-
-*   <b>`attention_prob`</b>: The probability of examining a document i given
-    document i - 1 not clicked.
-*   <b>`score_scaling`</b>: A multiplicative factor to convert score of document
-    i to the click probability of examined document i.
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.choice_model.CascadeChoiceModel(
+    choice_features
+)
+</code></pre>
 
-#### Raises:
+<!-- Placeholder for "Used in" -->
 
-*   <b>`ValueError`</b>: if either attention_prob or base_attention_prob is
-    invalid.
+<!-- Tabular view -->
 
-<h2 id="__init__"><code>__init__</code></h2>
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Raises</h2></th></tr>
 
-<a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/choice_model.py">View
-source</a>
+<tr>
+<td>
+`ValueError`
+</td>
+<td>
+if either attention_prob or base_attention_prob is invalid.
+</td>
+</tr>
+</table>
 
-```python
-__init__(choice_features)
-```
+<!-- Tabular view -->
 
-Initialize self. See help(type(self)) for accurate signature.
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
-## Properties
+<tr> <td> `attention_prob` </td> <td> The probability of examining a document i
+given document i - 1 not clicked. </td> </tr><tr> <td> `score_scaling` </td>
+<td> A multiplicative factor to convert score of document i to the click
+probability of examined document i. </td> </tr><tr> <td> `score_no_click` </td>
+<td>
 
-<h3 id="score_no_click"><code>score_no_click</code></h3>
+</td> </tr><tr> <td> `scores` </td> <td>
 
-<h3 id="scores"><code>scores</code></h3>
+</td>
+</tr>
+</table>
 
 ## Methods
 
@@ -64,39 +71,87 @@ Initialize self. See help(type(self)) for accurate signature.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/choice_model.py">View
 source</a>
 
-```python
-choose_item()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>choose_item()
+</code></pre>
 
 Returns selected index of document in the slate.
 
-#### Returns:
-
-*   <b>`selected_index`</b>: a integer indicating which item was chosen, or None
-    if none were selected.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+
+<tr>
+<td>
+`selected_index`
+</td>
+<td>
+a integer indicating which item was chosen, or None if
+none were selected.
+</td>
+</tr>
+</table>
 
 <h3 id="score_documents"><code>score_documents</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/choice_model.py">View
 source</a>
 
-```python
-score_documents(
-    user_state,
-    doc_obs
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>@abc.abstractmethod</code>
+<code>score_documents(
+    user_state, doc_obs
 )
-```
+</code></pre>
 
 Computes unnormalized scores of documents in the slate given user state.
 
-#### Args:
-
-*   <b>`user_state`</b>: An instance of AbstractUserState.
-*   <b>`doc_obs`</b>: A numpy array that represents the observation of all
-    documents in the slate.
-
-#### Attributes:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`user_state`
+</td>
+<td>
+An instance of AbstractUserState.
+</td>
+</tr><tr>
+<td>
+`doc_obs`
+</td>
+<td>
+A numpy array that represents the observation of all documents in
+the slate.
+</td>
+</tr>
+</table>
 
-*   <b>`scores`</b>: A numpy array that stores the scores of all documents.
-*   <b>`score_no_click`</b>: A float that represents the score for the action of
-    picking no document.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Attributes</th></tr>
+
+<tr>
+<td>
+`scores`
+</td>
+<td>
+A numpy array that stores the scores of all documents.
+</td>
+</tr><tr>
+<td>
+`score_no_click`
+</td>
+<td>
+A float that represents the score for the action of
+picking no document.
+</td>
+</tr>
+</table>
diff --git a/docs/api_docs/python/recsim/choice_model/ExponentialCascadeChoiceModel.md b/docs/api_docs/python/recsim/choice_model/ExponentialCascadeChoiceModel.md
index 78b5241..77243cf 100644
--- a/docs/api_docs/python/recsim/choice_model/ExponentialCascadeChoiceModel.md
+++ b/docs/api_docs/python/recsim/choice_model/ExponentialCascadeChoiceModel.md
@@ -1,8 +1,6 @@
 <div itemscope itemtype="http://developers.google.com/ReferenceObject">
 <meta itemprop="name" content="recsim.choice_model.ExponentialCascadeChoiceModel" />
 <meta itemprop="path" content="Stable" />
-<meta itemprop="property" content="score_no_click"/>
-<meta itemprop="property" content="scores"/>
 <meta itemprop="property" content="__init__"/>
 <meta itemprop="property" content="choose_item"/>
 <meta itemprop="property" content="score_documents"/>
@@ -10,44 +8,45 @@
 
 # recsim.choice_model.ExponentialCascadeChoiceModel
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/choice_model.py">View
 source</a>
 
-## Class `ExponentialCascadeChoiceModel`
-
-<!-- Start diff -->
 An exponential cascade choice model.
 
 Inherits From:
 [`CascadeChoiceModel`](../../recsim/choice_model/CascadeChoiceModel.md)
 
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.choice_model.ExponentialCascadeChoiceModel(
+    choice_features
+)
+</code></pre>
+
 <!-- Placeholder for "Used in" -->
 
 Clicks the item at position i according to p(i) = attention_prob * score_scaling
 * exp(score(i)) by going through the slate in order, and stopping once an item
 has been clicked.
 
-<h2 id="__init__"><code>__init__</code></h2>
-
-<a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/choice_model.py">View
-source</a>
-
-```python
-__init__(choice_features)
-```
+<!-- Tabular view -->
 
-Initialize self. See help(type(self)) for accurate signature.
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
-## Properties
+<tr> <td> `score_no_click` </td> <td>
 
-<h3 id="score_no_click"><code>score_no_click</code></h3>
+</td> </tr><tr> <td> `scores` </td> <td>
 
-<h3 id="scores"><code>scores</code></h3>
+</td>
+</tr>
+</table>
 
 ## Methods
 
@@ -56,39 +55,86 @@ Initialize self. See help(type(self)) for accurate signature.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/choice_model.py">View
 source</a>
 
-```python
-choose_item()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>choose_item()
+</code></pre>
 
 Returns selected index of document in the slate.
 
-#### Returns:
-
-*   <b>`selected_index`</b>: a integer indicating which item was chosen, or None
-    if none were selected.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+
+<tr>
+<td>
+`selected_index`
+</td>
+<td>
+a integer indicating which item was chosen, or None if
+none were selected.
+</td>
+</tr>
+</table>
 
 <h3 id="score_documents"><code>score_documents</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/choice_model.py">View
 source</a>
 
-```python
-score_documents(
-    user_state,
-    doc_obs
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>score_documents(
+    user_state, doc_obs
 )
-```
+</code></pre>
 
 Computes unnormalized scores of documents in the slate given user state.
 
-#### Args:
-
-*   <b>`user_state`</b>: An instance of AbstractUserState.
-*   <b>`doc_obs`</b>: A numpy array that represents the observation of all
-    documents in the slate.
-
-#### Attributes:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`user_state`
+</td>
+<td>
+An instance of AbstractUserState.
+</td>
+</tr><tr>
+<td>
+`doc_obs`
+</td>
+<td>
+A numpy array that represents the observation of all documents in
+the slate.
+</td>
+</tr>
+</table>
 
-*   <b>`scores`</b>: A numpy array that stores the scores of all documents.
-*   <b>`score_no_click`</b>: A float that represents the score for the action of
-    picking no document.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Attributes</th></tr>
+
+<tr>
+<td>
+`scores`
+</td>
+<td>
+A numpy array that stores the scores of all documents.
+</td>
+</tr><tr>
+<td>
+`score_no_click`
+</td>
+<td>
+A float that represents the score for the action of
+picking no document.
+</td>
+</tr>
+</table>
diff --git a/docs/api_docs/python/recsim/choice_model/MultinomialLogitChoiceModel.md b/docs/api_docs/python/recsim/choice_model/MultinomialLogitChoiceModel.md
index 1741a4e..547f5aa 100644
--- a/docs/api_docs/python/recsim/choice_model/MultinomialLogitChoiceModel.md
+++ b/docs/api_docs/python/recsim/choice_model/MultinomialLogitChoiceModel.md
@@ -1,8 +1,6 @@
 <div itemscope itemtype="http://developers.google.com/ReferenceObject">
 <meta itemprop="name" content="recsim.choice_model.MultinomialLogitChoiceModel" />
 <meta itemprop="path" content="Stable" />
-<meta itemprop="property" content="score_no_click"/>
-<meta itemprop="property" content="scores"/>
 <meta itemprop="property" content="__init__"/>
 <meta itemprop="property" content="choose_item"/>
 <meta itemprop="property" content="score_documents"/>
@@ -10,48 +8,60 @@
 
 # recsim.choice_model.MultinomialLogitChoiceModel
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/choice_model.py">View
 source</a>
 
-## Class `MultinomialLogitChoiceModel`
-
-<!-- Start diff -->
 A multinomial logit choice model.
 
 Inherits From:
 [`NormalizableChoiceModel`](../../recsim/choice_model/NormalizableChoiceModel.md)
 
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.choice_model.MultinomialLogitChoiceModel(
+    choice_features
+)
+</code></pre>
+
 <!-- Placeholder for "Used in" -->
 
 Samples item x in scores according to p(x) = exp(x) / Sum_{y in scores} exp(y)
 
-#### Args:
-
-*   <b>`choice_features`</b>: a dict that stores the features used in choice
-    model: `no_click_mass`: a float indicating the mass given to a no click
-    option.
-
-<h2 id="__init__"><code>__init__</code></h2>
-
-<a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/choice_model.py">View
-source</a>
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+`choice_features`
+</td>
+<td>
+a dict that stores the features used in choice model:
+`no_click_mass`: a float indicating the mass given to a no click option.
+</td>
+</tr>
+</table>
 
-```python
-__init__(choice_features)
-```
+<!-- Tabular view -->
 
-Initialize self. See help(type(self)) for accurate signature.
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
-## Properties
+<tr> <td> `score_no_click` </td> <td>
 
-<h3 id="score_no_click"><code>score_no_click</code></h3>
+</td> </tr><tr> <td> `scores` </td> <td>
 
-<h3 id="scores"><code>scores</code></h3>
+</td>
+</tr>
+</table>
 
 ## Methods
 
@@ -60,39 +70,86 @@ Initialize self. See help(type(self)) for accurate signature.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/choice_model.py">View
 source</a>
 
-```python
-choose_item()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>choose_item()
+</code></pre>
 
 Returns selected index of document in the slate.
 
-#### Returns:
-
-*   <b>`selected_index`</b>: a integer indicating which item was chosen, or None
-    if none were selected.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+
+<tr>
+<td>
+`selected_index`
+</td>
+<td>
+a integer indicating which item was chosen, or None if
+none were selected.
+</td>
+</tr>
+</table>
 
 <h3 id="score_documents"><code>score_documents</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/choice_model.py">View
 source</a>
 
-```python
-score_documents(
-    user_state,
-    doc_obs
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>score_documents(
+    user_state, doc_obs
 )
-```
+</code></pre>
 
 Computes unnormalized scores of documents in the slate given user state.
 
-#### Args:
-
-*   <b>`user_state`</b>: An instance of AbstractUserState.
-*   <b>`doc_obs`</b>: A numpy array that represents the observation of all
-    documents in the slate.
-
-#### Attributes:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`user_state`
+</td>
+<td>
+An instance of AbstractUserState.
+</td>
+</tr><tr>
+<td>
+`doc_obs`
+</td>
+<td>
+A numpy array that represents the observation of all documents in
+the slate.
+</td>
+</tr>
+</table>
 
-*   <b>`scores`</b>: A numpy array that stores the scores of all documents.
-*   <b>`score_no_click`</b>: A float that represents the score for the action of
-    picking no document.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Attributes</th></tr>
+
+<tr>
+<td>
+`scores`
+</td>
+<td>
+A numpy array that stores the scores of all documents.
+</td>
+</tr><tr>
+<td>
+`score_no_click`
+</td>
+<td>
+A float that represents the score for the action of
+picking no document.
+</td>
+</tr>
+</table>
diff --git a/docs/api_docs/python/recsim/choice_model/MultinomialProportionalChoiceModel.md b/docs/api_docs/python/recsim/choice_model/MultinomialProportionalChoiceModel.md
index c279910..cbcc123 100644
--- a/docs/api_docs/python/recsim/choice_model/MultinomialProportionalChoiceModel.md
+++ b/docs/api_docs/python/recsim/choice_model/MultinomialProportionalChoiceModel.md
@@ -1,8 +1,6 @@
 <div itemscope itemtype="http://developers.google.com/ReferenceObject">
 <meta itemprop="name" content="recsim.choice_model.MultinomialProportionalChoiceModel" />
 <meta itemprop="path" content="Stable" />
-<meta itemprop="property" content="score_no_click"/>
-<meta itemprop="property" content="scores"/>
 <meta itemprop="property" content="__init__"/>
 <meta itemprop="property" content="choose_item"/>
 <meta itemprop="property" content="score_documents"/>
@@ -10,52 +8,49 @@
 
 # recsim.choice_model.MultinomialProportionalChoiceModel
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/choice_model.py">View
 source</a>
 
-## Class `MultinomialProportionalChoiceModel`
-
-<!-- Start diff -->
 A multinomial proportional choice function.
 
 Inherits From:
 [`NormalizableChoiceModel`](../../recsim/choice_model/NormalizableChoiceModel.md)
 
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.choice_model.MultinomialProportionalChoiceModel(
+    choice_features
+)
+</code></pre>
+
 <!-- Placeholder for "Used in" -->
 
 Samples item x in scores according to p(x) = x - min_normalizer / sum(x -
 min_normalizer)
 
-#### Attributes:
-
-*   <b>`min_normalizer`</b>: A float (<= 0) used to offset the scores to be
-    positive. Specifically, if the scores have negative elements, then they do
-    not form a valid probability distribution for sampling. Subtracting the
-    least expected element is one heuristic for normalization.
-*   <b>`no_click_mass`</b>: An optional float indicating the mass given to a no
-    click option
-
-<h2 id="__init__"><code>__init__</code></h2>
-
-<a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/choice_model.py">View
-source</a>
+<!-- Tabular view -->
 
-```python
-__init__(choice_features)
-```
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
-Initialize self. See help(type(self)) for accurate signature.
+<tr> <td> `min_normalizer` </td> <td> A float (<= 0) used to offset the scores
+to be positive. Specifically, if the scores have negative elements, then they do
+not form a valid probability distribution for sampling. Subtracting the least
+expected element is one heuristic for normalization. </td> </tr><tr> <td>
+`no_click_mass` </td> <td> An optional float indicating the mass given to a no
+click option </td> </tr><tr> <td> `score_no_click` </td> <td>
 
-## Properties
+</td> </tr><tr> <td> `scores` </td> <td>
 
-<h3 id="score_no_click"><code>score_no_click</code></h3>
-
-<h3 id="scores"><code>scores</code></h3>
+</td>
+</tr>
+</table>
 
 ## Methods
 
@@ -64,39 +59,86 @@ Initialize self. See help(type(self)) for accurate signature.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/choice_model.py">View
 source</a>
 
-```python
-choose_item()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>choose_item()
+</code></pre>
 
 Returns selected index of document in the slate.
 
-#### Returns:
-
-*   <b>`selected_index`</b>: a integer indicating which item was chosen, or None
-    if none were selected.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+
+<tr>
+<td>
+`selected_index`
+</td>
+<td>
+a integer indicating which item was chosen, or None if
+none were selected.
+</td>
+</tr>
+</table>
 
 <h3 id="score_documents"><code>score_documents</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/choice_model.py">View
 source</a>
 
-```python
-score_documents(
-    user_state,
-    doc_obs
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>score_documents(
+    user_state, doc_obs
 )
-```
+</code></pre>
 
 Computes unnormalized scores of documents in the slate given user state.
 
-#### Args:
-
-*   <b>`user_state`</b>: An instance of AbstractUserState.
-*   <b>`doc_obs`</b>: A numpy array that represents the observation of all
-    documents in the slate.
-
-#### Attributes:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`user_state`
+</td>
+<td>
+An instance of AbstractUserState.
+</td>
+</tr><tr>
+<td>
+`doc_obs`
+</td>
+<td>
+A numpy array that represents the observation of all documents in
+the slate.
+</td>
+</tr>
+</table>
 
-*   <b>`scores`</b>: A numpy array that stores the scores of all documents.
-*   <b>`score_no_click`</b>: A float that represents the score for the action of
-    picking no document.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Attributes</th></tr>
+
+<tr>
+<td>
+`scores`
+</td>
+<td>
+A numpy array that stores the scores of all documents.
+</td>
+</tr><tr>
+<td>
+`score_no_click`
+</td>
+<td>
+A float that represents the score for the action of
+picking no document.
+</td>
+</tr>
+</table>
diff --git a/docs/api_docs/python/recsim/choice_model/NormalizableChoiceModel.md b/docs/api_docs/python/recsim/choice_model/NormalizableChoiceModel.md
index 1c10121..0ca2d61 100644
--- a/docs/api_docs/python/recsim/choice_model/NormalizableChoiceModel.md
+++ b/docs/api_docs/python/recsim/choice_model/NormalizableChoiceModel.md
@@ -1,25 +1,21 @@
 <div itemscope itemtype="http://developers.google.com/ReferenceObject">
 <meta itemprop="name" content="recsim.choice_model.NormalizableChoiceModel" />
 <meta itemprop="path" content="Stable" />
-<meta itemprop="property" content="score_no_click"/>
-<meta itemprop="property" content="scores"/>
 <meta itemprop="property" content="choose_item"/>
 <meta itemprop="property" content="score_documents"/>
 </div>
 
 # recsim.choice_model.NormalizableChoiceModel
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/choice_model.py">View
 source</a>
 
-## Class `NormalizableChoiceModel`
-
-<!-- Start diff -->
 A normalizable choice model.
 
 Inherits From:
@@ -27,11 +23,19 @@ Inherits From:
 
 <!-- Placeholder for "Used in" -->
 
-## Properties
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
-<h3 id="score_no_click"><code>score_no_click</code></h3>
+<tr> <td> `score_no_click` </td> <td>
 
-<h3 id="scores"><code>scores</code></h3>
+</td> </tr><tr> <td> `scores` </td> <td>
+
+</td>
+</tr>
+</table>
 
 ## Methods
 
@@ -40,39 +44,87 @@ Inherits From:
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/choice_model.py">View
 source</a>
 
-```python
-choose_item()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>choose_item()
+</code></pre>
 
 Returns selected index of document in the slate.
 
-#### Returns:
-
-*   <b>`selected_index`</b>: a integer indicating which item was chosen, or None
-    if none were selected.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+
+<tr>
+<td>
+`selected_index`
+</td>
+<td>
+a integer indicating which item was chosen, or None if
+none were selected.
+</td>
+</tr>
+</table>
 
 <h3 id="score_documents"><code>score_documents</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/choice_model.py">View
 source</a>
 
-```python
-score_documents(
-    user_state,
-    doc_obs
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>@abc.abstractmethod</code>
+<code>score_documents(
+    user_state, doc_obs
 )
-```
+</code></pre>
 
 Computes unnormalized scores of documents in the slate given user state.
 
-#### Args:
-
-*   <b>`user_state`</b>: An instance of AbstractUserState.
-*   <b>`doc_obs`</b>: A numpy array that represents the observation of all
-    documents in the slate.
-
-#### Attributes:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`user_state`
+</td>
+<td>
+An instance of AbstractUserState.
+</td>
+</tr><tr>
+<td>
+`doc_obs`
+</td>
+<td>
+A numpy array that represents the observation of all documents in
+the slate.
+</td>
+</tr>
+</table>
 
-*   <b>`scores`</b>: A numpy array that stores the scores of all documents.
-*   <b>`score_no_click`</b>: A float that represents the score for the action of
-    picking no document.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Attributes</th></tr>
+
+<tr>
+<td>
+`scores`
+</td>
+<td>
+A numpy array that stores the scores of all documents.
+</td>
+</tr><tr>
+<td>
+`score_no_click`
+</td>
+<td>
+A float that represents the score for the action of
+picking no document.
+</td>
+</tr>
+</table>
diff --git a/docs/api_docs/python/recsim/choice_model/ProportionalCascadeChoiceModel.md b/docs/api_docs/python/recsim/choice_model/ProportionalCascadeChoiceModel.md
index 651c8ed..5de0f24 100644
--- a/docs/api_docs/python/recsim/choice_model/ProportionalCascadeChoiceModel.md
+++ b/docs/api_docs/python/recsim/choice_model/ProportionalCascadeChoiceModel.md
@@ -1,8 +1,6 @@
 <div itemscope itemtype="http://developers.google.com/ReferenceObject">
 <meta itemprop="name" content="recsim.choice_model.ProportionalCascadeChoiceModel" />
 <meta itemprop="path" content="Stable" />
-<meta itemprop="property" content="score_no_click"/>
-<meta itemprop="property" content="scores"/>
 <meta itemprop="property" content="__init__"/>
 <meta itemprop="property" content="choose_item"/>
 <meta itemprop="property" content="score_documents"/>
@@ -10,44 +8,45 @@
 
 # recsim.choice_model.ProportionalCascadeChoiceModel
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/choice_model.py">View
 source</a>
 
-## Class `ProportionalCascadeChoiceModel`
-
-<!-- Start diff -->
 A proportional cascade choice model.
 
 Inherits From:
 [`CascadeChoiceModel`](../../recsim/choice_model/CascadeChoiceModel.md)
 
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.choice_model.ProportionalCascadeChoiceModel(
+    choice_features
+)
+</code></pre>
+
 <!-- Placeholder for "Used in" -->
 
 Clicks the item at position i according to attention_prob * score_scaling *
 (score(i) - min_normalizer) by going through the slate in order, and stopping
 once an item has been clicked.
 
-<h2 id="__init__"><code>__init__</code></h2>
-
-<a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/choice_model.py">View
-source</a>
-
-```python
-__init__(choice_features)
-```
+<!-- Tabular view -->
 
-Initialize self. See help(type(self)) for accurate signature.
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
-## Properties
+<tr> <td> `score_no_click` </td> <td>
 
-<h3 id="score_no_click"><code>score_no_click</code></h3>
+</td> </tr><tr> <td> `scores` </td> <td>
 
-<h3 id="scores"><code>scores</code></h3>
+</td>
+</tr>
+</table>
 
 ## Methods
 
@@ -56,39 +55,86 @@ Initialize self. See help(type(self)) for accurate signature.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/choice_model.py">View
 source</a>
 
-```python
-choose_item()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>choose_item()
+</code></pre>
 
 Returns selected index of document in the slate.
 
-#### Returns:
-
-*   <b>`selected_index`</b>: a integer indicating which item was chosen, or None
-    if none were selected.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+
+<tr>
+<td>
+`selected_index`
+</td>
+<td>
+a integer indicating which item was chosen, or None if
+none were selected.
+</td>
+</tr>
+</table>
 
 <h3 id="score_documents"><code>score_documents</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/choice_model.py">View
 source</a>
 
-```python
-score_documents(
-    user_state,
-    doc_obs
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>score_documents(
+    user_state, doc_obs
 )
-```
+</code></pre>
 
 Computes unnormalized scores of documents in the slate given user state.
 
-#### Args:
-
-*   <b>`user_state`</b>: An instance of AbstractUserState.
-*   <b>`doc_obs`</b>: A numpy array that represents the observation of all
-    documents in the slate.
-
-#### Attributes:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`user_state`
+</td>
+<td>
+An instance of AbstractUserState.
+</td>
+</tr><tr>
+<td>
+`doc_obs`
+</td>
+<td>
+A numpy array that represents the observation of all documents in
+the slate.
+</td>
+</tr>
+</table>
 
-*   <b>`scores`</b>: A numpy array that stores the scores of all documents.
-*   <b>`score_no_click`</b>: A float that represents the score for the action of
-    picking no document.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Attributes</th></tr>
+
+<tr>
+<td>
+`scores`
+</td>
+<td>
+A numpy array that stores the scores of all documents.
+</td>
+</tr><tr>
+<td>
+`score_no_click`
+</td>
+<td>
+A float that represents the score for the action of
+picking no document.
+</td>
+</tr>
+</table>
diff --git a/docs/api_docs/python/recsim/choice_model/softmax.md b/docs/api_docs/python/recsim/choice_model/softmax.md
index 69b5b65..e2d198e 100644
--- a/docs/api_docs/python/recsim/choice_model/softmax.md
+++ b/docs/api_docs/python/recsim/choice_model/softmax.md
@@ -5,20 +5,21 @@
 
 # recsim.choice_model.softmax
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/choice_model.py">View
 source</a>
 
-<!-- Start diff -->
-
 Computes the softmax of a vector.
 
-```python
-recsim.choice_model.softmax(vector)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.choice_model.softmax(
+    vector
+)
+</code></pre>
 
 <!-- Placeholder for "Used in" -->
diff --git a/docs/api_docs/python/recsim/document.md b/docs/api_docs/python/recsim/document.md
index e5cbf8d..d65332b 100644
--- a/docs/api_docs/python/recsim/document.md
+++ b/docs/api_docs/python/recsim/document.md
@@ -5,7 +5,10 @@
 
 # Module: recsim.document
 
+<!-- Insert buttons and diff -->
+
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/document.py">View
diff --git a/docs/api_docs/python/recsim/document/AbstractDocument.md b/docs/api_docs/python/recsim/document/AbstractDocument.md
index e2d1aae..241da81 100644
--- a/docs/api_docs/python/recsim/document/AbstractDocument.md
+++ b/docs/api_docs/python/recsim/document/AbstractDocument.md
@@ -5,35 +5,29 @@
 <meta itemprop="property" content="create_observation"/>
 <meta itemprop="property" content="doc_id"/>
 <meta itemprop="property" content="observation_space"/>
+<meta itemprop="property" content="NUM_FEATURES"/>
 </div>
 
 # recsim.document.AbstractDocument
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/document.py">View
 source</a>
 
-## Class `AbstractDocument`
-
-<!-- Start diff -->
 Abstract class to represent a document and its properties.
 
-<!-- Placeholder for "Used in" -->
-
-<h2 id="__init__"><code>__init__</code></h2>
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.document.AbstractDocument(
+    doc_id
+)
+</code></pre>
 
-<a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/document.py">View
-source</a>
-
-```python
-__init__(doc_id)
-```
-
-Initialize self. See help(type(self)) for accurate signature.
+<!-- Placeholder for "Used in" -->
 
 ## Methods
 
@@ -42,9 +36,10 @@ Initialize self. See help(type(self)) for accurate signature.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/document.py">View
 source</a>
 
-```python
-create_observation()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>@abc.abstractmethod</code>
+<code>create_observation()
+</code></pre>
 
 Returns observable properties of this document as a float array.
 
@@ -53,9 +48,9 @@ Returns observable properties of this document as a float array.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/document.py">View
 source</a>
 
-```python
-doc_id()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>doc_id()
+</code></pre>
 
 Returns the document ID.
 
@@ -64,9 +59,14 @@ Returns the document ID.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/document.py">View
 source</a>
 
-```python
-@classmethod
-observation_space(cls)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>@classmethod</code>
+<code>@abc.abstractmethod</code>
+<code>observation_space()
+</code></pre>
 
 Gym space that defines how documents are represented.
+
+## Class Variables
+
+*   `NUM_FEATURES = None` <a id="NUM_FEATURES"></a>
diff --git a/docs/api_docs/python/recsim/document/AbstractDocumentSampler.md b/docs/api_docs/python/recsim/document/AbstractDocumentSampler.md
index f9054e4..4a462f0 100644
--- a/docs/api_docs/python/recsim/document/AbstractDocumentSampler.md
+++ b/docs/api_docs/python/recsim/document/AbstractDocumentSampler.md
@@ -1,7 +1,6 @@
 <div itemscope itemtype="http://developers.google.com/ReferenceObject">
 <meta itemprop="name" content="recsim.document.AbstractDocumentSampler" />
 <meta itemprop="path" content="Stable" />
-<meta itemprop="property" content="num_clusters"/>
 <meta itemprop="property" content="__init__"/>
 <meta itemprop="property" content="get_doc_ctor"/>
 <meta itemprop="property" content="reset_sampler"/>
@@ -11,40 +10,39 @@
 
 # recsim.document.AbstractDocumentSampler
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/document.py">View
 source</a>
 
-## Class `AbstractDocumentSampler`
-
-<!-- Start diff -->
 Abstract class to sample documents.
 
-<!-- Placeholder for "Used in" -->
-
-<h2 id="__init__"><code>__init__</code></h2>
-
-<a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/document.py">View
-source</a>
-
-```python
-__init__(
-    doc_ctor,
-    seed=0
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.document.AbstractDocumentSampler(
+    doc_ctor, seed=0
 )
-```
-
-Initialize self. See help(type(self)) for accurate signature.
+</code></pre>
 
-## Properties
+<!-- Placeholder for "Used in" -->
+<!-- Tabular view -->
 
-<h3 id="num_clusters"><code>num_clusters</code></h3>
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
+<tr>
+<td>
+`num_clusters`
+</td>
+<td>
 Returns the number of document clusters. Returns 0 if not applicable.
+</td>
+</tr>
+</table>
 
 ## Methods
 
@@ -53,9 +51,9 @@ Returns the number of document clusters. Returns 0 if not applicable.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/document.py">View
 source</a>
 
-```python
-get_doc_ctor()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>get_doc_ctor()
+</code></pre>
 
 Returns the constructor/class of the documents that will be sampled.
 
@@ -64,18 +62,19 @@ Returns the constructor/class of the documents that will be sampled.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/document.py">View
 source</a>
 
-```python
-reset_sampler()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>reset_sampler()
+</code></pre>
 
 <h3 id="sample_document"><code>sample_document</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/document.py">View
 source</a>
 
-```python
-sample_document()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>@abc.abstractmethod</code>
+<code>sample_document()
+</code></pre>
 
 Samples and return an instantiation of AbstractDocument.
 
@@ -84,11 +83,10 @@ Samples and return an instantiation of AbstractDocument.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/document.py">View
 source</a>
 
-```python
-update_state(
-    documents,
-    responses
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>update_state(
+    documents, responses
 )
-```
+</code></pre>
 
 Update document state (if needed) given user's (or users') responses.
diff --git a/docs/api_docs/python/recsim/document/CandidateSet.md b/docs/api_docs/python/recsim/document/CandidateSet.md
index 1626daa..b50f0d8 100644
--- a/docs/api_docs/python/recsim/document/CandidateSet.md
+++ b/docs/api_docs/python/recsim/document/CandidateSet.md
@@ -13,35 +13,26 @@
 
 # recsim.document.CandidateSet
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/document.py">View
 source</a>
 
-## Class `CandidateSet`
-
-<!-- Start diff -->
 Class to represent a collection of AbstractDocuments.
 
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.document.CandidateSet()
+</code></pre>
+
 <!-- Placeholder for "Used in" -->
 
 The candidate set is represented as a hashmap (dictionary), with documents
 indexed by their document ID.
 
-<h2 id="__init__"><code>__init__</code></h2>
-
-<a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/document.py">View
-source</a>
-
-```python
-__init__()
-```
-
-Initializes a document candidate set with 0 documents.
-
 ## Methods
 
 <h3 id="add_document"><code>add_document</code></h3>
@@ -49,9 +40,11 @@ Initializes a document candidate set with 0 documents.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/document.py">View
 source</a>
 
-```python
-add_document(document)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>add_document(
+    document
+)
+</code></pre>
 
 Adds a document to the candidate set.
 
@@ -60,9 +53,9 @@ Adds a document to the candidate set.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/document.py">View
 source</a>
 
-```python
-create_observation()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>create_observation()
+</code></pre>
 
 Returns a dictionary of observable features of documents.
 
@@ -71,9 +64,9 @@ Returns a dictionary of observable features of documents.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/document.py">View
 source</a>
 
-```python
-get_all_documents()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>get_all_documents()
+</code></pre>
 
 Returns all documents.
 
@@ -82,39 +75,64 @@ Returns all documents.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/document.py">View
 source</a>
 
-```python
-get_documents(document_ids)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>get_documents(
+    document_ids
+)
+</code></pre>
 
 Gets the documents associated with the specified document IDs.
 
-#### Args:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`document_ids`
+</td>
+<td>
+an array representing indices into the candidate set.
+Indices can be integers or string-encoded integers.
+</td>
+</tr>
+</table>
 
-*   <b>`document_ids`</b>: an array representing indices into the candidate set.
-    Indices can be integers or string-encoded integers.
+<!-- Tabular view -->
 
-#### Returns:
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+(documents) an ordered list of AbstractDocuments associated with the
+document ids.
+</td>
+</tr>
 
-(documents) an ordered list of AbstractDocuments associated with the document
-ids.
+</table>
 
 <h3 id="observation_space"><code>observation_space</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/document.py">View
 source</a>
 
-```python
-observation_space()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>observation_space()
+</code></pre>
 
 <h3 id="remove_document"><code>remove_document</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/document.py">View
 source</a>
 
-```python
-remove_document(document)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>remove_document(
+    document
+)
+</code></pre>
 
 Removes a document from the set (to simulate a changing corpus).
 
@@ -123,8 +141,8 @@ Removes a document from the set (to simulate a changing corpus).
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/document.py">View
 source</a>
 
-```python
-size()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>size()
+</code></pre>
 
 Returns an integer, the number of documents in this candidate set.
diff --git a/docs/api_docs/python/recsim/environments.md b/docs/api_docs/python/recsim/environments.md
index 8848e77..d490527 100644
--- a/docs/api_docs/python/recsim/environments.md
+++ b/docs/api_docs/python/recsim/environments.md
@@ -5,7 +5,10 @@
 
 # Module: recsim.environments
 
+<!-- Insert buttons and diff -->
+
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/__init__.py">View
diff --git a/docs/api_docs/python/recsim/environments/interest_evolution.md b/docs/api_docs/python/recsim/environments/interest_evolution.md
index 701b449..0a6d013 100644
--- a/docs/api_docs/python/recsim/environments/interest_evolution.md
+++ b/docs/api_docs/python/recsim/environments/interest_evolution.md
@@ -1,12 +1,14 @@
 <div itemscope itemtype="http://developers.google.com/ReferenceObject">
 <meta itemprop="name" content="recsim.environments.interest_evolution" />
 <meta itemprop="path" content="Stable" />
-<meta itemprop="property" content="FLAGS"/>
 </div>
 
 # Module: recsim.environments.interest_evolution
 
+<!-- Insert buttons and diff -->
+
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_evolution.py">View
@@ -42,6 +44,9 @@ Class that samples videos for utility model experiment.
 
 ## Functions
 
+[`FLAGS(...)`](../../recsim/environments/interest_evolution/FLAGS.md): Registry
+of 'Flag' objects.
+
 [`clicked_watchtime_reward(...)`](../../recsim/environments/interest_evolution/clicked_watchtime_reward.md):
 Calculates the total clicked watchtime from a list of responses.
 
@@ -50,7 +55,3 @@ Creates an interest evolution environment.
 
 [`total_clicks_reward(...)`](../../recsim/environments/interest_evolution/total_clicks_reward.md):
 Calculates the total number of clicks from a list of responses.
-
-## Other Members
-
-*   `FLAGS` <a id="FLAGS"></a>
diff --git a/docs/api_docs/python/recsim/environments/interest_evolution/FLAGS.md b/docs/api_docs/python/recsim/environments/interest_evolution/FLAGS.md
new file mode 100644
index 0000000..cb0dc5c
--- /dev/null
+++ b/docs/api_docs/python/recsim/environments/interest_evolution/FLAGS.md
@@ -0,0 +1,57 @@
+<div itemscope itemtype="http://developers.google.com/ReferenceObject">
+<meta itemprop="name" content="recsim.environments.interest_evolution.FLAGS" />
+<meta itemprop="path" content="Stable" />
+</div>
+
+# recsim.environments.interest_evolution.FLAGS
+
+<!-- Insert buttons and diff -->
+
+<table class="tfo-notebook-buttons tfo-api" align="left">
+
+</table>
+
+Registry of 'Flag' objects.
+
+<section class="expandable">
+  <h4 class="showalways">View aliases</h4>
+  <p>
+<b>Main aliases</b>
+<p>`recsim.environments.interest_exploration.FLAGS`, `recsim.environments.long_term_satisfaction.FLAGS`, `recsim.simulator.runner_lib.FLAGS`</p>
+</p>
+</section>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.environments.interest_evolution.FLAGS(
+    argv, known_only=False
+)
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
+
+A 'FlagValues' can then scan command line arguments, passing flag arguments
+through to the 'Flag' objects that it owns. It also provides easy access to the
+flag values. Typically only one 'FlagValues' object is needed by an application:
+flags.FLAGS
+
+This class is heavily overloaded:
+
+'Flag' objects are registered via __setitem__: FLAGS['longname'] = x # register
+a new flag
+
+The .value attribute of the registered 'Flag' objects can be accessed as
+attributes of this 'FlagValues' object, through __getattr__. Both the long and
+short name of the original 'Flag' objects can be used to access its value:
+FLAGS.longname # parsed flag value FLAGS.x # parsed flag value (short name)
+
+Command line arguments are scanned and passed to the registered 'Flag' objects
+through the __call__ method. Unparsed arguments, including
+argv[0](e.g. the program name) are returned. argv = FLAGS(sys.argv) # scan
+command line arguments
+
+The original registered Flag objects can be retrieved through the use of the
+dictionary-like operator, __getitem__: x = FLAGS['longname'] # access the
+registered Flag object
+
+The str() operator of a 'FlagValues' object provides help for all of the
+registered 'Flag' objects.
diff --git a/docs/api_docs/python/recsim/environments/interest_evolution/IEvResponse.md b/docs/api_docs/python/recsim/environments/interest_evolution/IEvResponse.md
index 7a17de9..8346e36 100644
--- a/docs/api_docs/python/recsim/environments/interest_evolution/IEvResponse.md
+++ b/docs/api_docs/python/recsim/environments/interest_evolution/IEvResponse.md
@@ -10,55 +10,114 @@
 
 # recsim.environments.interest_evolution.IEvResponse
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_evolution.py">View
 source</a>
 
-## Class `IEvResponse`
-
-<!-- Start diff -->
 Class to represent a user's response to a video.
 
 Inherits From: [`AbstractResponse`](../../../recsim/user/AbstractResponse.md)
 
-<!-- Placeholder for "Used in" -->
-
-#### Attributes:
-
-*   <b>`clicked`</b>: A boolean indicating whether the video was clicked.
-*   <b>`watch_time`</b>: A float for fraction of the video watched.
-*   <b>`liked`</b>: A boolean indicating whether the video was liked.
-*   <b>`quality`</b>: A float indicating the quality of the video.
-*   <b>`cluster_id`</b>: A integer representing the cluster ID of the video.
-
-<h2 id="__init__"><code>__init__</code></h2>
-
-<a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_evolution.py">View
-source</a>
-
-```python
-__init__(
-    clicked=False,
-    watch_time=0.0,
-    liked=False,
-    quality=0.0,
-    cluster_id=0.0
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.environments.interest_evolution.IEvResponse(
+    clicked=False, watch_time=0.0, liked=False, quality=0.0, cluster_id=0.0
 )
-```
+</code></pre>
 
-Creates a new user response for a video.
+<!-- Placeholder for "Used in" -->
 
-#### Args:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+`clicked`
+</td>
+<td>
+A boolean indicating whether the video was clicked
+</td>
+</tr><tr>
+<td>
+`watch_time`
+</td>
+<td>
+A float for fraction of the video watched
+</td>
+</tr><tr>
+<td>
+`liked`
+</td>
+<td>
+A boolean indicating whether the video was liked
+</td>
+</tr><tr>
+<td>
+`quality`
+</td>
+<td>
+A float for document quality
+</td>
+</tr><tr>
+<td>
+`cluster_id`
+</td>
+<td>
+a integer for the cluster ID of the document.
+</td>
+</tr>
+</table>
 
-*   <b>`clicked`</b>: A boolean indicating whether the video was clicked
-*   <b>`watch_time`</b>: A float for fraction of the video watched
-*   <b>`liked`</b>: A boolean indicating whether the video was liked
-*   <b>`quality`</b>: A float for document quality
-*   <b>`cluster_id`</b>: a integer for the cluster ID of the document.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
+
+<tr>
+<td>
+`clicked`
+</td>
+<td>
+A boolean indicating whether the video was clicked.
+</td>
+</tr><tr>
+<td>
+`watch_time`
+</td>
+<td>
+A float for fraction of the video watched.
+</td>
+</tr><tr>
+<td>
+`liked`
+</td>
+<td>
+A boolean indicating whether the video was liked.
+</td>
+</tr><tr>
+<td>
+`quality`
+</td>
+<td>
+A float indicating the quality of the video.
+</td>
+</tr><tr>
+<td>
+`cluster_id`
+</td>
+<td>
+A integer representing the cluster ID of the video.
+</td>
+</tr>
+</table>
 
 ## Methods
 
@@ -67,9 +126,9 @@ Creates a new user response for a video.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_evolution.py">View
 source</a>
 
-```python
-create_observation()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>create_observation()
+</code></pre>
 
 Creates a tensor observation of this response.
 
@@ -78,14 +137,14 @@ Creates a tensor observation of this response.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_evolution.py">View
 source</a>
 
-```python
-@classmethod
-response_space(cls)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>@classmethod</code>
+<code>response_space()
+</code></pre>
 
 ArraySpec that defines how a single response is represented.
 
-## Class Members
+## Class Variables
 
 *   `MAX_QUALITY_SCORE = 100` <a id="MAX_QUALITY_SCORE"></a>
 *   `MIN_QUALITY_SCORE = -100` <a id="MIN_QUALITY_SCORE"></a>
diff --git a/docs/api_docs/python/recsim/environments/interest_evolution/IEvUserDistributionSampler.md b/docs/api_docs/python/recsim/environments/interest_evolution/IEvUserDistributionSampler.md
index 3c24e63..afa2452 100644
--- a/docs/api_docs/python/recsim/environments/interest_evolution/IEvUserDistributionSampler.md
+++ b/docs/api_docs/python/recsim/environments/interest_evolution/IEvUserDistributionSampler.md
@@ -9,37 +9,27 @@
 
 # recsim.environments.interest_evolution.IEvUserDistributionSampler
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_evolution.py">View
 source</a>
 
-## Class `IEvUserDistributionSampler`
-
-<!-- Start diff -->
 Class to sample users by a hardcoded distribution.
 
 Inherits From:
 [`AbstractUserSampler`](../../../recsim/user/AbstractUserSampler.md)
 
-<!-- Placeholder for "Used in" -->
-
-<h2 id="__init__"><code>__init__</code></h2>
-
-<a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_evolution.py">View
-source</a>
-
-```python
-__init__(
-    user_ctor=recsim.environments.interest_evolution.IEvUserState,
-    **kwargs
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.environments.interest_evolution.IEvUserDistributionSampler(
+    user_ctor=recsim.environments.interest_evolution.IEvUserState, **kwargs
 )
-```
+</code></pre>
 
-Creates a new user state sampler.
+<!-- Placeholder for "Used in" -->
 
 ## Methods
 
@@ -48,9 +38,9 @@ Creates a new user state sampler.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/user.py">View
 source</a>
 
-```python
-get_user_ctor()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>get_user_ctor()
+</code></pre>
 
 Returns the constructor/class of the user states that will be sampled.
 
@@ -59,17 +49,17 @@ Returns the constructor/class of the user states that will be sampled.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/user.py">View
 source</a>
 
-```python
-reset_sampler()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>reset_sampler()
+</code></pre>
 
 <h3 id="sample_user"><code>sample_user</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_evolution.py">View
 source</a>
 
-```python
-sample_user()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>sample_user()
+</code></pre>
 
 Samples a new user, with a new set of features.
diff --git a/docs/api_docs/python/recsim/environments/interest_evolution/IEvUserModel.md b/docs/api_docs/python/recsim/environments/interest_evolution/IEvUserModel.md
index 203e0fe..be9a7a3 100644
--- a/docs/api_docs/python/recsim/environments/interest_evolution/IEvUserModel.md
+++ b/docs/api_docs/python/recsim/environments/interest_evolution/IEvUserModel.md
@@ -15,65 +15,117 @@
 
 # recsim.environments.interest_evolution.IEvUserModel
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_evolution.py">View
 source</a>
 
-## Class `IEvUserModel`
-
-<!-- Start diff -->
 Class to model an interest evolution user.
 
 Inherits From: [`AbstractUserModel`](../../../recsim/user/AbstractUserModel.md)
 
-<!-- Placeholder for "Used in" -->
-
-Assumes the user state contains: - user_interests - time_budget - no_click_mass
-
-<h2 id="__init__"><code>__init__</code></h2>
-
-<a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_evolution.py">View
-source</a>
-
-```python
-__init__(
-    slate_size,
-    choice_model_ctor=None,
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.environments.interest_evolution.IEvUserModel(
+    slate_size, choice_model_ctor=None,
     response_model_ctor=recsim.environments.interest_evolution.IEvResponse,
     user_state_ctor=recsim.environments.interest_evolution.IEvUserState,
-    no_click_mass=1.0,
-    seed=0,
-    alpha_x_intercept=1.0,
-    alpha_y_intercept=0.3
+    no_click_mass=1.0, seed=0, alpha_x_intercept=1.0, alpha_y_intercept=0.3
 )
-```
+</code></pre>
+
+<!-- Placeholder for "Used in" -->
 
-Initializes a new user model.
+Assumes the user state contains: - user_interests - time_budget - no_click_mass
 
-#### Args:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+`slate_size`
+</td>
+<td>
+An integer representing the size of the slate
+</td>
+</tr><tr>
+<td>
+`choice_model_ctor`
+</td>
+<td>
+A contructor function to create user choice model.
+</td>
+</tr><tr>
+<td>
+`response_model_ctor`
+</td>
+<td>
+A constructor function to create response. The
+function should take a string of doc ID as input and returns a
+IEvResponse object.
+</td>
+</tr><tr>
+<td>
+`user_state_ctor`
+</td>
+<td>
+A constructor to create user state
+</td>
+</tr><tr>
+<td>
+`no_click_mass`
+</td>
+<td>
+A float that will be passed to compute probability of no
+click.
+</td>
+</tr><tr>
+<td>
+`seed`
+</td>
+<td>
+A integer used as the seed of the choice model.
+</td>
+</tr><tr>
+<td>
+`alpha_x_intercept`
+</td>
+<td>
+A float for the x intercept of the line used to compute
+interests update factor.
+</td>
+</tr><tr>
+<td>
+`alpha_y_intercept`
+</td>
+<td>
+A float for the y intercept of the line used to compute
+interests update factor.
+</td>
+</tr>
+</table>
 
-*   <b>`slate_size`</b>: An integer representing the size of the slate
-*   <b>`choice_model_ctor`</b>: A contructor function to create user choice
-    model.
-*   <b>`response_model_ctor`</b>: A constructor function to create response. The
-    function should take a string of doc ID as input and returns a IEvResponse
-    object.
-*   <b>`user_state_ctor`</b>: A constructor to create user state
-*   <b>`no_click_mass`</b>: A float that will be passed to compute probability
-    of no click.
-*   <b>`seed`</b>: A integer used as the seed of the choice model.
-*   <b>`alpha_x_intercept`</b>: A float for the x intercept of the line used to
-    compute interests update factor.
-*   <b>`alpha_y_intercept`</b>: A float for the y intercept of the line used to
-    compute interests update factor.
+<!-- Tabular view -->
 
-#### Raises:
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Raises</h2></th></tr>
 
-*   <b>`Exception`</b>: if choice_model_ctor is not specified.
+<tr>
+<td>
+`Exception`
+</td>
+<td>
+if choice_model_ctor is not specified.
+</td>
+</tr>
+</table>
 
 ## Methods
 
@@ -82,9 +134,9 @@ Initializes a new user model.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/user.py">View
 source</a>
 
-```python
-create_observation()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>create_observation()
+</code></pre>
 
 Emits obesrvation about user's state.
 
@@ -93,9 +145,9 @@ Emits obesrvation about user's state.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/user.py">View
 source</a>
 
-```python
-get_response_model_ctor()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>get_response_model_ctor()
+</code></pre>
 
 Returns a constructor for the type of response this model will create.
 
@@ -104,9 +156,9 @@ Returns a constructor for the type of response this model will create.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_evolution.py">View
 source</a>
 
-```python
-is_terminal()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>is_terminal()
+</code></pre>
 
 Returns a boolean indicating if the session is over.
 
@@ -115,9 +167,9 @@ Returns a boolean indicating if the session is over.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/user.py">View
 source</a>
 
-```python
-observation_space()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>observation_space()
+</code></pre>
 
 A Gym.spaces object that describes possible user observations.
 
@@ -126,9 +178,9 @@ A Gym.spaces object that describes possible user observations.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/user.py">View
 source</a>
 
-```python
-reset()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>reset()
+</code></pre>
 
 Resets the user.
 
@@ -137,9 +189,9 @@ Resets the user.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/user.py">View
 source</a>
 
-```python
-reset_sampler()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>reset_sampler()
+</code></pre>
 
 Resets the sampler.
 
@@ -148,40 +200,65 @@ Resets the sampler.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/user.py">View
 source</a>
 
-```python
-response_space()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>response_space()
+</code></pre>
 
 <h3 id="simulate_response"><code>simulate_response</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_evolution.py">View
 source</a>
 
-```python
-simulate_response(documents)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>simulate_response(
+    documents
+)
+</code></pre>
 
 Simulates the user's response to a slate of documents with choice model.
 
-#### Args:
+<!-- Tabular view -->
 
-*   <b>`documents`</b>: a list of IEvVideo objects
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
 
-#### Returns:
+<tr>
+<td>
+`documents`
+</td>
+<td>
+a list of IEvVideo objects
+</td>
+</tr>
+</table>
 
-*   <b>`responses`</b>: a list of IEvResponse objects, one for each document
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+
+<tr>
+<td>
+`responses`
+</td>
+<td>
+a list of IEvResponse objects, one for each document
+</td>
+</tr>
+</table>
 
 <h3 id="update_state"><code>update_state</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_evolution.py">View
 source</a>
 
-```python
-update_state(
-    slate_documents,
-    responses
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>update_state(
+    slate_documents, responses
 )
-```
+</code></pre>
 
 Updates the user state based on responses to the slate.
 
@@ -190,8 +267,26 @@ update the user's interests some small step size alpha based on the user's
 interest in that topic. The update is either towards the video's features or
 away, and is determined stochastically by the user's interest in that document.
 
-#### Args:
-
-*   <b>`slate_documents`</b>: a list of IEvVideos representing the slate
-*   <b>`responses`</b>: a list of IEvResponses representing the user's response
-    to each video in the slate.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`slate_documents`
+</td>
+<td>
+a list of IEvVideos representing the slate
+</td>
+</tr><tr>
+<td>
+`responses`
+</td>
+<td>
+a list of IEvResponses representing the user's response to each
+video in the slate.
+</td>
+</tr>
+</table>
diff --git a/docs/api_docs/python/recsim/environments/interest_evolution/IEvUserState.md b/docs/api_docs/python/recsim/environments/interest_evolution/IEvUserState.md
index de51318..7cfb3f6 100644
--- a/docs/api_docs/python/recsim/environments/interest_evolution/IEvUserState.md
+++ b/docs/api_docs/python/recsim/environments/interest_evolution/IEvUserState.md
@@ -10,49 +10,30 @@
 
 # recsim.environments.interest_evolution.IEvUserState
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_evolution.py">View
 source</a>
 
-## Class `IEvUserState`
-
-<!-- Start diff -->
 Class to represent interest evolution users.
 
 Inherits From: [`AbstractUserState`](../../../recsim/user/AbstractUserState.md)
 
-<!-- Placeholder for "Used in" -->
-
-<h2 id="__init__"><code>__init__</code></h2>
-
-<a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_evolution.py">View
-source</a>
-
-```python
-__init__(
-    user_interests,
-    time_budget=None,
-    score_scaling=None,
-    attention_prob=None,
-    no_click_mass=None,
-    keep_interact_prob=None,
-    min_doc_utility=None,
-    user_update_alpha=None,
-    watched_videos=None,
-    impressed_videos=None,
-    liked_videos=None,
-    step_penalty=None,
-    min_normalizer=None,
-    user_quality_factor=None,
-    document_quality_factor=None
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.environments.interest_evolution.IEvUserState(
+    user_interests, time_budget=None, score_scaling=None, attention_prob=None,
+    no_click_mass=None, keep_interact_prob=None, min_doc_utility=None,
+    user_update_alpha=None, watched_videos=None, impressed_videos=None,
+    liked_videos=None, step_penalty=None, min_normalizer=None,
+    user_quality_factor=None, document_quality_factor=None
 )
-```
+</code></pre>
 
-Initializes a new user.
+<!-- Placeholder for "Used in" -->
 
 ## Methods
 
@@ -61,9 +42,9 @@ Initializes a new user.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_evolution.py">View
 source</a>
 
-```python
-create_observation()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>create_observation()
+</code></pre>
 
 Return an observation of this user's observable state.
 
@@ -72,10 +53,10 @@ Return an observation of this user's observable state.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_evolution.py">View
 source</a>
 
-```python
-@classmethod
-observation_space(cls)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>@classmethod</code>
+<code>observation_space()
+</code></pre>
 
 Gym.spaces object that defines how user states are represented.
 
@@ -84,10 +65,12 @@ Gym.spaces object that defines how user states are represented.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_evolution.py">View
 source</a>
 
-```python
-score_document(doc_obs)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>score_document(
+    doc_obs
+)
+</code></pre>
 
-## Class Members
+## Class Variables
 
 *   `NUM_FEATURES = 20` <a id="NUM_FEATURES"></a>
diff --git a/docs/api_docs/python/recsim/environments/interest_evolution/IEvVideo.md b/docs/api_docs/python/recsim/environments/interest_evolution/IEvVideo.md
index 4c0f226..54d2e5e 100644
--- a/docs/api_docs/python/recsim/environments/interest_evolution/IEvVideo.md
+++ b/docs/api_docs/python/recsim/environments/interest_evolution/IEvVideo.md
@@ -11,47 +11,64 @@
 
 # recsim.environments.interest_evolution.IEvVideo
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_evolution.py">View
 source</a>
 
-## Class `IEvVideo`
-
-<!-- Start diff -->
 Class to represent a interest evolution Video.
 
 Inherits From:
 [`AbstractDocument`](../../../recsim/document/AbstractDocument.md)
 
-<!-- Placeholder for "Used in" -->
-
-#### Attributes:
-
-*   <b>`features`</b>: A numpy array that stores video features.
-*   <b>`cluster_id`</b>: An integer that represents.
-*   <b>`video_length`</b>: A float for video length.
-*   <b>`quality`</b>: a float the represents document quality.
-
-<h2 id="__init__"><code>__init__</code></h2>
-
-<a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_evolution.py">View
-source</a>
-
-```python
-__init__(
-    doc_id,
-    features,
-    cluster_id=None,
-    video_length=None,
-    quality=None
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.environments.interest_evolution.IEvVideo(
+    doc_id, features, cluster_id=None, video_length=None, quality=None
 )
-```
+</code></pre>
 
-Generates a random set of features for this interest evolution Video.
+<!-- Placeholder for "Used in" -->
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
+
+<tr>
+<td>
+`features`
+</td>
+<td>
+A numpy array that stores video features.
+</td>
+</tr><tr>
+<td>
+`cluster_id`
+</td>
+<td>
+An integer that represents.
+</td>
+</tr><tr>
+<td>
+`video_length`
+</td>
+<td>
+A float for video length.
+</td>
+</tr><tr>
+<td>
+`quality`
+</td>
+<td>
+a float the represents document quality.
+</td>
+</tr>
+</table>
 
 ## Methods
 
@@ -60,9 +77,9 @@ Generates a random set of features for this interest evolution Video.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_evolution.py">View
 source</a>
 
-```python
-create_observation()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>create_observation()
+</code></pre>
 
 Returns observable properties of this document as a float array.
 
@@ -71,9 +88,9 @@ Returns observable properties of this document as a float array.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/document.py">View
 source</a>
 
-```python
-doc_id()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>doc_id()
+</code></pre>
 
 Returns the document ID.
 
@@ -82,14 +99,14 @@ Returns the document ID.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_evolution.py">View
 source</a>
 
-```python
-@classmethod
-observation_space(cls)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>@classmethod</code>
+<code>observation_space()
+</code></pre>
 
 Gym space that defines how documents are represented.
 
-## Class Members
+## Class Variables
 
 *   `MAX_VIDEO_LENGTH = 100.0` <a id="MAX_VIDEO_LENGTH"></a>
 *   `NUM_FEATURES = 20` <a id="NUM_FEATURES"></a>
diff --git a/docs/api_docs/python/recsim/environments/interest_evolution/IEvVideoSampler.md b/docs/api_docs/python/recsim/environments/interest_evolution/IEvVideoSampler.md
index 9db8eba..9c75df2 100644
--- a/docs/api_docs/python/recsim/environments/interest_evolution/IEvVideoSampler.md
+++ b/docs/api_docs/python/recsim/environments/interest_evolution/IEvVideoSampler.md
@@ -1,7 +1,6 @@
 <div itemscope itemtype="http://developers.google.com/ReferenceObject">
 <meta itemprop="name" content="recsim.environments.interest_evolution.IEvVideoSampler" />
 <meta itemprop="path" content="Stable" />
-<meta itemprop="property" content="num_clusters"/>
 <meta itemprop="property" content="__init__"/>
 <meta itemprop="property" content="get_doc_ctor"/>
 <meta itemprop="property" content="reset_sampler"/>
@@ -11,57 +10,97 @@
 
 # recsim.environments.interest_evolution.IEvVideoSampler
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_evolution.py">View
 source</a>
 
-## Class `IEvVideoSampler`
-
-<!-- Start diff -->
 Class to sample interest_evolution videos.
 
 Inherits From:
 [`AbstractDocumentSampler`](../../../recsim/document/AbstractDocumentSampler.md)
 
-<!-- Placeholder for "Used in" -->
-
-<h2 id="__init__"><code>__init__</code></h2>
-
-<a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_evolution.py">View
-source</a>
-
-```python
-__init__(
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.environments.interest_evolution.IEvVideoSampler(
     doc_ctor=recsim.environments.interest_evolution.IEvVideo,
-    min_feature_value=-1.0,
-    max_feature_value=1.0,
-    video_length_mean=4.3,
-    video_length_std=1.0,
-    **kwargs
+    min_feature_value=-1.0, max_feature_value=1.0, video_length_mean=4.3,
+    video_length_std=1.0, **kwargs
 )
-```
-
-Creates a new interest evolution video sampler.
+</code></pre>
 
-#### Args:
+<!-- Placeholder for "Used in" -->
 
-*   <b>`doc_ctor`</b>: A class/constructor for the type of videos that will be
-    sampled by this sampler.
-*   <b>`min_feature_value`</b>: A float for the min feature value.
-*   <b>`max_feature_value`</b>: A float for the max feature value.
-*   <b>`video_length_mean`</b>: A float for the mean of the video length.
-*   <b>`video_length_std`</b>: A float for the std deviation of video length.
-*   <b>`**kwargs`</b>: other keyword parameters for the video sampler.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+`doc_ctor`
+</td>
+<td>
+A class/constructor for the type of videos that will be sampled
+by this sampler.
+</td>
+</tr><tr>
+<td>
+`min_feature_value`
+</td>
+<td>
+A float for the min feature value.
+</td>
+</tr><tr>
+<td>
+`max_feature_value`
+</td>
+<td>
+A float for the max feature value.
+</td>
+</tr><tr>
+<td>
+`video_length_mean`
+</td>
+<td>
+A float for the mean of the video length.
+</td>
+</tr><tr>
+<td>
+`video_length_std`
+</td>
+<td>
+A float for the std deviation of video length.
+</td>
+</tr><tr>
+<td>
+`**kwargs`
+</td>
+<td>
+other keyword parameters for the video sampler.
+</td>
+</tr>
+</table>
 
-## Properties
+<!-- Tabular view -->
 
-<h3 id="num_clusters"><code>num_clusters</code></h3>
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
+<tr>
+<td>
+`num_clusters`
+</td>
+<td>
 Returns the number of document clusters. Returns 0 if not applicable.
+</td>
+</tr>
+</table>
 
 ## Methods
 
@@ -70,9 +109,9 @@ Returns the number of document clusters. Returns 0 if not applicable.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/document.py">View
 source</a>
 
-```python
-get_doc_ctor()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>get_doc_ctor()
+</code></pre>
 
 Returns the constructor/class of the documents that will be sampled.
 
@@ -81,18 +120,18 @@ Returns the constructor/class of the documents that will be sampled.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/document.py">View
 source</a>
 
-```python
-reset_sampler()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>reset_sampler()
+</code></pre>
 
 <h3 id="sample_document"><code>sample_document</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_evolution.py">View
 source</a>
 
-```python
-sample_document()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>sample_document()
+</code></pre>
 
 Samples and return an instantiation of AbstractDocument.
 
@@ -101,11 +140,10 @@ Samples and return an instantiation of AbstractDocument.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/document.py">View
 source</a>
 
-```python
-update_state(
-    documents,
-    responses
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>update_state(
+    documents, responses
 )
-```
+</code></pre>
 
 Update document state (if needed) given user's (or users') responses.
diff --git a/docs/api_docs/python/recsim/environments/interest_evolution/UtilityModelUserSampler.md b/docs/api_docs/python/recsim/environments/interest_evolution/UtilityModelUserSampler.md
index e453d6c..3536933 100644
--- a/docs/api_docs/python/recsim/environments/interest_evolution/UtilityModelUserSampler.md
+++ b/docs/api_docs/python/recsim/environments/interest_evolution/UtilityModelUserSampler.md
@@ -9,34 +9,28 @@
 
 # recsim.environments.interest_evolution.UtilityModelUserSampler
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_evolution.py">View
 source</a>
 
-## Class `UtilityModelUserSampler`
-
-<!-- Start diff -->
 Class that samples users for utility model experiment.
 
 Inherits From:
 [`AbstractUserSampler`](../../../recsim/user/AbstractUserSampler.md)
 
-<!-- Placeholder for "Used in" -->
-
-<h2 id="__init__"><code>__init__</code></h2>
-
-```python
-__init__(
-    *args,
-    **kwargs
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.environments.interest_evolution.UtilityModelUserSampler(
+    user_ctor=recsim.environments.interest_evolution.IEvUserState,
+    document_quality_factor=1.0, no_click_mass=1.0, min_normalizer=-1.0, **kwargs
 )
-```
+</code></pre>
 
-Creates a new user state sampler.
+<!-- Placeholder for "Used in" -->
 
 ## Methods
 
@@ -45,9 +39,9 @@ Creates a new user state sampler.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/user.py">View
 source</a>
 
-```python
-get_user_ctor()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>get_user_ctor()
+</code></pre>
 
 Returns the constructor/class of the user states that will be sampled.
 
@@ -56,17 +50,17 @@ Returns the constructor/class of the user states that will be sampled.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/user.py">View
 source</a>
 
-```python
-reset_sampler()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>reset_sampler()
+</code></pre>
 
 <h3 id="sample_user"><code>sample_user</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_evolution.py">View
 source</a>
 
-```python
-sample_user()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>sample_user()
+</code></pre>
 
 Creates a new instantiation of this user's hidden state parameters.
diff --git a/docs/api_docs/python/recsim/environments/interest_evolution/UtilityModelVideoSampler.md b/docs/api_docs/python/recsim/environments/interest_evolution/UtilityModelVideoSampler.md
index c7c8ab7..ed0b748 100644
--- a/docs/api_docs/python/recsim/environments/interest_evolution/UtilityModelVideoSampler.md
+++ b/docs/api_docs/python/recsim/environments/interest_evolution/UtilityModelVideoSampler.md
@@ -1,7 +1,6 @@
 <div itemscope itemtype="http://developers.google.com/ReferenceObject">
 <meta itemprop="name" content="recsim.environments.interest_evolution.UtilityModelVideoSampler" />
 <meta itemprop="path" content="Stable" />
-<meta itemprop="property" content="num_clusters"/>
 <meta itemprop="property" content="__init__"/>
 <meta itemprop="property" content="get_doc_ctor"/>
 <meta itemprop="property" content="reset_sampler"/>
@@ -11,55 +10,89 @@
 
 # recsim.environments.interest_evolution.UtilityModelVideoSampler
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_evolution.py">View
 source</a>
 
-## Class `UtilityModelVideoSampler`
-
-<!-- Start diff -->
 Class that samples videos for utility model experiment.
 
 Inherits From:
 [`AbstractDocumentSampler`](../../../recsim/document/AbstractDocumentSampler.md)
 
-<!-- Placeholder for "Used in" -->
-
-<h2 id="__init__"><code>__init__</code></h2>
-
-<a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_evolution.py">View
-source</a>
-
-```python
-__init__(
-    doc_ctor=recsim.environments.interest_evolution.IEvVideo,
-    min_utility=-3.0,
-    max_utility=3.0,
-    video_length=4.0,
-    **kwargs
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.environments.interest_evolution.UtilityModelVideoSampler(
+    doc_ctor=recsim.environments.interest_evolution.IEvVideo, min_utility=-3.0,
+    max_utility=3.0, video_length=4.0, **kwargs
 )
-```
-
-Creates a new utility model video sampler.
+</code></pre>
 
-#### Args:
+<!-- Placeholder for "Used in" -->
 
-*   <b>`doc_ctor`</b>: A class/constructor for the type of videos that will be
-    sampled by this sampler.
-*   <b>`min_utility`</b>: A float for the min utility score.
-*   <b>`max_utility`</b>: A float for the max utility score.
-*   <b>`video_length`</b>: A float for the video_length in minutes.
-*   <b>`**kwargs`</b>: other keyword parameters for the video sampler.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+`doc_ctor`
+</td>
+<td>
+A class/constructor for the type of videos that will be sampled
+by this sampler.
+</td>
+</tr><tr>
+<td>
+`min_utility`
+</td>
+<td>
+A float for the min utility score.
+</td>
+</tr><tr>
+<td>
+`max_utility`
+</td>
+<td>
+A float for the max utility score.
+</td>
+</tr><tr>
+<td>
+`video_length`
+</td>
+<td>
+A float for the video_length in minutes.
+</td>
+</tr><tr>
+<td>
+`**kwargs`
+</td>
+<td>
+other keyword parameters for the video sampler.
+</td>
+</tr>
+</table>
 
-## Properties
+<!-- Tabular view -->
 
-<h3 id="num_clusters"><code>num_clusters</code></h3>
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
+<tr>
+<td>
+`num_clusters`
+</td>
+<td>
 Returns the number of document clusters. Returns 0 if not applicable.
+</td>
+</tr>
+</table>
 
 ## Methods
 
@@ -68,9 +101,9 @@ Returns the number of document clusters. Returns 0 if not applicable.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/document.py">View
 source</a>
 
-```python
-get_doc_ctor()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>get_doc_ctor()
+</code></pre>
 
 Returns the constructor/class of the documents that will be sampled.
 
@@ -79,18 +112,18 @@ Returns the constructor/class of the documents that will be sampled.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/document.py">View
 source</a>
 
-```python
-reset_sampler()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>reset_sampler()
+</code></pre>
 
 <h3 id="sample_document"><code>sample_document</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_evolution.py">View
 source</a>
 
-```python
-sample_document()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>sample_document()
+</code></pre>
 
 Samples and return an instantiation of AbstractDocument.
 
@@ -99,11 +132,10 @@ Samples and return an instantiation of AbstractDocument.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/document.py">View
 source</a>
 
-```python
-update_state(
-    documents,
-    responses
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>update_state(
+    documents, responses
 )
-```
+</code></pre>
 
 Update document state (if needed) given user's (or users') responses.
diff --git a/docs/api_docs/python/recsim/environments/interest_evolution/clicked_watchtime_reward.md b/docs/api_docs/python/recsim/environments/interest_evolution/clicked_watchtime_reward.md
index 003a81d..685cb22 100644
--- a/docs/api_docs/python/recsim/environments/interest_evolution/clicked_watchtime_reward.md
+++ b/docs/api_docs/python/recsim/environments/interest_evolution/clicked_watchtime_reward.md
@@ -5,28 +5,53 @@
 
 # recsim.environments.interest_evolution.clicked_watchtime_reward
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_evolution.py">View
 source</a>
 
-<!-- Start diff -->
 Calculates the total clicked watchtime from a list of responses.
 
-```python
-recsim.environments.interest_evolution.clicked_watchtime_reward(responses)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.environments.interest_evolution.clicked_watchtime_reward(
+    responses
+)
+</code></pre>
 
 <!-- Placeholder for "Used in" -->
 
-#### Args:
+<!-- Tabular view -->
 
-*   <b>`responses`</b>: A list of IEvResponse objects
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
-#### Returns:
+<tr>
+<td>
+`responses`
+</td>
+<td>
+A list of IEvResponse objects
+</td>
+</tr>
+</table>
 
-*   <b>`reward`</b>: A float representing the total watch time from the
-    responses
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
+
+<tr>
+<td>
+`reward`
+</td>
+<td>
+A float representing the total watch time from the responses
+</td>
+</tr>
+</table>
diff --git a/docs/api_docs/python/recsim/environments/interest_evolution/create_environment.md b/docs/api_docs/python/recsim/environments/interest_evolution/create_environment.md
index 8235ecb..e391fa6 100644
--- a/docs/api_docs/python/recsim/environments/interest_evolution/create_environment.md
+++ b/docs/api_docs/python/recsim/environments/interest_evolution/create_environment.md
@@ -5,20 +5,21 @@
 
 # recsim.environments.interest_evolution.create_environment
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_evolution.py">View
 source</a>
 
-<!-- Start diff -->
-
 Creates an interest evolution environment.
 
-```python
-recsim.environments.interest_evolution.create_environment(env_config)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.environments.interest_evolution.create_environment(
+    env_config
+)
+</code></pre>
 
 <!-- Placeholder for "Used in" -->
diff --git a/docs/api_docs/python/recsim/environments/interest_evolution/total_clicks_reward.md b/docs/api_docs/python/recsim/environments/interest_evolution/total_clicks_reward.md
index 80b3c5c..a89fb87 100644
--- a/docs/api_docs/python/recsim/environments/interest_evolution/total_clicks_reward.md
+++ b/docs/api_docs/python/recsim/environments/interest_evolution/total_clicks_reward.md
@@ -5,27 +5,53 @@
 
 # recsim.environments.interest_evolution.total_clicks_reward
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_evolution.py">View
 source</a>
 
-<!-- Start diff -->
 Calculates the total number of clicks from a list of responses.
 
-```python
-recsim.environments.interest_evolution.total_clicks_reward(responses)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.environments.interest_evolution.total_clicks_reward(
+    responses
+)
+</code></pre>
 
 <!-- Placeholder for "Used in" -->
 
-#### Args:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
-*   <b>`responses`</b>: A list of IEvResponse objects
+<tr>
+<td>
+`responses`
+</td>
+<td>
+A list of IEvResponse objects
+</td>
+</tr>
+</table>
 
-#### Returns:
+<!-- Tabular view -->
 
-*   <b>`reward`</b>: A float representing the total clicks from the responses
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
+
+<tr>
+<td>
+`reward`
+</td>
+<td>
+A float representing the total clicks from the responses
+</td>
+</tr>
+</table>
diff --git a/docs/api_docs/python/recsim/environments/interest_exploration.md b/docs/api_docs/python/recsim/environments/interest_exploration.md
index 9a46aaa..169bd47 100644
--- a/docs/api_docs/python/recsim/environments/interest_exploration.md
+++ b/docs/api_docs/python/recsim/environments/interest_exploration.md
@@ -1,12 +1,14 @@
 <div itemscope itemtype="http://developers.google.com/ReferenceObject">
 <meta itemprop="name" content="recsim.environments.interest_exploration" />
 <meta itemprop="path" content="Stable" />
-<meta itemprop="property" content="FLAGS"/>
 </div>
 
 # Module: recsim.environments.interest_exploration
 
+<!-- Insert buttons and diff -->
+
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_exploration.py">View
@@ -58,12 +60,11 @@ Class to represent users.
 
 ## Functions
 
+[`FLAGS(...)`](../../recsim/environments/interest_evolution/FLAGS.md): Registry
+of 'Flag' objects.
+
 [`create_environment(...)`](../../recsim/environments/interest_exploration/create_environment.md):
 Creates an interest exploration environment.
 
 [`total_clicks_reward(...)`](../../recsim/environments/interest_exploration/total_clicks_reward.md):
 Calculates the total number of clicks from a list of responses.
-
-## Other Members
-
-*   `FLAGS` <a id="FLAGS"></a>
diff --git a/docs/api_docs/python/recsim/environments/interest_exploration/IEClusterUserSampler.md b/docs/api_docs/python/recsim/environments/interest_exploration/IEClusterUserSampler.md
index e114c31..98a5132 100644
--- a/docs/api_docs/python/recsim/environments/interest_exploration/IEClusterUserSampler.md
+++ b/docs/api_docs/python/recsim/environments/interest_exploration/IEClusterUserSampler.md
@@ -10,22 +10,28 @@
 
 # recsim.environments.interest_exploration.IEClusterUserSampler
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_exploration.py">View
 source</a>
 
-## Class `IEClusterUserSampler`
-
-<!-- Start diff -->
 Samples users from predetermined types with type-specific parameters.
 
 Inherits From:
 [`AbstractUserSampler`](../../../recsim/user/AbstractUserSampler.md)
 
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.environments.interest_exploration.IEClusterUserSampler(
+    user_type_distribution=(0.3, 0.7), user_document_mean_affinity_matrix=((0.1,
+    0.7), (0.7, 0.1)), user_document_stddev_affinity_matrix=((0.1, 0.1), (0.1,
+    0.1)), user_ctor=recsim.environments.interest_exploration.IEUserState, **kwargs
+)
+</code></pre>
+
 <!-- Placeholder for "Used in" -->
 
 This sampler consumes a distribution over user types and type-specific
@@ -35,37 +41,71 @@ type-specific parameters. In this case, these are the mean and scale of a
 lognormal distribution, i.e. the affinity of user u of type U towards an
 document of type D is drawn according to lognormal(mean(U,D), scale(U,D)).
 
-#### Args:
-
-*   <b>`user_type_distribution`</b>: a non-negative array of dimension equal to
-    the number of user types, whose entries sum to one.
-*   <b>`user_document_mean_affinity_matrix`</b>: a non-negative two-dimensional
-    array with dimensions number of user types by number of document topics.
-    Represents the mean of the affinity score of a user type to a topic.
-*   <b>`user_document_stddev_affinity_matrix`</b>: a non-negative
-    two-dimensional array with dimensions number of user types by number of
-    document topics. Represents the scale of the affinity score of a user type
-    to a topic.
-*   <b>`user_ctor`</b>: constructor for a user state.
-
-<h2 id="__init__"><code>__init__</code></h2>
-
-```python
-__init__(
-    *args,
-    **kwargs
-)
-```
-
-Creates a new user state sampler.
-
-User states of the type user_ctor are sampled.
-
-#### Args:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+`user_type_distribution`
+</td>
+<td>
+a non-negative array of dimension equal to the
+number of user types, whose entries sum to one.
+</td>
+</tr><tr>
+<td>
+`user_document_mean_affinity_matrix`
+</td>
+<td>
+a non-negative two-dimensional array
+with dimensions number of user types by number of document topics.
+Represents the mean of the affinity score of a user type to a topic.
+</td>
+</tr><tr>
+<td>
+`user_document_stddev_affinity_matrix`
+</td>
+<td>
+a non-negative two-dimensional array
+with dimensions number of user types by number of document topics.
+Represents the scale of the affinity score of a user type to a topic.
+</td>
+</tr><tr>
+<td>
+`user_ctor`
+</td>
+<td>
+constructor for a user state.
+</td>
+</tr>
+</table>
 
-*   <b>`user_ctor`</b>: A class/constructor for the type of user states that
-    will be sampled.
-*   <b>`seed`</b>: An integer for a random seed.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+`user_ctor`
+</td>
+<td>
+A class/constructor for the type of user states that will be
+sampled.
+</td>
+</tr><tr>
+<td>
+`seed`
+</td>
+<td>
+An integer for a random seed.
+</td>
+</tr>
+</table>
 
 ## Methods
 
@@ -74,18 +114,18 @@ User states of the type user_ctor are sampled.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_exploration.py">View
 source</a>
 
-```python
-avg_affinity_given_topic()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>avg_affinity_given_topic()
+</code></pre>
 
 <h3 id="get_user_ctor"><code>get_user_ctor</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/user.py">View
 source</a>
 
-```python
-get_user_ctor()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>get_user_ctor()
+</code></pre>
 
 Returns the constructor/class of the user states that will be sampled.
 
@@ -94,17 +134,17 @@ Returns the constructor/class of the user states that will be sampled.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/user.py">View
 source</a>
 
-```python
-reset_sampler()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>reset_sampler()
+</code></pre>
 
 <h3 id="sample_user"><code>sample_user</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_exploration.py">View
 source</a>
 
-```python
-sample_user()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>sample_user()
+</code></pre>
 
 Creates a new instantiation of this user's hidden state parameters.
diff --git a/docs/api_docs/python/recsim/environments/interest_exploration/IEDocument.md b/docs/api_docs/python/recsim/environments/interest_exploration/IEDocument.md
index 87b7000..9a11838 100644
--- a/docs/api_docs/python/recsim/environments/interest_exploration/IEDocument.md
+++ b/docs/api_docs/python/recsim/environments/interest_exploration/IEDocument.md
@@ -6,48 +6,55 @@
 <meta itemprop="property" content="doc_id"/>
 <meta itemprop="property" content="observation_space"/>
 <meta itemprop="property" content="NUM_CLUSTERS"/>
+<meta itemprop="property" content="NUM_FEATURES"/>
 </div>
 
 # recsim.environments.interest_exploration.IEDocument
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_exploration.py">View
 source</a>
 
-## Class `IEDocument`
-
-<!-- Start diff -->
 Class to represent an IE Document.
 
 Inherits From:
 [`AbstractDocument`](../../../recsim/document/AbstractDocument.md)
 
-<!-- Placeholder for "Used in" -->
-
-#### Attributes:
-
-*   <b>`cluster_id`</b>: an integer representing the document cluster.
-*   <b>`quality`</b>: non-negative real number representing the quality of the
-    document.
-
-<h2 id="__init__"><code>__init__</code></h2>
-
-<a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_exploration.py">View
-source</a>
-
-```python
-__init__(
-    doc_id,
-    cluster_id,
-    quality
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.environments.interest_exploration.IEDocument(
+    doc_id, cluster_id, quality
 )
-```
+</code></pre>
 
-Initialize self. See help(type(self)) for accurate signature.
+<!-- Placeholder for "Used in" -->
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
+
+<tr>
+<td>
+`cluster_id`
+</td>
+<td>
+an integer representing the document cluster.
+</td>
+</tr><tr>
+<td>
+`quality`
+</td>
+<td>
+non-negative real number representing the quality of the document.
+</td>
+</tr>
+</table>
 
 ## Methods
 
@@ -56,9 +63,9 @@ Initialize self. See help(type(self)) for accurate signature.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_exploration.py">View
 source</a>
 
-```python
-create_observation()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>create_observation()
+</code></pre>
 
 Returns observable properties of this document as a float array.
 
@@ -67,9 +74,9 @@ Returns observable properties of this document as a float array.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/document.py">View
 source</a>
 
-```python
-doc_id()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>doc_id()
+</code></pre>
 
 Returns the document ID.
 
@@ -78,13 +85,14 @@ Returns the document ID.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_exploration.py">View
 source</a>
 
-```python
-@classmethod
-observation_space(cls)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>@classmethod</code>
+<code>observation_space()
+</code></pre>
 
 Gym space that defines how documents are represented.
 
-## Class Members
+## Class Variables
 
 *   `NUM_CLUSTERS = 0` <a id="NUM_CLUSTERS"></a>
+*   `NUM_FEATURES = None` <a id="NUM_FEATURES"></a>
diff --git a/docs/api_docs/python/recsim/environments/interest_exploration/IEResponse.md b/docs/api_docs/python/recsim/environments/interest_exploration/IEResponse.md
index 69ce4cf..5f950cc 100644
--- a/docs/api_docs/python/recsim/environments/interest_exploration/IEResponse.md
+++ b/docs/api_docs/python/recsim/environments/interest_exploration/IEResponse.md
@@ -9,43 +9,56 @@
 
 # recsim.environments.interest_exploration.IEResponse
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_exploration.py">View
 source</a>
 
-## Class `IEResponse`
-
-<!-- Start diff -->
 Class to represent a user's response to a document.
 
 Inherits From: [`AbstractResponse`](../../../recsim/user/AbstractResponse.md)
 
-<!-- Placeholder for "Used in" -->
-
-#### Attributes:
-
-*   <b>`clicked`</b>: boolean indicating whether the item was clicked or not.
-*   <b>`quality`</b>: a float indicating the quality of the document.
-*   <b>`cluster_id`</b>: an integer representing the topic ID of the document.
-
-<h2 id="__init__"><code>__init__</code></h2>
-
-<a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_exploration.py">View
-source</a>
-
-```python
-__init__(
-    clicked=False,
-    quality=0.0,
-    cluster_id=0
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.environments.interest_exploration.IEResponse(
+    clicked=False, quality=0.0, cluster_id=0
 )
-```
+</code></pre>
 
-Initialize self. See help(type(self)) for accurate signature.
+<!-- Placeholder for "Used in" -->
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
+
+<tr>
+<td>
+`clicked`
+</td>
+<td>
+boolean indicating whether the item was clicked or not.
+</td>
+</tr><tr>
+<td>
+`quality`
+</td>
+<td>
+a float indicating the quality of the document.
+</td>
+</tr><tr>
+<td>
+`cluster_id`
+</td>
+<td>
+an integer representing the topic ID of the document.
+</td>
+</tr>
+</table>
 
 ## Methods
 
@@ -54,9 +67,9 @@ Initialize self. See help(type(self)) for accurate signature.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_exploration.py">View
 source</a>
 
-```python
-create_observation()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>create_observation()
+</code></pre>
 
 Creates a tensor observation of this response.
 
@@ -65,13 +78,13 @@ Creates a tensor observation of this response.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_exploration.py">View
 source</a>
 
-```python
-@classmethod
-response_space(cls)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>@classmethod</code>
+<code>response_space()
+</code></pre>
 
 ArraySpec that defines how a single response is represented.
 
-## Class Members
+## Class Variables
 
 *   `NUM_CLUSTERS = 0` <a id="NUM_CLUSTERS"></a>
diff --git a/docs/api_docs/python/recsim/environments/interest_exploration/IETopicDocumentSampler.md b/docs/api_docs/python/recsim/environments/interest_exploration/IETopicDocumentSampler.md
index 3c3f183..3cdfc27 100644
--- a/docs/api_docs/python/recsim/environments/interest_exploration/IETopicDocumentSampler.md
+++ b/docs/api_docs/python/recsim/environments/interest_exploration/IETopicDocumentSampler.md
@@ -1,7 +1,6 @@
 <div itemscope itemtype="http://developers.google.com/ReferenceObject">
 <meta itemprop="name" content="recsim.environments.interest_exploration.IETopicDocumentSampler" />
 <meta itemprop="path" content="Stable" />
-<meta itemprop="property" content="num_clusters"/>
 <meta itemprop="property" content="__init__"/>
 <meta itemprop="property" content="get_doc_ctor"/>
 <meta itemprop="property" content="reset_sampler"/>
@@ -11,54 +10,89 @@
 
 # recsim.environments.interest_exploration.IETopicDocumentSampler
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_exploration.py">View
 source</a>
 
-## Class `IETopicDocumentSampler`
-
-<!-- Start diff -->
 Samples documents with topic-specific quality distribution.
 
 Inherits From:
 [`AbstractDocumentSampler`](../../../recsim/document/AbstractDocumentSampler.md)
 
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.environments.interest_exploration.IETopicDocumentSampler(
+    topic_distribution=(0.2, 0.8), topic_quality_mean=(0.8, 0.2),
+    topic_quality_stddev=(0.1, 0.1),
+    doc_ctor=recsim.environments.interest_exploration.IEDocument, **kwargs
+)
+</code></pre>
+
 <!-- Placeholder for "Used in" -->
 
 Consumes a distribution over document topics and topic-specific parameters for
 generating a quality score (according to a lognormal distribution).
 
-#### Args:
-
-*   <b>`topic_distribution`</b>: a non-negative array of dimension equal to the
-    number of topics, whose entries sum to one.
-*   <b>`topic_quality_mean`</b>: a non-negative array of dimension equal to the
-    number of topics, representing the mean of the topic quality score.
-*   <b>`topic_quality_stddev`</b>: a non-negative array of dimension equal to
-    the number of topics, representing the scale of the topic quality score.
-*   <b>`doc_ctor`</b>: A class/constructor for the type of videos that will be
-    sampled by this sampler.
-
-<h2 id="__init__"><code>__init__</code></h2>
-
-```python
-__init__(
-    *args,
-    **kwargs
-)
-```
-
-Initialize self. See help(type(self)) for accurate signature.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+`topic_distribution`
+</td>
+<td>
+a non-negative array of dimension equal to the
+number of topics, whose entries sum to one.
+</td>
+</tr><tr>
+<td>
+`topic_quality_mean`
+</td>
+<td>
+a non-negative array of dimension equal to the
+number of topics, representing the mean of the topic quality score.
+</td>
+</tr><tr>
+<td>
+`topic_quality_stddev`
+</td>
+<td>
+a non-negative array of dimension equal to the
+number of topics, representing the scale of the topic quality score.
+</td>
+</tr><tr>
+<td>
+`doc_ctor`
+</td>
+<td>
+A class/constructor for the type of videos that will be sampled
+by this sampler.
+</td>
+</tr>
+</table>
 
-## Properties
+<!-- Tabular view -->
 
-<h3 id="num_clusters"><code>num_clusters</code></h3>
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
+<tr>
+<td>
+`num_clusters`
+</td>
+<td>
 Returns the number of document clusters. Returns 0 if not applicable.
+</td>
+</tr>
+</table>
 
 ## Methods
 
@@ -67,9 +101,9 @@ Returns the number of document clusters. Returns 0 if not applicable.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/document.py">View
 source</a>
 
-```python
-get_doc_ctor()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>get_doc_ctor()
+</code></pre>
 
 Returns the constructor/class of the documents that will be sampled.
 
@@ -78,18 +112,18 @@ Returns the constructor/class of the documents that will be sampled.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/document.py">View
 source</a>
 
-```python
-reset_sampler()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>reset_sampler()
+</code></pre>
 
 <h3 id="sample_document"><code>sample_document</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_exploration.py">View
 source</a>
 
-```python
-sample_document()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>sample_document()
+</code></pre>
 
 Samples the topic and then samples document features given the topic.
 
@@ -98,11 +132,10 @@ Samples the topic and then samples document features given the topic.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/document.py">View
 source</a>
 
-```python
-update_state(
-    documents,
-    responses
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>update_state(
+    documents, responses
 )
-```
+</code></pre>
 
 Update document state (if needed) given user's (or users') responses.
diff --git a/docs/api_docs/python/recsim/environments/interest_exploration/IEUserModel.md b/docs/api_docs/python/recsim/environments/interest_exploration/IEUserModel.md
index d8eb990..ae88380 100644
--- a/docs/api_docs/python/recsim/environments/interest_exploration/IEUserModel.md
+++ b/docs/api_docs/python/recsim/environments/interest_exploration/IEUserModel.md
@@ -1,7 +1,6 @@
 <div itemscope itemtype="http://developers.google.com/ReferenceObject">
 <meta itemprop="name" content="recsim.environments.interest_exploration.IEUserModel" />
 <meta itemprop="path" content="Stable" />
-<meta itemprop="property" content="avg_user_state"/>
 <meta itemprop="property" content="__init__"/>
 <meta itemprop="property" content="create_observation"/>
 <meta itemprop="property" content="get_response_model_ctor"/>
@@ -16,21 +15,27 @@
 
 # recsim.environments.interest_exploration.IEUserModel
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_exploration.py">View
 source</a>
 
-## Class `IEUserModel`
-
-<!-- Start diff -->
 Class to model a user.
 
 Inherits From: [`AbstractUserModel`](../../../recsim/user/AbstractUserModel.md)
 
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.environments.interest_exploration.IEUserModel(
+    slate_size, no_click_mass=5,
+    choice_model_ctor=recsim.choice_model.MultinomialLogitChoiceModel,
+    user_state_ctor=None, response_model_ctor=None, seed=0
+)
+</code></pre>
+
 <!-- Placeholder for "Used in" -->
 
 The user in this scenario is completely characterized by a vector g of affinity
@@ -42,48 +47,69 @@ on these scores.
 The state space consists of a vector of affinity scores which is unique to the
 user and static but not observable.
 
-#### Args:
-
-slate_size: An integer representing the size of the slate. no_click_mass: A
-float indicating the mass given to a no-click option. Must be positive,
-otherwise CTR is always 1. choice_model_ctor: A contructor function to create
-user choice model. user_state_ctor: A constructor to create user state.
-response_model_ctor: A constructor function to create response. The function
-should take a string of doc ID as input and returns a IEResponse object. seed:
-an integer used as the seed in random sampling.
-
-<h2 id="__init__"><code>__init__</code></h2>
-
-<a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_exploration.py">View
-source</a>
-
-```python
-__init__(
-    slate_size,
-    no_click_mass=5,
-    choice_model_ctor=recsim.choice_model.MultinomialLogitChoiceModel,
-    user_state_ctor=None,
-    response_model_ctor=None,
-    seed=0
-)
-```
+<!-- Tabular view -->
 
-Initializes a new user model.
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
-#### Args:
+</table>
 
-*   <b>`response_model_ctor`</b>: A class/constructor representing the type of
-    responses this model will generate.
-*   <b>`user_sampler`</b>: An instance of AbstractUserSampler that can generate
-    initial user states from an inital state distribution.
-*   <b>`slate_size`</b>: integer number of documents that can be served to the
-    user at any interaction.
+slate_size: An integer representing the size of the slate. no_click_mass: A
+float indicating the mass given to a no-click option. choice_model_ctor: A
+contructor function to create user choice model. user_state_ctor: A constructor
+to create user state. response_model_ctor: A constructor function to create
+response. The function should take a string of doc ID as input and returns a
+IEResponse object. seed: an integer used as the seed in random sampling.
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+`response_model_ctor`
+</td>
+<td>
+A class/constructor representing the type of
+responses this model will generate.
+</td>
+</tr><tr>
+<td>
+`user_sampler`
+</td>
+<td>
+An instance of AbstractUserSampler that can generate
+initial user states from an inital state distribution.
+</td>
+</tr><tr>
+<td>
+`slate_size`
+</td>
+<td>
+integer number of documents that can be served to the user at
+any interaction.
+</td>
+</tr>
+</table>
 
-## Properties
+<!-- Tabular view -->
 
-<h3 id="avg_user_state"><code>avg_user_state</code></h3>
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
+<tr>
+<td>
+`avg_user_state`
+</td>
+<td>
 Returns the prior of user state.
+</td>
+</tr>
+</table>
 
 ## Methods
 
@@ -92,9 +118,9 @@ Returns the prior of user state.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/user.py">View
 source</a>
 
-```python
-create_observation()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>create_observation()
+</code></pre>
 
 Emits obesrvation about user's state.
 
@@ -103,9 +129,9 @@ Emits obesrvation about user's state.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/user.py">View
 source</a>
 
-```python
-get_response_model_ctor()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>get_response_model_ctor()
+</code></pre>
 
 Returns a constructor for the type of response this model will create.
 
@@ -114,9 +140,9 @@ Returns a constructor for the type of response this model will create.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_exploration.py">View
 source</a>
 
-```python
-is_terminal()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>is_terminal()
+</code></pre>
 
 Returns a boolean indicating if the session is over.
 
@@ -125,9 +151,9 @@ Returns a boolean indicating if the session is over.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/user.py">View
 source</a>
 
-```python
-observation_space()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>observation_space()
+</code></pre>
 
 A Gym.spaces object that describes possible user observations.
 
@@ -136,9 +162,9 @@ A Gym.spaces object that describes possible user observations.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/user.py">View
 source</a>
 
-```python
-reset()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>reset()
+</code></pre>
 
 Resets the user.
 
@@ -147,9 +173,9 @@ Resets the user.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/user.py">View
 source</a>
 
-```python
-reset_sampler()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>reset_sampler()
+</code></pre>
 
 Resets the sampler.
 
@@ -158,46 +184,89 @@ Resets the sampler.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/user.py">View
 source</a>
 
-```python
-response_space()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>response_space()
+</code></pre>
 
 <h3 id="simulate_response"><code>simulate_response</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_exploration.py">View
 source</a>
 
-```python
-simulate_response(documents)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>simulate_response(
+    documents
+)
+</code></pre>
 
 Simulates the user's response to a slate of documents with choice model.
 
-#### Args:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`documents`
+</td>
+<td>
+a list of IEDocument objects in the slate.
+</td>
+</tr>
+</table>
 
-*   <b>`documents`</b>: a list of IEDocument objects in the slate.
+<!-- Tabular view -->
 
-#### Returns:
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
 
-*   <b>`responses`</b>: a list of IEResponse objects, one for each document.
+<tr>
+<td>
+`responses`
+</td>
+<td>
+a list of IEResponse objects, one for each document.
+</td>
+</tr>
+</table>
 
 <h3 id="update_state"><code>update_state</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_exploration.py">View
 source</a>
 
-```python
-update_state(
-    slate_documents,
-    responses
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>update_state(
+    slate_documents, responses
 )
-```
+</code></pre>
 
 Updates the user's state based on the slate and document selected.
 
-#### Args:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`slate_documents`
+</td>
+<td>
+A list of AbstractDocuments for items in the slate.
+</td>
+</tr><tr>
+<td>
+`responses`
+</td>
+<td>
+A list of AbstractResponses for each item in the slate.
+</td>
+</tr>
+</table>
 
-*   <b>`slate_documents`</b>: A list of AbstractDocuments for items in the
-    slate.
-*   <b>`responses`</b>: A list of AbstractResponses for each item in the slate.
-    Updates: The user's hidden state.
+Updates: The user's hidden state.
diff --git a/docs/api_docs/python/recsim/environments/interest_exploration/IEUserState.md b/docs/api_docs/python/recsim/environments/interest_exploration/IEUserState.md
index 7a67fec..cf12bf5 100644
--- a/docs/api_docs/python/recsim/environments/interest_exploration/IEUserState.md
+++ b/docs/api_docs/python/recsim/environments/interest_exploration/IEUserState.md
@@ -5,42 +5,48 @@
 <meta itemprop="property" content="create_observation"/>
 <meta itemprop="property" content="observation_space"/>
 <meta itemprop="property" content="score_document"/>
+<meta itemprop="property" content="NUM_FEATURES"/>
 </div>
 
 # recsim.environments.interest_exploration.IEUserState
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_exploration.py">View
 source</a>
 
-## Class `IEUserState`
-
-<!-- Start diff -->
 Class to represent users.
 
 Inherits From: [`AbstractUserState`](../../../recsim/user/AbstractUserState.md)
 
-<!-- Placeholder for "Used in" -->
-
-#### Attributes:
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.environments.interest_exploration.IEUserState(
+    topic_affinity
+)
+</code></pre>
 
-*   <b>`topic_affinity`</b>: a nonnegative vector holds document type affinities
-    which are not temporal dynamics and hidden.
-
-<h2 id="__init__"><code>__init__</code></h2>
-
-<a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_exploration.py">View
-source</a>
-
-```python
-__init__(topic_affinity)
-```
+<!-- Placeholder for "Used in" -->
 
-Initializes a new user.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
+
+<tr>
+<td>
+`topic_affinity`
+</td>
+<td>
+a nonnegative vector holds document type affinities which
+are not temporal dynamics and hidden.
+</td>
+</tr>
+</table>
 
 ## Methods
 
@@ -49,9 +55,9 @@ Initializes a new user.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_exploration.py">View
 source</a>
 
-```python
-create_observation()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>create_observation()
+</code></pre>
 
 User's topic_affinity is not observable.
 
@@ -60,10 +66,10 @@ User's topic_affinity is not observable.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_exploration.py">View
 source</a>
 
-```python
-@staticmethod
-observation_space()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>@staticmethod</code>
+<code>observation_space()
+</code></pre>
 
 Gym.spaces object that defines how user states are represented.
 
@@ -72,8 +78,14 @@ Gym.spaces object that defines how user states are represented.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_exploration.py">View
 source</a>
 
-```python
-score_document(doc_obs)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>score_document(
+    doc_obs
+)
+</code></pre>
 
 Returns user document affinity plus document quality.
+
+## Class Variables
+
+*   `NUM_FEATURES = None` <a id="NUM_FEATURES"></a>
diff --git a/docs/api_docs/python/recsim/environments/interest_exploration/create_environment.md b/docs/api_docs/python/recsim/environments/interest_exploration/create_environment.md
index a555897..f8050b5 100644
--- a/docs/api_docs/python/recsim/environments/interest_exploration/create_environment.md
+++ b/docs/api_docs/python/recsim/environments/interest_exploration/create_environment.md
@@ -5,20 +5,21 @@
 
 # recsim.environments.interest_exploration.create_environment
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_exploration.py">View
 source</a>
 
-<!-- Start diff -->
-
 Creates an interest exploration environment.
 
-```python
-recsim.environments.interest_exploration.create_environment(env_config)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.environments.interest_exploration.create_environment(
+    env_config
+)
+</code></pre>
 
 <!-- Placeholder for "Used in" -->
diff --git a/docs/api_docs/python/recsim/environments/interest_exploration/total_clicks_reward.md b/docs/api_docs/python/recsim/environments/interest_exploration/total_clicks_reward.md
index 39685e8..17599a7 100644
--- a/docs/api_docs/python/recsim/environments/interest_exploration/total_clicks_reward.md
+++ b/docs/api_docs/python/recsim/environments/interest_exploration/total_clicks_reward.md
@@ -5,27 +5,53 @@
 
 # recsim.environments.interest_exploration.total_clicks_reward
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/interest_exploration.py">View
 source</a>
 
-<!-- Start diff -->
 Calculates the total number of clicks from a list of responses.
 
-```python
-recsim.environments.interest_exploration.total_clicks_reward(responses)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.environments.interest_exploration.total_clicks_reward(
+    responses
+)
+</code></pre>
 
 <!-- Placeholder for "Used in" -->
 
-#### Args:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
-*   <b>`responses`</b>: A list of IEResponse objects
+<tr>
+<td>
+`responses`
+</td>
+<td>
+A list of IEResponse objects
+</td>
+</tr>
+</table>
 
-#### Returns:
+<!-- Tabular view -->
 
-*   <b>`reward`</b>: A float representing the total clicks from the responses
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
+
+<tr>
+<td>
+`reward`
+</td>
+<td>
+A float representing the total clicks from the responses
+</td>
+</tr>
+</table>
diff --git a/docs/api_docs/python/recsim/environments/long_term_satisfaction.md b/docs/api_docs/python/recsim/environments/long_term_satisfaction.md
index 1107fc3..f1f7c33 100644
--- a/docs/api_docs/python/recsim/environments/long_term_satisfaction.md
+++ b/docs/api_docs/python/recsim/environments/long_term_satisfaction.md
@@ -1,12 +1,14 @@
 <div itemscope itemtype="http://developers.google.com/ReferenceObject">
 <meta itemprop="name" content="recsim.environments.long_term_satisfaction" />
 <meta itemprop="path" content="Stable" />
-<meta itemprop="property" content="FLAGS"/>
 </div>
 
 # Module: recsim.environments.long_term_satisfaction
 
+<!-- Insert buttons and diff -->
+
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/long_term_satisfaction.py">View
@@ -46,12 +48,11 @@ Class to represent users.
 
 ## Functions
 
+[`FLAGS(...)`](../../recsim/environments/interest_evolution/FLAGS.md): Registry
+of 'Flag' objects.
+
 [`clicked_engagement_reward(...)`](../../recsim/environments/long_term_satisfaction/clicked_engagement_reward.md):
 Calculates the total clicked watchtime from a list of responses.
 
 [`create_environment(...)`](../../recsim/environments/long_term_satisfaction/create_environment.md):
 Creates a long-term satisfaction environment.
-
-## Other Members
-
-*   `FLAGS` <a id="FLAGS"></a>
diff --git a/docs/api_docs/python/recsim/environments/long_term_satisfaction/LTSDocument.md b/docs/api_docs/python/recsim/environments/long_term_satisfaction/LTSDocument.md
index 5c4dc54..174e027 100644
--- a/docs/api_docs/python/recsim/environments/long_term_satisfaction/LTSDocument.md
+++ b/docs/api_docs/python/recsim/environments/long_term_satisfaction/LTSDocument.md
@@ -5,46 +5,49 @@
 <meta itemprop="property" content="create_observation"/>
 <meta itemprop="property" content="doc_id"/>
 <meta itemprop="property" content="observation_space"/>
+<meta itemprop="property" content="NUM_FEATURES"/>
 </div>
 
 # recsim.environments.long_term_satisfaction.LTSDocument
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/long_term_satisfaction.py">View
 source</a>
 
-## Class `LTSDocument`
-
-<!-- Start diff -->
 Class to represent an LTS Document.
 
 Inherits From:
 [`AbstractDocument`](../../../recsim/document/AbstractDocument.md)
 
-<!-- Placeholder for "Used in" -->
-
-#### Attributes:
-
-*   <b>`clickbait_score`</b>: real number in [0,1] representing the
-    clickbaitiness of a document.
-
-<h2 id="__init__"><code>__init__</code></h2>
-
-<a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/long_term_satisfaction.py">View
-source</a>
-
-```python
-__init__(
-    doc_id,
-    clickbait_score
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.environments.long_term_satisfaction.LTSDocument(
+    doc_id, clickbait_score
 )
-```
+</code></pre>
 
-Initialize self. See help(type(self)) for accurate signature.
+<!-- Placeholder for "Used in" -->
+
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
+
+<tr>
+<td>
+`clickbait_score`
+</td>
+<td>
+real number in [0,1] representing the clickbaitiness of a
+document.
+</td>
+</tr>
+</table>
 
 ## Methods
 
@@ -53,9 +56,9 @@ Initialize self. See help(type(self)) for accurate signature.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/long_term_satisfaction.py">View
 source</a>
 
-```python
-create_observation()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>create_observation()
+</code></pre>
 
 Returns observable properties of this document as a float array.
 
@@ -64,9 +67,9 @@ Returns observable properties of this document as a float array.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/document.py">View
 source</a>
 
-```python
-doc_id()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>doc_id()
+</code></pre>
 
 Returns the document ID.
 
@@ -75,9 +78,13 @@ Returns the document ID.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/long_term_satisfaction.py">View
 source</a>
 
-```python
-@staticmethod
-observation_space()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>@staticmethod</code>
+<code>observation_space()
+</code></pre>
 
 Gym space that defines how documents are represented.
+
+## Class Variables
+
+*   `NUM_FEATURES = None` <a id="NUM_FEATURES"></a>
diff --git a/docs/api_docs/python/recsim/environments/long_term_satisfaction/LTSDocumentSampler.md b/docs/api_docs/python/recsim/environments/long_term_satisfaction/LTSDocumentSampler.md
index 8e8ef14..a556dfb 100644
--- a/docs/api_docs/python/recsim/environments/long_term_satisfaction/LTSDocumentSampler.md
+++ b/docs/api_docs/python/recsim/environments/long_term_satisfaction/LTSDocumentSampler.md
@@ -1,7 +1,6 @@
 <div itemscope itemtype="http://developers.google.com/ReferenceObject">
 <meta itemprop="name" content="recsim.environments.long_term_satisfaction.LTSDocumentSampler" />
 <meta itemprop="path" content="Stable" />
-<meta itemprop="property" content="num_clusters"/>
 <meta itemprop="property" content="__init__"/>
 <meta itemprop="property" content="get_doc_ctor"/>
 <meta itemprop="property" content="reset_sampler"/>
@@ -11,48 +10,54 @@
 
 # recsim.environments.long_term_satisfaction.LTSDocumentSampler
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/long_term_satisfaction.py">View
 source</a>
 
-## Class `LTSDocumentSampler`
-
-<!-- Start diff -->
 Class to sample LTSDocument documents.
 
 Inherits From:
 [`AbstractDocumentSampler`](../../../recsim/document/AbstractDocumentSampler.md)
 
-<!-- Placeholder for "Used in" -->
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.environments.long_term_satisfaction.LTSDocumentSampler(
+    doc_ctor=recsim.environments.long_term_satisfaction.LTSDocument, **kwargs
+)
+</code></pre>
 
-#### Args:
+<!-- Placeholder for "Used in" -->
 
-doc_ctor: A class/constructor for the type of documents that will be sampled by
-this sampler.
+<!-- Tabular view -->
 
-<h2 id="__init__"><code>__init__</code></h2>
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
-<a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/long_term_satisfaction.py">View
-source</a>
-
-```python
-__init__(
-    doc_ctor=recsim.environments.long_term_satisfaction.LTSDocument,
-    **kwargs
-)
-```
+</table>
 
-Initialize self. See help(type(self)) for accurate signature.
+doc_ctor: A class/constructor for the type of documents that will be sampled by
+this sampler.
 
-## Properties
+<!-- Tabular view -->
 
-<h3 id="num_clusters"><code>num_clusters</code></h3>
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
 
+<tr>
+<td>
+`num_clusters`
+</td>
+<td>
 Returns the number of document clusters. Returns 0 if not applicable.
+</td>
+</tr>
+</table>
 
 ## Methods
 
@@ -61,9 +66,9 @@ Returns the number of document clusters. Returns 0 if not applicable.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/document.py">View
 source</a>
 
-```python
-get_doc_ctor()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>get_doc_ctor()
+</code></pre>
 
 Returns the constructor/class of the documents that will be sampled.
 
@@ -72,18 +77,18 @@ Returns the constructor/class of the documents that will be sampled.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/document.py">View
 source</a>
 
-```python
-reset_sampler()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>reset_sampler()
+</code></pre>
 
 <h3 id="sample_document"><code>sample_document</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/long_term_satisfaction.py">View
 source</a>
 
-```python
-sample_document()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>sample_document()
+</code></pre>
 
 Samples and return an instantiation of AbstractDocument.
 
@@ -92,11 +97,10 @@ Samples and return an instantiation of AbstractDocument.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/document.py">View
 source</a>
 
-```python
-update_state(
-    documents,
-    responses
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>update_state(
+    documents, responses
 )
-```
+</code></pre>
 
 Update document state (if needed) given user's (or users') responses.
diff --git a/docs/api_docs/python/recsim/environments/long_term_satisfaction/LTSResponse.md b/docs/api_docs/python/recsim/environments/long_term_satisfaction/LTSResponse.md
index 84f8123..ba9b8aa 100644
--- a/docs/api_docs/python/recsim/environments/long_term_satisfaction/LTSResponse.md
+++ b/docs/api_docs/python/recsim/environments/long_term_satisfaction/LTSResponse.md
@@ -9,48 +9,74 @@
 
 # recsim.environments.long_term_satisfaction.LTSResponse
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/long_term_satisfaction.py">View
 source</a>
 
-## Class `LTSResponse`
-
-<!-- Start diff -->
 Class to represent a user's response to a document.
 
 Inherits From: [`AbstractResponse`](../../../recsim/user/AbstractResponse.md)
 
-<!-- Placeholder for "Used in" -->
-
-#### Attributes:
-
-*   <b>`engagement`</b>: real number representing the degree of engagement with
-    a document (e.g. watch time).
-*   <b>`clicked`</b>: boolean indicating whether the item was clicked or not.
-
-<h2 id="__init__"><code>__init__</code></h2>
-
-<a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/long_term_satisfaction.py">View
-source</a>
-
-```python
-__init__(
-    clicked=False,
-    engagement=0.0
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.environments.long_term_satisfaction.LTSResponse(
+    clicked=False, engagement=0.0
 )
-```
+</code></pre>
 
-Creates a new user response for a document.
+<!-- Placeholder for "Used in" -->
 
-#### Args:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+`clicked`
+</td>
+<td>
+boolean indicating whether the item was clicked or not.
+</td>
+</tr><tr>
+<td>
+`engagement`
+</td>
+<td>
+real number representing the degree of engagement with a
+document (e.g. watch time).
+</td>
+</tr>
+</table>
 
-*   <b>`clicked`</b>: boolean indicating whether the item was clicked or not.
-*   <b>`engagement`</b>: real number representing the degree of engagement with
-    a document (e.g. watch time).
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
+
+<tr>
+<td>
+`engagement`
+</td>
+<td>
+real number representing the degree of engagement with a
+document (e.g. watch time).
+</td>
+</tr><tr>
+<td>
+`clicked`
+</td>
+<td>
+boolean indicating whether the item was clicked or not.
+</td>
+</tr>
+</table>
 
 ## Methods
 
@@ -59,9 +85,9 @@ Creates a new user response for a document.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/long_term_satisfaction.py">View
 source</a>
 
-```python
-create_observation()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>create_observation()
+</code></pre>
 
 Creates a tensor observation of this response.
 
@@ -70,13 +96,13 @@ Creates a tensor observation of this response.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/long_term_satisfaction.py">View
 source</a>
 
-```python
-@classmethod
-response_space(cls)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>@classmethod</code>
+<code>response_space()
+</code></pre>
 
 ArraySpec that defines how a single response is represented.
 
-## Class Members
+## Class Variables
 
 *   `MAX_ENGAGEMENT_MAGNITUDE = 100.0` <a id="MAX_ENGAGEMENT_MAGNITUDE"></a>
diff --git a/docs/api_docs/python/recsim/environments/long_term_satisfaction/LTSStaticUserSampler.md b/docs/api_docs/python/recsim/environments/long_term_satisfaction/LTSStaticUserSampler.md
index 8402641..2c3dd01 100644
--- a/docs/api_docs/python/recsim/environments/long_term_satisfaction/LTSStaticUserSampler.md
+++ b/docs/api_docs/python/recsim/environments/long_term_satisfaction/LTSStaticUserSampler.md
@@ -9,34 +9,29 @@
 
 # recsim.environments.long_term_satisfaction.LTSStaticUserSampler
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/long_term_satisfaction.py">View
 source</a>
 
-## Class `LTSStaticUserSampler`
-
-<!-- Start diff -->
 Generates user with identical predetermined parameters.
 
 Inherits From:
 [`AbstractUserSampler`](../../../recsim/user/AbstractUserSampler.md)
 
-<!-- Placeholder for "Used in" -->
-
-<h2 id="__init__"><code>__init__</code></h2>
-
-```python
-__init__(
-    *args,
-    **kwargs
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.environments.long_term_satisfaction.LTSStaticUserSampler(
+    user_ctor=recsim.environments.long_term_satisfaction.LTSUserState,
+    memory_discount=0.7, sensitivity=0.01, innovation_stddev=0.05, choc_mean=5.0,
+    choc_stddev=1.0, kale_mean=4.0, kale_stddev=1.0, time_budget=60, **kwargs
 )
-```
+</code></pre>
 
-Creates a new user state sampler.
+<!-- Placeholder for "Used in" -->
 
 ## Methods
 
@@ -45,9 +40,9 @@ Creates a new user state sampler.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/user.py">View
 source</a>
 
-```python
-get_user_ctor()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>get_user_ctor()
+</code></pre>
 
 Returns the constructor/class of the user states that will be sampled.
 
@@ -56,17 +51,17 @@ Returns the constructor/class of the user states that will be sampled.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/user.py">View
 source</a>
 
-```python
-reset_sampler()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>reset_sampler()
+</code></pre>
 
 <h3 id="sample_user"><code>sample_user</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/long_term_satisfaction.py">View
 source</a>
 
-```python
-sample_user()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>sample_user()
+</code></pre>
 
 Creates a new instantiation of this user's hidden state parameters.
diff --git a/docs/api_docs/python/recsim/environments/long_term_satisfaction/LTSUserModel.md b/docs/api_docs/python/recsim/environments/long_term_satisfaction/LTSUserModel.md
index e5077d8..6f9cfc1 100644
--- a/docs/api_docs/python/recsim/environments/long_term_satisfaction/LTSUserModel.md
+++ b/docs/api_docs/python/recsim/environments/long_term_satisfaction/LTSUserModel.md
@@ -16,21 +16,25 @@
 
 # recsim.environments.long_term_satisfaction.LTSUserModel
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/long_term_satisfaction.py">View
 source</a>
 
-## Class `LTSUserModel`
-
-<!-- Start diff -->
 Class to model a user with long-term satisfaction dynamics.
 
 Inherits From: [`AbstractUserModel`](../../../recsim/user/AbstractUserModel.md)
 
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.environments.long_term_satisfaction.LTSUserModel(
+    slate_size, user_state_ctor=None, response_model_ctor=None, seed=0
+)
+</code></pre>
+
 <!-- Placeholder for "Used in" -->
 
 Implements a controlled continuous Hidden Markov Model of the user having the
@@ -60,30 +64,38 @@ A constructor to create user state. response_model_ctor: A constructor function
 to create response. The function should take a string of doc ID as input and
 returns a LTSResponse object. seed: an integer as the seed in random sampling.
 
-<h2 id="__init__"><code>__init__</code></h2>
-
-<a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/long_term_satisfaction.py">View
-source</a>
-
-```python
-__init__(
-    slate_size,
-    user_state_ctor=None,
-    response_model_ctor=None,
-    seed=0
-)
-```
-
-Initializes a new user model.
-
-#### Args:
-
-*   <b>`response_model_ctor`</b>: A class/constructor representing the type of
-    responses this model will generate.
-*   <b>`user_sampler`</b>: An instance of AbstractUserSampler that can generate
-    initial user states from an inital state distribution.
-*   <b>`slate_size`</b>: integer number of documents that can be served to the
-    user at any interaction.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+`response_model_ctor`
+</td>
+<td>
+A class/constructor representing the type of
+responses this model will generate.
+</td>
+</tr><tr>
+<td>
+`user_sampler`
+</td>
+<td>
+An instance of AbstractUserSampler that can generate
+initial user states from an inital state distribution.
+</td>
+</tr><tr>
+<td>
+`slate_size`
+</td>
+<td>
+integer number of documents that can be served to the user at
+any interaction.
+</td>
+</tr>
+</table>
 
 ## Methods
 
@@ -92,9 +104,9 @@ Initializes a new user model.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/user.py">View
 source</a>
 
-```python
-create_observation()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>create_observation()
+</code></pre>
 
 Emits obesrvation about user's state.
 
@@ -103,29 +115,48 @@ Emits obesrvation about user's state.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/long_term_satisfaction.py">View
 source</a>
 
-```python
-generate_response(
-    doc,
-    response
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>generate_response(
+    doc, response
 )
-```
+</code></pre>
 
 Generates a response to a clicked document.
 
-#### Args:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`doc`
+</td>
+<td>
+an LTSDocument object.
+</td>
+</tr><tr>
+<td>
+`response`
+</td>
+<td>
+an LTSResponse for the document.
+</td>
+</tr>
+</table>
 
-*   <b>`doc`</b>: an LTSDocument object.
-*   <b>`response`</b>: an LTSResponse for the document. Updates: response, with
-    whether the document was clicked, liked, and how much of it was watched.
+Updates: response, with whether the document was clicked, liked, and how much of
+it was watched.
 
 <h3 id="get_response_model_ctor"><code>get_response_model_ctor</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/user.py">View
 source</a>
 
-```python
-get_response_model_ctor()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>get_response_model_ctor()
+</code></pre>
 
 Returns a constructor for the type of response this model will create.
 
@@ -134,9 +165,9 @@ Returns a constructor for the type of response this model will create.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/long_term_satisfaction.py">View
 source</a>
 
-```python
-is_terminal()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>is_terminal()
+</code></pre>
 
 Returns a boolean indicating if the session is over.
 
@@ -145,9 +176,9 @@ Returns a boolean indicating if the session is over.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/user.py">View
 source</a>
 
-```python
-observation_space()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>observation_space()
+</code></pre>
 
 A Gym.spaces object that describes possible user observations.
 
@@ -156,9 +187,9 @@ A Gym.spaces object that describes possible user observations.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/user.py">View
 source</a>
 
-```python
-reset()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>reset()
+</code></pre>
 
 Resets the user.
 
@@ -167,9 +198,9 @@ Resets the user.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/user.py">View
 source</a>
 
-```python
-reset_sampler()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>reset_sampler()
+</code></pre>
 
 Resets the sampler.
 
@@ -178,45 +209,88 @@ Resets the sampler.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/user.py">View
 source</a>
 
-```python
-response_space()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>response_space()
+</code></pre>
 
 <h3 id="simulate_response"><code>simulate_response</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/long_term_satisfaction.py">View
 source</a>
 
-```python
-simulate_response(documents)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>simulate_response(
+    documents
+)
+</code></pre>
 
 Simulates the user's response to a slate of documents with choice model.
 
-#### Args:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
 
-*   <b>`documents`</b>: a list of LTSDocument objects.
+<tr>
+<td>
+`documents`
+</td>
+<td>
+a list of LTSDocument objects.
+</td>
+</tr>
+</table>
+
+<!-- Tabular view -->
 
-#### Returns:
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
 
-*   <b>`responses`</b>: a list of LTSResponse objects, one for each document.
+<tr>
+<td>
+`responses`
+</td>
+<td>
+a list of LTSResponse objects, one for each document.
+</td>
+</tr>
+</table>
 
 <h3 id="update_state"><code>update_state</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/long_term_satisfaction.py">View
 source</a>
 
-```python
-update_state(
-    slate_documents,
-    responses
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>update_state(
+    slate_documents, responses
 )
-```
+</code></pre>
 
 Updates the user's latent state based on responses to the slate.
 
-#### Args:
-
-*   <b>`slate_documents`</b>: a list of LTSDocuments representing the slate
-*   <b>`responses`</b>: a list of LTSResponses representing the user's response
-    to each document in the slate.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`slate_documents`
+</td>
+<td>
+a list of LTSDocuments representing the slate
+</td>
+</tr><tr>
+<td>
+`responses`
+</td>
+<td>
+a list of LTSResponses representing the user's response to each
+document in the slate.
+</td>
+</tr>
+</table>
diff --git a/docs/api_docs/python/recsim/environments/long_term_satisfaction/LTSUserState.md b/docs/api_docs/python/recsim/environments/long_term_satisfaction/LTSUserState.md
index f35d7d0..9d44cad 100644
--- a/docs/api_docs/python/recsim/environments/long_term_satisfaction/LTSUserState.md
+++ b/docs/api_docs/python/recsim/environments/long_term_satisfaction/LTSUserState.md
@@ -5,25 +5,31 @@
 <meta itemprop="property" content="create_observation"/>
 <meta itemprop="property" content="observation_space"/>
 <meta itemprop="property" content="score_document"/>
+<meta itemprop="property" content="NUM_FEATURES"/>
 </div>
 
 # recsim.environments.long_term_satisfaction.LTSUserState
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/long_term_satisfaction.py">View
 source</a>
 
-## Class `LTSUserState`
-
-<!-- Start diff -->
 Class to represent users.
 
 Inherits From: [`AbstractUserState`](../../../recsim/user/AbstractUserState.md)
 
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.environments.long_term_satisfaction.LTSUserState(
+    memory_discount, sensitivity, innovation_stddev, choc_mean, choc_stddev,
+    kale_mean, kale_stddev, net_positive_exposure, time_budget
+)
+</code></pre>
+
 <!-- Placeholder for "Used in" -->
 
 See the LTSUserModel class documentation for precise information about how the
@@ -36,27 +42,6 @@ kale_mean: mean of engagement with non-clickbaity content. kale_stddev: standard
 deviation of engagement with non-clickbaity content. net_positive_exposure:
 starting value for NPE (NPE_0). time_budget: length of a user session.
 
-<h2 id="__init__"><code>__init__</code></h2>
-
-<a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/long_term_satisfaction.py">View
-source</a>
-
-```python
-__init__(
-    memory_discount,
-    sensitivity,
-    innovation_stddev,
-    choc_mean,
-    choc_stddev,
-    kale_mean,
-    kale_stddev,
-    net_positive_exposure,
-    time_budget
-)
-```
-
-Initializes a new user.
-
 ## Methods
 
 <h3 id="create_observation"><code>create_observation</code></h3>
@@ -64,9 +49,9 @@ Initializes a new user.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/long_term_satisfaction.py">View
 source</a>
 
-```python
-create_observation()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>create_observation()
+</code></pre>
 
 User's state is not observable.
 
@@ -75,10 +60,10 @@ User's state is not observable.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/long_term_satisfaction.py">View
 source</a>
 
-```python
-@staticmethod
-observation_space()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>@staticmethod</code>
+<code>observation_space()
+</code></pre>
 
 Gym.spaces object that defines how user states are represented.
 
@@ -87,6 +72,12 @@ Gym.spaces object that defines how user states are represented.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/long_term_satisfaction.py">View
 source</a>
 
-```python
-score_document(doc_obs)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>score_document(
+    doc_obs
+)
+</code></pre>
+
+## Class Variables
+
+*   `NUM_FEATURES = None` <a id="NUM_FEATURES"></a>
diff --git a/docs/api_docs/python/recsim/environments/long_term_satisfaction/clicked_engagement_reward.md b/docs/api_docs/python/recsim/environments/long_term_satisfaction/clicked_engagement_reward.md
index 726ce10..f99f8cd 100644
--- a/docs/api_docs/python/recsim/environments/long_term_satisfaction/clicked_engagement_reward.md
+++ b/docs/api_docs/python/recsim/environments/long_term_satisfaction/clicked_engagement_reward.md
@@ -5,28 +5,53 @@
 
 # recsim.environments.long_term_satisfaction.clicked_engagement_reward
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/long_term_satisfaction.py">View
 source</a>
 
-<!-- Start diff -->
 Calculates the total clicked watchtime from a list of responses.
 
-```python
-recsim.environments.long_term_satisfaction.clicked_engagement_reward(responses)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.environments.long_term_satisfaction.clicked_engagement_reward(
+    responses
+)
+</code></pre>
 
 <!-- Placeholder for "Used in" -->
 
-#### Args:
+<!-- Tabular view -->
 
-*   <b>`responses`</b>: A list of LTSResponse objects
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
-#### Returns:
+<tr>
+<td>
+`responses`
+</td>
+<td>
+A list of LTSResponse objects
+</td>
+</tr>
+</table>
 
-*   <b>`reward`</b>: A float representing the total watch time from the
-    responses
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
+
+<tr>
+<td>
+`reward`
+</td>
+<td>
+A float representing the total watch time from the responses
+</td>
+</tr>
+</table>
diff --git a/docs/api_docs/python/recsim/environments/long_term_satisfaction/create_environment.md b/docs/api_docs/python/recsim/environments/long_term_satisfaction/create_environment.md
index 14c5506..f3d9239 100644
--- a/docs/api_docs/python/recsim/environments/long_term_satisfaction/create_environment.md
+++ b/docs/api_docs/python/recsim/environments/long_term_satisfaction/create_environment.md
@@ -5,20 +5,21 @@
 
 # recsim.environments.long_term_satisfaction.create_environment
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/environments/long_term_satisfaction.py">View
 source</a>
 
-<!-- Start diff -->
-
 Creates a long-term satisfaction environment.
 
-```python
-recsim.environments.long_term_satisfaction.create_environment(env_config)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.environments.long_term_satisfaction.create_environment(
+    env_config
+)
+</code></pre>
 
 <!-- Placeholder for "Used in" -->
diff --git a/docs/api_docs/python/recsim/simulator.md b/docs/api_docs/python/recsim/simulator.md
index 199d7f7..44d9547 100644
--- a/docs/api_docs/python/recsim/simulator.md
+++ b/docs/api_docs/python/recsim/simulator.md
@@ -5,7 +5,10 @@
 
 # Module: recsim.simulator
 
+<!-- Insert buttons and diff -->
+
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/simulator/__init__.py">View
diff --git a/docs/api_docs/python/recsim/simulator/environment.md b/docs/api_docs/python/recsim/simulator/environment.md
index af23234..7381bf4 100644
--- a/docs/api_docs/python/recsim/simulator/environment.md
+++ b/docs/api_docs/python/recsim/simulator/environment.md
@@ -5,7 +5,10 @@
 
 # Module: recsim.simulator.environment
 
+<!-- Insert buttons and diff -->
+
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/simulator/environment.py">View
diff --git a/docs/api_docs/python/recsim/simulator/environment/AbstractEnvironment.md b/docs/api_docs/python/recsim/simulator/environment/AbstractEnvironment.md
index 24a6c54..353b6ce 100644
--- a/docs/api_docs/python/recsim/simulator/environment/AbstractEnvironment.md
+++ b/docs/api_docs/python/recsim/simulator/environment/AbstractEnvironment.md
@@ -1,10 +1,6 @@
 <div itemscope itemtype="http://developers.google.com/ReferenceObject">
 <meta itemprop="name" content="recsim.simulator.environment.AbstractEnvironment" />
 <meta itemprop="path" content="Stable" />
-<meta itemprop="property" content="candidate_set"/>
-<meta itemprop="property" content="num_candidates"/>
-<meta itemprop="property" content="slate_size"/>
-<meta itemprop="property" content="user_model"/>
 <meta itemprop="property" content="__init__"/>
 <meta itemprop="property" content="reset"/>
 <meta itemprop="property" content="reset_sampler"/>
@@ -13,70 +9,122 @@
 
 # recsim.simulator.environment.AbstractEnvironment
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/simulator/environment.py">View
 source</a>
 
-## Class `AbstractEnvironment`
-
-<!-- Start diff -->
 Abstract class representing the recommender system environment.
 
-<!-- Placeholder for "Used in" -->
-
-#### Attributes:
-
-*   <b>`user_model`</b>: An list or single instantiation of AbstractUserModel
-    representing the user/users.
-*   <b>`document_sampler`</b>: An instantiation of AbstractDocumentSampler.
-*   <b>`num_candidates`</b>: An integer representing the size of the
-    candidate_set.
-*   <b>`slate_size`</b>: An integer representing the slate size.
-*   <b>`candidate_set`</b>: An instantiation of CandidateSet.
-*   <b>`num_clusters`</b>: An integer representing the number of document
-    clusters.
-
-<h2 id="__init__"><code>__init__</code></h2>
-
-<a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/simulator/environment.py">View
-source</a>
-
-```python
-__init__(
-    user_model,
-    document_sampler,
-    num_candidates,
-    slate_size,
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.simulator.environment.AbstractEnvironment(
+    user_model, document_sampler, num_candidates, slate_size,
     resample_documents=True
 )
-```
-
-Initializes a new simulation environment.
-
-#### Args:
-
-*   <b>`user_model`</b>: An instantiation of AbstractUserModel or list of such
-    instantiations
-*   <b>`document_sampler`</b>: An instantiation of AbstractDocumentSampler
-*   <b>`num_candidates`</b>: An integer representing the size of the
-    candidate_set
-*   <b>`slate_size`</b>: An integer representing the slate size
-*   <b>`resample_documents`</b>: A boolean indicating whether to resample the
-    candidate set every step
-
-## Properties
+</code></pre>
 
-<h3 id="candidate_set"><code>candidate_set</code></h3>
-
-<h3 id="num_candidates"><code>num_candidates</code></h3>
-
-<h3 id="slate_size"><code>slate_size</code></h3>
+<!-- Placeholder for "Used in" -->
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+`user_model`
+</td>
+<td>
+An instantiation of AbstractUserModel or list of such
+instantiations
+</td>
+</tr><tr>
+<td>
+`document_sampler`
+</td>
+<td>
+An instantiation of AbstractDocumentSampler
+</td>
+</tr><tr>
+<td>
+`num_candidates`
+</td>
+<td>
+An integer representing the size of the candidate_set
+</td>
+</tr><tr>
+<td>
+`slate_size`
+</td>
+<td>
+An integer representing the slate size
+</td>
+</tr><tr>
+<td>
+`resample_documents`
+</td>
+<td>
+A boolean indicating whether to resample the candidate
+set every step
+</td>
+</tr>
+</table>
 
-<h3 id="user_model"><code>user_model</code></h3>
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
+
+<tr>
+<td>
+`user_model`
+</td>
+<td>
+An list or single instantiation of AbstractUserModel
+representing the user/users.
+</td>
+</tr><tr>
+<td>
+`document_sampler`
+</td>
+<td>
+An instantiation of AbstractDocumentSampler.
+</td>
+</tr><tr>
+<td>
+`num_candidates`
+</td>
+<td>
+An integer representing the size of the candidate_set.
+</td>
+</tr><tr>
+<td>
+`slate_size`
+</td>
+<td>
+An integer representing the slate size.
+</td>
+</tr><tr>
+<td>
+`candidate_set`
+</td>
+<td>
+An instantiation of CandidateSet.
+</td>
+</tr><tr>
+<td>
+`num_clusters`
+</td>
+<td>
+An integer representing the number of document clusters.
+</td>
+</tr>
+</table>
 
 ## Methods
 
@@ -85,27 +133,46 @@ Initializes a new simulation environment.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/simulator/environment.py">View
 source</a>
 
-```python
-reset()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>@abc.abstractmethod</code>
+<code>reset()
+</code></pre>
 
 Resets the environment and return the first observation.
 
-#### Returns:
-
-*   <b>`user_obs`</b>: An array of floats representing observations of the
-    user's current state
-*   <b>`doc_obs`</b>: An OrderedDict of document observations keyed by document
-    ids
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+
+<tr>
+<td>
+`user_obs`
+</td>
+<td>
+An array of floats representing observations of the user's
+current state
+</td>
+</tr><tr>
+<td>
+`doc_obs`
+</td>
+<td>
+An OrderedDict of document observations keyed by document ids
+</td>
+</tr>
+</table>
 
 <h3 id="reset_sampler"><code>reset_sampler</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/simulator/environment.py">View
 source</a>
 
-```python
-reset_sampler()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>@abc.abstractmethod</code>
+<code>reset_sampler()
+</code></pre>
 
 Resets the relevant samplers of documents and user/users.
 
@@ -114,22 +181,65 @@ Resets the relevant samplers of documents and user/users.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/simulator/environment.py">View
 source</a>
 
-```python
-step(slate)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>@abc.abstractmethod</code>
+<code>step(
+    slate
+)
+</code></pre>
 
 Executes the action, returns next state observation and reward.
 
-#### Args:
-
-*   <b>`slate`</b>: An integer array of size slate_size (or list of such
-    arrays), where each element is an index into the set of current_documents
-    presented.
-
-#### Returns:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`slate`
+</td>
+<td>
+An integer array of size slate_size (or list of such arrays), where
+each element is an index into the set of current_documents presented.
+</td>
+</tr>
+</table>
 
-*   <b>`user_obs`</b>: A gym observation representing the user's next state
-*   <b>`doc_obs`</b>: A list of observations of the documents
-*   <b>`responses`</b>: A list of AbstractResponse objects for each item in the
-    slate
-*   <b>`done`</b>: A boolean indicating whether the episode has terminated
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+
+<tr>
+<td>
+`user_obs`
+</td>
+<td>
+A gym observation representing the user's next state
+</td>
+</tr><tr>
+<td>
+`doc_obs`
+</td>
+<td>
+A list of observations of the documents
+</td>
+</tr><tr>
+<td>
+`responses`
+</td>
+<td>
+A list of AbstractResponse objects for each item in the slate
+</td>
+</tr><tr>
+<td>
+`done`
+</td>
+<td>
+A boolean indicating whether the episode has terminated
+</td>
+</tr>
+</table>
diff --git a/docs/api_docs/python/recsim/simulator/environment/Environment.md b/docs/api_docs/python/recsim/simulator/environment/Environment.md
index 91dea36..eaa62d1 100644
--- a/docs/api_docs/python/recsim/simulator/environment/Environment.md
+++ b/docs/api_docs/python/recsim/simulator/environment/Environment.md
@@ -1,10 +1,6 @@
 <div itemscope itemtype="http://developers.google.com/ReferenceObject">
 <meta itemprop="name" content="recsim.simulator.environment.Environment" />
 <meta itemprop="path" content="Stable" />
-<meta itemprop="property" content="candidate_set"/>
-<meta itemprop="property" content="num_candidates"/>
-<meta itemprop="property" content="slate_size"/>
-<meta itemprop="property" content="user_model"/>
 <meta itemprop="property" content="__init__"/>
 <meta itemprop="property" content="reset"/>
 <meta itemprop="property" content="reset_sampler"/>
@@ -13,77 +9,132 @@
 
 # recsim.simulator.environment.Environment
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/simulator/environment.py">View
 source</a>
 
-## Class `Environment`
-
-<!-- Start diff -->
 Class to represent the environment with one user.
 
 Inherits From:
 [`AbstractEnvironment`](../../../recsim/simulator/environment/AbstractEnvironment.md)
 
-### Aliases:
-
-*   Class `recsim.simulator.environment.SingleUserEnvironment`
-
-<!-- Placeholder for "Used in" -->
-
-#### Attributes:
-
-*   <b>`user_model`</b>: An instantiation of AbstractUserModel that represents a
-    user.
-*   <b>`document_sampler`</b>: An instantiation of AbstractDocumentSampler.
-*   <b>`num_candidates`</b>: An integer representing the size of the
-    candidate_set.
-*   <b>`slate_size`</b>: An integer representing the slate size.
-*   <b>`candidate_set`</b>: An instantiation of CandidateSet.
-*   <b>`num_clusters`</b>: An integer representing the number of document
-    clusters.
-
-<h2 id="__init__"><code>__init__</code></h2>
-
-<a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/simulator/environment.py">View
-source</a>
-
-```python
-__init__(
-    user_model,
-    document_sampler,
-    num_candidates,
-    slate_size,
+<section class="expandable">
+  <h4 class="showalways">View aliases</h4>
+  <p>
+<b>Main aliases</b>
+<p>`recsim.simulator.environment.SingleUserEnvironment`</p>
+</p>
+</section>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.simulator.environment.Environment(
+    user_model, document_sampler, num_candidates, slate_size,
     resample_documents=True
 )
-```
-
-Initializes a new simulation environment.
-
-#### Args:
-
-*   <b>`user_model`</b>: An instantiation of AbstractUserModel or list of such
-    instantiations
-*   <b>`document_sampler`</b>: An instantiation of AbstractDocumentSampler
-*   <b>`num_candidates`</b>: An integer representing the size of the
-    candidate_set
-*   <b>`slate_size`</b>: An integer representing the slate size
-*   <b>`resample_documents`</b>: A boolean indicating whether to resample the
-    candidate set every step
+</code></pre>
 
-## Properties
-
-<h3 id="candidate_set"><code>candidate_set</code></h3>
-
-<h3 id="num_candidates"><code>num_candidates</code></h3>
-
-<h3 id="slate_size"><code>slate_size</code></h3>
+<!-- Placeholder for "Used in" -->
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+`user_model`
+</td>
+<td>
+An instantiation of AbstractUserModel or list of such
+instantiations
+</td>
+</tr><tr>
+<td>
+`document_sampler`
+</td>
+<td>
+An instantiation of AbstractDocumentSampler
+</td>
+</tr><tr>
+<td>
+`num_candidates`
+</td>
+<td>
+An integer representing the size of the candidate_set
+</td>
+</tr><tr>
+<td>
+`slate_size`
+</td>
+<td>
+An integer representing the slate size
+</td>
+</tr><tr>
+<td>
+`resample_documents`
+</td>
+<td>
+A boolean indicating whether to resample the candidate
+set every step
+</td>
+</tr>
+</table>
 
-<h3 id="user_model"><code>user_model</code></h3>
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
+
+<tr>
+<td>
+`user_model`
+</td>
+<td>
+An instantiation of AbstractUserModel that represents a user.
+</td>
+</tr><tr>
+<td>
+`document_sampler`
+</td>
+<td>
+An instantiation of AbstractDocumentSampler.
+</td>
+</tr><tr>
+<td>
+`num_candidates`
+</td>
+<td>
+An integer representing the size of the candidate_set.
+</td>
+</tr><tr>
+<td>
+`slate_size`
+</td>
+<td>
+An integer representing the slate size.
+</td>
+</tr><tr>
+<td>
+`candidate_set`
+</td>
+<td>
+An instantiation of CandidateSet.
+</td>
+</tr><tr>
+<td>
+`num_clusters`
+</td>
+<td>
+An integer representing the number of document clusters.
+</td>
+</tr>
+</table>
 
 ## Methods
 
@@ -92,27 +143,44 @@ Initializes a new simulation environment.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/simulator/environment.py">View
 source</a>
 
-```python
-reset()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>reset()
+</code></pre>
 
 Resets the environment and return the first observation.
 
-#### Returns:
-
-*   <b>`user_obs`</b>: An array of floats representing observations of the
-    user's current state
-*   <b>`doc_obs`</b>: An OrderedDict of document observations keyed by document
-    ids
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+
+<tr>
+<td>
+`user_obs`
+</td>
+<td>
+An array of floats representing observations of the user's
+current state
+</td>
+</tr><tr>
+<td>
+`doc_obs`
+</td>
+<td>
+An OrderedDict of document observations keyed by document ids
+</td>
+</tr>
+</table>
 
 <h3 id="reset_sampler"><code>reset_sampler</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/simulator/environment.py">View
 source</a>
 
-```python
-reset_sampler()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>reset_sampler()
+</code></pre>
 
 Resets the relevant samplers of documents and user/users.
 
@@ -121,21 +189,64 @@ Resets the relevant samplers of documents and user/users.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/simulator/environment.py">View
 source</a>
 
-```python
-step(slate)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>step(
+    slate
+)
+</code></pre>
 
 Executes the action, returns next state observation and reward.
 
-#### Args:
-
-*   <b>`slate`</b>: An integer array of size slate_size, where each element is
-    an index into the set of current_documents presented
-
-#### Returns:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`slate`
+</td>
+<td>
+An integer array of size slate_size, where each element is an index
+into the set of current_documents presented
+</td>
+</tr>
+</table>
 
-*   <b>`user_obs`</b>: A gym observation representing the user's next state
-*   <b>`doc_obs`</b>: A list of observations of the documents
-*   <b>`responses`</b>: A list of AbstractResponse objects for each item in the
-    slate
-*   <b>`done`</b>: A boolean indicating whether the episode has terminated
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+
+<tr>
+<td>
+`user_obs`
+</td>
+<td>
+A gym observation representing the user's next state
+</td>
+</tr><tr>
+<td>
+`doc_obs`
+</td>
+<td>
+A list of observations of the documents
+</td>
+</tr><tr>
+<td>
+`responses`
+</td>
+<td>
+A list of AbstractResponse objects for each item in the slate
+</td>
+</tr><tr>
+<td>
+`done`
+</td>
+<td>
+A boolean indicating whether the episode has terminated
+</td>
+</tr>
+</table>
diff --git a/docs/api_docs/python/recsim/simulator/environment/MultiUserEnvironment.md b/docs/api_docs/python/recsim/simulator/environment/MultiUserEnvironment.md
index 366834f..27fd02e 100644
--- a/docs/api_docs/python/recsim/simulator/environment/MultiUserEnvironment.md
+++ b/docs/api_docs/python/recsim/simulator/environment/MultiUserEnvironment.md
@@ -1,11 +1,6 @@
 <div itemscope itemtype="http://developers.google.com/ReferenceObject">
 <meta itemprop="name" content="recsim.simulator.environment.MultiUserEnvironment" />
 <meta itemprop="path" content="Stable" />
-<meta itemprop="property" content="candidate_set"/>
-<meta itemprop="property" content="num_candidates"/>
-<meta itemprop="property" content="num_users"/>
-<meta itemprop="property" content="slate_size"/>
-<meta itemprop="property" content="user_model"/>
 <meta itemprop="property" content="__init__"/>
 <meta itemprop="property" content="reset"/>
 <meta itemprop="property" content="reset_sampler"/>
@@ -14,76 +9,131 @@
 
 # recsim.simulator.environment.MultiUserEnvironment
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/simulator/environment.py">View
 source</a>
 
-## Class `MultiUserEnvironment`
-
-<!-- Start diff -->
 Class to represent environment with multiple users.
 
 Inherits From:
 [`AbstractEnvironment`](../../../recsim/simulator/environment/AbstractEnvironment.md)
 
-<!-- Placeholder for "Used in" -->
-
-#### Attributes:
-
-*   <b>`user_model`</b>: A list of AbstractUserModel instances that represent
-    users.
-*   <b>`num_users`</b>: An integer representing the number of users.
-*   <b>`document_sampler`</b>: An instantiation of AbstractDocumentSampler.
-*   <b>`num_candidates`</b>: An integer representing the size of the
-    candidate_set.
-*   <b>`slate_size`</b>: An integer representing the slate size.
-*   <b>`candidate_set`</b>: An instantiation of CandidateSet.
-*   <b>`num_clusters`</b>: An integer representing the number of document
-    clusters.
-
-<h2 id="__init__"><code>__init__</code></h2>
-
-<a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/simulator/environment.py">View
-source</a>
-
-```python
-__init__(
-    user_model,
-    document_sampler,
-    num_candidates,
-    slate_size,
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.simulator.environment.MultiUserEnvironment(
+    user_model, document_sampler, num_candidates, slate_size,
     resample_documents=True
 )
-```
-
-Initializes a new simulation environment.
-
-#### Args:
-
-*   <b>`user_model`</b>: An instantiation of AbstractUserModel or list of such
-    instantiations
-*   <b>`document_sampler`</b>: An instantiation of AbstractDocumentSampler
-*   <b>`num_candidates`</b>: An integer representing the size of the
-    candidate_set
-*   <b>`slate_size`</b>: An integer representing the slate size
-*   <b>`resample_documents`</b>: A boolean indicating whether to resample the
-    candidate set every step
-
-## Properties
+</code></pre>
 
-<h3 id="candidate_set"><code>candidate_set</code></h3>
-
-<h3 id="num_candidates"><code>num_candidates</code></h3>
-
-<h3 id="num_users"><code>num_users</code></h3>
-
-<h3 id="slate_size"><code>slate_size</code></h3>
+<!-- Placeholder for "Used in" -->
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+`user_model`
+</td>
+<td>
+An instantiation of AbstractUserModel or list of such
+instantiations
+</td>
+</tr><tr>
+<td>
+`document_sampler`
+</td>
+<td>
+An instantiation of AbstractDocumentSampler
+</td>
+</tr><tr>
+<td>
+`num_candidates`
+</td>
+<td>
+An integer representing the size of the candidate_set
+</td>
+</tr><tr>
+<td>
+`slate_size`
+</td>
+<td>
+An integer representing the slate size
+</td>
+</tr><tr>
+<td>
+`resample_documents`
+</td>
+<td>
+A boolean indicating whether to resample the candidate
+set every step
+</td>
+</tr>
+</table>
 
-<h3 id="user_model"><code>user_model</code></h3>
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
+
+<tr>
+<td>
+`user_model`
+</td>
+<td>
+A list of AbstractUserModel instances that represent users.
+</td>
+</tr><tr>
+<td>
+`num_users`
+</td>
+<td>
+An integer representing the number of users.
+</td>
+</tr><tr>
+<td>
+`document_sampler`
+</td>
+<td>
+An instantiation of AbstractDocumentSampler.
+</td>
+</tr><tr>
+<td>
+`num_candidates`
+</td>
+<td>
+An integer representing the size of the candidate_set.
+</td>
+</tr><tr>
+<td>
+`slate_size`
+</td>
+<td>
+An integer representing the slate size.
+</td>
+</tr><tr>
+<td>
+`candidate_set`
+</td>
+<td>
+An instantiation of CandidateSet.
+</td>
+</tr><tr>
+<td>
+`num_clusters`
+</td>
+<td>
+An integer representing the number of document clusters.
+</td>
+</tr>
+</table>
 
 ## Methods
 
@@ -92,27 +142,44 @@ Initializes a new simulation environment.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/simulator/environment.py">View
 source</a>
 
-```python
-reset()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>reset()
+</code></pre>
 
 Resets the environment and return the first observation.
 
-#### Returns:
-
-*   <b>`user_obs`</b>: An array of floats representing observations of the
-    user's current state
-*   <b>`doc_obs`</b>: An OrderedDict of document observations keyed by document
-    ids
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+
+<tr>
+<td>
+`user_obs`
+</td>
+<td>
+An array of floats representing observations of the user's
+current state
+</td>
+</tr><tr>
+<td>
+`doc_obs`
+</td>
+<td>
+An OrderedDict of document observations keyed by document ids
+</td>
+</tr>
+</table>
 
 <h3 id="reset_sampler"><code>reset_sampler</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/simulator/environment.py">View
 source</a>
 
-```python
-reset_sampler()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>reset_sampler()
+</code></pre>
 
 Resets the relevant samplers of documents and user/users.
 
@@ -121,23 +188,65 @@ Resets the relevant samplers of documents and user/users.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/simulator/environment.py">View
 source</a>
 
-```python
-step(slates)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>step(
+    slates
+)
+</code></pre>
 
 Executes the action, returns next state observation and reward.
 
-#### Args:
-
-*   <b>`slates`</b>: A list of slates, where each slate is an integer array of
-    size slate_size, where each element is an index into the set of
-    current_documents presented
-
-#### Returns:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`slates`
+</td>
+<td>
+A list of slates, where each slate is an integer array of size
+slate_size, where each element is an index into the set of
+current_documents presented
+</td>
+</tr>
+</table>
 
-*   <b>`user_obs`</b>: A list of gym observation representing all users' next
-    state
-*   <b>`doc_obs`</b>: A list of observations of the documents
-*   <b>`responses`</b>: A list of AbstractResponse objects for each item in the
-    slate
-*   <b>`done`</b>: A boolean indicating whether the episode has terminated
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+
+<tr>
+<td>
+`user_obs`
+</td>
+<td>
+A list of gym observation representing all users' next state
+</td>
+</tr><tr>
+<td>
+`doc_obs`
+</td>
+<td>
+A list of observations of the documents
+</td>
+</tr><tr>
+<td>
+`responses`
+</td>
+<td>
+A list of AbstractResponse objects for each item in the slate
+</td>
+</tr><tr>
+<td>
+`done`
+</td>
+<td>
+A boolean indicating whether the episode has terminated
+</td>
+</tr>
+</table>
diff --git a/docs/api_docs/python/recsim/simulator/recsim_gym.md b/docs/api_docs/python/recsim/simulator/recsim_gym.md
index 6e7b3f5..d6a6e1c 100644
--- a/docs/api_docs/python/recsim/simulator/recsim_gym.md
+++ b/docs/api_docs/python/recsim/simulator/recsim_gym.md
@@ -5,7 +5,10 @@
 
 # Module: recsim.simulator.recsim_gym
 
+<!-- Insert buttons and diff -->
+
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/simulator/recsim_gym.py">View
diff --git a/docs/api_docs/python/recsim/simulator/recsim_gym/RecSimGymEnv.md b/docs/api_docs/python/recsim/simulator/recsim_gym/RecSimGymEnv.md
index 3673757..7379190 100644
--- a/docs/api_docs/python/recsim/simulator/recsim_gym/RecSimGymEnv.md
+++ b/docs/api_docs/python/recsim/simulator/recsim_gym/RecSimGymEnv.md
@@ -1,11 +1,6 @@
 <div itemscope itemtype="http://developers.google.com/ReferenceObject">
 <meta itemprop="name" content="recsim.simulator.recsim_gym.RecSimGymEnv" />
 <meta itemprop="path" content="Stable" />
-<meta itemprop="property" content="action_space"/>
-<meta itemprop="property" content="environment"/>
-<meta itemprop="property" content="game_over"/>
-<meta itemprop="property" content="observation_space"/>
-<meta itemprop="property" content="unwrapped"/>
 <meta itemprop="property" content="__enter__"/>
 <meta itemprop="property" content="__exit__"/>
 <meta itemprop="property" content="__init__"/>
@@ -21,112 +16,125 @@
 <meta itemprop="property" content="write_metrics"/>
 <meta itemprop="property" content="metadata"/>
 <meta itemprop="property" content="reward_range"/>
+<meta itemprop="property" content="spec"/>
 </div>
 
 # recsim.simulator.recsim_gym.RecSimGymEnv
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/simulator/recsim_gym.py">View
 source</a>
 
-## Class `RecSimGymEnv`
-
-<!-- Start diff -->
 Class to wrap recommender system environment to gym.Env.
 
-<!-- Placeholder for "Used in" -->
-
-#### Attributes:
-
-*   <b>`game_over`</b>: A boolean indicating whether the current game has
-    finished
-*   <b>`action_space`</b>: A gym.spaces object that specifies the space for
-    possible actions.
-*   <b>`observation_space`</b>: A gym.spaces object that specifies the space for
-    possible observations.
-
-<h2 id="__init__"><code>__init__</code></h2>
-
-<a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/simulator/recsim_gym.py">View
-source</a>
-
-```python
-__init__(
-    raw_environment,
-    reward_aggregator,
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.simulator.recsim_gym.RecSimGymEnv(
+    raw_environment, reward_aggregator,
     metrics_aggregator=_dummy_metrics_aggregator,
     metrics_writer=_dummy_metrics_writer
 )
-```
-
-Initializes a RecSim environment conforming to gym.Env.
-
-#### Args:
-
-*   <b>`raw_environment`</b>: A recsim recommender system environment.
-*   <b>`reward_aggregator`</b>: A function mapping a list of responses to a
-    number.
-*   <b>`metrics_aggregator`</b>: A function aggregating metrics over all steps
-    given responses and response_names.
-*   <b>`metrics_writer`</b>: A function writing final metrics to TensorBoard.
-
-## Properties
+</code></pre>
 
-<h3 id="action_space"><code>action_space</code></h3>
-
-Returns the action space of the environment.
-
-Each action is a vector that specified document slate. Each element in the
-vector corresponds to the index of the document in the candidate set.
-
-<h3 id="environment"><code>environment</code></h3>
+<!-- Placeholder for "Used in" -->
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+`raw_environment`
+</td>
+<td>
+A recsim recommender system environment.
+</td>
+</tr><tr>
+<td>
+`reward_aggregator`
+</td>
+<td>
+A function mapping a list of responses to a number.
+</td>
+</tr><tr>
+<td>
+`metrics_aggregator`
+</td>
+<td>
+A function aggregating metrics over all steps given
+responses and response_names.
+</td>
+</tr><tr>
+<td>
+`metrics_writer`
+</td>
+<td>
+A function writing final metrics to TensorBoard.
+</td>
+</tr>
+</table>
 
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>
+
+<tr>
+<td>
+`game_over`
+</td>
+<td>
+A boolean indicating whether the current game has finished
+</td>
+</tr><tr>
+<td>
+`action_space`
+</td>
+<td>
+A gym.spaces object that specifies the space for possible
+actions.
+</td>
+</tr><tr>
+<td>
+`observation_space`
+</td>
+<td>
+A gym.spaces object that specifies the space for possible
+observations.
+</td>
+</tr><tr>
+<td>
+`environment`
+</td>
+<td>
 Returns the recsim recommender system environment.
-
-<h3 id="game_over"><code>game_over</code></h3>
-
-<h3 id="observation_space"><code>observation_space</code></h3>
-
-Returns the observation space of the environment.
-
-Each observation is a dictionary with three keys `user`, `doc` and `response`
-that includes observation about user state, document and user response,
-respectively.
-
-<h3 id="unwrapped"><code>unwrapped</code></h3>
-
+</td>
+</tr><tr>
+<td>
+`unwrapped`
+</td>
+<td>
 Completely unwrap this env.
-
-#### Returns:
-
-*   <b>`gym.Env`</b>: The base non-wrapped gym.Env instance
+</td>
+</tr>
+</table>
 
 ## Methods
 
-<h3 id="__enter__"><code>__enter__</code></h3>
-
-```python
-__enter__()
-```
-
-<h3 id="__exit__"><code>__exit__</code></h3>
-
-```python
-__exit__(*args)
-```
-
 <h3 id="close"><code>close</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/simulator/recsim_gym.py">View
 source</a>
 
-```python
-close()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>close()
+</code></pre>
 
 Override close in your subclass to perform any necessary cleanup.
 
@@ -138,18 +146,20 @@ when the program exits.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/simulator/recsim_gym.py">View
 source</a>
 
-```python
-extract_env_info()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>extract_env_info()
+</code></pre>
 
 <h3 id="render"><code>render</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/simulator/recsim_gym.py">View
 source</a>
 
-```python
-render(mode='human')
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>render(
+    mode='human'
+)
+</code></pre>
 
 Renders the environment.
 
@@ -170,9 +180,18 @@ Make sure that your class's metadata 'render.modes' key includes the list of
 supported modes. It's recommended to call super() in implementations to use the
 functionality of this method.
 
-#### Args:
+<!-- Tabular view -->
 
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+<tr class="alt">
+<td colspan="2">
 mode (str): the mode to render with
+</td>
+</tr>
+
+</table>
 
 #### Example:
 
@@ -191,24 +210,33 @@ class MyEnv(Env): metadata = {'render.modes': ['human', 'rgb_array']}
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/simulator/recsim_gym.py">View
 source</a>
 
-```python
-reset()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>reset()
+</code></pre>
 
 Resets the state of the environment and returns an initial observation.
 
-#### Returns:
+<!-- Tabular view -->
 
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
 observation (object): the initial observation.
+</td>
+</tr>
+
+</table>
 
 <h3 id="reset_metrics"><code>reset_metrics</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/simulator/recsim_gym.py">View
 source</a>
 
-```python
-reset_metrics()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>reset_metrics()
+</code></pre>
 
 Resets every metric to zero.
 
@@ -220,18 +248,20 @@ reset() gets called for every episode.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/simulator/recsim_gym.py">View
 source</a>
 
-```python
-reset_sampler()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>reset_sampler()
+</code></pre>
 
 <h3 id="seed"><code>seed</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/simulator/recsim_gym.py">View
 source</a>
 
-```python
-seed(seed=None)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>seed(
+    seed=None
+)
+</code></pre>
 
 Sets the seed for this env's random number generator(s).
 
@@ -241,21 +271,33 @@ Some environments use multiple pseudorandom number generators. We want to
 capture all such seeds used in order to ensure that there aren't accidental
 correlations between multiple generators.
 
-#### Returns:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+list<bigint>: Returns the list of seeds used in this env's random
+number generators. The first value in the list should be the
+"main" seed, or the value which a reproducer should pass to
+'seed'. Often, the main seed equals the provided 'seed', but
+this won't be true if seed=None, for example.
+</td>
+</tr>
 
-list<bigint>: Returns the list of seeds used in this env's random number
-generators. The first value in the list should be the "main" seed, or the value
-which a reproducer should pass to 'seed'. Often, the main seed equals the
-provided 'seed', but this won't be true if seed=None, for example.
+</table>
 
 <h3 id="step"><code>step</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/simulator/recsim_gym.py">View
 source</a>
 
-```python
-step(action)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>step(
+    action
+)
+</code></pre>
 
 Runs one timestep of the environment's dynamics.
 
@@ -263,31 +305,51 @@ When end of episode is reached, you are responsible for calling `reset()` to
 reset this environment's state. Accepts an action and returns a tuple
 (observation, reward, done, info).
 
-#### Args:
+<!-- Tabular view -->
 
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+<tr class="alt">
+<td colspan="2">
 action (object): An action provided by the environment
+</td>
+</tr>
 
-#### Returns:
+</table>
 
-A four-tuple of (observation, reward, done, info) where: observation (object):
-agent's observation that include 1. User's state features 2. Document's
-observation 3. Observation about user's slate responses. reward (float) : The
-amount of reward returned after previous action done (boolean): Whether the
-episode has ended, in which case further step() calls will return undefined
-results info (dict): Contains responses for the full slate for
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
+A four-tuple of (observation, reward, done, info) where:
+observation (object): agent's observation that include
+1. User's state features
+2. Document's observation
+3. Observation about user's slate responses.
+reward (float) : The amount of reward returned after previous action
+done (boolean): Whether the episode has ended, in which case further
+step() calls will return undefined results
+info (dict): Contains responses for the full slate for
 debugging/learning.
+</td>
+</tr>
+
+</table>
 
 <h3 id="update_metrics"><code>update_metrics</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/simulator/recsim_gym.py">View
 source</a>
 
-```python
-update_metrics(
-    responses,
-    info=None
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>update_metrics(
+    responses, info=None
 )
-```
+</code></pre>
 
 Updates metrics with one step responses.
 
@@ -296,13 +358,30 @@ Updates metrics with one step responses.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/simulator/recsim_gym.py">View
 source</a>
 
-```python
-write_metrics(add_summary_fn)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>write_metrics(
+    add_summary_fn
+)
+</code></pre>
 
 Writes metrics to TensorBoard by calling add_summary_fn.
 
-## Class Members
+<h3 id="__enter__"><code>__enter__</code></h3>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>__enter__()
+</code></pre>
+
+<h3 id="__exit__"><code>__exit__</code></h3>
+
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>__exit__(
+    *args
+)
+</code></pre>
+
+## Class Variables
 
 *   `metadata` <a id="metadata"></a>
 *   `reward_range` <a id="reward_range"></a>
+*   `spec = None` <a id="spec"></a>
diff --git a/docs/api_docs/python/recsim/simulator/runner_lib.md b/docs/api_docs/python/recsim/simulator/runner_lib.md
index 2e1ea52..affd243 100644
--- a/docs/api_docs/python/recsim/simulator/runner_lib.md
+++ b/docs/api_docs/python/recsim/simulator/runner_lib.md
@@ -1,12 +1,14 @@
 <div itemscope itemtype="http://developers.google.com/ReferenceObject">
 <meta itemprop="name" content="recsim.simulator.runner_lib" />
 <meta itemprop="path" content="Stable" />
-<meta itemprop="property" content="FLAGS"/>
 </div>
 
 # Module: recsim.simulator.runner_lib
 
+<!-- Insert buttons and diff -->
+
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/simulator/runner_lib.py">View
@@ -27,9 +29,8 @@ that handles running the training.
 
 ## Functions
 
+[`FLAGS(...)`](../../recsim/environments/interest_evolution/FLAGS.md): Registry
+of 'Flag' objects.
+
 [`load_gin_configs(...)`](../../recsim/simulator/runner_lib/load_gin_configs.md):
 Loads gin configuration files.
-
-## Other Members
-
-*   `FLAGS` <a id="FLAGS"></a>
diff --git a/docs/api_docs/python/recsim/simulator/runner_lib/EvalRunner.md b/docs/api_docs/python/recsim/simulator/runner_lib/EvalRunner.md
index 47f2ff4..0288a08 100644
--- a/docs/api_docs/python/recsim/simulator/runner_lib/EvalRunner.md
+++ b/docs/api_docs/python/recsim/simulator/runner_lib/EvalRunner.md
@@ -7,49 +7,83 @@
 
 # recsim.simulator.runner_lib.EvalRunner
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/simulator/runner_lib.py">View
 source</a>
 
-## Class `EvalRunner`
-
-<!-- Start diff -->
 Object that handles running the evaluation.
 
 Inherits From: [`Runner`](../../../recsim/simulator/runner_lib/Runner.md)
 
-<!-- Placeholder for "Used in" -->
-
-See main.py for a simple example to evaluate an agent.
-
-<h2 id="__init__"><code>__init__</code></h2>
-
-```python
-__init__(
-    *args,
-    **kwargs
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.simulator.runner_lib.EvalRunner(
+    max_eval_episodes=125000, test_mode=False, min_interval_secs=30,
+    train_base_dir=None, **kwargs
 )
-```
+</code></pre>
 
-Initializes the Runner object in charge of running a full experiment.
+<!-- Placeholder for "Used in" -->
 
-#### Args:
+See main.py for a simple example to evaluate an agent.
 
-*   <b>`base_dir`</b>: str, the base directory to host all required
-    sub-directories.
-*   <b>`create_agent_fn`</b>: A function that takes as args a Tensorflow session
-    and an environment, and returns an agent.
-*   <b>`env`</b>: A Gym environment for running the experiments.
-*   <b>`episode_log_file`</b>: Path to output simulated episodes in
-    tf.SequenceExample. Disable logging if episode_log_file is an empty string.
-*   <b>`checkpoint_file_prefix`</b>: str, the prefix to use for checkpoint
-    files.
-*   <b>`max_steps_per_episode`</b>: int, maximum number of steps after which an
-    episode terminates.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+`base_dir`
+</td>
+<td>
+str, the base directory to host all required sub-directories.
+</td>
+</tr><tr>
+<td>
+`create_agent_fn`
+</td>
+<td>
+A function that takes as args a Tensorflow session and an
+environment, and returns an agent.
+</td>
+</tr><tr>
+<td>
+`env`
+</td>
+<td>
+A Gym environment for running the experiments.
+</td>
+</tr><tr>
+<td>
+`episode_log_file`
+</td>
+<td>
+Path to output simulated episodes in tf.SequenceExample.
+Disable logging if episode_log_file is an empty string.
+</td>
+</tr><tr>
+<td>
+`checkpoint_file_prefix`
+</td>
+<td>
+str, the prefix to use for checkpoint files.
+</td>
+</tr><tr>
+<td>
+`max_steps_per_episode`
+</td>
+<td>
+int, maximum number of steps after which an episode
+terminates.
+</td>
+</tr>
+</table>
 
 ## Methods
 
@@ -58,8 +92,8 @@ Initializes the Runner object in charge of running a full experiment.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/simulator/runner_lib.py">View
 source</a>
 
-```python
-run_experiment()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>run_experiment()
+</code></pre>
 
 Runs a full experiment, spread over multiple iterations.
diff --git a/docs/api_docs/python/recsim/simulator/runner_lib/Runner.md b/docs/api_docs/python/recsim/simulator/runner_lib/Runner.md
index 74dff31..364138a 100644
--- a/docs/api_docs/python/recsim/simulator/runner_lib/Runner.md
+++ b/docs/api_docs/python/recsim/simulator/runner_lib/Runner.md
@@ -6,46 +6,80 @@
 
 # recsim.simulator.runner_lib.Runner
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/simulator/runner_lib.py">View
 source</a>
 
-## Class `Runner`
-
-<!-- Start diff -->
 Object that handles running experiments.
 
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.simulator.runner_lib.Runner(
+    base_dir, create_agent_fn, env, episode_log_file='',
+    checkpoint_file_prefix='ckpt', max_steps_per_episode=27000
+)
+</code></pre>
+
 <!-- Placeholder for "Used in" -->
 
 Here we use the term 'experiment' to mean simulating interactions between the
 agent and the environment and reporting some statistics pertaining to these
 interactions.
 
-<h2 id="__init__"><code>__init__</code></h2>
+<!-- Tabular view -->
 
-```python
-__init__(
-    *args,
-    **kwargs
-)
-```
-
-Initializes the Runner object in charge of running a full experiment.
-
-#### Args:
-
-*   <b>`base_dir`</b>: str, the base directory to host all required
-    sub-directories.
-*   <b>`create_agent_fn`</b>: A function that takes as args a Tensorflow session
-    and an environment, and returns an agent.
-*   <b>`env`</b>: A Gym environment for running the experiments.
-*   <b>`episode_log_file`</b>: Path to output simulated episodes in
-    tf.SequenceExample. Disable logging if episode_log_file is an empty string.
-*   <b>`checkpoint_file_prefix`</b>: str, the prefix to use for checkpoint
-    files.
-*   <b>`max_steps_per_episode`</b>: int, maximum number of steps after which an
-    episode terminates.
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+`base_dir`
+</td>
+<td>
+str, the base directory to host all required sub-directories.
+</td>
+</tr><tr>
+<td>
+`create_agent_fn`
+</td>
+<td>
+A function that takes as args a Tensorflow session and an
+environment, and returns an agent.
+</td>
+</tr><tr>
+<td>
+`env`
+</td>
+<td>
+A Gym environment for running the experiments.
+</td>
+</tr><tr>
+<td>
+`episode_log_file`
+</td>
+<td>
+Path to output simulated episodes in tf.SequenceExample.
+Disable logging if episode_log_file is an empty string.
+</td>
+</tr><tr>
+<td>
+`checkpoint_file_prefix`
+</td>
+<td>
+str, the prefix to use for checkpoint files.
+</td>
+</tr><tr>
+<td>
+`max_steps_per_episode`
+</td>
+<td>
+int, maximum number of steps after which an episode
+terminates.
+</td>
+</tr>
+</table>
diff --git a/docs/api_docs/python/recsim/simulator/runner_lib/TrainRunner.md b/docs/api_docs/python/recsim/simulator/runner_lib/TrainRunner.md
index 80f6052..628eb79 100644
--- a/docs/api_docs/python/recsim/simulator/runner_lib/TrainRunner.md
+++ b/docs/api_docs/python/recsim/simulator/runner_lib/TrainRunner.md
@@ -7,49 +7,82 @@
 
 # recsim.simulator.runner_lib.TrainRunner
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/simulator/runner_lib.py">View
 source</a>
 
-## Class `TrainRunner`
-
-<!-- Start diff -->
 Object that handles running the training.
 
 Inherits From: [`Runner`](../../../recsim/simulator/runner_lib/Runner.md)
 
-<!-- Placeholder for "Used in" -->
-
-See main.py for a simple example to train an agent.
-
-<h2 id="__init__"><code>__init__</code></h2>
-
-```python
-__init__(
-    *args,
-    **kwargs
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.simulator.runner_lib.TrainRunner(
+    max_training_steps=250000, num_iterations=100, checkpoint_frequency=1, **kwargs
 )
-```
+</code></pre>
 
-Initializes the Runner object in charge of running a full experiment.
+<!-- Placeholder for "Used in" -->
 
-#### Args:
+See main.py for a simple example to train an agent.
 
-*   <b>`base_dir`</b>: str, the base directory to host all required
-    sub-directories.
-*   <b>`create_agent_fn`</b>: A function that takes as args a Tensorflow session
-    and an environment, and returns an agent.
-*   <b>`env`</b>: A Gym environment for running the experiments.
-*   <b>`episode_log_file`</b>: Path to output simulated episodes in
-    tf.SequenceExample. Disable logging if episode_log_file is an empty string.
-*   <b>`checkpoint_file_prefix`</b>: str, the prefix to use for checkpoint
-    files.
-*   <b>`max_steps_per_episode`</b>: int, maximum number of steps after which an
-    episode terminates.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+`base_dir`
+</td>
+<td>
+str, the base directory to host all required sub-directories.
+</td>
+</tr><tr>
+<td>
+`create_agent_fn`
+</td>
+<td>
+A function that takes as args a Tensorflow session and an
+environment, and returns an agent.
+</td>
+</tr><tr>
+<td>
+`env`
+</td>
+<td>
+A Gym environment for running the experiments.
+</td>
+</tr><tr>
+<td>
+`episode_log_file`
+</td>
+<td>
+Path to output simulated episodes in tf.SequenceExample.
+Disable logging if episode_log_file is an empty string.
+</td>
+</tr><tr>
+<td>
+`checkpoint_file_prefix`
+</td>
+<td>
+str, the prefix to use for checkpoint files.
+</td>
+</tr><tr>
+<td>
+`max_steps_per_episode`
+</td>
+<td>
+int, maximum number of steps after which an episode
+terminates.
+</td>
+</tr>
+</table>
 
 ## Methods
 
@@ -58,8 +91,8 @@ Initializes the Runner object in charge of running a full experiment.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/simulator/runner_lib.py">View
 source</a>
 
-```python
-run_experiment()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>run_experiment()
+</code></pre>
 
 Runs a full experiment, spread over multiple iterations.
diff --git a/docs/api_docs/python/recsim/simulator/runner_lib/load_gin_configs.md b/docs/api_docs/python/recsim/simulator/runner_lib/load_gin_configs.md
index 1e0ae49..9dc6c62 100644
--- a/docs/api_docs/python/recsim/simulator/runner_lib/load_gin_configs.md
+++ b/docs/api_docs/python/recsim/simulator/runner_lib/load_gin_configs.md
@@ -5,29 +5,46 @@
 
 # recsim.simulator.runner_lib.load_gin_configs
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/simulator/runner_lib.py">View
 source</a>
 
-<!-- Start diff -->
 Loads gin configuration files.
 
-```python
-recsim.simulator.runner_lib.load_gin_configs(
-    gin_files,
-    gin_bindings
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.simulator.runner_lib.load_gin_configs(
+    gin_files, gin_bindings
 )
-```
+</code></pre>
 
 <!-- Placeholder for "Used in" -->
 
-#### Args:
-
-*   <b>`gin_files`</b>: list, of paths to the gin configuration files for this
-    experiment.
-*   <b>`gin_bindings`</b>: list, of gin parameter bindings to override the
-    values in the config files.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+`gin_files`
+</td>
+<td>
+list, of paths to the gin configuration files for this
+experiment.
+</td>
+</tr><tr>
+<td>
+`gin_bindings`
+</td>
+<td>
+list, of gin parameter bindings to override the values in the
+config files.
+</td>
+</tr>
+</table>
diff --git a/docs/api_docs/python/recsim/user.md b/docs/api_docs/python/recsim/user.md
index 493dbac..8763270 100644
--- a/docs/api_docs/python/recsim/user.md
+++ b/docs/api_docs/python/recsim/user.md
@@ -5,7 +5,10 @@
 
 # Module: recsim.user
 
+<!-- Insert buttons and diff -->
+
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/user.py">View
diff --git a/docs/api_docs/python/recsim/user/AbstractResponse.md b/docs/api_docs/python/recsim/user/AbstractResponse.md
index f36e733..d99ebe6 100644
--- a/docs/api_docs/python/recsim/user/AbstractResponse.md
+++ b/docs/api_docs/python/recsim/user/AbstractResponse.md
@@ -7,17 +7,15 @@
 
 # recsim.user.AbstractResponse
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/user.py">View
 source</a>
 
-## Class `AbstractResponse`
-
-<!-- Start diff -->
 Abstract class to model a user response.
 
 <!-- Placeholder for "Used in" -->
@@ -29,9 +27,10 @@ Abstract class to model a user response.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/user.py">View
 source</a>
 
-```python
-create_observation()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>@abc.abstractmethod</code>
+<code>create_observation()
+</code></pre>
 
 Creates a tensor observation of this response.
 
@@ -40,9 +39,10 @@ Creates a tensor observation of this response.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/user.py">View
 source</a>
 
-```python
-@staticmethod
-response_space()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>@staticmethod</code>
+<code>@abc.abstractmethod</code>
+<code>response_space()
+</code></pre>
 
 ArraySpec that defines how a single response is represented.
diff --git a/docs/api_docs/python/recsim/user/AbstractUserModel.md b/docs/api_docs/python/recsim/user/AbstractUserModel.md
index feb698f..ecca708 100644
--- a/docs/api_docs/python/recsim/user/AbstractUserModel.md
+++ b/docs/api_docs/python/recsim/user/AbstractUserModel.md
@@ -15,44 +15,57 @@
 
 # recsim.user.AbstractUserModel
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/user.py">View
 source</a>
 
-## Class `AbstractUserModel`
-
-<!-- Start diff -->
 Abstract class to represent an encoding of a user's dynamics.
 
-<!-- Placeholder for "Used in" -->
-
-<h2 id="__init__"><code>__init__</code></h2>
-
-<a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/user.py">View
-source</a>
-
-```python
-__init__(
-    response_model_ctor,
-    user_sampler,
-    slate_size
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.user.AbstractUserModel(
+    response_model_ctor, user_sampler, slate_size
 )
-```
-
-Initializes a new user model.
+</code></pre>
 
-#### Args:
+<!-- Placeholder for "Used in" -->
 
-*   <b>`response_model_ctor`</b>: A class/constructor representing the type of
-    responses this model will generate.
-*   <b>`user_sampler`</b>: An instance of AbstractUserSampler that can generate
-    initial user states from an inital state distribution.
-*   <b>`slate_size`</b>: integer number of documents that can be served to the
-    user at any interaction.
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+`response_model_ctor`
+</td>
+<td>
+A class/constructor representing the type of
+responses this model will generate.
+</td>
+</tr><tr>
+<td>
+`user_sampler`
+</td>
+<td>
+An instance of AbstractUserSampler that can generate
+initial user states from an inital state distribution.
+</td>
+</tr><tr>
+<td>
+`slate_size`
+</td>
+<td>
+integer number of documents that can be served to the user at
+any interaction.
+</td>
+</tr>
+</table>
 
 ## Methods
 
@@ -61,9 +74,9 @@ Initializes a new user model.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/user.py">View
 source</a>
 
-```python
-create_observation()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>create_observation()
+</code></pre>
 
 Emits obesrvation about user's state.
 
@@ -72,9 +85,9 @@ Emits obesrvation about user's state.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/user.py">View
 source</a>
 
-```python
-get_response_model_ctor()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>get_response_model_ctor()
+</code></pre>
 
 Returns a constructor for the type of response this model will create.
 
@@ -83,9 +96,10 @@ Returns a constructor for the type of response this model will create.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/user.py">View
 source</a>
 
-```python
-is_terminal()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>@abc.abstractmethod</code>
+<code>is_terminal()
+</code></pre>
 
 Returns a boolean indicating whether this session is over.
 
@@ -94,9 +108,9 @@ Returns a boolean indicating whether this session is over.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/user.py">View
 source</a>
 
-```python
-observation_space()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>observation_space()
+</code></pre>
 
 A Gym.spaces object that describes possible user observations.
 
@@ -105,9 +119,9 @@ A Gym.spaces object that describes possible user observations.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/user.py">View
 source</a>
 
-```python
-reset()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>reset()
+</code></pre>
 
 Resets the user.
 
@@ -116,9 +130,9 @@ Resets the user.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/user.py">View
 source</a>
 
-```python
-reset_sampler()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>reset_sampler()
+</code></pre>
 
 Resets the sampler.
 
@@ -127,49 +141,91 @@ Resets the sampler.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/user.py">View
 source</a>
 
-```python
-response_space()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>response_space()
+</code></pre>
 
 <h3 id="simulate_response"><code>simulate_response</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/user.py">View
 source</a>
 
-```python
-simulate_response(documents)
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>@abc.abstractmethod</code>
+<code>simulate_response(
+    documents
+)
+</code></pre>
 
 Simulates the user's response to a slate of documents.
 
 This could involve simulating models of attention, as well as random sampling
 for selection from scored documents.
 
-#### Args:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
 
-*   <b>`documents`</b>: a list of AbstractDocuments
+<tr>
+<td>
+`documents`
+</td>
+<td>
+a list of AbstractDocuments
+</td>
+</tr>
+</table>
 
-#### Returns:
+<!-- Tabular view -->
 
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
+<tr class="alt">
+<td colspan="2">
 (response) a list of AbstractResponse objects for each slate item
+</td>
+</tr>
+
+</table>
 
 <h3 id="update_state"><code>update_state</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/user.py">View
 source</a>
 
-```python
-update_state(
-    slate_documents,
-    responses
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>@abc.abstractmethod</code>
+<code>update_state(
+    slate_documents, responses
 )
-```
+</code></pre>
 
 Updates the user's state based on the slate and document selected.
 
-#### Args:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Args</th></tr>
+
+<tr>
+<td>
+`slate_documents`
+</td>
+<td>
+A list of AbstractDocuments for items in the slate.
+</td>
+</tr><tr>
+<td>
+`responses`
+</td>
+<td>
+A list of AbstractResponses for each item in the slate.
+</td>
+</tr>
+</table>
 
-*   <b>`slate_documents`</b>: A list of AbstractDocuments for items in the
-    slate.
-*   <b>`responses`</b>: A list of AbstractResponses for each item in the slate.
-    Updates: The user's hidden state.
+Updates: The user's hidden state.
diff --git a/docs/api_docs/python/recsim/user/AbstractUserSampler.md b/docs/api_docs/python/recsim/user/AbstractUserSampler.md
index ee28d8a..bb8aa39 100644
--- a/docs/api_docs/python/recsim/user/AbstractUserSampler.md
+++ b/docs/api_docs/python/recsim/user/AbstractUserSampler.md
@@ -9,42 +9,47 @@
 
 # recsim.user.AbstractUserSampler
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/user.py">View
 source</a>
 
-## Class `AbstractUserSampler`
-
-<!-- Start diff -->
 Abstract class to sample users.
 
-<!-- Placeholder for "Used in" -->
-
-<h2 id="__init__"><code>__init__</code></h2>
-
-<a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/user.py">View
-source</a>
-
-```python
-__init__(
-    user_ctor,
-    seed=0
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.user.AbstractUserSampler(
+    user_ctor, seed=0
 )
-```
-
-Creates a new user state sampler.
-
-User states of the type user_ctor are sampled.
+</code></pre>
 
-#### Args:
-
-*   <b>`user_ctor`</b>: A class/constructor for the type of user states that
-    will be sampled.
-*   <b>`seed`</b>: An integer for a random seed.
+<!-- Placeholder for "Used in" -->
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
+
+<tr>
+<td>
+`user_ctor`
+</td>
+<td>
+A class/constructor for the type of user states that will be
+sampled.
+</td>
+</tr><tr>
+<td>
+`seed`
+</td>
+<td>
+An integer for a random seed.
+</td>
+</tr>
+</table>
 
 ## Methods
 
@@ -53,9 +58,9 @@ User states of the type user_ctor are sampled.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/user.py">View
 source</a>
 
-```python
-get_user_ctor()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>get_user_ctor()
+</code></pre>
 
 Returns the constructor/class of the user states that will be sampled.
 
@@ -64,17 +69,18 @@ Returns the constructor/class of the user states that will be sampled.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/user.py">View
 source</a>
 
-```python
-reset_sampler()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>reset_sampler()
+</code></pre>
 
 <h3 id="sample_user"><code>sample_user</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/user.py">View
 source</a>
 
-```python
-sample_user()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>@abc.abstractmethod</code>
+<code>sample_user()
+</code></pre>
 
 Creates a new instantiation of this user's hidden state parameters.
diff --git a/docs/api_docs/python/recsim/user/AbstractUserState.md b/docs/api_docs/python/recsim/user/AbstractUserState.md
index 0948bbc..cbab8a7 100644
--- a/docs/api_docs/python/recsim/user/AbstractUserState.md
+++ b/docs/api_docs/python/recsim/user/AbstractUserState.md
@@ -3,21 +3,20 @@
 <meta itemprop="path" content="Stable" />
 <meta itemprop="property" content="create_observation"/>
 <meta itemprop="property" content="observation_space"/>
+<meta itemprop="property" content="NUM_FEATURES"/>
 </div>
 
 # recsim.user.AbstractUserState
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/user.py">View
 source</a>
 
-## Class `AbstractUserState`
-
-<!-- Start diff -->
 Abstract class to represent a user's state.
 
 <!-- Placeholder for "Used in" -->
@@ -29,28 +28,42 @@ Abstract class to represent a user's state.
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/user.py">View
 source</a>
 
-```python
-create_observation()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>@abc.abstractmethod</code>
+<code>create_observation()
+</code></pre>
 
 Generates obs of underlying state to simulate partial observability.
 
-#### Returns:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2">Returns</th></tr>
 
-*   <b>`obs`</b>: A float array of the observed user features.
+<tr>
+<td>
+`obs`
+</td>
+<td>
+A float array of the observed user features.
+</td>
+</tr>
+</table>
 
 <h3 id="observation_space"><code>observation_space</code></h3>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/user.py">View
 source</a>
 
-```python
-@staticmethod
-observation_space()
-```
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>@staticmethod</code>
+<code>@abc.abstractmethod</code>
+<code>observation_space()
+</code></pre>
 
 Gym.spaces object that defines how user states are represented.
 
+## Class Variables
 
-
-
+*   `NUM_FEATURES = None` <a id="NUM_FEATURES"></a>
diff --git a/docs/api_docs/python/recsim/utils.md b/docs/api_docs/python/recsim/utils.md
index 7a46883..390a557 100644
--- a/docs/api_docs/python/recsim/utils.md
+++ b/docs/api_docs/python/recsim/utils.md
@@ -5,7 +5,10 @@
 
 # Module: recsim.utils
 
+<!-- Insert buttons and diff -->
+
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/utils.py">View
diff --git a/docs/api_docs/python/recsim/utils/aggregate_video_cluster_metrics.md b/docs/api_docs/python/recsim/utils/aggregate_video_cluster_metrics.md
index 0cb76de..87eefb3 100644
--- a/docs/api_docs/python/recsim/utils/aggregate_video_cluster_metrics.md
+++ b/docs/api_docs/python/recsim/utils/aggregate_video_cluster_metrics.md
@@ -5,34 +5,64 @@
 
 # recsim.utils.aggregate_video_cluster_metrics
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/utils.py">View
 source</a>
 
-<!-- Start diff -->
 Aggregates the video cluster metrics with one step responses.
 
-```python
-recsim.utils.aggregate_video_cluster_metrics(
-    responses,
-    metrics,
-    info=None
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.utils.aggregate_video_cluster_metrics(
+    responses, metrics, info=None
 )
-```
+</code></pre>
 
 <!-- Placeholder for "Used in" -->
 
-#### Args:
+<!-- Tabular view -->
+
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>
 
-*   <b>`responses`</b>: a dictionary of names, observed responses.
-*   <b>`metrics`</b>: A dictionary mapping from metric_name to its value in
-    float.
-*   <b>`info`</b>: Additional info for computing metrics (ignored here)
+<tr>
+<td>
+`responses`
+</td>
+<td>
+a dictionary of names, observed responses.
+</td>
+</tr><tr>
+<td>
+`metrics`
+</td>
+<td>
+A dictionary mapping from metric_name to its value in float.
+</td>
+</tr><tr>
+<td>
+`info`
+</td>
+<td>
+Additional info for computing metrics (ignored here)
+</td>
+</tr>
+</table>
 
-#### Returns:
+<!-- Tabular view -->
 
+ <table class="responsive fixed orange">
+<colgroup><col width="214px"><col></colgroup>
+<tr><th colspan="2"><h2 class="add-link">Returns</h2></th></tr>
+<tr class="alt">
+<td colspan="2">
 A dictionary storing metrics after aggregation.
+</td>
+</tr>
+
+</table>
diff --git a/docs/api_docs/python/recsim/utils/write_video_cluster_metrics.md b/docs/api_docs/python/recsim/utils/write_video_cluster_metrics.md
index 53aeadb..c845cee 100644
--- a/docs/api_docs/python/recsim/utils/write_video_cluster_metrics.md
+++ b/docs/api_docs/python/recsim/utils/write_video_cluster_metrics.md
@@ -5,23 +5,21 @@
 
 # recsim.utils.write_video_cluster_metrics
 
-<!-- Insert buttons -->
+<!-- Insert buttons and diff -->
 
 <table class="tfo-notebook-buttons tfo-api" align="left">
+
 </table>
 
 <a target="_blank" href="https://github.com/google-research/recsim/tree/master/recsim/utils.py">View
 source</a>
 
-<!-- Start diff -->
-
 Writes average video cluster metrics using add_summary_fn.
 
-```python
-recsim.utils.write_video_cluster_metrics(
-    metrics,
-    add_summary_fn
+<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
+<code>recsim.utils.write_video_cluster_metrics(
+    metrics, add_summary_fn
 )
-```
+</code></pre>
 
 <!-- Placeholder for "Used in" -->
diff --git a/recsim/agents/slate_decomp_q_agent.py b/recsim/agents/slate_decomp_q_agent.py
index f289f79..0a6d9cc 100644
--- a/recsim/agents/slate_decomp_q_agent.py
+++ b/recsim/agents/slate_decomp_q_agent.py
@@ -1,5 +1,4 @@
 # coding=utf-8
-# coding=utf-8
 # Copyright 2019 The RecSim Authors.
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
@@ -172,7 +171,7 @@ def set_element(v, i, x):
     numerator = numerator + tf.gather(s * q, k)
     denominator = denominator + tf.gather(s, k)
 
-  output_slate = tf.compat.v1.where(tf.equal(mask, 0))
+  output_slate = tf.where(tf.equal(mask, 0))
   return output_slate
 
 
@@ -352,13 +351,13 @@ def compute_target_topk_q(reward, gamma, next_actions, next_q_values,
 
   # Get the expected Q-value of the slate containing top-K items.
   # [batch_size, slate_size]
-  next_q_values_selected = tf.compat.v1.batch_gather(
+  next_q_values_selected = tf.batch_gather(
       next_q_values, tf.cast(topk_optimal_slate, dtype=tf.int32))
 
   # Get normalized affinity scores on the slate.
   # [batch_size, slate_size]
-  scores_selected = tf.compat.v1.batch_gather(
-      scores, tf.cast(topk_optimal_slate, dtype=tf.int32))
+  scores_selected = tf.batch_gather(scores,
+                                    tf.cast(topk_optimal_slate, dtype=tf.int32))
 
   next_q_target_topk = tf.reduce_sum(
       input_tensor=next_q_values_selected * scores_selected, axis=1) / (
@@ -475,9 +474,9 @@ def __init__(self,
     abstract_agent.AbstractEpisodicRecommenderAgent.__init__(self, action_space)
 
     # The doc score is a [num_candidates] vector.
-    self._doc_affinity_scores_ph = tf.compat.v1.placeholder(
+    self._doc_affinity_scores_ph = tf.placeholder(
         tf.float32, (self._num_candidates,), name='doc_affinity_scores_ph')
-    self._prob_no_click_ph = tf.compat.v1.placeholder(
+    self._prob_no_click_ph = tf.placeholder(
         tf.float32, (), name='prob_no_click_ph')
 
     self._select_slate_fn = select_slate_fn
@@ -496,7 +495,7 @@ def __init__(self,
   def _network_adapter(self, states, scope):
     self._validate_states(states)
 
-    with tf.compat.v1.name_scope('network'):
+    with tf.name_scope('network'):
       # Since we decompose the slate optimization into an item-level
       # optimization problem, the observation space is the user state
       # observation plus all documents' observations. In the Dopamine DQN agent
@@ -513,7 +512,7 @@ def _network_adapter(self, states, scope):
     return dqn_agent.DQNNetworkType(q_values)
 
   def _build_networks(self):
-    with tf.compat.v1.name_scope('networks'):
+    with tf.name_scope('networks'):
       self._replay_net_outputs = self._network_adapter(self._replay.states,
                                                        'Online')
       self._replay_next_target_net_outputs = self._network_adapter(
@@ -533,7 +532,7 @@ def _build_train_op(self):
     # slate_q_values: [B, S]
     # replay_click_q: [B]
     click_indicator = self._replay.rewards[:, :, self._click_response_index]
-    slate_q_values = tf.compat.v1.batch_gather(
+    slate_q_values = tf.batch_gather(
         self._replay_net_outputs.q_values,
         tf.cast(self._replay.actions, dtype=tf.int32))
     # Only get the Q from the clicked document.
@@ -545,8 +544,7 @@ def _build_train_op(self):
     target = tf.stop_gradient(self._build_target_q_op())
 
     clicked = tf.reduce_sum(input_tensor=click_indicator, axis=1)
-    clicked_indices = tf.squeeze(
-        tf.compat.v1.where(tf.equal(clicked, 1)), axis=1)
+    clicked_indices = tf.squeeze(tf.where(tf.equal(clicked, 1)), axis=1)
     # clicked_indices is a vector and tf.gather selects the batch dimension.
     q_clicked = tf.gather(replay_click_q, clicked_indices)
     target_clicked = tf.gather(target, clicked_indices)
@@ -554,8 +552,8 @@ def _build_train_op(self):
     def get_train_op():
       loss = tf.reduce_mean(input_tensor=tf.square(q_clicked - target_clicked))
       if self.summary_writer is not None:
-        with tf.compat.v1.variable_scope('Losses'):
-          tf.compat.v1.summary.scalar('Loss', loss)
+        with tf.variable_scope('Losses'):
+          tf.summary.scalar('Loss', loss)
 
       return loss
 
@@ -613,25 +611,24 @@ def _build_select_slate_op(self):
     p_no_click = self._prob_no_click_ph
     p = self._doc_affinity_scores_ph
     q = self._net_outputs.q_values[0]
-    with tf.compat.v1.name_scope('select_slate'):
+    with tf.name_scope('select_slate'):
       self._output_slate = self._select_slate_fn(self._slate_size, p_no_click,
                                                  p, q)
 
-    self._output_slate = tf.compat.v1.Print(
+    self._output_slate = tf.Print(
         self._output_slate, [tf.constant('cp 1'), self._output_slate, p, q],
         summarize=10000)
     self._output_slate = tf.reshape(self._output_slate, (self._slate_size,))
 
-    self._action_counts = tf.compat.v1.get_variable(
+    self._action_counts = tf.get_variable(
         'action_counts',
         shape=[self._num_candidates],
-        initializer=tf.compat.v1.zeros_initializer())
+        initializer=tf.zeros_initializer())
     output_slate = tf.reshape(self._output_slate, [-1])
     output_one_hot = tf.one_hot(output_slate, self._num_candidates)
     update_ops = []
     for i in range(self._slate_size):
-      update_ops.append(
-          tf.compat.v1.assign_add(self._action_counts, output_one_hot[i]))
+      update_ops.append(tf.assign_add(self._action_counts, output_one_hot[i]))
     self._select_action_update_op = tf.group(*update_ops)
 
   def _select_action(self):
@@ -660,7 +657,7 @@ def _select_action(self):
       observation = self._raw_observation
       user_obs = observation['user']
       doc_obs = np.array(list(observation['doc'].values()))
-      tf.compat.v1.logging.debug('cp 1: %s, %s', doc_obs, observation)
+      tf.logging.debug('cp 1: %s, %s', doc_obs, observation)
       # TODO(cwhsu): Use score_documents_tf() and remove score_documents().
       scores, score_no_click = score_documents(user_obs, doc_obs)
       output_slate, _ = self._sess.run(
@@ -697,8 +694,8 @@ def _build_replay_buffer(self, use_staging):
 
   def _add_summary(self, tag, value):
     if self.summary_writer:
-      summary = tf.compat.v1.Summary(
-          value=[tf.compat.v1.Summary.Value(tag=tag, simple_value=value)])
+      summary = tf.Summary(
+          value=[tf.Summary.Value(tag=tag, simple_value=value)])
       self.summary_writer.add_summary(summary, self.training_steps)
 
   def begin_episode(self, observation):
diff --git a/recsim/environments/interest_exploration.py b/recsim/environments/interest_exploration.py
index 9d6885a..c8aeca1 100644
--- a/recsim/environments/interest_exploration.py
+++ b/recsim/environments/interest_exploration.py
@@ -71,7 +71,6 @@ class IEUserModel(user.AbstractUserModel):
   Args:
   slate_size: An integer representing the size of the slate.
   no_click_mass: A float indicating the mass given to a no-click option.
-    Must be positive, otherwise CTR is always 1.
   choice_model_ctor: A contructor function to create user choice model.
   user_state_ctor: A constructor to create user state.
   response_model_ctor: A constructor function to create response. The
@@ -87,8 +86,6 @@ def __init__(self,
                user_state_ctor=None,
                response_model_ctor=None,
                seed=0):
-    if no_click_mass < 0:
-      raise ValueError('no_click_mass must be positive.')
 
     super(IEUserModel, self).__init__(response_model_ctor, IEClusterUserSampler(
         user_ctor=user_state_ctor, seed=seed), slate_size)
diff --git a/setup.py b/setup.py
index 2035764..10b974f 100644
--- a/setup.py
+++ b/setup.py
@@ -44,7 +44,7 @@
 
 setup(
     name='recsim',
-    version='0.2.3',
+    version='0.2.4',
     author='The RecSim Team',
     author_email='no-reply@google.com',
     description=recsim_description,

Args
+`action_space` +	+A gym.spaces object that specifies the format of actions. +
+`summary_writer` +	+A Tensorflow summary writer to pass to the agent +for in-agent training statistics in Tensorboard. +
Args
+`checkpoint_dir` +	+A string that represents the path to the checkpoint and is +used when we save TensorFlow objects by tf.Save. +
+`iteration_number` +	+An integer that represents the checkpoint version and is +used when restoring replay buffer. +
Args
+`reward` +	+An float that is the last reward from the environment. +
+`observation` +	+numpy array that represents the last observation of the +episode. +
Args
+`reward` +	+The reward received from the agent's most recent action as a +float. +
+`observation` +	+A dictionary that includes the most recent observations. +
Args
+`checkpoint_dir` +	+A string for the directory where objects will be saved. +
+`iteration_number` +	+An integer of iteration number to use for naming the +checkpoint file. +
Args
+`num_arms` +	+Number of arms. Must be greater than one. +
+`params` +	+A dictionary which includes additional parameters like +optimism_scaling. Default is an empty dictionary. +
+`seed` +	+Random seed for this object. Default is zero. +
Attributes
+`pulls` +	+A numpy array which counts number of pulls of each arm +
+`reward` +	+A numpy array which sums up reward of each arm +
+`optimism_scaling` +	+A float specifying the confidence level. Default value +(1.0) corresponds to the exploration strategy presented in the literature. +A smaller number means less exploration and more exploitation. +
+`_rng` +	+An instance of random.RandomState for random number generation +
Args
+`observation_space` +	+Instance of a gym space corresponding to the +observation format. +
+`action_space` +	+A gym.spaces object that specifies the format of actions. +
+`alg_ctor` +	+A class of an MABAlgorithm for exploration, default to UCB1. +
+`ci_scaling` +	+A floating number specifying the scaling of confidence bound. +
+`random_seed` +	+An integer for random seed. +
+`**kwargs` +	+currently unused arguments. +
Args
+`reward` +	+Unused. +
+`observation` +	+A dictionary that includes the most recent observations and +should have the following fields: +- user: A dictionary representing user's observed state. Assumes +observation['user']['sufficient_statics'] is a dictionary containing +base agent impression counts and base agent click counts. +
Args
+`sess` +	+`tf.compat.v1.Session`, for executing ops. +
+`num_actions` +	+int, number of actions the agent can take at any state. +
+`observation_shape` +	+tuple of ints describing the observation shape. +
+`observation_dtype` +	+tf.DType, specifies the type of the observations. Note +that if your inputs are continuous, you should set this to tf.float32. +
+`stack_size` +	+int, number of frames to use in state stack. +
+`network` +	+tf.Keras.Model, expecting 2 parameters: num_actions, +network_type. A call to this object will return an instantiation of the +network provided. The network returned can be run with different inputs +to create different outputs. See +dopamine.discrete_domains.atari_lib.NatureDQNNetwork as an example. +
+`gamma` +	+float, discount factor with the usual RL meaning. +
+`update_horizon` +	+int, horizon at which updates are performed, the 'n' in +n-step update. +
+`min_replay_history` +	+int, number of transitions that should be experienced +before the agent begins training its value function. +
+`update_period` +	+int, period between DQN updates. +
+`target_update_period` +	+int, update period for the target network. +
+`epsilon_fn` +	+function expecting 4 parameters: +(decay_period, step, warmup_steps, epsilon). This function should return +the epsilon value used for exploration during training. +
+`epsilon_train` +	+float, the value to which the agent's epsilon is eventually +decayed during training. +
+`epsilon_eval` +	+float, epsilon used when evaluating the agent. +
+`epsilon_decay_period` +	+int, length of the epsilon decay schedule. +
+`tf_device` +	+str, Tensorflow device on which the agent's graph is executed. +
+`eval_mode` +	+bool, True for evaluation and False for training. +
+`use_staging` +	+bool, when True use a staging area to prefetch the next +training batch, speeding training up by about 30%. +
+`max_tf_checkpoints_to_keep` +	+int, the number of TensorFlow checkpoints to +keep. +
+`optimizer` +	+`tf.compat.v1.train.Optimizer`, for training the value +function. +
+`summary_writer` +	+SummaryWriter object for outputting training statistics. +Summary writing disabled if set to None. +
+`summary_writing_frequency` +	+int, frequency with which summaries will be +written. Lower values will result in slower training. +
+`allow_partial_reload` +	+bool, whether we allow reloading a partial agent +(for instance, only the network parameters). +