Correct typos (#614)

* Correct typos * Add spell check when available * Update changelog * Fix space * Fix HER link
Stable-Baselines-Team · Dec 12, 2019 · ea93850 · ea93850
1 parent bef46d3
commit ea93850
Show file tree

Hide file tree

Showing 32 changed files with 202 additions and 86 deletions.
diff --git a/docs/common/schedules.rst b/docs/common/schedules.rst
@@ -3,8 +3,8 @@
 Schedules
 =========
 
-Schedules are used as hyperparameter for most of the algortihms,
-in order to change value of a parameter over time (usuallly the learning rate).
+Schedules are used as hyperparameter for most of the algorithms,
+in order to change value of a parameter over time (usually the learning rate).
 
 
 .. automodule:: stable_baselines.common.schedules

diff --git a/docs/conf.py b/docs/conf.py
@@ -16,6 +16,14 @@
 import sys
 from unittest.mock import MagicMock
 
+# We CANNOT enable 'sphinxcontrib.spelling' because ReadTheDocs.org does not support
+# PyEnchant.
+try:
+    import sphinxcontrib.spelling
+    enable_spell_check = True
+except ImportError:
+    enable_spell_check = False
+
 # source code directory, relative to this file, for sphinx-autobuild
 sys.path.insert(0, os.path.abspath('..'))
 
@@ -69,6 +77,9 @@ def __getattr__(cls, name):
     'sphinx.ext.viewcode',
 ]
 
+if enable_spell_check:
+    extensions.append('sphinxcontrib.spelling')
+
 # Add any paths that contain templates here, relative to this directory.
 templates_path = ['_templates']
 

diff --git a/docs/guide/install.rst b/docs/guide/install.rst
@@ -169,7 +169,7 @@ Explanation of the docker command:
 -  ``--ipc=host`` Use the host system’s IPC namespace. IPC (POSIX/SysV IPC) namespace provides
    separation of named shared memory segments, semaphores and message
    queues.
--  ``--name test`` give explicitely the name ``test`` to the container,
+-  ``--name test`` give explicitly the name ``test`` to the container,
    otherwise it will be assigned a random name
 -  ``--mount src=...`` give access of the local directory (``pwd``
    command) to the container (it will be map to ``/root/code/stable-baselines``), so

diff --git a/docs/guide/pretrain.rst b/docs/guide/pretrain.rst
@@ -80,7 +80,7 @@ The idea is that this callable can be a PID controller, asking a human player, .
 		    return env.action_space.sample()
 		# Data will be saved in a numpy archive named `expert_cartpole.npz`
 		# when using something different than an RL expert,
-		# you must pass the environment object explicitely
+		# you must pass the environment object explicitly
 		generate_expert_traj(dummy_expert, 'dummy_expert_cartpole', env, n_episodes=10)
 
 

diff --git a/docs/guide/rl_tips.rst b/docs/guide/rl_tips.rst
@@ -33,7 +33,7 @@ bad trajectories.
 This factor, among others, explains that results in RL may vary from one run to another (i.e., when only the seed of the pseudo-random generator changes).
 For this reason, you should always do several runs to have quantitative results.
 
-Good results in RL are generally dependent on finding appropriate hyperparameters. Recent alogrithms (PPO, SAC, TD3) normally require little hyperparameter tuning,
+Good results in RL are generally dependent on finding appropriate hyperparameters. Recent algorithms (PPO, SAC, TD3) normally require little hyperparameter tuning,
 however, *don't expect the default ones to work* on any environment.
 
 Therefore, we *highly recommend you* to take a look at the `RL zoo <https://github.com/araffin/rl-baselines-zoo>`_ (or the original papers) for tuned hyperparameters.
@@ -93,7 +93,7 @@ or continuous actions (ex: go to a certain speed)?
 Some algorithms are only tailored for one or the other domain: `DQN` only supports discrete actions, where `SAC` is restricted to continuous actions.
 
 The second difference that will help you choose is whether you can parallelize your training or not, and how you can do it (with or without MPI?).
-If what matters is the wall clock training time, then you should lean towards `À2C` and its derivates (PPO, ACER, ACKTR, ...).
+If what matters is the wall clock training time, then you should lean towards `A2C` and its derivatives (PPO, ACER, ACKTR, ...).
 Take a look at the `Vectorized Environments <vec_envs.html>`_ to learn more about training with multiple workers.
 
 To sum it up:
@@ -146,7 +146,7 @@ If you can use MPI, then you can choose between PPO1, TRPO and DDPG.
 Goal Environment
 -----------------
 
-If your environment follows the `GoalEnv` interface (cf `HER <her.html>`_), then you should use
+If your environment follows the `GoalEnv` interface (cf `HER <../modules/her.html>`_), then you should use
 HER + (SAC/TD3/DDPG/DQN) depending on the action space.
 
 

diff --git a/docs/misc/changelog.rst b/docs/misc/changelog.rst
@@ -73,6 +73,8 @@ Documentation:
 - Update custom env documentation to reflect new gym API for the `close()` method (@justinkterry)
 - Update custom env documentation to clarify what step and reset return (@justinkterry)
 - Add RL tips and tricks for doing RL experiments
+- Corrected lots of typos
+- Add spell check to documentation if available
 
 
 Release 2.8.0 (2019-09-29)
@@ -388,7 +390,7 @@ Release 2.1.1 (2018-10-20)
 --------------------------
 
 - fixed MpiAdam synchronization issue in PPO1 (thanks to @brendenpetersen) issue #50
-- fixed dependency issues (new mujoco-py requires a mujoco licence + gym broke MultiDiscrete space shape)
+- fixed dependency issues (new mujoco-py requires a mujoco license + gym broke MultiDiscrete space shape)
 
 
 Release 2.1.0 (2018-10-2)

diff --git a/docs/modules/her.rst b/docs/modules/her.rst
@@ -93,7 +93,7 @@ Goal Selection Strategies
 	:undoc-members:
 
 
-Gaol Env Wrapper
+Goal Env Wrapper
 ----------------
 
 .. autoclass:: HERGoalEnvWrapper

diff --git a/docs/spelling_wordlist.txt b/docs/spelling_wordlist.txt
@@ -0,0 +1,103 @@
+py
+env
+atari
+argparse
+Argparse
+TensorFlow
+feedforward
+envs
+VecEnv
+pretrain
+petrained
+tf
+np
+mujoco
+cpu
+ndarray
+ndarrays
+timestep
+timesteps
+stepsize
+dataset
+adam
+fn
+normalisation
+Kullback
+Leibler
+boolean
+deserialized
+pretrained
+minibatch
+subprocesses
+ArgumentParser
+Tensorflow
+Gaussian
+approximator
+minibatches
+hyperparameters
+hyperparameter
+vectorized
+rl
+colab
+dataloader
+npz
+datasets
+vf
+logits
+num
+Utils
+backpropagate
+prepend
+NaN
+preprocessing
+Cloudpickle
+async
+multiprocess
+tensorflow
+mlp
+cnn
+neglogp
+tanh
+coef
+repo
+Huber
+params
+ppo
+arxiv
+Arxiv
+func
+DQN
+Uhlenbeck
+Ornstein
+multithread
+cancelled
+Tensorboard
+parallelize
+customising
+serializable
+Multiprocessed
+cartpole
+toolset
+lstm
+rescale
+ffmpeg
+avconv
+unnormalized
+Github
+pre
+preprocess
+backend
+attr
+preprocess
+Antonin
+Raffin
+araffin
+Homebrew
+Numpy
+Theano
+rollout
+kfac
+Piecewise
+csv
+nvidia
+visdom
diff --git a/stable_baselines/acer/acer_simple.py b/stable_baselines/acer/acer_simple.py
@@ -75,7 +75,7 @@ class ACER(ActorCriticRLModel):
             Use `n_cpu_tf_sess` instead.
 
     :param q_coef: (float) The weight for the loss on the Q value
-    :param ent_coef: (float) The weight for the entropic loss
+    :param ent_coef: (float) The weight for the entropy loss
     :param max_grad_norm: (float) The clipping value for the maximum gradient
     :param learning_rate: (float) The initial learning rate for the RMS prop optimizer
     :param lr_schedule: (str) The type of scheduler for the learning rate update ('linear', 'constant',
@@ -390,13 +390,13 @@ def custom_getter(getter, name, *args, **kwargs):
                     tf.summary.scalar('rewards', tf.reduce_mean(self.reward_ph))
                     tf.summary.scalar('learning_rate', tf.reduce_mean(self.learning_rate))
                     tf.summary.scalar('advantage', tf.reduce_mean(adv))
-                    tf.summary.scalar('action_probabilty', tf.reduce_mean(self.mu_ph))
+                    tf.summary.scalar('action_probability', tf.reduce_mean(self.mu_ph))
 
                     if self.full_tensorboard_log:
                         tf.summary.histogram('rewards', self.reward_ph)
                         tf.summary.histogram('learning_rate', self.learning_rate)
                         tf.summary.histogram('advantage', adv)
-                        tf.summary.histogram('action_probabilty', self.mu_ph)
+                        tf.summary.histogram('action_probability', self.mu_ph)
                         if tf_util.is_image(self.observation_space):
                             tf.summary.image('observation', train_model.obs_ph)
                         else:

diff --git a/stable_baselines/acktr/acktr.py b/stable_baselines/acktr/acktr.py
@@ -30,7 +30,7 @@ class ACKTR(ActorCriticRLModel):
             Use `n_cpu_tf_sess` instead.
 
     :param n_steps: (int) The number of steps to run for each environment
-    :param ent_coef: (float) The weight for the entropic loss
+    :param ent_coef: (float) The weight for the entropy loss
     :param vf_coef: (float) The weight for the loss on the value function
     :param vf_fisher_coef: (float) The weight for the fisher loss on the value function
     :param learning_rate: (float) The initial learning rate for the RMS prop optimizer

diff --git a/stable_baselines/acktr/kfac.py b/stable_baselines/acktr/kfac.py
@@ -25,7 +25,7 @@ def __init__(self, learning_rate=0.01, momentum=0.9, clip_kl=0.01, kfac_update=2
         :param clip_kl: (float) gradient clipping for Kullback-Leibler
         :param kfac_update: (int) update kfac after kfac_update steps
         :param stats_accum_iter: (int) how may steps to accumulate stats
-        :param full_stats_init: (bool) whether or not to fully initalize stats
+        :param full_stats_init: (bool) whether or not to fully initialize stats
         :param cold_iter: (int) Cold start learning rate for how many steps
         :param cold_lr: (float) Cold start learning rate
         :param async_eigen_decomp: (bool) Use async eigen decomposition

diff --git a/stable_baselines/common/atari_wrappers.py b/stable_baselines/common/atari_wrappers.py
@@ -276,7 +276,7 @@ def __getitem__(self, i):
 
 def make_atari(env_id):
     """
-    Create a wrapped atari envrionment
+    Create a wrapped atari Environment
 
     :param env_id: (str) the environment ID
     :return: (Gym Environment) the wrapped atari environment

diff --git a/stable_baselines/common/base_class.py b/stable_baselines/common/base_class.py
@@ -238,9 +238,9 @@ def _get_pretrain_placeholders(self):
         """
         Return the placeholders needed for the pretraining:
         - obs_ph: observation placeholder
-        - actions_ph will be population with an action from the environement
+        - actions_ph will be population with an action from the environment
             (from the expert dataset)
-        - deterministic_actions_ph: e.g., in the case of a gaussian policy,
+        - deterministic_actions_ph: e.g., in the case of a Gaussian policy,
             the mean.
 
         :return: ((tf.placeholder)) (obs_ph, actions_ph, deterministic_actions_ph)
@@ -474,7 +474,7 @@ def load(cls, load_path, env=None, custom_objects=None, **kwargs):
         Load the model from file
 
         :param load_path: (str or file-like) the saved parameter location
-        :param env: (Gym Envrionment) the new environment to run the loaded model on
+        :param env: (Gym Environment) the new environment to run the loaded model on
             (can be None if you only need prediction from a trained model)
         :param custom_objects: (dict) Dictionary of objects to replace
             upon loading. If a variable is present in this dictionary as a
@@ -862,7 +862,7 @@ def load(cls, load_path, env=None, custom_objects=None, **kwargs):
         Load the model from file
 
         :param load_path: (str or file-like) the saved parameter location
-        :param env: (Gym Envrionment) the new environment to run the loaded model on
+        :param env: (Gym Environment) the new environment to run the loaded model on
             (can be None if you only need prediction from a trained model)
         :param custom_objects: (dict) Dictionary of objects to replace
             upon loading. If a variable is present in this dictionary as a
@@ -945,7 +945,7 @@ def load(cls, load_path, env=None, custom_objects=None, **kwargs):
         Load the model from file
 
         :param load_path: (str or file-like) the saved parameter location
-        :param env: (Gym Envrionment) the new environment to run the loaded model on
+        :param env: (Gym Environment) the new environment to run the loaded model on
             (can be None if you only need prediction from a trained model)
         :param custom_objects: (dict) Dictionary of objects to replace
             upon loading. If a variable is present in this dictionary as a

diff --git a/stable_baselines/common/cmd_util.py b/stable_baselines/common/cmd_util.py
@@ -25,7 +25,7 @@ def make_vec_env(env_id, n_envs=1, seed=None, start_index=0,
 
     :param env_id: (str or Type[gym.Env]) the environment ID or the environment class
     :param n_envs: (int) the number of environments you wish to have in parallel
-    :param seed: (int) the inital seed for the random number generator
+    :param seed: (int) the initial seed for the random number generator
     :param start_index: (int) start rank index
     :param monitor_dir: (str) Path to a folder where the monitor files will be saved.
         If None, no file will be written, however, the env will still be wrapped
@@ -80,7 +80,7 @@ def make_atari_env(env_id, num_env, seed, wrapper_kwargs=None,
 
     :param env_id: (str) the environment ID
     :param num_env: (int) the number of environment you wish to have in subprocesses
-    :param seed: (int) the inital seed for RNG
+    :param seed: (int) the initial seed for RNG
     :param wrapper_kwargs: (dict) the parameters for wrap_deepmind function
     :param start_index: (int) start rank index
     :param allow_early_resets: (bool) allows early reset of the environment
@@ -116,7 +116,7 @@ def make_mujoco_env(env_id, seed, allow_early_resets=True):
     Create a wrapped, monitored gym.Env for MuJoCo.
 
     :param env_id: (str) the environment ID
-    :param seed: (int) the inital seed for RNG
+    :param seed: (int) the initial seed for RNG
     :param allow_early_resets: (bool) allows early reset of the environment
     :return: (Gym Environment) The mujoco environment
     """
@@ -132,7 +132,7 @@ def make_robotics_env(env_id, seed, rank=0, allow_early_resets=True):
     Create a wrapped, monitored gym.Env for MuJoCo.
 
     :param env_id: (str) the environment ID
-    :param seed: (int) the inital seed for RNG
+    :param seed: (int) the initial seed for RNG
     :param rank: (int) the rank of the environment (for logging)
     :param allow_early_resets: (bool) allows early reset of the environment
     :return: (Gym Environment) The robotic environment