Table of Contents
Spinning Up ships with basic logging tools, implemented in the classes Logger and EpochLogger. The Logger class contains most of the basic functionality for saving diagnostics, hyperparameter configurations, the state of a training run, and the trained model. The EpochLogger class adds a thin layer on top of that to make it easy to track the average, standard deviation, min, and max value of a diagnostic over each epoch and across MPI workers.
You Should Know
All Spinning Up algorithm implementations use an EpochLogger.
First, let's look at a simple example of how an EpochLogger keeps track of a diagnostic value:
>>> from spinup.utils.logx import EpochLogger >>> epoch_logger = EpochLogger() >>> for i in range(10): epoch_logger.store(Test=i) >>> epoch_logger.log_tabular('Test', with_min_and_max=True) >>> epoch_logger.dump_tabular() ------------------------------------- | AverageTest | 4.5 | | StdTest | 2.87 | | MaxTest | 9 | | MinTest | 0 | -------------------------------------
store method is used to save all values of
Test to the
epoch_logger's internal state. Then, when
log_tabular is called, it computes the average, standard deviation, min, and max of
Test over all of the values in the internal state. The internal state is wiped clean after the call to
log_tabular (to prevent leakage into the statistics at the next epoch). Finally,
dump_tabular is called to write the diagnostics to file and to stdout.
Next, let's look at a full training procedure with the logger embedded, to highlight configuration and model saving as well as diagnostic logging:
In this example, observe that
- On line 19, logger.save_config is used to save the hyperparameter configuration to a JSON file.
- On lines 42 and 43, logger.setup_tf_saver is used to prepare the logger to save the key elements of the computation graph.
- On line 54, diagnostics are saved to the logger's internal state via logger.store.
- On line 58, the computation graph is saved once per epoch via logger.save_state.
- On lines 61-66, logger.log_tabular and logger.dump_tabular are used to write the epoch diagnostics to file. Note that the keys passed into logger.log_tabular are the same as the keys passed into logger.store.
The preceding example was given in Tensorflow. For PyTorch, everything is the same except for L42-43: instead of
logger.setup_tf_saver, you would use
logger.setup_pytorch_saver, and you would pass it a PyTorch module (the network you are training) as an argument.
The behavior of
logger.save_state is the same as in the Tensorflow case: each time it is called, it'll save the latest version of the PyTorch module.
You Should Know
Several algorithms in RL are easily parallelized by using MPI to average gradients and/or other key quantities. The Spinning Up loggers are designed to be well-behaved when using MPI: things will only get written to stdout and to file from the process with rank 0. But information from other processes isn't lost if you use the EpochLogger: everything which is passed into EpochLogger via
store, regardless of which process it's stored in, gets used to compute average/std/min/max values for a diagnostic.
.. autoclass:: spinup.utils.logx.Logger :members: .. automethod:: spinup.utils.logx.Logger.__init__
.. autoclass:: spinup.utils.logx.EpochLogger :show-inheritance: :members:
To load an actor-critic model saved by a PyTorch Spinning Up implementation, run:
ac = torch.load('path/to/model.pt')
When you use this method to load an actor-critic model, you can minimally expect it to have an
act method that allows you to sample actions from the policy, given observations:
actions = ac.act(torch.as_tensor(obs, dtype=torch.float32))
.. autofunction:: spinup.utils.logx.restore_tf_graph
When you use this method to restore a graph saved by a Tensorflow Spinning Up implementation, you can minimally expect it to include the following:
||Tensorflow placeholder for state input.|
Samples an action from the agent, conditioned
on states in
The relevant value functions for an algorithm are also typically stored. For details of what else gets saved by a given algorithm, see its documentation page.