Skip to content

Commit

Permalink
update docs
Browse files Browse the repository at this point in the history
  • Loading branch information
ppwwyyxx committed Oct 16, 2018
1 parent 1871b61 commit e9363df
Show file tree
Hide file tree
Showing 4 changed files with 44 additions and 6 deletions.
16 changes: 13 additions & 3 deletions docs/tutorial/extend/callback.md
Expand Up @@ -79,11 +79,15 @@ You can overwrite any of the following methods in the new callback:
return tf.train.SessionRunArgs(fetches=my_op)
```

The training loops would become `sess.run([training_op, my_op])`.
The training loops would become equivalent to `sess.run([training_op, my_op])`.

However, if you write `my_op.run()` in `_trigger_step`, the training loop would become
`sess.run(training_op); sess.run(my_op);`.
Usually the difference matters, please choose carefully.

If you want to run ops that depend on your inputs, it's better to run it
__along with__ the training iteration, to avoid wasting a datapoint and avoid
messing up hooks of the `InputSource`.

* `_trigger_step(self)`

Expand All @@ -107,7 +111,13 @@ You can overwrite any of the following methods in the new callback:
* Access tensors / ops (details mentioned above):
* For existing tensors/ops created in the tower, access them through [self.trainer.towers](../../modules/train.html#tensorpack.train.TowerTrainer.towers).
* Extra tensors/ops have to be created in `_setup_graph` callback method.
* Access the current graph and session by `self.trainer.graph` and `self.trainer.sess`.
* Access the current graph and session by `self.trainer.graph` and
`self.trainer.sess`, `self.trainer.hooked_sess`.
Note that calling `(hooked_)sess.run` to evaluate tensors may have unexpected
effect in certain scenarios.
In general, use `sess.run` to evaluate tensors that do not depend on the inputs.
And use `_{before,after}_run` to evaluate tensors together with inputs if the
tensors depend on the inputs.
* Write stuff to the monitor backend, by `self.trainer.monitors.put_xxx`.
The monitors might direct your events to TensorFlow events file, JSON file, stdout, etc.
You can access history monitor data as well. See the docs for [Monitors](../../modules/callbacks.html#tensorpack.callbacks.Monitors)
Expand All @@ -118,7 +128,7 @@ You can overwrite any of the following methods in the new callback:
### Typical Steps about Writing/Using a Callback

* Define the callback in `__init__`, prepare for it in `_setup_graph, _before_train`.
* Know whether you want to do something __along with__ the session run or not.
* Know whether you want to do something __along with__ the training iterations or not.
If yes, implement the logic with `_{before,after}_run`.
Otherwise, implement in `_trigger`, or `_trigger_step`.
* You can choose to only implement "what to do", and leave "when to do" to
Expand Down
6 changes: 5 additions & 1 deletion docs/tutorial/inference.md
Expand Up @@ -51,12 +51,16 @@ with TowerContext('', is_training=False):
training settings (queues, iterators, summaries, evaluations, multi-gpu replications).
Therefore it is usually wrong to import a training metagraph for inference.
It's especially error-prone to load a metagraph on top of a non-empty graph.
The potential name conflicts between the current graph and the nodes in the
metagraph can lead to esoteric bugs or sometimes completely ruin the model.
It's also very common to change the graph for inference.
For example, you may need a different data layout for CPU inference,
or you may need placeholders in the inference graph (which may not even exist in
the training graph). However metagraph is not designed to be easily modified at all.
To do inference, it's best to recreate a clean graph (and save it if needed) by yourself.
Due to the above reasons, to do inference, it's best to recreate a clean graph (and save it if needed) by yourself.
```

### Step 2: load the checkpoint
Expand Down
4 changes: 3 additions & 1 deletion tensorpack/callbacks/inference_runner.py
Expand Up @@ -126,7 +126,9 @@ def __init__(self, input, infs, tower_name='InferenceTower', tower_func=None, de
device (int): the device to use
"""
if isinstance(input, DataFlow):
input = FeedInput(input, infinite=True) # TODO a better way to handle inference size
# use infinite=False so that a dataflow without size will stop normally
# TODO a better way to handle inference size
input = FeedInput(input, infinite=False)
assert isinstance(input, InputSource), input
assert not isinstance(input, StagingInput), input
self._tower_name = tower_name
Expand Down
24 changes: 23 additions & 1 deletion tensorpack/train/base.py
Expand Up @@ -104,13 +104,35 @@ class Trainer(object):
"""
The ``tf.Session`` object the trainer is using.
Available after :meth:`initialize()`.
Using ``trainer.sess.run`` to evaluate tensors that depend on the inputs
can lead to unexpected effect:
For example, if you use ``trainer.sess.run`` to evaluate a tensor that depends on the
inputs coming from a ``StagingArea``,
this will take a datapoint from the ``StagingArea``, making the ``StagingArea`` empty, and as a result
make the training hang.
"""

hooked_sess = None
"""
The ``tf.train.MonitoredSession`` object the trainer is using.
It contains all the hooks the callbacks have registered.
It contains all the ``before_run/after_run`` hooks the callbacks have registered.
It is used for running the training iterations.
Available after :meth:`initialize()`.
Note that using ``hooked_sess.run`` will evaluate all the hooks,
just like running a training iteration. It may do the following:
1. Take a datapoint from the InputSource
2. Increase the global_step
3. Evaluate some summaries
Typically you do not want to use ``hooked_sess.run`` in callbacks,
because it is for the "training iteration". If you just want to evaluate
some tensors, use ``sess.run`` if the tensors does not depend on the inputs,
or more generally, use `before_run/after_run` to evaluate the tensors **along with**
the training iterations.
"""

def __init__(self):
Expand Down

0 comments on commit e9363df

Please sign in to comment.