Skip to content

Commit

Permalink
Add more details for CTC layer, fix CTC evalutor and add their interf…
Browse files Browse the repository at this point in the history
…ace test (#74)

* Add some comments for CTC layer and fix CTC evalutors, also add interface test
  • Loading branch information
qingqing01 authored and reyoung committed Sep 14, 2016
1 parent 9300678 commit 9f3cbed
Show file tree
Hide file tree
Showing 4 changed files with 31 additions and 7 deletions.
4 changes: 2 additions & 2 deletions doc/build/contribute_to_paddle.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ repo or just head straight to the command line:

```shell
# Clone your fork to your local machine
git clone git@github.com:USERNAME/paddle.git
git clone git@github.com:USERNAME/Paddle.git
```
Then you can start to develop.

Expand All @@ -52,7 +52,7 @@ To do this, you'll need to add a remote at first:
# see the current configured remote repository
git remote -v
# add upstream repository
git remote add upstream https://github.com/paddle/paddle.git
git remote add upstream https://github.com/baidu/Paddle.git
# verify the new upstream
git remote -v
```
Expand Down
10 changes: 7 additions & 3 deletions python/paddle/trainer_config_helpers/evaluators.py
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,7 @@ def evaluator_base(
Batch=200 samples=20000 AvgCost=0.679655 CurrentCost=0.662179 Eval:
classification_error_evaluator=0.4486
CurrentEval: ErrorRate=0.3964
:param input: Input layers, a object of LayerOutput or a list of
LayerOutput.
:type input: list|LayerOutput
Expand Down Expand Up @@ -296,6 +296,7 @@ def precision_recall_evaluator(
@wrap_name_default()
def ctc_error_evaluator(
input,
label,
name=None,
):
"""
Expand All @@ -305,16 +306,19 @@ def ctc_error_evaluator(
.. code-block:: python
eval = ctc_error_evaluator(input)
eval = ctc_error_evaluator(input=input, label=lbl)
:param name: Evaluator name.
:type name: None|basestring
:param input: Input Layer.
:type input: LayerOutput
:param label: input label, which is a data_layer.
:type label: LayerOutput
"""
evaluator_base(name=name,
type="ctc_edit_distance",
input=input)
input=input,
label=label)

@evaluator(EvaluatorAttribute.FOR_CLASSIFICATION)
@wrap_name_default()
Expand Down
15 changes: 13 additions & 2 deletions python/paddle/trainer_config_helpers/layers.py
Original file line number Diff line number Diff line change
Expand Up @@ -2944,7 +2944,7 @@ def linear_comb_layer(weights, vectors, size, name=None):
.. math::
z = x^T Y
z = x^\mathrm{T} Y
In this formular:
- :math:`x`: weights
Expand Down Expand Up @@ -3064,6 +3064,17 @@ def ctc_layer(input, label, size, name=None, norm_by_times=False):
classication task. That is, for sequence labeling problems where the
alignment between the inputs and the target labels is unknown.
More details can be found by referring to `Connectionist Temporal
Classification: Labelling Unsegmented Sequence Data with Recurrent
Neural Networks <http://machinelearning.wustl.edu/mlpapers/paper_files/icml2006_GravesFGS06.pdf>`_
Note:
Considering the 'blank' label needed by CTC, you need to use
(num_classes + 1) as the input size. num_classes is the category number.
And the 'blank' is the last category index. So the size of 'input' layer, such as
fc_layer with softmax activation, should be num_classes + 1. The size of ctc_layer
should also be num_classes + 1.
The simple usage:
.. code-block:: python
Expand All @@ -3077,7 +3088,7 @@ def ctc_layer(input, label, size, name=None, norm_by_times=False):
:type input: LayerOutput
:param label: The data layer of label with variable length.
:type label: LayerOutput
:param size: category numbers.
:param size: category numbers + 1.
:type size: int
:param name: The name of this layer, which can not specify.
:type name: string|None
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,15 @@

outputs(classification_cost(out, data_layer(name="label", size=num_classes)))

# for ctc
tmp = fc_layer(input=x1,
size=num_classes + 1,
act=SoftmaxActivation())
ctc = ctc_layer(input=tmp,
label=y,
size=num_classes + 1)
ctc_eval = ctc_error_evaluator(input=ctc, label=y)

This comment has been minimized.

Copy link
@emailweixu

emailweixu Sep 14, 2016

Collaborator

I think the input should be tmp for ctc_error_evaluator. Please double check


settings(
batch_size=10,
learning_rate=2e-3,
Expand Down

0 comments on commit 9f3cbed

Please sign in to comment.