# Training Acoustic Model with Connectionist Temporal Classification (CTC) Criteria
CNTK implementation of CTC is *parallel* and is based on the paper by A. Graves et al. *"Connectionist temporal classification: labeling unsegmented sequence data with recurrent neural networks"*. Readers are expected to be familiar with the content and notation from the paper.

## Data Preparation
CNTK consumes Acoustic Model (AM) training data in HTK/MLF format and typically expects 3 input files
* [SCP file with features](https://github.com/Microsoft/CNTK/blob/master/Tests/EndToEndTests/Speech/Data/glob_0000.scp)
* [MLF file with labels](https://github.com/Microsoft/CNTK/blob/master/Tests/EndToEndTests/Speech/Data/glob_0000.mlf)
* [States list file](https://github.com/Microsoft/CNTK/blob/master/Tests/EndToEndTests/Speech/Data/state_ctc.list)

The example state list file contains the CTC blank label "s_blank" as the last entry, i.e. at index 132.

## Feature Input Definition


In [2]:
import os
import cntk as C
import numpy as np

# Select the right target device
if 'TEST_DEVICE' in os.environ:
    if os.environ['TEST_DEVICE'] == 'cpu':
        C.device.try_set_default_device(C.device.cpu())
    else:
        C.device.try_set_default_device(C.device.gpu(0))

data_dir = os.path.join("..", "Tests", "EndToEndTests", "Speech", "Data")
print("Current directory {0}".format(os.getcwd()))
if os.path.realpath(data_dir) != os.path.realpath(os.getcwd()):
    print("Changing to data directory {0}".format(data_dir))
    os.chdir(data_dir)
       
feature_dimension = 33
feature = C.input((feature_dimension))

label_dimension = 133
label = C.input((label_dimension))

train_feature_filepath = "glob_0000.scp"
train_label_filepath = "glob_0000.mlf"
mapping_filepath = "state_ctc.list"
train_feature_stream = C.io.HTKFeatureDeserializer(C.io.StreamDefs(amazing_feature = C.io.StreamDef(shape = feature_dimension, scp = train_feature_filepath)))
train_label_stream = C.io.HTKMLFDeserializer(mapping_filepath, C.io.StreamDefs(awesome_label = C.io.StreamDef(shape = label_dimension, mlf = train_label_filepath)))
train_data_reader = C.io.MinibatchSource([train_feature_stream, train_label_stream], frame_mode = False)
train_input_map = {feature: train_data_reader.streams.amazing_feature, label: train_data_reader.streams.awesome_label}


Current directory D:\CNTK\CNTK\Tutorials
Changing to data directory ..\Tests\EndToEndTests\Speech\Data


## Normalize Features and Define a Network with LSTM Layers

In [7]:
feature_mean = np.fromfile(os.path.join("GlobalStats", "mean.363"), dtype=float, count=feature_dimension)
feature_inverse_stddev = np.fromfile(os.path.join("GlobalStats", "var.363"), dtype=float, count=feature_dimension)
feature_normalized = (feature - feature_mean) * feature_inverse_stddev

with C.default_options(enable_self_stabilization=False, activation=C.sigmoid):
	model = C.layers.Sequential([C.layers.For(range(3), lambda: C.layers.Recurrence(C.layers.LSTM(1024))), C.layers.Dense(label.shape[0])])(feature_normalized)

RuntimeError: PastValue/FutureValue Function 'PastValue: Output('Block5879_Output_0', [#], [1024]) -> Output('PastValue5647_Output_0', [???], [???])': Input operand 'Output('Block5879_Output_0', [#], [1024])' with #dynamic axes != 2 (1 sequence axis and 1 batch axis) is not supported.

[CALL STACK]
    > CNTK::NDMask::  MaskedCount
    - CNTK::  Unpooling
    - CNTK::  Unpooling
    - CNTK::  Unpooling
    - CNTK::Function::  ReplacePlaceholders
    - CNTK::Function::  Clone
    - PyInit__cntk_py
    - PyInit__cntk_py
    - PyCFunction_Call
    - PyEval_GetFuncDesc
    - PyEval_EvalFrameEx
    - PyEval_EvalFrameEx
    - PyEval_GetFuncDesc
    - PyEval_GetFuncDesc
    - PyEval_EvalFrameEx
    - PyEval_EvalFrameEx

