# Tutorial

In [1]:
# Loading all helper functions
import sys
sys.path.insert(0, '..')
from src.models.data_utils import *
from src.models.model_utils import *
from src.models.train_model import *
from tqdm import tqdm

## General info:

### Code structure
- Helper functions are stored in src/models
- Data is stored in data/processed

### Corpora and annotation
#### Dicta-Sign
Annotations include (see more detail in the ortolang repo):
* **fls**: fully-lexical signs, encoded as categorical values (gloss indices)
* **PT**:    pointing signs, encoded as binary
* **PT_PRO1**, **PT_PRO2**, **PT_PRO3**, **PT_LOC**, **PT_DET**, **PT_LBUOY**, **PT_BUOY**: sub-categories for pointing signs, encoded as binary
* **DS**:    depicting signs, encoded as binary
* **DSA**, **DSG**, **DSL**, **DSM**, **DSS**, **DST**, **DSX**: sub-categories for depicting signs, encoded as binary
* **FBUOY**: fragment buoys, encoded as binary
* **N**:     numbering signs, encoded as binary
* **FS**:    fingerspelling signs, encoded as binary

#### NCSLGR
Annotations include (all data is binary):
* **lexical_with_ns_not_fs**: lexical signs, including numbering signs but excluding fingerspelling signs
* **fingerspelling**, **fingerspelled_loan_signs**: fingerspelling signs, finglerspelling loan signs
* **IX_1p**, **IX_2p**, **IX_3p**, **IX_loc**: sub-categories for pointing signs
* **POSS**, **SELF**: possessive pronouns
* **DCL**, **LCL**, **SCL**, **BCL**, **ICL**, **BPCL**, **PCL**: sub-categories for classifier signs (i.e. depicting signs)
* **gesture**: culturally shared gestures
* **part_indef**
* **other**

### Type of input features
Originally, this code was designed around preprocessed features for each frame. Possible features types are : 
- **bodyFace_2D_raw_hands_OP**
- **bodyFace_2D_raw_hands_OP_HS**
- **bodyFace_2D_raw_hands_HS**
- **bodyFace_2D_raw_hands_None**
- **bodyFace_2D_features_hands_OP**
- **bodyFace_2D_features_hands_OP_HS**
- **bodyFace_2D_features_hands_HS**
- **bodyFace_2D_features_hands_None**
- **bodyFace_3D_raw_hands_OP**
- **bodyFace_3D_raw_hands_OP_HS**
- **bodyFace_3D_raw_hands_HS**
- **bodyFace_3D_raw_hands_None**
- **bodyFace_3D_features_hands_OP**
- **bodyFace_3D_features_hands_OP_HS**
- **bodyFace_3D_features_hands_HS**
- **bodyFace_3D_features_hands_None**

which correspond to 2D or 3D data, raw OpenPose or preprocessed body and face data, including or excluding hand shape estimates, including or excluding OpenPose hand data.

The main data function `get_data_concatenated` requires a features dictionary, which can be obtained with the `getFeaturesDict` function.

Recently, we also added direct image input to the model, but it has not been tested thoroughly.

### Model outputs
<a id='list_examples'></a>

#### A unique output (`output_form='sign_types'`)
This is the setting that was tested thoroughly for the thesis manuscript. This setting can be used when:
- one wants to recognize a binary linguistic descriptor
    - **[Example 1](#ex1)**: Y1 = [other, depicting signs]
    - **[Example 1 bis](#ex1bis)**: Y1bis = [other, depicting signs of type A and G assembled into one category]
    - **[Example 2](#ex2)**: Y2 = [other, lexical signs (all)]
- one wants to recognize a certain number of lexical signs
    - **[Example 2 bis](#ex2bis)**: Y2bis = [other, lexical signs (indices 43015, 43038, 42318, 43357, 43116, 42719)] (binary output)
    - **[Example 2 ter](#ex2ter)**: Y2ter = [other, lexical sign 43015, lexical sign 43038, lexical sign 42318, lexical sign 43357, lexical sign 43116, lexical sign 42719] (categorical output)
- one wants to recognize sign types as per Yanovich et. al. (the most probable sign type for each frame)
    - **[Example 3](#ex3)**: Y3 = [other, lexical signs,  pointing signs, depicting signs, fragment buoys]. 

#### Multiple mixed output (`output_form='mixed'`)
This setting was not tested thoroughly. This setting can be used when:
- one wants to recognize several (N) mixed linguistic descriptors in parallel, possibly simultaneously true. Each descriptor includes a 'garbage/other' class.
    - **[Example 4](#ex4)** (N = 4) : Y4 = [Y4_1 : lexical signs (categorical with 6 different signs), Y4_2 : pointing signs, Y4_3 : depicting signs, Y4_4 : fragment buoys]
    - **[Example 5](#ex5)** (N = 2) : Y5 = [Y5_1 : pointing signs to PRO1/2/3, Y5_2 : lexical signs (all)]

## Metrics

### Generally used during training
- Frame-wise accuracy, precision, recall, F1

### Generally used to evaluate the quality of predictions 
- Frame-wise accuracy, precision, recall, F1
- Unit-wise P*, R*, F1* with different margins (see [thesis](https://tel.archives-ouvertes.fr/tel-03082011/document) for detail)
- Integral values Ip, Ir, Ipr (see [thesis](https://tel.archives-ouvertes.fr/tel-03082011/document) for detail)

## Getting help on a function

In [2]:
# Use help(function_name)
# For instance:

help(get_raw_annotation_from_file)

Help on function get_raw_annotation_from_file in module src.models.data_utils:

get_raw_annotation_from_file(corpus, from_notebook=False)
    Gets raw annotation from data file
    
    Inputs:
        corpus: 'DictaSign' or 'NCSLGR'
        from_notebook: True if used in Jupyter notebook
    
    Outputs:
        Annotation data



## Main helper function for data handling (`data_utils.py`): `get_data_concatenated`

This is the main function, which enables to extract data in usable format for training.

`get_data_concatenated` returns [X_features, X_frames], Y, idx_trueData or [X_features, X_frames], Y (depending on return_idx_trueData)

#### Outputs
- **X_features** is a numpy array of size [1, total_time_steps, features_number] containing all retained preprocessed features for all retained frames
- **X_frames** is simply a list of paths for all retained frames (frames cannot be stored in memory directly, and will have to be read during training thanks to frames paths)
- **Y** is the annotation data (i.e. ground truth data) in the desired format

#### Inputs
- **corpus** (string)
- **output_form**:
    - 'mixed' if different and separated Outputs
    - 'sign_types' if annotation is only a binary matrix of sign types
- **types**: a list of lists of original names that are used to compose final outputs
- **nonZero**: a list of lists of nonzero values to consider. If 4 outputs with all nonZero values should be considered, nonZero=[[],[],[],[]]
- **binary**: only considered when output_form=mixed. It's a list (True/False) indicating whether the values should be categorical or binary
- **features_dict**: a dictionary indication which features to keep ; e.g.: {'features_HS':np.arange(0, 420), 'features_HS_norm':np.array([]), 'raw':np.array([]), 'raw_norm':np.array([])}
- **preloaded_features**: if features are already loaded, in the format of a list (features for each video)
- **provided_annotation**: raw annotation data (not needed)
- **video_indices**: numpy array for a list of videos
- **separation**: in order to separate consecutive videos
- **from_notebook**: if notebook script, data is in parent folder
- **return_idx_trueData**: if True, returns a binary vector with 0 where separations are
- **features_type**: 'features', 'frames', 'both'            
- **frames_path_before_video**: video frames are supposed to be in folders, like '/localHD/DictaSign/convert/img/DictaSign_lsf_S7_T2_A10',
- **empty_image_path**: path of a white frame


## Main helper function for model handling (`model_utils.py`): `get_model`

This is the main function, which enables to obtain the Keras model.

`get_model` returns a Keras model

#### Outputs
- a Keras model

#### Inputs
- **output_names**: list of outputs (strings)
- **output_classes**: list of number of classes of each output type
- **output_weights**: list of weights for each_output
- **conv** (bool): if True, applies convolution on input
- **conv_filt**: number of convolution filters
- **conv_ker**: size of convolution kernel
- **conv_strides**: size of convolution strides
- **rnn_number**: number of recurrent layers
- **rnn_type**: type of recurrent layers (string)
- **rnn_hidden_units**: number of hidden units
- **dropout**: how much dropout (0 to 1)
- **att_in_rnn**: if True, applies attention layer before recurrent layers
- **att_in_rnn_single**: single (shared) attention layer or not
- **att_in_rnn_type** (string): timewise or featurewise attention layer
- **att_out_rnn**: if True, applies attention layer after recurrent layers
- **att_out_rnn_single**: single (shared) attention layer or not
- **att_out_rnn_type** (string): timewise or featurewise attention layer
- **rnn_return_sequences**: if False, only last timestep of recurrent layers is returned
- **classif_local** (bool): whether classification is for each timestep (local) of globally for the sequence
- **mlp_layers_number**: number of additional dense layers
- **mlp_layers_size**: size of additional dense layers
- **optimizer**: gradient optimizer type (string)
- **learning_rate**: learning rate (float)
- **time_steps**: length of sequences (int)
- **features_number**: number of features (int)
- **features_type**: 'features' (1D vector of features), 'frames' (for a CNN processing) or 'both'
- **img_height** and **img_width**: size of CNN input
- **cnnType**: 'resnet', 'vgg' or 'mobilenet'
- **cnnFirstTrainedLayer**: index of first trainable layer in CNN (int)
- **cnnReduceDim**: if greater than 0, size of CNN flattened output is reduced to cnnReduceDim
- **print_summary** (bool)

## Shortcut: script to recognize a unique output on DictaSign

Just use `python src/recognitionUniqueDictaSign.py`

Provided help:

In [3]:
!python ../src/recognitionUniqueDictaSign.py -h

usage: recognitionUniqueDictaSign.py [-h] [--outputName OUTPUTNAME]
                                     [--flsBinary {0,1}]
                                     [--flsKeep [FLSKEEP [FLSKEEP ...]]]
                                     [--comment COMMENT]
                                     [--videoSplitMode {manual,auto}]
                                     [--fractionValid FRACTIONVALID]
                                     [--fractionTest FRACTIONTEST]
                                     [--signerIndependent {0,1}]
                                     [--taskIndependent {0,1}]
                                     [--excludeTask9 {0,1}]
                                     [--tasksTrain [{1,2,3,4,5,6,7,8,9} [{1,2,3,4,5,6,7,8,9} ...]]]
                                     [--tasksValid [{1,2,3,4,5,6,7,8,9} [{1,2,3,4,5,6,7,8,9} ...]]]
                                     [--tasksTest [{1,2,3,4,5,6,7,8,9} [{1,2,3,4,5,6,7,8,9} ...]]]
                                     [-

See examples below

## Building data and model together, manually, then training

In [4]:
# let us split train/valid/test videos
# In this case we split by signers in a manual fashion

idxTrain, idxValid, idxTest = getVideoIndicesSplitDictaSign(tasksTrain=[],
                                                            tasksValid=[],
                                                            tasksTest=[],
                                                            signersTrain=[0,1,2,3,4,5,6,7,8,9],
                                                            signersValid=[10,11,12],
                                                            signersTest=[13,14,15],
                                                            excludeTask9=False,
                                                            videoSplitMode='manual',
                                                            checkSplits=True,
                                                            checkSets=True,
                                                            from_notebook=True)
print(idxTrain)

Number of videos:
Train: 66
Valid: 10
Test: 18
Total: 94
[47 24 22  6  2 16 48 60  4  7  8 41 63  0 43 19 64 12 65 28 51 13 50 55
 56 36 59 32 33 29 27 14 18 44 38 53 46  5 23 15 45 20 37 34 42 17 21 40
 54  1 25 52 61 62 31 30 58 39 35 49 26  3 11 57  9 10]


In [5]:
# for the below examples to run faster, let us select only a part of all signers
idxTrain, idxValid, idxTest = getVideoIndicesSplitDictaSign(tasksTrain=[],
                                                            tasksValid=[],
                                                            tasksTest=[],
                                                            signersTrain=[0,1,2,3],
                                                            signersValid=[10],
                                                            signersTest=[13],
                                                            excludeTask9=False,
                                                            videoSplitMode='manual',
                                                            checkSplits=True,
                                                            checkSets=True,
                                                            from_notebook=True)
print(idxTrain)

Number of videos:
Train: 32
Valid: 3
Test: 2
Total: 37
[29 21  4 20 26 30  1 23  2 24  8 15 27  9  7  3 14 28 18 17 22  0 19 12
 10 25  5 13 11  6 31 16]


In [6]:
# Getting a dictionary for desired preprocessed features
# In this case we ask for normalized 3Dfeatures_HS (this correspond to a total number of 420 features):

features_dict, features_number = getFeaturesDict(inputType='3Dfeatures_HS', inputNormed=True)

print(features_dict)
print(features_number)

{'features_HS': array([], dtype=float64), 'features_HS_norm': array([  0,   1,   2,   3,   4,   5,   6,   7,   8,   9,  10,  11,  12,
        13,  14,  15,  16,  17,  18,  19,  20,  21,  22,  23,  24,  25,
        26,  27,  28,  29,  30,  31,  32,  33,  34,  35,  36,  37,  38,
        39,  40,  41,  42,  43,  44,  45,  46,  47,  48,  49,  50,  51,
        52,  53,  54,  55,  56,  57,  58,  59,  60,  61,  62,  63,  64,
        65,  66,  67,  68,  69,  70,  71,  72,  73,  74,  75,  76,  77,
        78,  79,  80,  81,  82,  83,  84,  85,  86,  87,  88,  89,  90,
        91,  92,  93,  94,  95,  96,  97,  98,  99, 100, 101, 102, 103,
       104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116,
       117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129,
       130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142,
       143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155,
       156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168,
  

### First category of examples: `output_form = 'sign_types'` (examples 1 to 3)

<a id='ex1'></a>
#### ex. 1 ([back to the list of examples](#list_examples)):
##### Data

In [7]:
[X_feat_train_1, X_frames_train_1], Y_train_1 =\
      get_data_concatenated(corpus='DictaSign',
                            output_form='sign_types',
                            types=[['DS']],
                            nonZero=[[]],
                            binary=[],
                            video_indices=idxTrain,
                            features_dict=features_dict,
                            features_type='both',
                            from_notebook=True)
[X_feat_valid_1, X_frames_valid_1], Y_valid_1 =\
      get_data_concatenated(corpus='DictaSign',
                            output_form='sign_types',
                            types=[['DS']],
                            nonZero=[[]],
                            binary=[],
                            video_indices=idxValid,
                            features_dict=features_dict,
                            features_type='both',
                            from_notebook=True)

In [8]:
print(X_feat_train_1.shape)
print(X_frames_train_1.shape)
print(Y_train_1.shape)

(1, 371696, 420)
(371696,)
(1, 371696, 2)


As can be seen above, X_feat is a big matrix storing all 420 preprocessed features for each frame. We can print the first 15 features for frame number 192:

In [9]:
print(X_feat_train_1[0,192,0:15])

[0.25372416 0.00730157 0.00393365 0.00697261 0.00035279 0.0107119
 0.15444274 0.01599335 0.0104205  0.01571699 0.00672656 0.27560651
 0.00168156 0.0216126  0.00268968]


In [10]:
X_frames_train_1[192]

'/localHD/DictaSign/convert/img/DictaSign_lsf_S3_T8_B0_front/00193.jpg'

In [11]:
# First depicting frame:
i_DS_one = np.where(Y_train_1[0,:,1]==1)[0][0]
print(i_DS_one)
print(Y_train_1[0,i_DS_one,:])

2367
[0. 1.]


##### Metrics

In [12]:
metrics_notebook = ['acc',  f1K,   precisionK,   recallK]

##### Model

In [13]:
# using only preprocessed features as input:
model_1_features = get_model(output_names=['DS'],
                    output_classes=[2],
                    output_weights=[1],
                    metrics=metrics_notebook,
                    features_number=features_number,
                    features_type='features')


Model: "model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_1 (InputLayer)         [(None, 100, 420)]        0         
_________________________________________________________________
conv1d (Conv1D)              (None, 100, 200)          252200    
_________________________________________________________________
bidirectional (Bidirectional (None, 100, 110)          112640    
_________________________________________________________________
bidirectional_1 (Bidirection (None, 100, 110)          73040     
_________________________________________________________________
time_distributed (TimeDistri (None, 100, 2)            222       
Total params: 438,102
Trainable params: 438,102
Non-trainable params: 0
_________________________________________________________________


In [14]:
# using only frames as input:
model_1_frames = get_model(output_names=['DS'],
                    output_classes=[2],
                    output_weights=[1],
                    metrics=metrics_notebook,
                    features_number=features_number,
                    features_type='frames')


Model: "model_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_2 (InputLayer)         [(None, 100, 224, 224, 3) 0         
_________________________________________________________________
time_distributed_1 (TimeDist (None, 100, 2048)         23587712  
_________________________________________________________________
bidirectional_2 (Bidirection (None, 100, 110)          925760    
_________________________________________________________________
bidirectional_3 (Bidirection (None, 100, 110)          73040     
_________________________________________________________________
time_distributed_2 (TimeDist (None, 100, 2)            222       
Total params: 24,586,734
Trainable params: 5,464,686
Non-trainable params: 19,122,048
_________________________________________________________________


In [15]:
# using both as input:
model_1_both = get_model(output_names=['DS'],
                    output_classes=[2],
                    output_weights=[1],
                    metrics=metrics_notebook,
                    features_number=features_number,
                    features_type='both')

Model: "model_2"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_4 (InputLayer)            [(None, 100, 420)]   0                                            
__________________________________________________________________________________________________
input_5 (InputLayer)            [(None, 100, 224, 22 0                                            
__________________________________________________________________________________________________
conv1d_1 (Conv1D)               (None, 100, 200)     252200      input_4[0][0]                    
__________________________________________________________________________________________________
time_distributed_3 (TimeDistrib (None, 100, 2048)    23587712    input_5[0][0]                    
____________________________________________________________________________________________

##### Training

In [16]:
history = train_model(model_1_features,
                      [X_feat_train_1, X_frames_train_1],
                      Y_train_1,
                      [X_feat_valid_1, X_frames_valid_1],
                      Y_valid_1,
                      batch_size=200,
                      epochs=5,
                      seq_length=100)

Instructions for updating:
Please use Model.fit, which supports generators.
  ...
    to  
  ['...']
  ...
    to  
  ['...']
Train for 19.0 steps, validate for 1 steps
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


#### Shortcut to do it all with a script
(the last three lines are needed when the script is called from the notebook)

In [17]:
!python ../src/recognitionUniqueDictaSign.py --outputName DS \
                                             --epochs 5 \
                                             --batchSize 200 \
                                             --videoSplitMode manual \
                                             --signersTrain 0 1 2 3 \
                                             --signersValid 10 \
                                             --signersTest 13 \
                                             --fromNotebook 1 \
                                             --saveGlobalresults ../reports/corpora/DictaSign/recognitionUnique/global/globalUnique.dat \
                                             --savePredictions ../reports/corpora/DictaSign/recognitionUnique/predictions/ \
                                             --saveModels ../models/corpora/DictaSign/recognitionUnique/

Number of videos:
Train: 32
Valid: 3
Test: 2
Total: 37
Model: "model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_1 (InputLayer)         [(None, 100, 420)]        0         
_________________________________________________________________
conv1d (Conv1D)              (None, 100, 200)          252200    
_________________________________________________________________
bidirectional (Bidirectional (None, 100, 100)          100400    
_________________________________________________________________
time_distributed (TimeDistri (None, 100, 2)            202       
Total params: 352,802
Trainable params: 352,802
Non-trainable params: 0
_________________________________________________________________
Instructions for updating:
Please use Model.fit, which supports generators.
  ...
    to  
  ['...']
  ...
    to  
  ['...']
Train for 19.0 steps, validate for 1 steps
Epoch 1/5
Epoch 00001: val_f1

<a id='ex1bis'></a>
#### ex. 1 bis ([back to the list of examples](#list_examples)):
##### Data

In [18]:
[X_feat_train_1bis, X_frames_train_1bis], Y_train_1bis =\
      get_data_concatenated(corpus='DictaSign',
                            output_form='sign_types',
                            types=[['DSA', 'DSG']],
                            nonZero=[[]],
                            binary=[],
                            video_indices=idxTrain,
                            features_dict=features_dict,
                            features_type='both',
                            from_notebook=True)
[X_feat_valid_1bis, X_frames_valid_1bis], Y_valid_1bis =\
      get_data_concatenated(corpus='DictaSign',
                            output_form='sign_types',
                            types=[['DSA', 'DSG']],
                            nonZero=[[]],
                            binary=[],
                            video_indices=idxValid,
                            features_dict=features_dict,
                            features_type='both',
                            from_notebook=True)

In [19]:
print(X_feat_train_1bis.shape)
print(X_frames_train_1bis.shape)
print(Y_train_1bis.shape)

(1, 371696, 420)
(371696,)
(1, 371696, 2)


##### Model

In [20]:
# using only preprocessed features as input:
model_1bis_features = get_model(output_names=['DSA-DSG'],
                    output_classes=[2],
                    output_weights=[1],
                    metrics=metrics_notebook,
                    features_number=features_number,
                    features_type='features')

Model: "model_3"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_7 (InputLayer)         [(None, 100, 420)]        0         
_________________________________________________________________
conv1d_2 (Conv1D)            (None, 100, 200)          252200    
_________________________________________________________________
bidirectional_6 (Bidirection (None, 100, 110)          112640    
_________________________________________________________________
bidirectional_7 (Bidirection (None, 100, 110)          73040     
_________________________________________________________________
time_distributed_5 (TimeDist (None, 100, 2)            222       
Total params: 438,102
Trainable params: 438,102
Non-trainable params: 0
_________________________________________________________________


<a id='ex2'></a>
#### ex. 2 ([back to the list of examples](#list_examples)):
##### Data

In [21]:
[X_feat_train_2, X_frames_train_2], Y_train_2 =\
      get_data_concatenated(corpus='DictaSign',
                            output_form='sign_types',
                            types=[['fls']],
                            nonZero=[[]],
                            binary=[],
                            video_indices=idxTrain,
                            features_dict=features_dict,
                            features_type='both',
                            from_notebook=True)
[X_feat_valid_2, X_frames_valid_2], Y_valid_2 =\
      get_data_concatenated(corpus='DictaSign',
                            output_form='sign_types',
                            types=[['fls']],
                            nonZero=[[]],
                            binary=[],
                            video_indices=idxValid,
                            features_dict=features_dict,
                            features_type='both',
                            from_notebook=True)

##### Model

In [22]:
# using only preprocessed features as input:
model_2_features = get_model(output_names=['fls'],
                    output_classes=[2],
                    output_weights=[1],
                    metrics=metrics_notebook,
                    features_number=features_number,
                    features_type='features')

Model: "model_4"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_8 (InputLayer)         [(None, 100, 420)]        0         
_________________________________________________________________
conv1d_3 (Conv1D)            (None, 100, 200)          252200    
_________________________________________________________________
bidirectional_8 (Bidirection (None, 100, 110)          112640    
_________________________________________________________________
bidirectional_9 (Bidirection (None, 100, 110)          73040     
_________________________________________________________________
time_distributed_6 (TimeDist (None, 100, 2)            222       
Total params: 438,102
Trainable params: 438,102
Non-trainable params: 0
_________________________________________________________________


#### Shortcut to do it all with a script
(the last three lines are needed when the script is called from the notebook)

In [23]:
!python ../src/recognitionUniqueDictaSign.py --outputName fls \
                                             --epochs 5 \
                                             --batchSize 200 \
                                             --videoSplitMode manual \
                                             --signersTrain 0 1 2 3 \
                                             --signersValid 10 \
                                             --signersTest 13 \
                                             --fromNotebook 1 \
                                             --saveGlobalresults ../reports/corpora/DictaSign/recognitionUnique/global/globalUnique.dat \
                                             --savePredictions ../reports/corpora/DictaSign/recognitionUnique/predictions/ \
                                             --saveModels ../models/corpora/DictaSign/recognitionUnique/

Number of videos:
Train: 32
Valid: 3
Test: 2
Total: 37
Model: "model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_1 (InputLayer)         [(None, 100, 420)]        0         
_________________________________________________________________
conv1d (Conv1D)              (None, 100, 200)          252200    
_________________________________________________________________
bidirectional (Bidirectional (None, 100, 100)          100400    
_________________________________________________________________
time_distributed (TimeDistri (None, 100, 2)            202       
Total params: 352,802
Trainable params: 352,802
Non-trainable params: 0
_________________________________________________________________
Instructions for updating:
Please use Model.fit, which supports generators.
  ...
    to  
  ['...']
  ...
    to  
  ['...']
Train for 19.0 steps, validate for 1 steps
Epoch 1/5
Epoch 00001: val_f1

<a id='ex2bis'></a>
#### ex. 2 bis ([back to the list of examples](#list_examples)):
##### Data

In [24]:
flsKept=[43015, 43038, 42318, 43357, 43116, 42719]

In [25]:
import csv

idGloss = {}

with open('Dicta-Sign-LSF_ID.csv', newline='') as csvfile:
    glossreader = csv.reader(csvfile, delimiter=';', quotechar='|')
    for row in glossreader:
        idGloss[row[0]] = row[1]

        N_fls = len(flsKept)
# that correspond to glosses:
for i in flsKept:
    print(idGloss[str(i)])

OUI:VAR
NON1:VAR
COMME/MEME/AUSSI
SUPER/BIEN
CA VEUT DIRE1:VAR
JUSTE1/PRECIS


In [26]:
[X_feat_train_2bis, X_frames_train_2bis], Y_train_2bis =\
      get_data_concatenated(corpus='DictaSign',
                            output_form='sign_types',
                            types=[['fls']],
                            nonZero=[flsKept],
                            binary=[],
                            video_indices=idxTrain,
                            features_dict=features_dict,
                            features_type='both',
                            from_notebook=True)
[X_feat_valid_2bis, X_frames_valid_2bis], Y_valid_2bis =\
      get_data_concatenated(corpus='DictaSign',
                            output_form='sign_types',
                            types=[['fls']],
                            nonZero=[flsKept],
                            binary=[],
                            video_indices=idxValid,
                            features_dict=features_dict,
                            features_type='both',
                            from_notebook=True)

In [27]:
print(X_feat_train_2bis.shape)
print(X_frames_train_2bis.shape)
print(Y_train_2bis.shape)

(1, 371696, 420)
(371696,)
(1, 371696, 2)


##### Model

In [28]:
# using only preprocessed features as input:
model_2bis_features = get_model(output_names=['fls-binary-43015_43038_42318_43357_43116_42719'],
                    output_classes=[2],
                    output_weights=[1],
                    metrics=metrics_notebook,
                    features_number=features_number,
                    features_type='features')

Model: "model_5"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_9 (InputLayer)         [(None, 100, 420)]        0         
_________________________________________________________________
conv1d_4 (Conv1D)            (None, 100, 200)          252200    
_________________________________________________________________
bidirectional_10 (Bidirectio (None, 100, 110)          112640    
_________________________________________________________________
bidirectional_11 (Bidirectio (None, 100, 110)          73040     
_________________________________________________________________
time_distributed_7 (TimeDist (None, 100, 2)            222       
Total params: 438,102
Trainable params: 438,102
Non-trainable params: 0
_________________________________________________________________


#### Shortcut to do it all with a script
(the last three lines are needed when the script is called from the notebook)

In [29]:
!python ../src/recognitionUniqueDictaSign.py --outputName fls \
                                             --flsKept 43015 43038 42318 43357 43116 42719 \
                                             --flsBinary 1 \
                                             --epochs 5 \
                                             --batchSize 200 \
                                             --videoSplitMode manual \
                                             --signersTrain 0 1 2 3 \
                                             --signersValid 10 \
                                             --signersTest 13 \
                                             --fromNotebook 1 \
                                             --saveGlobalresults ../reports/corpora/DictaSign/recognitionUnique/global/globalUnique.dat \
                                             --savePredictions ../reports/corpora/DictaSign/recognitionUnique/predictions/ \
                                             --saveModels ../models/corpora/DictaSign/recognitionUnique/

usage: recognitionUniqueDictaSign.py [-h] [--outputName OUTPUTNAME]
                                     [--flsBinary {0,1}]
                                     [--flsKeep [FLSKEEP [FLSKEEP ...]]]
                                     [--comment COMMENT]
                                     [--videoSplitMode {manual,auto}]
                                     [--fractionValid FRACTIONVALID]
                                     [--fractionTest FRACTIONTEST]
                                     [--signerIndependent {0,1}]
                                     [--taskIndependent {0,1}]
                                     [--excludeTask9 {0,1}]
                                     [--tasksTrain [{1,2,3,4,5,6,7,8,9} [{1,2,3,4,5,6,7,8,9} ...]]]
                                     [--tasksValid [{1,2,3,4,5,6,7,8,9} [{1,2,3,4,5,6,7,8,9} ...]]]
                                     [--tasksTest [{1,2,3,4,5,6,7,8,9} [{1,2,3,4,5,6,7,8,9} ...]]]
                                     [-

<a id='ex2ter'></a>
#### ex. 2 ter ([back to the list of examples](#list_examples)):
##### Data

In [30]:
[X_feat_train_2ter, X_frames_train_2ter], Y_train_2ter =\
      get_data_concatenated(corpus='DictaSign',
                            output_form='sign_types',
                            types=[['fls'],['fls'],['fls'],['fls'],['fls'],['fls']],
                            nonZero=[[flsKept[0]],[flsKept[1]],[flsKept[2]],[flsKept[3]],[flsKept[4]],[flsKept[5]]],
                            binary=[],
                            video_indices=idxTrain,
                            features_dict=features_dict,
                            features_type='both',
                            from_notebook=True)
[X_feat_valid_2ter, X_frames_valid_2ter], Y_valid_2ter =\
      get_data_concatenated(corpus='DictaSign',
                            output_form='sign_types',
                            types=[['fls'],['fls'],['fls'],['fls'],['fls'],['fls']],
                            nonZero=[[flsKept[0]],[flsKept[1]],[flsKept[2]],[flsKept[3]],[flsKept[4]],[flsKept[5]]],
                            binary=[],
                            video_indices=idxValid,
                            features_dict=features_dict,
                            features_type='both',
                            from_notebook=True)

In [31]:
print(X_feat_train_2ter.shape)
print(X_frames_train_2ter.shape)
print(Y_train_2ter.shape)

(1, 371696, 420)
(371696,)
(1, 371696, 7)


##### Model

In [32]:
# using only preprocessed features as input:
model_2ter_features = get_model(output_names=['fls-categ-43015_43038_42318_43357_43116_42719'],
                    output_classes=[7],
                    output_weights=[1],
                    metrics=metrics_notebook,
                    features_number=features_number,
                    features_type='features')

Model: "model_6"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_10 (InputLayer)        [(None, 100, 420)]        0         
_________________________________________________________________
conv1d_5 (Conv1D)            (None, 100, 200)          252200    
_________________________________________________________________
bidirectional_12 (Bidirectio (None, 100, 110)          112640    
_________________________________________________________________
bidirectional_13 (Bidirectio (None, 100, 110)          73040     
_________________________________________________________________
time_distributed_8 (TimeDist (None, 100, 7)            777       
Total params: 438,657
Trainable params: 438,657
Non-trainable params: 0
_________________________________________________________________


#### Shortcut to do it all with a script
(the last three lines are needed when the script is called from the notebook)

In [33]:
!python ../src/recognitionUniqueDictaSign.py --outputName fls \
                                             --flsKept 43015 43038 42318 43357 43116 42719 \
                                             --flsBinary 0 \
                                             --epochs 5 \
                                             --batchSize 200 \
                                             --videoSplitMode manual \
                                             --signersTrain 0 1 2 3 \
                                             --signersValid 10 \
                                             --signersTest 13 \
                                             --fromNotebook 1 \
                                             --saveGlobalresults ../reports/corpora/DictaSign/recognitionUnique/global/globalUnique.dat \
                                             --savePredictions ../reports/corpora/DictaSign/recognitionUnique/predictions/ \
                                             --saveModels ../models/corpora/DictaSign/recognitionUnique/

usage: recognitionUniqueDictaSign.py [-h] [--outputName OUTPUTNAME]
                                     [--flsBinary {0,1}]
                                     [--flsKeep [FLSKEEP [FLSKEEP ...]]]
                                     [--comment COMMENT]
                                     [--videoSplitMode {manual,auto}]
                                     [--fractionValid FRACTIONVALID]
                                     [--fractionTest FRACTIONTEST]
                                     [--signerIndependent {0,1}]
                                     [--taskIndependent {0,1}]
                                     [--excludeTask9 {0,1}]
                                     [--tasksTrain [{1,2,3,4,5,6,7,8,9} [{1,2,3,4,5,6,7,8,9} ...]]]
                                     [--tasksValid [{1,2,3,4,5,6,7,8,9} [{1,2,3,4,5,6,7,8,9} ...]]]
                                     [--tasksTest [{1,2,3,4,5,6,7,8,9} [{1,2,3,4,5,6,7,8,9} ...]]]
                                     [-

<a id='ex3'></a>
#### ex. 3 ([back to the list of examples](#list_examples)):
##### Data

In [34]:
[X_feat_train_3, X_frames_train_3], Y_train_3 =\
      get_data_concatenated(corpus='DictaSign',
                            output_form='sign_types',
                            types=[['fls'],['DS'],['PT'],['FBUOY']],
                            nonZero=[[],[],[],[]],
                            binary=[],
                            video_indices=idxTrain,
                            features_dict=features_dict,
                            features_type='both',
                            from_notebook=True)
[X_feat_valid_3, X_frames_valid_3], Y_valid_3 =\
      get_data_concatenated(corpus='DictaSign',
                            output_form='sign_types',
                            types=[['fls'],['DS'],['PT'],['FBUOY']],
                            nonZero=[[],[],[],[]],
                            binary=[],
                            video_indices=idxValid,
                            features_dict=features_dict,
                            features_type='both',
                            from_notebook=True)

In [35]:
print(X_feat_train_3.shape)
print(X_frames_train_3.shape)
print(Y_train_3.shape)

(1, 371696, 420)
(371696,)
(1, 371696, 5)


##### Model

In [36]:
# using only preprocessed features as input:
model_3_features = get_model(output_names=['fls-DS-PT-FBUOY'],
                    output_classes=[5],
                    output_weights=[1],
                    metrics=metrics_notebook,
                    features_number=features_number,
                    features_type='features')

Model: "model_7"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_11 (InputLayer)        [(None, 100, 420)]        0         
_________________________________________________________________
conv1d_6 (Conv1D)            (None, 100, 200)          252200    
_________________________________________________________________
bidirectional_14 (Bidirectio (None, 100, 110)          112640    
_________________________________________________________________
bidirectional_15 (Bidirectio (None, 100, 110)          73040     
_________________________________________________________________
time_distributed_9 (TimeDist (None, 100, 5)            555       
Total params: 438,435
Trainable params: 438,435
Non-trainable params: 0
_________________________________________________________________


### Second category of examples: `output_form = 'mixed'` (examples 4 to 6)

<a id='ex4'></a>
#### ex. 4 ([back to the list of examples](#list_examples)):
##### Data

In [37]:
[X_feat_train_4, X_frames_train_4], Y_train_4 =\
      get_data_concatenated(corpus='DictaSign',
                            output_form='mixed',
                            types=[['fls'], ['DS'], ['PT'], ['FBUOY']],
                            nonZero=[flsKept, [], [], []],
                            binary=[False, True, True, True],
                            video_indices=idxTrain,
                            features_dict=features_dict,
                            features_type='both',
                            from_notebook=True)
[X_feat_valid_4, X_frames_valid_4], Y_valid_4 =\
      get_data_concatenated(corpus='DictaSign',
                            output_form='mixed',
                            types=[['fls'], ['DS'], ['PT'], ['FBUOY']],
                            nonZero=[flsKept, [], [], []],
                            binary=[False, True, True, True],
                            video_indices=idxValid,
                            features_dict=features_dict,
                            features_type='both',
                            from_notebook=True)

In [38]:
print(X_feat_train_4.shape)
print(X_frames_train_4.shape)
print(len(Y_train_4))
print(Y_train_4[0].shape)
print(Y_train_4[1].shape)

(1, 371696, 420)
(371696,)
4
(1, 371696, 7)
(1, 371696, 2)


##### Model

In [39]:
model_4_features = get_model(output_names=['fls', 'DS', 'PT', 'FBUOY'],
                    output_classes=[N_fls+1,2,2,2],
                    output_weights=[1,1,1,1],
                    metrics=metrics_notebook,
                    features_number=features_number,
                    features_type='features')

Model: "model_8"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_12 (InputLayer)           [(None, 100, 420)]   0                                            
__________________________________________________________________________________________________
conv1d_7 (Conv1D)               (None, 100, 200)     252200      input_12[0][0]                   
__________________________________________________________________________________________________
bidirectional_16 (Bidirectional (None, 100, 110)     112640      conv1d_7[0][0]                   
__________________________________________________________________________________________________
bidirectional_17 (Bidirectional (None, 100, 110)     73040       bidirectional_16[0][0]           
____________________________________________________________________________________________

<a id='ex5'></a>
#### ex. 5 ([back to the list of examples](#list_examples)):
##### Data

In [40]:
[X_feat_train_5, X_frames_train_5], Y_train_5 =\
      get_data_concatenated(corpus='DictaSign',
                            output_form='mixed',
                            types=[['PT_PRO1','PT_PRO2', 'PT_PRO3'], ['fls']],
                            nonZero=[[],[]],
                            binary=[True,True],
                            video_indices=idxTrain,
                            features_dict=features_dict,
                            features_type='both',
                            from_notebook=True)

##### Model

In [41]:
model_5_features = get_model(output_names=['PT_PRO123', 'fls_all'],
                    output_classes=[2,2],
                    output_weights=[1,1],
                    metrics=metrics_notebook,
                    features_number=features_number,
                    features_type='features')

Model: "model_9"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_13 (InputLayer)           [(None, 100, 420)]   0                                            
__________________________________________________________________________________________________
conv1d_8 (Conv1D)               (None, 100, 200)     252200      input_13[0][0]                   
__________________________________________________________________________________________________
bidirectional_18 (Bidirectional (None, 100, 110)     112640      conv1d_8[0][0]                   
__________________________________________________________________________________________________
bidirectional_19 (Bidirectional (None, 100, 110)     73040       bidirectional_18[0][0]           
____________________________________________________________________________________________