# Inference

## Inference Run Options   
inference_custom.py is for running inference on user-specified data, whereas inference_commonVoice.py will run on the testing dataset from 
the CommonVoice dataset. Both scripts use the following input arguments. Some options are only supported on inference_custom.py:   
-p : datapath  
-d : duration of the wave sample, default is 3  
-s : size of sample waves, default is 100   
--vad : enable VAD model to detect active speech (inference_custom.py only)  
--ipex : run inference with optimizations from Intel® Extension for PyTorch (IPEX)  
--bf16 : run inference with auto-mixed precision featuring Bfloat16  
--int8_model : Run inference with the INT8 model generated from Intel® Neural Compressor (INC)  
--ground_truth_compare : enable comparison of prediction labels to ground truth values (inference_custom.py only)  
--verbose : prints additional debug information, such as latency  

The VAD option will identify speech segments in the audio file and construct a new WAV file containing only the speech segments. This improves the quality of speech data used as input into the language identification model.  

The Intel® Extension for PyTorch (IPEX) option will apply optimizations to the pretrained model. There should be performance improvements in terms of latency.  

## inference_commonVoice.py for CommonVoice Test Data
This will run inference on the trained model to see how well it performs on the Common Voice test data generated from the preprocessing scripts.  

python inference_commonVoice.py -p DATAPATH  

An output file test_data_accuracy.csv will give the summary of the results.  

In [None]:
!python inference_commonVoice.py -p ${COMMON_VOICE_PATH}/processed_data/test

## inference_custom.py for Custom Data  
To run inference on custom data, you must specify a folder with .wav files and pass the path in as an argument. You can do so by creating a folder named `data_custom` and then copy 1 or 2 .wav files from your test dataset into it. .mp3 files will NOT work. 

### Randomly select audio clips from audio files for prediction
python inference_custom.py -p DATAPATH -d DURATION -s SIZE

An output file `output_summary.csv` will give the summary of the results.

In [None]:
# Pick 50 3sec samples from each WAV file under the folder "data_custom"
!python inference_custom.py -p data_custom -d 3 -s 50

### Randomly select audio clips from audio files after applying Voice Activity Detection (VAD)  
python inference_custom.py -p DATAPATH -d DURATION -s SIZE --vad  

An output file output_summary.csv will give the summary of the results. Note that the audio input into the VAD model must be sampled at 16kHz. The code already performs this conversion for you.  

In [None]:
!python inference_custom.py -p data_custom -d 3 -s 50 --vad

### Optimizations with Intel® Extension for PyTorch (IPEX) 
python inference_custom.py -p data_custom -d 3 -s 50 --vad --ipex --verbose  

This will apply ipex.optimize to the model(s) and TorchScript. You can also add the --bf16 option along with --ipex to run in the BF16 data type, supported on 4th Gen Intel® Xeon® Scalable processors and newer.

Note that the *--verbose* option is required to view the latency measurements.   

In [None]:
!python inference_custom.py -p data_custom -d 3 -s 50 --vad --ipex --verbose

## Quantization with Intel® Neural Compressor (INC)
To improve inference latency, Intel® Neural Compressor (INC) can be used to quantize the trained model from FP32 to INT8 by running quantize_model.py. The *-datapath* argument can be used to specify a custom evaluation dataset but by default it is set to `$COMMON_VOICE_PATH/processed_data/dev` which was generated from the data preprocessing scripts in the `Training` folder.  

In [None]:
!python quantize_model.py -p ./lang_id_commonvoice_model -datapath $COMMON_VOICE_PATH/processed_data/dev

After quantization, the model will be stored in lang_id_commonvoice_model_INT8 and neural_compressor.utils.pytorch.load will have to be used to load the quantized model for inference. If self.language_id is the original model and data_path is the path to the audio file:

```
from neural_compressor.utils.pytorch import load
model_int8 = load("./lang_id_commonvoice_model_INT8", self.language_id)
signal = self.language_id.load_audio(data_path)
prediction = self.model_int8(signal)
```

The code above is integrated into inference_custom.py. You can now run inference on your data using this INT8 model:

In [None]:
!python inference_custom.py -p data_custom -d 3 -s 50 --vad --int8_model --verbose

### (Optional) Comparing Predictions with Ground Truth

You can choose to modify audio_ground_truth_labels.csv to include the name of the audio file and expected audio label (like, en for English), then run inference_custom.py with the --ground_truth_compare option. By default, this is disabled.

## Troubleshooting
If the model appears to be giving the same output regardless of input, try running clean.sh to remove the RIR_NOISES and speechbrain 
folders so they can be re-pulled.  

In [None]:
print("[CODE_SAMPLE_COMPLETED_SUCCESSFULLY]")