NVIDIA · fayejf · Nov 16, 2022 · Nov 9, 2022 · Nov 9, 2022 · Nov 10, 2022
diff --git a/docs/source/asr/speaker_recognition/results.rst b/docs/source/asr/speaker_recognition/results.rst
@@ -61,7 +61,7 @@ For extracting embeddings from a single file:
 .. code-block:: python
 
   speaker_model = EncDecSpeakerLabelModel.from_pretrained(model_name="<pretrained_model_name or path/to/nemo/file>")
-  embs = speaker_model.get_embedding('audio_path')
+  embs = speaker_model.get_embedding('<audio_path>')
 
 For extracting embeddings from a bunch of files:
 
@@ -78,7 +78,14 @@ This python call will download best pretrained model from NGC and writes embeddi
 .. code-block:: bash
 
     python examples/speaker_tasks/recognition/extract_speaker_embeddings.py --manifest=manifest.json
-
+
+or you can run `batch_inference()` to perform inference on the manifest with seleted batch_size to get embeddings
+
+.. code-block:: python
+
+  speaker_model = nemo_asr.models.EncDecSpeakerLabelModel.from_pretrained(model_name="<pretrained_model_name or path/to/nemo/file>")
+  embs, logits, gt_labels, mapped_labels = speaker_model.batch_inference(manifest, batch_size=32)
+
 Speaker Verification Inference
 ------------------------------
 

diff --git a/docs/source/asr/speech_classification/models.rst b/docs/source/asr/speech_classification/models.rst
@@ -68,7 +68,7 @@ MarbleNet models can be instantiated using the :class:`~nemo.collections.asr.mod
 AmberNet (Lang ID) 
 ------------------
 
-AmberNet is an end-to-end neural network for language identification moden based on `TitanNet <../speaker_recognition/models.html#titanet>`__.
+AmberNet is an end-to-end neural network for language identification moden based on `TitaNet <../speaker_recognition/models.html#titanet>`__.
 
 It can reach state-of-the art performance on the `Voxlingua107 dataset <http://bark.phon.ioc.ee/voxlingua107/>`_ while having significantly fewer parameters than similar models.
 AmberNet models can be instantiated using the :class:`~nemo.collections.asr.models.EncDecSpeakerLabelModel` class.
@@ -81,4 +81,4 @@ References
 .. bibliography:: ../asr_all.bib
     :style: plain
     :labelprefix: SC-MODELS
-    :keyprefix: sc-models-
+    :keyprefix: sc-models-
diff --git a/docs/source/asr/speech_classification/results.rst b/docs/source/asr/speech_classification/results.rst
@@ -33,7 +33,7 @@ Transcribing/Inference
 
 The audio files should be 16KHz monochannel wav files.
 
-**Transcribe speech command segment:**
+`Transcribe speech command segment:`
 
 You may perform inference and transcribe a sample of speech after loading the model by using its 'transcribe()' method:
 
@@ -47,7 +47,7 @@ Setting argument ``logprobs`` to True would return the log probabilities instead
 Learn how to fine tune on your own data or on subset classes in ``<NeMo_git_root>/tutorials/asr/Speech_Commands.ipynb``
 
 
-**Run VAD inference:**
+`Run VAD inference:`
 
 .. code-block:: bash 
 
@@ -72,6 +72,23 @@ Filtering:
   - ``filter_speech_first`` to control whether to perform short speech segment deletion first.
 
 
+`Identify language of utterance`
+
+You may load the model and identify the language of an audio file by using `get_label()` method:
+
+.. code-block:: python
+
+  langid_model = nemo_asr.models.EncDecSpeakerLabelModel.from_pretrained(model_name="<MODEL_NAME>")
+  lang = langid_model.get_label('<audio_path>')
+
+or you can run `batch_inference()` to perform inference on a manifest with seleted batch_size to get mapped_labels
+
+.. code-block:: python
+
+  langid_model = nemo_asr.models.EncDecSpeakerLabelModel.from_pretrained(model_name="<MODEL_NAME>")
+  lang_embs, logits, gt_labels, mapped_labels = langid_model.batch_inference(manifest_filepath, batch_size=32)
+
+
 NGC Pretrained Checkpoints
 --------------------------