Skip to content

Commit

Permalink
Merge r1.10.0 main (NVIDIA#4486)
Browse files Browse the repository at this point in the history
* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* Fix ASR Typos in tutorials (NVIDIA#4384)

* Fix typos

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Quick wav2vec fix. In-place operation adding convolutional positions to encoder was overwriting leaf history. Wasn't caught on previous torch versions. (NVIDIA#4383)

Signed-off-by: tbartley94 <tbartley@nvidia.com>

Co-authored-by: tbartley94 <tbartley@nvidia.com>
(cherry picked from commit 0322b15)

Co-authored-by: Travis Bartley <Travismbartley@gmail.com>

* Fix tutorial typos and docs (NVIDIA#4415)

* Fix typos

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Fix typos

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Add ASR Scores to Docs (NVIDIA#4412)

* Fix link

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Correct model card

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Add ASR Results to Docs

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Update info

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Update info

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* docs: add table overflow handling for nested sections (NVIDIA#4441)

Co-authored-by: Nick Goncharenko <ngoncharenko@nvidia.com>

* Docs: Decrease Font Size on Tables  (NVIDIA#4444)

* docs: add table overflow handling for nested sections

* docs: set table font-size to small

Co-authored-by: Nick Goncharenko <ngoncharenko@nvidia.com>

* Updated notebook to fix batch configuration and precision bugs (NVIDIA#4447)

* Updated notebook to fix batch configuration and precision bugs

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Deleted cell outputs

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Set datasets back to full dataset

Signed-off-by: Virginia Adams <vadams@nvidia.com>

Co-authored-by: Eric Harper <complex451@gmail.com>

* fix branch in link (NVIDIA#4454)

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* [TTS] [bugfix] German FastPitch HiFi-GAN tutorial and lr (NVIDIA#4459)

* [TN] Bug fix: expand serial coverage of unknown symbol, remove constraints from word graph (NVIDIA#4463)

* remove constraints from word graph det

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* add measure units to serial

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* revert serial changes, update jenkins path

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* fix test case

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* update indentation (NVIDIA#4468)

Signed-off-by: Akshit Arora <akshit.arora@colorado.edu>

* t5-rpe-fix targeting r1.10.0; raise exception for PP>2. (NVIDIA#4469)

Signed-off-by: Hoo Chang Shin <hshin@nvidia.com>

Co-authored-by: Hoo Chang Shin <hshin@nvidia.com>

* Fix some 's' cases for IPA G2P (NVIDIA#4460)

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

Co-authored-by: Eric Harper <complex451@gmail.com>

* Refactor bias act fusion (NVIDIA#4376)

* Refactor bias act fusion

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Update NMT config

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Update ci tests

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Empty

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Add kwargs to exact string match (NVIDIA#4479)

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Try fix (NVIDIA#4484)

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* update branch

Signed-off-by: ericharper <complex451@gmail.com>

Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: Travis Bartley <Travismbartley@gmail.com>
Co-authored-by: Nick Goncharenko <8766167+nickolyamba@users.noreply.github.com>
Co-authored-by: Nick Goncharenko <ngoncharenko@nvidia.com>
Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: Akshit Arora <akshit.arora@colorado.edu>
Co-authored-by: khcs <khcs@users.noreply.github.com>
Co-authored-by: Hoo Chang Shin <hshin@nvidia.com>
Co-authored-by: Jocelyn <jocelynh@nvidia.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>
  • Loading branch information
12 people authored and Hainan Xu committed Nov 29, 2022
1 parent a24e660 commit 2e04a40
Show file tree
Hide file tree
Showing 21 changed files with 484 additions and 482 deletions.
38 changes: 19 additions & 19 deletions Jenkinsfile
Original file line number Diff line number Diff line change
Expand Up @@ -137,18 +137,18 @@ pipeline {
parallel {
stage('En TN grammars') {
steps {
sh 'CUDA_VISIBLE_DEVICES="" python nemo_text_processing/text_normalization/normalize.py --text="1" --cache_dir /home/TestData/nlp/text_norm/ci/grammars/6-14-22'
sh 'CUDA_VISIBLE_DEVICES="" python nemo_text_processing/text_normalization/normalize.py --text="1" --cache_dir /home/TestData/nlp/text_norm/ci/grammars/6-28-22'
}
}
stage('En ITN grammars') {
steps {
sh 'CUDA_VISIBLE_DEVICES="" python nemo_text_processing/inverse_text_normalization/inverse_normalize.py --language en --text="twenty" --cache_dir /home/TestData/nlp/text_norm/ci/grammars/6-14-22'
sh 'CUDA_VISIBLE_DEVICES="" python nemo_text_processing/inverse_text_normalization/inverse_normalize.py --language en --text="twenty" --cache_dir /home/TestData/nlp/text_norm/ci/grammars/6-28-22'
}
}
stage('Test En non-deterministic TN & Run all En TN/ITN tests (restore grammars from cache)') {
steps {
sh 'CUDA_VISIBLE_DEVICES="" python nemo_text_processing/text_normalization/normalize_with_audio.py --text "\$.01" --n_tagged 2 --cache_dir /home/TestData/nlp/text_norm/ci/grammars/6-14-22'
sh 'CUDA_VISIBLE_DEVICES="" pytest tests/nemo_text_processing/en/ -m "not pleasefixme" --cpu --tn_cache_dir /home/TestData/nlp/text_norm/ci/grammars/6-14-22'
sh 'CUDA_VISIBLE_DEVICES="" python nemo_text_processing/text_normalization/normalize_with_audio.py --text "\$.01" --n_tagged 2 --cache_dir /home/TestData/nlp/text_norm/ci/grammars/6-28-22'
sh 'CUDA_VISIBLE_DEVICES="" pytest tests/nemo_text_processing/en/ -m "not pleasefixme" --cpu --tn_cache_dir /home/TestData/nlp/text_norm/ci/grammars/6-28-22'
}
}
}
Expand All @@ -165,7 +165,7 @@ pipeline {
parallel {
stage('L2: Eng TN') {
steps {
sh 'cd tools/text_processing_deployment && python pynini_export.py --output=/home/TestData/nlp/text_norm/output/ --grammars=tn_grammars --cache_dir /home/TestData/nlp/text_norm/ci/grammars/6-14-22 --language=en && ls -R /home/TestData/nlp/text_norm/output/ && echo ".far files created "|| exit 1'
sh 'cd tools/text_processing_deployment && python pynini_export.py --output=/home/TestData/nlp/text_norm/output/ --grammars=tn_grammars --cache_dir /home/TestData/nlp/text_norm/ci/grammars/6-28-22 --language=en && ls -R /home/TestData/nlp/text_norm/output/ && echo ".far files created "|| exit 1'
sh 'cd nemo_text_processing/text_normalization/ && python normalize.py --input_file=/home/TestData/nlp/text_norm/ci/test.txt --input_case="lower_cased" --language=en --output_file=/home/TestData/nlp/text_norm/output/test.pynini.txt --verbose'
sh 'cat /home/TestData/nlp/text_norm/output/test.pynini.txt'
sh 'cmp --silent /home/TestData/nlp/text_norm/output/test.pynini.txt /home/TestData/nlp/text_norm/ci/test_goal_py_05-25.txt || exit 1'
Expand All @@ -175,7 +175,7 @@ pipeline {

stage('L2: Eng ITN export') {
steps {
sh 'cd tools/text_processing_deployment && python pynini_export.py --output=/home/TestData/nlp/text_denorm/output/ --grammars=itn_grammars --cache_dir /home/TestData/nlp/text_norm/ci/grammars/6-14-22 --language=en && ls -R /home/TestData/nlp/text_denorm/output/ && echo ".far files created "|| exit 1'
sh 'cd tools/text_processing_deployment && python pynini_export.py --output=/home/TestData/nlp/text_denorm/output/ --grammars=itn_grammars --cache_dir /home/TestData/nlp/text_norm/ci/grammars/6-28-22 --language=en && ls -R /home/TestData/nlp/text_denorm/output/ && echo ".far files created "|| exit 1'
sh 'cd nemo_text_processing/inverse_text_normalization/ && python inverse_normalize.py --input_file=/home/TestData/nlp/text_denorm/ci/test.txt --language=en --output_file=/home/TestData/nlp/text_denorm/output/test.pynini.txt --verbose'
sh 'cmp --silent /home/TestData/nlp/text_denorm/output/test.pynini.txt /home/TestData/nlp/text_denorm/ci/test_goal_py.txt || exit 1'
sh 'rm -rf /home/TestData/nlp/text_denorm/output/*'
Expand All @@ -184,23 +184,23 @@ pipeline {
stage('L2: TN with Audio (audio and raw text)') {
steps {
sh 'cd nemo_text_processing/text_normalization && \
python normalize_with_audio.py --language=en --cache_dir /home/TestData/nlp/text_norm/ci/grammars/6-14-22 --text "The total amounts to \\$4.76." \
python normalize_with_audio.py --language=en --cache_dir /home/TestData/nlp/text_norm/ci/grammars/6-28-22 --text "The total amounts to \\$4.76." \
--audio_data /home/TestData/nlp/text_norm/audio_based/audio.wav | tail -n2 | head -n1 > /tmp/out_raw.txt 2>&1 && \
cmp --silent /tmp/out_raw.txt /home/TestData/nlp/text_norm/audio_based/result.txt || exit 1'
}
}
stage('L2: TN with Audio (audio and text file)') {
steps {
sh 'cd nemo_text_processing/text_normalization && \
python normalize_with_audio.py --language=en --cache_dir /home/TestData/nlp/text_norm/ci/grammars/6-14-22 --text /home/TestData/nlp/text_norm/audio_based/text.txt \
python normalize_with_audio.py --language=en --cache_dir /home/TestData/nlp/text_norm/ci/grammars/6-28-22 --text /home/TestData/nlp/text_norm/audio_based/text.txt \
--audio_data /home/TestData/nlp/text_norm/audio_based/audio.wav | tail -n2 | head -n1 > /tmp/out_file.txt 2>&1 && \
cmp --silent /tmp/out_file.txt /home/TestData/nlp/text_norm/audio_based/result.txt || exit 1'
}
}
stage('L2: TN with Audio (manifest)') {
steps {
sh 'cd nemo_text_processing/text_normalization && \
python normalize_with_audio.py --language=en --audio_data /home/TestData/nlp/text_norm/audio_based/manifest.json --n_tagged=120 --cache_dir /home/TestData/nlp/text_norm/ci/grammars/6-14-22'
python normalize_with_audio.py --language=en --audio_data /home/TestData/nlp/text_norm/audio_based/manifest.json --n_tagged=120 --cache_dir /home/TestData/nlp/text_norm/ci/grammars/6-28-22'
}
}
}
Expand Down Expand Up @@ -2129,7 +2129,7 @@ pipeline {
model.num_attention_heads=8 \
model.activation='swiglu' \
model.masked_softmax_fusion=False \
model.bias_gelu_fusion=False \
model.bias_activation_fusion=False \
model.activations_checkpoint_method='block' \
model.activations_checkpoint_num_layers=1 \
model.micro_batch_size=2 \
Expand Down Expand Up @@ -2161,7 +2161,7 @@ pipeline {
model.hidden_size=64 \
model.num_attention_heads=8 \
model.activation='swiglu' \
model.bias_gelu_fusion=False \
model.bias_activation_fusion=False \
model.masked_softmax_fusion=False \
model.activations_checkpoint_method='block' \
model.activations_checkpoint_num_layers=1 \
Expand Down Expand Up @@ -2893,7 +2893,7 @@ pipeline {
model.hidden_size=64 \
model.num_attention_heads=8 \
model.activation='swiglu' \
model.bias_gelu_fusion=False \
model.bias_activation_fusion=False \
model.activations_checkpoint_method='block' \
model.activations_checkpoint_num_layers=1 \
model.transformer_block_type='pre_ln' \
Expand All @@ -2918,7 +2918,7 @@ pipeline {
model.hidden_size=64 \
model.num_attention_heads=8 \
model.activation='swiglu' \
model.bias_gelu_fusion=False \
model.bias_activation_fusion=False \
model.activations_checkpoint_method='block' \
model.activations_checkpoint_num_layers=1 \
model.transformer_block_type='pre_ln' \
Expand Down Expand Up @@ -3015,7 +3015,7 @@ pipeline {
model.hidden_size=64 \
model.num_attention_heads=8 \
model.activation='swiglu' \
model.bias_gelu_fusion=False \
model.bias_activation_fusion=False \
model.activations_checkpoint_method='block' \
model.activations_checkpoint_num_layers=1 \
model.transformer_block_type='normformer' \
Expand All @@ -3040,7 +3040,7 @@ pipeline {
model.hidden_size=64 \
model.num_attention_heads=8 \
model.activation='swiglu' \
model.bias_gelu_fusion=False \
model.bias_activation_fusion=False \
model.activations_checkpoint_method='block' \
model.activations_checkpoint_num_layers=1 \
model.transformer_block_type='normformer' \
Expand Down Expand Up @@ -3094,7 +3094,7 @@ pipeline {
model.hidden_size=64 \
model.num_attention_heads=8 \
model.activation='reglu' \
model.bias_gelu_fusion=False \
model.bias_activation_fusion=False \
model.activations_checkpoint_method='block' \
model.activations_checkpoint_num_layers=1 \
model.data.data_prefix=[.5,/home/TestData/nlp/megatron_t5/data/pile_val_small_bert_tokenizer_text_document,.5,/home/TestData/nlp/megatron_t5/data/pile_val_small_bert_tokenizer_text_document]"
Expand All @@ -3116,7 +3116,7 @@ pipeline {
model.hidden_size=64 \
model.num_attention_heads=8 \
model.activation='reglu' \
model.bias_gelu_fusion=False \
model.bias_activation_fusion=False \
model.activations_checkpoint_method='block' \
model.activations_checkpoint_num_layers=1 \
model.data.data_prefix=[.5,/home/TestData/nlp/megatron_t5/data/pile_val_small_bert_tokenizer_text_document,.5,/home/TestData/nlp/megatron_t5/data/pile_val_small_bert_tokenizer_text_document]"
Expand Down Expand Up @@ -3150,7 +3150,7 @@ pipeline {
model.hidden_size=64 \
model.num_attention_heads=8 \
model.activation='geglu' \
model.bias_gelu_fusion=False \
model.bias_activation_fusion=False \
model.activations_checkpoint_method='block' \
model.activations_checkpoint_num_layers=1 \
model.data.data_prefix=[.5,/home/TestData/nlp/megatron_t5/data/pile_val_small_bert_tokenizer_text_document,.5,/home/TestData/nlp/megatron_t5/data/pile_val_small_bert_tokenizer_text_document]"
Expand All @@ -3173,7 +3173,7 @@ pipeline {
model.hidden_size=64 \
model.num_attention_heads=8 \
model.activation='geglu' \
model.bias_gelu_fusion=False \
model.bias_activation_fusion=False \
model.activations_checkpoint_method='block' \
model.activations_checkpoint_num_layers=1 \
model.data.data_prefix=[.5,/home/TestData/nlp/megatron_t5/data/pile_val_small_bert_tokenizer_text_document,.5,/home/TestData/nlp/megatron_t5/data/pile_val_small_bert_tokenizer_text_document]"
Expand Down

0 comments on commit 2e04a40

Please sign in to comment.