Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update speechllm #8486

Merged
merged 562 commits into from
Feb 22, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
562 commits
Select commit Hold shift + click to select a range
e7d94dc
fix(clustering_diarizer.py): fix typo (#7772)
jqueguiner Oct 23, 2023
85a5e7e
fix(diarization-README): typo (#7771)
jqueguiner Oct 23, 2023
46d2b9a
Fix bug wrt change decoding strategy for bpe models (#7762) (#7764)
github-actions[bot] Oct 23, 2023
4523562
Remove incorrect extra argument for load_from_checkpoint_dir() (#7500)
RobinDong Oct 23, 2023
fb6fb2f
Add nemo to mcore GPT conversion script (#7730)
cuichenx Oct 23, 2023
ece7634
Fix bug in ConditionalInput: cat along the feature dim, not the batch…
anferico Oct 24, 2023
1d4d397
Add some docs and update scripts for ASR (#7790)
titu1994 Oct 24, 2023
ddd052f
set context for text memmap to fork (#7784)
arendu Oct 24, 2023
3d77b5b
add training with multiple audios
stevehuang52 Oct 24, 2023
cb2e796
Support flash decoding (#7744)
hsiehjackson Oct 24, 2023
7bb4049
Change accelerator to 'auto' in nlp_checkpoint_port.py (#7761)
github-actions[bot] Oct 24, 2023
609af14
Add selection criteria for reference audios in the `GlobalStyleToken`…
anferico Oct 24, 2023
265a0a6
update text server to support compute logprobs (#7733)
Zhilin123 Oct 24, 2023
dc62add
add multi-layer feat extract and fix random question insertion
stevehuang52 Oct 25, 2023
5cda3d1
Configure MCore logger (#7781)
mikolajblaz Oct 25, 2023
19133b5
Revert "PEFT eval fix (#7626) (#7638)" (#7693)
ericharper Oct 26, 2023
5f8e06d
remove TN from ctc_segm tut (#7807)
ekmb Oct 26, 2023
2c28582
[TTS] Support audio offsets in TTS data loaders (#7156)
rlangman Oct 26, 2023
177a67f
Update Apex install command in Dockerfile (#7794) (#7804)
github-actions[bot] Oct 27, 2023
86632bc
fix typo
stevehuang52 Oct 27, 2023
c5a9d45
Nemo to HF converter for LLaMA model (#7770)
uppalutkarsh Oct 30, 2023
8c061de
Save best NeMo model only when necessary (#7836)
anteju Nov 1, 2023
05ecfe4
add guard if its a distributed checkpoint (#7845)
gshennvm Nov 2, 2023
b112ce9
Fix tn duplex (#7808)
ekmb Nov 3, 2023
0d37b3b
Update transformers cache on Jenkins (#7854)
ericharper Nov 3, 2023
269a100
Update README.rst for container update (#7844)
fayejf Nov 3, 2023
c7948b2
Add support for finetuning with huggingface datasets (#7834)
stevehuang52 Nov 3, 2023
0e3e108
Multimodal merge (#7728)
yaoyu-33 Nov 3, 2023
286e84e
LID: several random samples for long file (#6853)
karpnv Nov 6, 2023
86b198f
Fix flash decoding precision (#7852)
hsiehjackson Nov 6, 2023
b1bd2db
small fix to remove duplicate megatron-lm installation (#7864)
Davood-M Nov 7, 2023
df9f0d1
Adding long-form audio speaker diarization (clustering) class and fun…
tango4j Nov 7, 2023
78e86e9
fix random question
stevehuang52 Nov 7, 2023
34bae82
Fix mcore conversion bug (#7846)
cuichenx Nov 7, 2023
c7c59b4
fix random context
stevehuang52 Nov 7, 2023
7a69ab1
fix multi-layer feat
stevehuang52 Nov 7, 2023
d49b73c
adding special_tokens from tokenizer config for transformer-lm model …
Nov 8, 2023
77d1386
Add Adapter and IA3 support for MCore models (#7750)
cuichenx Nov 9, 2023
eebc4b6
update dataset
stevehuang52 Nov 9, 2023
2fd4d55
add comment on script and fix target check (#7881)
gshennvm Nov 14, 2023
e4682c2
update
stevehuang52 Nov 14, 2023
98848b0
Fix k2 installation: update for latest PyTorch, move script to `insta…
artbataev Nov 14, 2023
8cd5f1c
Add back import guard (#7882)
cuichenx Nov 15, 2023
37d907a
add ugly fix for incorrect max_steps with iterable dataset
stevehuang52 Nov 15, 2023
0d3d8fa
[ASR] GSS-based mask estimator (#7849)
anteju Nov 16, 2023
b4438a3
[Codec] Update codec checkpoint config (#7835)
anteju Nov 16, 2023
b97ee35
change fp8 defaults (#7894)
cuichenx Nov 16, 2023
991a513
Added knob for ub_tp_comm_overlap for the MCORE pass (#7902)
sanandaraj5597 Nov 17, 2023
ce6afd7
Upgrade NeMo to latest mcore and TE (#7862)
dimapihtar Nov 17, 2023
eba699d
Pad sequences to multiples of 16 for GPTSFTDataset (#7904)
vysarge Nov 17, 2023
76e5bdf
[Codec] Finite scalar quantizer (#7886)
anteju Nov 17, 2023
c3e628a
upgrade to latest mcore and TE (#7908)
dimapihtar Nov 17, 2023
d81beac
Tar codec (#7867)
nithinraok Nov 18, 2023
08937c8
added missing torch import (#7913)
Davood-M Nov 20, 2023
c5d696f
add multi-encoder and titanet support, update misc
stevehuang52 Nov 21, 2023
ef19e02
add cpu init check (#7889)
cuichenx Nov 21, 2023
29a90a3
Fix pinned triton version (#7925)
hsiehjackson Nov 22, 2023
9c7926d
fix tp_overlap config var name (#7928)
xrennvidia Nov 22, 2023
41efa55
update cfg
stevehuang52 Nov 22, 2023
973c65a
update cfg
stevehuang52 Nov 22, 2023
521cfb4
add Dutch P&C FC model info (#7892)
zhehuaichen Nov 23, 2023
79bc929
fix issues with convert_nemo_llama_to_hf.py (#7922)
Zhilin123 Nov 25, 2023
e54641f
update for speaker counting and misc
stevehuang52 Nov 27, 2023
d90dd18
add checks (#7943)
ericharper Nov 28, 2023
8b27f3a
instructions for running ci on pr template (#7944)
ericharper Nov 28, 2023
5c811b4
fix and update infer decoding, add clap encoder (WIP)
stevehuang52 Nov 30, 2023
df325e7
update cfg
stevehuang52 Dec 1, 2023
a7f0bc1
only enable query key scaling during fp16 (#7946)
gshennvm Dec 1, 2023
d102118
added bf16 support (#7888)
yidong72 Dec 1, 2023
acf1d9b
Proposed WAR for gpt3 eval hang with PP (#7927)
yaoyu-33 Dec 1, 2023
ae5d7e8
Pass in rotary_base to mcore and from HF (#7933)
Kipok Dec 3, 2023
110c9d7
Add interface to set NCCL options of each process group (#7923)
erhoo82 Dec 4, 2023
52d50e9
Support O2 training of PEFT and SFT (#7971)
cuichenx Dec 5, 2023
f733f54
Add news section to README (#7984)
ericharper Dec 7, 2023
bbadcf7
[NLP] Access scaler only in FP16 case (#7916)
janekl Dec 7, 2023
c822d5c
fix librosa display issue (#7991) (#7993)
github-actions[bot] Dec 7, 2023
25f066f
Fix librosa issue (#7994) (#7995)
github-actions[bot] Dec 7, 2023
88d3a4d
Minor fixes (#7978)
janekl Dec 7, 2023
5103a9a
Resolve dtype with utils_funcs.py (#7979)
janekl Dec 7, 2023
663bd0a
Remove replace_sampler_ddp (deprecated in Trainer) (#7981)
janekl Dec 7, 2023
c6cf276
Fixing conversion script to work for code llama (#7997)
shanmugamr1992 Dec 8, 2023
fa8d416
Reworked MegatronPretrainingRandomBatchSampler to correctly handle ep…
trias702 Dec 10, 2023
0d88e40
Fix Tokenizer argparse (#8012) (#8013)
github-actions[bot] Dec 12, 2023
8eaa504
Fix crash when converting to mcore a model using rotary embeddings (#…
odelalleau Dec 12, 2023
47f6773
Update link to yaml file in ASR_with_Transducers.ipynb (#8014)
Faith-Nchifor Dec 12, 2023
4dd57c2
use convert_hf_dataset_to_nemo (#8017)
karpnv Dec 12, 2023
ac376ed
Check `torchaudio` dependencies in installation script (#8019) (#8021)
github-actions[bot] Dec 12, 2023
bdaf650
Update asr_language_modeling.rst: Add a missing word (#8007)
martin0258 Dec 12, 2023
0e891dd
Added a procedure for Windows users, README (#7942)
Jorjeous Dec 12, 2023
783f6ab
spelling mistake (#7903)
orena1 Dec 12, 2023
a19a073
remove depricated arguments (#7917)
jbaczek Dec 12, 2023
af8daed
Add All Multimodal Source Code (#7791)
yaoyu-33 Dec 13, 2023
8216178
[TTS] Scale sampler steps by number of devices (#7947)
rlangman Dec 13, 2023
0c95bde
Update manifest.py to speedup loading tarred datasets (#7900)
stevehuang52 Dec 14, 2023
1270609
migrate to PTL2.0
stevehuang52 Dec 14, 2023
6df13f1
clean up
stevehuang52 Dec 14, 2023
fa0493a
update manifest util
stevehuang52 Dec 14, 2023
58a277a
First draft of mcore bert model in NeMo (#7814)
shanmugamr1992 Dec 15, 2023
8523384
Support Falcon Variants (7B/40B/180B) in Mcore NeMo (#7666)
xuanzic Dec 15, 2023
2eb320a
migrate to ptl2.1 to support multiple dataloaders
stevehuang52 Dec 15, 2023
10edd11
FSDP + Tensor Parallelism (#7897)
erhoo82 Dec 16, 2023
ed0f681
Packed Sequence (#7945)
cuichenx Dec 16, 2023
8903fd9
Adding method back that was removed accidentally (#8038)
ericharper Dec 16, 2023
6b40e62
[NLP] ArtifactItem with init=True to make it debuggable (#7980)
janekl Dec 18, 2023
e482965
SFT patch: (1) enable sequence parallelism and (2) enable profile (#7…
erhoo82 Dec 18, 2023
c81903c
update asr eval (#8045)
stevehuang52 Dec 18, 2023
3fc0db6
update misc
stevehuang52 Dec 19, 2023
3e940d1
[Fix] Fixed name of a test (#7986)
anteju Dec 19, 2023
bde6b92
Use GPU for inference, if available (#8048) (#8053)
github-actions[bot] Dec 19, 2023
39d883f
fix noise aug (#8057)
stevehuang52 Dec 19, 2023
4f947ce
fix eval and clean up
stevehuang52 Dec 20, 2023
7ba79a7
Various fixes for typos and urls (#8066)
titu1994 Dec 20, 2023
c0ab6be
[Fix] Increase length check tolerance to prevent test failing (#8067)
anteju Dec 20, 2023
0df7bd3
Use NLPDDPStrategyNotebook in Multitask_Prompt_and_PTuning.ipynb (#80…
github-actions[bot] Dec 21, 2023
f97c901
debug
stevehuang52 Dec 22, 2023
937df1a
update broken links (#8079) (#8080)
github-actions[bot] Dec 22, 2023
c672435
update reqs (#8072) (#8073)
github-actions[bot] Dec 23, 2023
274a21b
run with non-dev option (#8077) (#8078)
github-actions[bot] Dec 23, 2023
fcc0f9f
fix ptl issue #18803
stevehuang52 Dec 27, 2023
4fd1c74
migration to PTL 2.0 for spellmapper model (#7924)
bene-ges Dec 28, 2023
cfee7a8
update for concat dataset
stevehuang52 Dec 28, 2023
13c4d5e
Add text metrics to asr eval (#8087)
stevehuang52 Dec 29, 2023
7faeee8
Change the megatron config lr scheduler default and fix to change par…
shan18 Dec 29, 2023
c9c033d
fix device setting to allow using accelerator cpu (#8084)
orena1 Dec 31, 2023
7e1bf36
(1) Add SHARP interface to M-CORE, (2) use send/recv to send train lo…
erhoo82 Jan 2, 2024
d3c5896
Reconfigure limit_val_batches only for int (#8099)
athitten Jan 2, 2024
8a8258d
fix lora merge script (#8113)
cuichenx Jan 3, 2024
ae95cda
Support transcoding audio formats when saving tarred datasets (FLAC, …
pzelasko Jan 3, 2024
86f1b7d
Fixing wrapper and moving it to base class (#8055)
shanmugamr1992 Jan 4, 2024
d62f6ff
fix gated_linear_unit bug (#8042)
Agoniii Jan 4, 2024
dae28da
README edit to change Apple Silicon install instructions (to fix a br…
stephenmcconnachie Jan 4, 2024
e133264
Fix Adapter for MCore models (#8124)
cuichenx Jan 5, 2024
0f7528d
fix concat data probs
stevehuang52 Jan 5, 2024
a8ff106
Wer fix (#8047)
tbartley94 Jan 5, 2024
d81ea6a
add war fix for sync issues (#8130)
gshennvm Jan 5, 2024
ef6ed61
.ctm in data simulator annotator compliant with RT-09 specification (…
popcornell Jan 8, 2024
c6c2643
Improve PEFT UX (#8131)
cuichenx Jan 8, 2024
20adcc3
Fixes NVIDIA/apex installation to not erroneously install the `instal…
terrykong Jan 8, 2024
76a712a
Enhance flexibility by passing callbacks as method argument (#8015)
michal2409 Jan 9, 2024
58d6bce
context parallelism (#7739)
xrennvidia Jan 10, 2024
8bdcf37
Make pipelined TP comm overlap available with mcore (#8005)
erhoo82 Jan 10, 2024
3d4e290
remove deprecated scripts (#8138)
arendu Jan 10, 2024
bd47c5c
Fix AST eval (#8112)
stevehuang52 Jan 10, 2024
0a1a5b1
Graphviz fix (#7843)
GNroy Jan 10, 2024
8d4218e
Add All Multimodal Source Code Part 2: Text to image, x to nerf (#7970)
yaoyu-33 Jan 11, 2024
6c006df
adding OnlineSampleMapping (#8137)
arendu Jan 11, 2024
24d4344
Update README.rst (#8154)
fayejf Jan 11, 2024
03e7cf1
fix: numba.*_num_threads resets torch num_threads #8141 (#8145)
itzsimpl Jan 11, 2024
e46f410
fix TP>1 issue for conversion script (#8144)
cuichenx Jan 11, 2024
6082d76
Add distopt support for FP8 params and BF16 optimizer state (#7909)
timmoon10 Jan 12, 2024
90600f1
Support torch jit script (#8027)
artbataev Jan 12, 2024
0e7b388
refactor
stevehuang52 Jan 12, 2024
62b27b7
Update dependencies (#8156)
titu1994 Jan 12, 2024
199a8ba
NeMo + Lhotse integration (#7880)
pzelasko Jan 13, 2024
c30536d
Revert "adding OnlineSampleMapping" (#8164)
pablo-garay Jan 13, 2024
1fede57
Add token count and sequence length logging for MegatronGPTSFTModel a…
vysarge Jan 14, 2024
733f530
undo lora weight
stevehuang52 Jan 15, 2024
c2aa737
Use latest apex internal API (#8129)
jbaczek Jan 16, 2024
b404b84
tune specific params in the base model (#7745)
arendu Jan 16, 2024
410f092
Speedup RNN-T greedy decoding (#7926)
artbataev Jan 16, 2024
fe358f4
NeMo Multimodal Docs and Tests Initial PR (#8028)
yaoyu-33 Jan 16, 2024
8811946
Virtual pipeline parallel support for MegatronGPTSFTModel (#7964)
vysarge Jan 17, 2024
b144bb9
removed pdeprecated eft model (#8183)
arendu Jan 17, 2024
48f3514
remove more deprecated files (#8169)
arendu Jan 18, 2024
7449c67
Fix learning rate schedule in Megatron models when `max_steps` is not…
odelalleau Jan 18, 2024
2c2c3cc
Remove left-over prints in NeMo+Lhotse code (#8180)
pzelasko Jan 18, 2024
92b098a
Upgrade to DLFW PyTorch 23.12 (#8163)
ericharper Jan 18, 2024
6c40209
pre-generate cu_seqlens argmin and max_seqlen to remove host-to-devic…
erhoo82 Jan 18, 2024
dd69c7a
Add the interface to use SHARP to FSDP strategy (#8202)
erhoo82 Jan 19, 2024
dab6a04
Multimodal required NLP base model changes (#8188)
yaoyu-33 Jan 19, 2024
e329575
[NLP] Improve and unify loading state_dict for community models (#7977)
janekl Jan 19, 2024
46f6465
[TTS] Add period discriminator and feature matching loss to codec rec…
rlangman Jan 19, 2024
3579919
Add Lhotse support for `offset` key in NeMo manifests (#8197)
pzelasko Jan 19, 2024
c56ef6c
Fix CPU Initialization and TP>1 for LoRA Merge Script (#8199)
cuichenx Jan 19, 2024
e7e007b
[docker] Install k2 before NeMo for faster image rebuilding (#8204)
pzelasko Jan 19, 2024
d656f22
Rename Finetuning Scripts (#8201)
cuichenx Jan 20, 2024
bb575b7
Final multimodal PR with our recent developments on MM side (#8127)
yaoyu-33 Jan 20, 2024
dfaf500
Added VectorQuantizer base class (#8011)
anteju Jan 20, 2024
7d3d9ac
Add include_text parameter to SFT dataloaders (#8198)
Kipok Jan 21, 2024
d8b2ffc
change end string
stevehuang52 Jan 22, 2024
b84c231
Add random_seed argument to generate (#8162)
Kipok Jan 22, 2024
8abdb25
Added support for neptune logger (#8210)
harishankar-gopalan Jan 23, 2024
40fb2ce
Pre-compute max_seqlen and cu_seqlens_argmin in all model-parallel ca…
erhoo82 Jan 23, 2024
b5265cb
Add --force_codec to tarred dataset creation examples (#8227)
pzelasko Jan 23, 2024
0f239ca
Temporarily use the previous RNN-T decoding algorithm as default (#8226)
artbataev Jan 23, 2024
a39f526
Use PackedSeqParams in accordance with changes in Megatron-LM (#8205)
cuichenx Jan 23, 2024
a44b75d
Move check to prevent running peft with VP to a more correct phase of…
vysarge Jan 23, 2024
c09b114
Fixed the tp overlap switch (#8195)
sanandaraj5597 Jan 23, 2024
d275d68
add knobs for rope/swiglu fusion (#8184)
lhb8125 Jan 24, 2024
0773702
Added sample cpu_offloading switch to YAML (#8148)
sanandaraj5597 Jan 24, 2024
aeb9799
Add support in Neural Typecheck to disable semantic checks (#8212)
titu1994 Jan 24, 2024
f10d694
Make TDT inference not require duration params (#8207)
hainan-xv Jan 24, 2024
6143f6b
RSyncing random seed between ranks (#8230)
Kipok Jan 25, 2024
f25be00
add first_val_step to mcore scheduler (#8150)
JimmyZhang12 Jan 25, 2024
618ff06
Correct padding for SFT input data to account for sequence parallel +…
vysarge Jan 25, 2024
9944304
Mistral 7b conversion script (#8052)
akoumpa Jan 25, 2024
a93589d
switch to mcore dataset [with FIM support] (#8149)
dimapihtar Jan 25, 2024
fdc4b13
mcore ds fix (#8253)
dimapihtar Jan 26, 2024
13c1db4
Mixtral to NeMo conversion script. (#8155)
akoumpa Jan 27, 2024
19ce912
fixes to accomendate mcore changes (#8261)
HuiyingLi Jan 27, 2024
898cb99
Allow MegatronPretrainingRandomSampler to do multi-epoch training (#8…
trias702 Jan 29, 2024
37ac5a3
[tutorial] fixed missing RIR scripts file. (#8257)
XuesongYang Jan 29, 2024
7b2415a
add values to en tts dict (#7879)
mgrafu Jan 30, 2024
85d8756
Add Bert HF checkpoint converter (#8088)
yaoyu-33 Jan 31, 2024
7f7a487
Merge remote-tracking branch 'origin/main' into heh/modular_speechlm_…
stevehuang52 Jan 31, 2024
f6e6485
Pin lhotse version to 1.19.2 (#8291)
pzelasko Jan 31, 2024
a4f1f1c
Fix documentation build (#8308)
artbataev Feb 1, 2024
5fdd12e
Cache Aware Streaming tutorial notebook (#8296) (#8311)
github-actions[bot] Feb 2, 2024
d10726d
Attention encoder-decoder models for multiple speech-to-text tasks …
pzelasko Feb 3, 2024
a5448f3
"Loop labels" greedy decoding: faster implementation (#8286)
artbataev Feb 3, 2024
5e22ff4
updated online sample mapping (#8181)
arendu Feb 5, 2024
dced14d
Fix memory leak caused by context parallelism hanging references by o…
github-actions[bot] Feb 5, 2024
d95624c
Fixing bug in tutorials. (#8335)
tbartley94 Feb 5, 2024
c2ea202
Support uploading NeMo models to HF via `push_to_hf_hub()` (#8263)
titu1994 Feb 6, 2024
9940ec6
add check for distributed optimizer which is unsupported for PEFT (#8…
cuichenx Feb 6, 2024
b0ddfa0
Remove asr webapp (#8347) (#8348)
github-actions[bot] Feb 6, 2024
4afc277
ASR Transcription Refactor (#8167)
titu1994 Feb 6, 2024
0fb851c
remove _target_ at model level in aed config (#8351) (#8352)
github-actions[bot] Feb 6, 2024
d3237d5
Update HF hub (#8349)
titu1994 Feb 7, 2024
2bc2e97
Change default (#8371) (#8372)
github-actions[bot] Feb 8, 2024
5a65505
Unfinished checkpoints handling (#7952)
jbieniusiewi Feb 8, 2024
c84121a
Improve communication overlapping in FP8 distributed optimizer (#8221)
timmoon10 Feb 8, 2024
b100cd1
Add AudioCodecModel to documentation (#8376)
anteju Feb 8, 2024
0bb9e66
Add longform infer for MultitaskAED models (#8355)
stevehuang52 Feb 9, 2024
6865c39
bug fix in fast-conformer-aed.yaml and adding jenkins test for speech…
github-actions[bot] Feb 9, 2024
8a08b00
Reintroduce dictionaries for data prefixes in GPT (#8362)
jbaczek Feb 12, 2024
fe0fe23
Add Finetuning tutorial with HF Datasets (#8356) (#8393)
github-actions[bot] Feb 12, 2024
8349d63
Fixes for MoE parameter passing & use of AutoTokenizer/Model for mist…
github-actions[bot] Feb 12, 2024
0bfac69
Context-biasing by CTC-based Word Spotter (CTC-WS) (#8223)
andrusenkoau Feb 13, 2024
4e7293a
Fix Canary chunked infer on short audios (#8382)
stevehuang52 Feb 13, 2024
3d0e5ca
revert changes (#8410) (#8411)
github-actions[bot] Feb 13, 2024
1f519a9
Update NFA video download link (#8406) (#8408)
github-actions[bot] Feb 13, 2024
05c051b
updated link to pubmed (#8402) (#8407)
github-actions[bot] Feb 13, 2024
3a76b9d
Mcore customization doc (#8298) (#8405)
github-actions[bot] Feb 13, 2024
03a7e4f
Script for estimating Lhotse dynamic duration buckets (#8237)
pzelasko Feb 13, 2024
a06835f
Add Canary support for decoding with return_hypotheses=True (#8338)
stevehuang52 Feb 13, 2024
6e96e9c
[TTS] Add modules for mel spectrogram codec (#8238)
rlangman Feb 14, 2024
478ec6b
coldfix (#8412)
Jorjeous Feb 14, 2024
e534beb
Merge remote-tracking branch 'origin/main' into heh/modular_speechlm_…
stevehuang52 Feb 14, 2024
d6a9ef4
fix issues from previous merge, clean up
stevehuang52 Feb 15, 2024
68b3b2e
update config
stevehuang52 Feb 15, 2024
06075d9
fix ptl bug of keeping multiple -last.ckpt
stevehuang52 Feb 19, 2024
8ae3094
fix ptl bug of keeping multiple -last.ckpt
stevehuang52 Feb 19, 2024
7302e5b
add support for non-peft tuning
stevehuang52 Feb 20, 2024
16d134b
update and add docs
stevehuang52 Feb 21, 2024
bea4256
update doc
stevehuang52 Feb 21, 2024
751404b
Merge branch 'modular_speechllm' into heh/modular_speechlm_nightly
stevehuang52 Feb 21, 2024
77405cd
update for infer
stevehuang52 Feb 22, 2024
4af9f38
Merge branch 'heh/modular_speechlm_nightly' of https://github.com/NVI…
stevehuang52 Feb 22, 2024
f569a6f
clean up
stevehuang52 Feb 22, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
2 changes: 2 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
Expand Up @@ -17,3 +17,5 @@ coverage.xml
.git
**/*.nemo
**/*.ckpt
workspace
nemo_experiments
3 changes: 3 additions & 0 deletions .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,9 @@ Add a one line overview of what this PR aims to accomplish.
# Add a code snippet demonstrating how to use this
```

# Jenkins CI
To run Jenkins, a NeMo User with write access must comment `jenkins` on the PR.

# Before your PR is "Ready for review"
**Pre checks**:
- [ ] Make sure you read and followed [Contributor guidelines](https://github.com/NVIDIA/NeMo/blob/main/CONTRIBUTING.md)
Expand Down
8 changes: 8 additions & 0 deletions .github/labeler.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,25 +3,33 @@ ASR:
- examples/asr/**/*
- tutorials/asr/**/*
- docs/source/asr/**/*
- tests/collections/asr/**

NLP:
- nemo/collections/nlp/**/*
- examples/nlp/**/*
- tutorials/nlp/**/*
- docs/source/nlp/**/*
- tests/collections/nlp/**

Speaker Tasks:
- examples/speaker_tasks/**/*
- tutorials/speaker_tasks/**/*

TTS:
- nemo/collections/tts/**/*
- nemo/collections/common/tokenizers/text_to_speech/**
- examples/tts/**/*
- tutorials/tts/**/*
- docs/source/tts/**/*
- scripts/dataset_processing/tts/**
- scripts/tts_dataset_files/**
- tests/collections/tts/**
- tests/collections/common/tokenizers/text_to_speech/**

core:
- nemo/core/**/*
- tests/core/**

common:
- nemo/collections/common/**/*
Expand Down
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ repos:
- id: check-case-conflict
- id: detect-private-key
- id: check-added-large-files
args: ['--maxkb=1000']
args: ['--maxkb=5000']
- id: requirements-txt-fixer

- repo: https://github.com/PyCQA/isort
Expand Down
6 changes: 5 additions & 1 deletion .readthedocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,12 +20,16 @@
# Required field.
version: 2

build:
os: ubuntu-22.04
tools:
python: "3.10"

# Build documentation in the docs/ directory with Sphinx.
sphinx:
configuration: docs/source/conf.py

# Set the version of Python and requirements required to build your docs
python:
version: 3.8
install:
- requirements: requirements/requirements_docs.txt
85 changes: 65 additions & 20 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
# See the License for the specific language governing permissions and
# limitations under the License.

ARG BASE_IMAGE=nvcr.io/nvidia/pytorch:23.06-py3
ARG BASE_IMAGE=nvcr.io/nvidia/pytorch:23.12-py3

# build an image that includes only the nemo dependencies, ensures that dependencies
# are included first for optimal caching, and useful for building a development
Expand All @@ -31,7 +31,7 @@ ARG REQUIRE_AIS_CLI=false

# Ensure apt-get won't prompt for selecting options
ENV DEBIAN_FRONTEND=noninteractive
# libavdevice-dev rerquired for latest torchaudio
# libavdevice-dev required for latest torchaudio
RUN apt-get update && \
apt-get upgrade -y && \
apt-get install -y \
Expand All @@ -42,15 +42,48 @@ RUN apt-get update && \
libavdevice-dev && \
rm -rf /var/lib/apt/lists/*

WORKDIR /workspace/
# libtool, ... , libgts-dev are required for graphviz
# graphviz is required for k2 and pynini visualization
RUN apt-get update && \
apt-get install -y \
libtool \
libltdl-dev \
automake \
autoconf \
bison \
flex \
tcl \
ghostscript \
libgd-dev \
fontconfig \
libcairo2-dev \
libpango1.0-dev \
libgts-dev && \
rm -rf /var/lib/apt/lists/*

WORKDIR /tmp/
# TODO: Remove once this Apex commit (5/12/23) is included in PyTorch
# container
WORKDIR /workspace/
# Install megatron core, this can be removed once 0.3 pip package is released
# We leave it here in case we need to work off of a specific commit in main
RUN git clone https://github.com/NVIDIA/Megatron-LM.git && \
cd Megatron-LM && \
git checkout 27cbe46714a50c43ed290f1b1472db8d2780c55c && \
pip install .

# Performance optimizations for distributed optimizer: https://github.com/NVIDIA/apex/pull/1771
RUN git clone https://github.com/NVIDIA/apex.git && \
cd apex && \
git checkout 8b7a1ff183741dd8f9b87e7bafd04cfde99cea28 && \
pip3 install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" --global-option="--fast_layer_norm" --global-option="--distributed_adam" --global-option="--deprecated_fused_adam" ./
git checkout b496d85fb88a801d8e680872a12822de310951fd && \
pip install -v --no-build-isolation --disable-pip-version-check --no-cache-dir --config-settings "--build-option=--cpp_ext --cuda_ext --fast_layer_norm --distributed_adam --deprecated_fused_adam" ./

# Transformer Engine 1.2.0
RUN git clone https://github.com/NVIDIA/TransformerEngine.git && \
cd TransformerEngine && \
git fetch origin 4f9662fbe621671f5f905e772fc1138953af77f6 && \
git checkout FETCH_HEAD && \
git submodule init && git submodule update && \
NVTE_FRAMEWORK=pytorch NVTE_WITH_USERBUFFERS=1 MPI_HOME=/usr/local/mpi pip install .

WORKDIR /tmp/

# uninstall stuff from base container
RUN pip3 uninstall -y sacrebleu torchtext
Expand All @@ -67,19 +100,20 @@ RUN INSTALL_MSG=$(/bin/bash /tmp/torchaudio_build/scripts/installers/install_tor
else echo "Skipping failed torchaudio installation"; fi \
else echo "torchaudio installed successfully"; fi

# install nemo dependencies
WORKDIR /tmp/nemo
COPY requirements .
RUN for f in $(ls requirements*.txt); do pip3 install --disable-pip-version-check --no-cache-dir -r $f; done

# install flash attention dependencies
RUN pip install flash-attn
# pinned triton version for flash-attention https://github.com/HazyResearch/flash-attention/blob/main/flash_attn/flash_attn_triton.py#L3
RUN pip install triton==2.0.0.dev20221202
COPY scripts /tmp/nemo/scripts/
# install correct graphviz version (k2 and pynini visualization tool), skip if installation fails
RUN INSTALL_MSG=$(/bin/bash /tmp/nemo/scripts/installers/install_graphviz.sh --docker); INSTALL_CODE=$?; \
echo ${INSTALL_MSG}; \
if [ ${INSTALL_CODE} -ne 0 ]; then \
echo "graphviz installation failed"; \
if [ "${REQUIRE_K2}" = true ]; then \
exit ${INSTALL_CODE}; \
else echo "Skipping failed graphviz installation"; fi \
else echo "graphviz installed successfully"; fi

# install k2, skip if installation fails
COPY scripts /tmp/nemo/scripts/
RUN INSTALL_MSG=$(/bin/bash /tmp/nemo/scripts/speech_recognition/k2/setup.sh); INSTALL_CODE=$?; \
RUN INSTALL_MSG=$(/bin/bash /tmp/nemo/scripts/installers/install_k2.sh); INSTALL_CODE=$?; \
echo ${INSTALL_MSG}; \
if [ ${INSTALL_CODE} -ne 0 ]; then \
echo "k2 installation failed"; \
Expand All @@ -88,13 +122,24 @@ RUN INSTALL_MSG=$(/bin/bash /tmp/nemo/scripts/speech_recognition/k2/setup.sh); I
else echo "Skipping failed k2 installation"; fi \
else echo "k2 installed successfully"; fi

# install nemo dependencies
WORKDIR /tmp/nemo
ENV LHOTSE_REQUIRE_TORCHAUDIO=0
COPY requirements .
RUN for f in $(ls requirements*.txt); do pip3 install --disable-pip-version-check --no-cache-dir -r $f; done

# install flash attention
RUN pip install flash-attn
# install numba for latest containers
RUN pip install numba>=0.57.1

# copy nemo source into a scratch image
FROM scratch as nemo-src
COPY . .

# start building the final container
FROM nemo-deps as nemo
ARG NEMO_VERSION=1.20.0
ARG NEMO_VERSION=1.23.0

# Check that NEMO_VERSION is set. Build will fail without this. Expose NEMO and base container
# version information as runtime environment variable for introspection purposes
Expand All @@ -103,7 +148,7 @@ RUN /usr/bin/test -n "$NEMO_VERSION" && \
/bin/echo "export BASE_IMAGE=${BASE_IMAGE}" >> /root/.bashrc

# Install NeMo
RUN --mount=from=nemo-src,target=/tmp/nemo cd /tmp/nemo && pip install ".[all]"
RUN --mount=from=nemo-src,target=/tmp/nemo,rw cd /tmp/nemo && pip install ".[all]"

# Check install
RUN python -c "import nemo.collections.nlp as nemo_nlp" && \
Expand Down
Loading
Loading