NVIDIA / NeMo Public

Notifications
Fork 2.7k
Star 13.4k

Code
Issues 56
Pull requests 118
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Issues: NVIDIA/NeMo

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clear current search query, filters, and sorts

56 Open 2,399 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

Error in saving nemo checkpoint with Llama-3.1-70B SFT. /opt/NeMo/nemo/utils/callbacks/nemo_model_checkpoint.py bug

Something isn't working

stale

#12157 opened Feb 12, 2025 by songwang41

Possible bug in ASRDecoderTimeStamps - math.ceil on fractional tokens_per_chunk leads to timestamps displacements on long files bug

Something isn't working

#11604 opened Dec 15, 2024 by bene-ges

when i use container to do sft for any model, it has context not found error bug

Something isn't working

#11825 opened Jan 11, 2025 by munger1985

Annoyed NeMo User Rant

#11835 opened Jan 13, 2025 by yonas-g

Broken offline mode of NeMo bug

Something isn't working

#11899 opened Jan 20, 2025 by maxstrobel

Hybrid Sharding support with FSDP stale

#11946 opened Jan 23, 2025 by Teng-xu

Pickling error when trying to save checkpoints with custom checkpointIO bug

Something isn't working

#11955 opened Jan 24, 2025 by jdnurme

Add option for prefetch factor of data loader to config stale

#11977 opened Jan 28, 2025 by shengshiqi-google

ASR: How to convert .ckpt to nemo correctly? ASR bug

Something isn't working

#12003 opened Jan 31, 2025 by ican24

llava-like dataset implementation "LazySupervisedDataset" likely fails to handle large dataset

#12034 opened Feb 3, 2025 by bernardhan33

AttributeError: 'HFDatasetDataModule' object has no attribute 'tokenizer' bug

Something isn't working

#12080 opened Feb 6, 2025 by j40903272

[QST] How to set MoE-specific TP size in recipe? stale

#12103 opened Feb 8, 2025 by umiswing

Bug when generating confidence scores with timestamps for a buffered rnnt model ASR bug

Something isn't working

#11456 opened Dec 3, 2024 by aanchan

Fail to convert trained checkpoint to HF format bug

Something isn't working

stale

#12124 opened Feb 10, 2025 by Zhihan1996

NeMo is not friendly to HF compatibility.

#12166 opened Feb 13, 2025 by dyang67

I am trying to train the FastConformer 120M model from scratch, but it is not converging? ASR help wanted

Extra attention is needed

#12167 opened Feb 13, 2025 by PhamDangNguyen

Update TE version for support of pad_between_seqs=True bug

Something isn't working

#12174 opened Feb 13, 2025 by cyanguwa

HiFiGAN Finetune "Cannot re-initialize CUDA in forked subprocess." bug

Something isn't working

#12178 opened Feb 13, 2025 by Fournogo

Bloated pre-requirements

#12188 opened Feb 14, 2025 by ceyxasm

Support configuration of num_workers and max_samples_per_sequence in llava_next_pretrain

#12195 opened Feb 14, 2025 by bernardhan33

Checkpointing randomly fails bug

Something isn't working

#12203 opened Feb 15, 2025 by aflah02

Pre-Training Neva under pipeline parallel set to 2. bug

Something isn't working

#12205 opened Feb 16, 2025 by takuya576

loss divergence when CP>1 and MBS>1 bug

Something isn't working

#12210 opened Feb 17, 2025 by hawkoli1987

There exists a 2.4B training parameter during fine-tuned training of a 70B model, where did this parameter come from?

#12213 opened Feb 17, 2025 by echo-valor

HUGE Inconsistency between logged tokens_per_second_per_GPU and actual wall time and Global Step is Not Monotonically Increasing bug

Something isn't working

#12727 opened Mar 21, 2025 by aflah02

Previous 1 2 3 Next

Previous Next

ProTip! Type g p on any issue or pull request to go back to the pull request listing page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly