Skip to content

GitHub Actions: Add style check#1

Closed
ayushdg wants to merge 2 commits intoNVIDIA-NeMo:mainfrom
ayushdg:ci/style-check
Closed

GitHub Actions: Add style check#1
ayushdg wants to merge 2 commits intoNVIDIA-NeMo:mainfrom
ayushdg:ci/style-check

Conversation

@ayushdg
Copy link
Copy Markdown
Contributor

@ayushdg ayushdg commented Mar 18, 2024

No description provided.

@ayushdg ayushdg marked this pull request as draft March 18, 2024 18:43
@ayushdg ayushdg marked this pull request as ready for review March 18, 2024 19:24
@ryantwolf ryantwolf self-requested a review March 18, 2024 20:56
Copy link
Copy Markdown
Contributor

@ryantwolf ryantwolf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

testing github action

@ayushdg
Copy link
Copy Markdown
Contributor Author

ayushdg commented Mar 18, 2024

Closing in favor of #3

@ayushdg ayushdg closed this Mar 18, 2024
@ayushdg ayushdg deleted the ci/style-check branch April 3, 2024 00:14
copy-pr-bot Bot pushed a commit that referenced this pull request Nov 21, 2025
copy-pr-bot Bot pushed a commit that referenced this pull request Apr 1, 2026
…mpat

Fix Cosmos-Embed1 compatibility with transformers 4.56+ / 5.x
Jorjeous added a commit that referenced this pull request Apr 27, 2026
…iption prompt for Qwen3-Omni

Adds language-agnostic single-turn ASR pseudo-labeling prompt for non-English
audio. Unlike the English two-turn flow (transcription + disfluency followup),
this prompt combines transcription and verbatim fidelity into one instruction,
making the followup turn unnecessary for ML languages.

- examples/audio/qwen_omni_inprocess/prompts/ml_qwen3_omni_disfluency_asr.md
  (uses {language} placeholder)
- nemo_curator/models/qwen_omni.py: _resolve_prompt() helper + thread
  language through _build_messages, _build_turn2_messages, _prepare_single,
  _prepare_batch, _prepare_turn2_single, _prepare_turn2_batch, generate()
- nemo_curator/stages/audio/inference/qwen_omni.py: source_lang_key field
  pulls per-sample language from manifest and passes to model.generate()
- examples/audio/qwen_omni_inprocess/run_pipeline.py: --source_lang_key CLI

Surgical squash cherry-pick of #1839 (additive bits only).
Skipped FastTextLIDStage source_lang_key (would conflict with PR #1's
source-tracking refactor) and initialize_fields drop (already handled). #NO_PR

Signed-off-by: George Zelenfroynd <gzelenfroind@nvidia.com>
Jorjeous added a commit that referenced this pull request Apr 27, 2026
…st, validated improvements on top of the 4 PRs

Squash cherry-pick of integration-test's unique commits on top of #1853 + #1 + #3 + #1839:

- 633acc7 FastText and Hallucination update
  → SelectBestPredictionStage: cross-model WER agreement. If both omni and
    ASR are flagged hallucinated but agree (WER ≤ 100 - min_agreement_pct,
    default 80%), keep omni and mark recovered — two independent models
    producing near-identical text is strong evidence the text is correct.
  → FastTextLIDStage: HuggingFace-format model loader, proper _predict()
    abstraction, source-tracked _skip_me ("Wrong language:{name}").

- 5fdfa0a additional notes key + skip writing keys after skip_me + pnc prompt + prefill caching
  → Models (qwen_omni, qwen_asr, qwen_text_llm): notes_key field for
    diagnostic info, vLLM enable_prefix_caching=True with xxhash.
  → text_filtering stages: skip writing output keys when skip_me is set.
  → New file: prompts/pnc_prompt.md.

- 15424e3 updated prompt for ITN
  → Sharper ITN prompt (handles more conversion edge cases).

- 0cf8e6c match max model len for ITN and PnC
  → Aligned ITN/PnC max_model_len (4096), max_num_seqs (16),
    gpu_memory_utilization (0.95). Wired ITN args through run_pipeline.

- 7e32df1 add Qwen3ASR for all
  → Apply QwenASR recovery to all hallucination flags, not just specific
    patterns. WhisperHallucinationStage tweaks.

- caccd37 Add min word count for FastText
  → Re-adds min_word_count=2 (FastText is unreliable on single-word inputs).

Conflict resolution:
- run_pipeline.py: kept multi-line argparse style (ours), kept --source_lang_key,
  adopted theirs' ITN stage construction (with new max_model_len/num_seqs/gpu_mem args).
- fasttext_lid.py: took theirs' richer process logic (min_word_count check,
  per-sample expected language via source_lang_key, source-tracked _skip_me values). #NO_PR

Signed-off-by: George Zelenfroynd <gzelenfroind@nvidia.com>
Jorjeous added a commit that referenced this pull request Apr 27, 2026
Merge origin/main into dev to pick up upstream changes (492 files, +57k/-6k):
- 26.04 staging release
- Generic ASR/TTS audio processing pipeline (#1679)
- Dynamo disaggregated serving + validators (#1813, #1820, #1833, #1834, #1861)
- ReadSpeech audio curation benchmark + tutorials (#1841, #1851, #1870)
- VideoReader path validation, audio waveform leak fixes (#1845, #1765)
- Sortformer tutorial fixes + benchmarks (#1764)
- Generic audio pipeline + qwen3 support (#1827)
- Fern docs (audio + curate-audio sections)

Conflict resolution:
- nemo_curator/stages/audio/__init__.py: kept dev's lazy __getattr__ registry,
  added main's new ManifestReader and ManifestWriterStage to both __all__ and
  _LAZY_IMPORTS (now lazy-loaded from nemo_curator.stages.audio.common).
- uv.lock: took main's version (latest dependency resolutions).

Removals propagated from main (pre-merge-base files we no longer need):
- nemo_curator/stages/audio/alm/alm_manifest_writer.py (replaced by ShardedManifestWriterStage)
- nemo_curator/stages/audio/alm/alm_manifest_reader.py
- nemo_curator/backends/experimental/* (refactored away)
- nemo_curator/core/serve.py (replaced by typed serve config)

Verified intact:
- SCOTCH pipeline: speaker_id/, hifi_pipeline/slurm_e2e/ (dev-only additions, untouched).
- Cherry-picked audio PRs (#1853, #3, #1, #1839, integration-test) all present.

Signed-off-by: George Zelenfroynd <gzelenfroind@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants