Skip to content

Readspeech tutorial: README, tests, and lower audio data filter GPU defaults#1841

Merged
sarahyurick merged 21 commits intoNVIDIA-NeMo:mainfrom
shubhamNvidia:pr/documentation_revised
Apr 24, 2026
Merged

Readspeech tutorial: README, tests, and lower audio data filter GPU defaults#1841
sarahyurick merged 21 commits intoNVIDIA-NeMo:mainfrom
shubhamNvidia:pr/documentation_revised

Conversation

@shubhamNvidia
Copy link
Copy Markdown
Contributor

Summary

Documentation and testing for the readspeech audio tutorial, plus notebook hygiene and lighter default GPU fractions for the audio data filter pipeline.

Changes

Documentation & tutorial

  • tutorials/audio/readspeech/README.md — Expanded guide, including A100 benchmark notes and workflow details.
  • tutorials/audio/readspeech/readspeech_tutorial.ipynb — Added tutorial notebook for readspeech.

Tests

  • tests/stages/audio/test_readspeech_create_initial_manifest.py — New tests for readspeech initial manifest creation.
  • tests/stages/audio/test_common.py — Updates to shared audio stage tests.

Repo hygiene

  • .gitattributes*.ipynb uses nbstripout on commit to avoid large accidental notebook output diffs; notebook language hint preserved.

Default config

  • nemo_curator/stages/audio/advanced_pipelines/audio_data_filter/default_config.yaml — Reduced default gpus for VAD, UTMOS, SigMOS, and speaker separation so defaults are less GPU-heavy out of the box (e.g. VAD 0.30.1, UTMOS/SigMOS 0.50.2, speaker separation 1.00.4).

Commits (vs main)

  • Merge latest main into this branch.
  • Docs: README benchmarks, notebook, tests, nbstripout.
  • Lower default GPU fractions in audio data filter config.

Testing

  • CI / existing audio stage tests
  • Manual: follow README + notebook on target hardware if applicable

@shubhamNvidia shubhamNvidia requested a review from a team as a code owner April 21, 2026 14:44
@shubhamNvidia shubhamNvidia requested review from sarahyurick and removed request for a team April 21, 2026 14:44
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Apr 21, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 21, 2026

Greptile Summary

This PR adds a tutorial notebook and expanded README for the DNS Challenge ReadSpeech pipeline, new unit tests for manifest creation and audio common utilities, per-sample band-filter rejection logging, timing instrumentation in pipeline.py/run.py, and reduced default GPU fractions in default_config.yaml to lower out-of-the-box resource requirements. All changes are additive and self-contained with no breaking API modifications.

Confidence Score: 5/5

Safe to merge — only P2 style findings, no logic or correctness issues.

All findings are P2 (log level preference, minor log duplication). No correctness bugs, security issues, or data-integrity concerns were identified across any of the changed files.

No files require special attention.

Important Files Changed

Filename Overview
nemo_curator/stages/audio/advanced_pipelines/audio_data_filter/default_config.yaml Reduced default GPU fractions for UTMOS (0.2→0.1), SIGMOS (0.2→0.1), and speaker_separation (0.4→0.3) to lower out-of-the-box resource requirements.
nemo_curator/stages/audio/filtering/band.py Added a logging call on band-filter rejection, but log level is info rather than debug, which will produce verbose output at scale.
tests/stages/audio/datasets/test_readspeech_create_initial_manifest.py New test file covering manifest creation, filename parsing, sample limiting, end-to-end processing, and auto-download mocking — good coverage.
tests/stages/audio/test_common.py Adds unit tests for audio utility helpers (ensure_mono, ensure_waveform_2d, load_audio_file, resolve_waveform_from_item, resolve_model_path) with appropriate mocking.
tutorials/audio/readspeech/README.md Significantly expanded documentation: GPU memory requirements, performance benchmarks, parameter tuning guides, SIGMOS/UTMOS/VAD threshold explanations, composability notes, and troubleshooting tips.
tutorials/audio/readspeech/pipeline.py Adds timing instrumentation and an extra info log; minor style issue with logger.exception duplicating {e} in the format string.
tutorials/audio/readspeech/run.py Adds wall-clock timing log after pipeline.run(), consistent with the pipeline.py change.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[CreateInitialManifestReadSpeechStage] -->|AudioTask per WAV| B[MonoConversionStage]
    B --> C[VAD Segmentation]
    C --> D[BandFilterStage]
    D --> E[UTMOSFilterStage]
    E --> F[SIGMOSFilterStage]
    F --> G[SpeakerSeparationStage]
    G --> H[AudioToDocumentStage]
    H --> I[JsonlWriter]

    subgraph GPU Defaults Reduced
        E
        F
        G
    end
Loading

Reviews (24): Last reviewed commit: "chore: revert secrets baseline to upstre..." | Re-trigger Greptile

Comment thread .gitattributes Outdated
*.ipynb linguist-language=Python
*.tar.gz filter=lfs diff=lfs merge=lfs -text
# Strip notebook outputs on commit to prevent accidental large diffs
*.ipynb filter=nbstripout
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Duplicate *.ipynb pattern — consider merging into one line

There are now two separate *.ipynb attribute lines. Git applies both independently (different attributes don't collide), so it works correctly, but the convention is to keep all attributes for a single glob on one line to avoid confusion:

Suggested change
*.ipynb filter=nbstripout
*.ipynb linguist-language=Python filter=nbstripout

You can also remove the standalone comment line above if you merge them, or keep it as a line-comment above the combined entry.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

@shubhamNvidia
Copy link
Copy Markdown
Contributor Author

/ok to test 3122770

Reduce per-stage gpus for vad, utmos, sigmos, and speaker_separation
in default_config.yaml to lighter defaults.
Copy link
Copy Markdown
Contributor

@sarahyurick sarahyurick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice, thanks! Left 2 open questions. I don't think they should block the PR, just want to make sure I understand.

"cell_type": "markdown",
"metadata": {},
"source": [
"## 6. Band Classification Breakdown"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to double check, should we expect it to be 100% passing samples?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's expected to be 100% as it shows the total full band segment in output, it will be always 100% as we are filtering out the narrow band segments, I'm guessing u are confusing this with the overall band estimation

"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAhgAAAGJCAYAAADIVkprAAAAOnRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjEwLjgsIGh0dHBzOi8vbWF0cGxvdGxpYi5vcmcvwVt1zgAAAAlwSFlzAAAPYQAAD2EBqD+naQAAOR5JREFUeJzt3X18zfX/x/HncbUL244N25i59pNrcn05CnNRqEiojS+iXPXV5frG0MWUr6JvIkorhSL0TSFNzKLksiKFFiqbiM0mo5337w+/nZ/T2djmwzEe99vt3G4+78/78/m8zuec4zz3/lwcmzHGCAAAwELFPF0AAAC4/hAwAACA5QgYAADAcgQMAABgOQIGAACwHAEDAABYjoABAAAsR8AAAACWI2AAAADLETBwQ1u/fr1sNpuWLl3q6VLyJTU1VX379lXZsmVls9k0Y8YMT5d0WTp27Kj69et7ugwAVwABA1dcfHy8bDabvL299euvv7rN50sm//75z39qzZo1iomJ0YIFC9StW7c8+2ZkZCg2Nlb169dX6dKlVbZsWTVu3Fjjxo3Tb7/9dhWrxuVKSkpS9+7dFRYWJm9vb1WuXFm33367Fi5c6OnSrro9e/Zo0qRJ+vnnnz1dCi6hhKcLwI0jKytLU6dO1X/+8x9Pl1JkrVu3Tr1799Yjjzxy0X7nzp1Thw4dtHfvXkVHR2vMmDHKyMjQ7t27tXDhQt1xxx2qWLHiVaoal2PJkiXq37+/MxwGBgYqOTlZiYmJmjdvngYOHOjpEq+qPXv2aPLkyerYsaOqVq3q6XJwEQQMXDWNGzfWvHnzFBMTc8N9uWVmZqp06dKXvZ6jR4+qTJkyl+y3YsUK7dixQ++++67bF9CZM2d09uzZy66lKLBqv19pp0+flq+vb67zJk2apLp16+rLL79UqVKlXOYdPXr0apQHFAqHSHDVPPnkk8rOztbUqVMv2u/nn3+WzWZTfHy82zybzaZJkyY5pydNmiSbzaYff/xR9957r+x2u8qXL68JEybIGKPDhw+rd+/eCggIUGhoqKZPn57rNrOzs/Xkk08qNDRUpUuXVq9evXT48GG3fl999ZW6desmu90uX19fRURE6IsvvnDpk1PTnj17NHDgQAUGBqpdu3YXfc4//fST+vXrp6CgIPn6+qpVq1b6+OOPnfNzDjMZYzRr1izZbDbZbLY813fgwAFJUtu2bd3meXt7KyAgwDk9ePBg+fn56aefflJkZKRKly6tihUrasqUKfr7jy07HA7NmDFD9erVk7e3t0JCQjRixAidOHHCpd+HH36onj17qmLFivLy8lKNGjX09NNPKzs7+6L7QZI+/fRT+fr6asCAAfrrr78kSXv37lXfvn0VFBQkb29vNWvWTP/9739dlsvZRxs2bNCDDz6o4OBgVapUSZJ06tQpPfTQQ6pataq8vLwUHBysLl26aPv27RetJee13Lt3r+6++24FBASobNmyGjdunM6cOePW/5133lHTpk3l4+OjoKAg3XPPPW7vo5xDgtu2bVOHDh3k6+urJ598Ms8aDhw4oObNm7uFC0kKDg52mc7v6+NwODRp0iRVrFhRvr6+6tSpk/bs2aOqVatq8ODBbvs0KSlJY8eOVfny5VWmTBmNGDFCZ8+e1cmTJxUVFaXAwEAFBgbqscceK/R7pmrVqrrtttuUlJSkFi1ayNvbW9WrV9fbb7/tUk+/fv0kSZ06dXJ+DtavX5/n/oPnEDBw1VSrVk1RUVGaN2+e5ecA9O/fXw6HQ1OnTlXLli31zDPPaMaMGerSpYvCwsL0/PPPq2bNmnrkkUeUmJjotvyzzz6rjz/+WI8//rjGjh2rtWvXqnPnzvrzzz+dfdatW6cOHTooPT1dsbGxeu6553Ty5Endcsst2rJli9s6+/Xrp9OnT+u5557T8OHD86w9NTVVbdq00Zo1a/Tggw/q2Wef1ZkzZ9SrVy8tX75cktShQwctWLBAktSlSxctWLDAOZ2bKlWqSJLefvttt//wc5Odna1u3bopJCREL7zwgpo2barY2FjFxsa69BsxYoQeffRRtW3bVjNnztSQIUP07rvvKjIyUufOnXP2i4+Pl5+fn8aPH6+ZM2eqadOmmjhxop544omL1rFy5Ur16tVL/fr10zvvvKMSJUpo9+7datWqlb7//ns98cQTmj59ukqXLq0+ffo498+FHnzwQe3Zs8dleyNHjtTs2bN111136dVXX9UjjzwiHx8fff/995fcN5J0991368yZM4qLi1OPHj308ssv6/7773fp8+yzzyoqKkq1atXSiy++qIceekgJCQnq0KGDTp486dL3+PHj6t69uxo3bqwZM2aoU6dOeW67SpUqSkhI0C+//HLJOvP7+sTExGjy5Mlq1qyZpk2bplq1aikyMlKZmZm5rnfMmDHat2+fJk+erF69emnu3LmaMGGCbr/9dmVnZ+u5555Tu3btNG3aNLf3ZX5rkqT9+/erb9++6tKli6ZPn67AwEANHjxYu3fvlnT+czB27FhJ5/9gyfkc1KlT55L7Bh5ggCvszTffNJLM119/bQ4cOGBKlChhxo4d65wfERFh6tWr55xOTk42ksybb77pti5JJjY21jkdGxtrJJn777/f2fbXX3+ZSpUqGZvNZqZOnepsP3HihPHx8THR0dHOts8//9xIMmFhYSY9Pd3Z/v777xtJZubMmcYYYxwOh6lVq5aJjIw0DofD2e/06dOmWrVqpkuXLm41DRgwIF/756GHHjKSzMaNG51tp06dMtWqVTNVq1Y12dnZLs9/1KhRl1zn6dOnTe3atY0kU6VKFTN48GDzxhtvmNTUVLe+0dHRRpIZM2aMs83hcJiePXuaUqVKmd9//90YY8zGjRuNJPPuu++6LL969Wq39tOnT7ttZ8SIEcbX19ecOXPG2Xbha//BBx+YkiVLmuHDh7s851tvvdU0aNDAZTmHw2HatGljatWq5WzLeZ+1a9fO/PXXXy7bttvt+dpvf5fzWvbq1cul/cEHHzSSzK5du4wxxvz888+mePHi5tlnn3Xp9+2335oSJUq4tEdERBhJZs6cOfmq4Y033jCSTKlSpUynTp3MhAkTzMaNG132kTH5f31SUlJMiRIlTJ8+fVz6TZo0yUhy+Xzk7NO/v+9bt25tbDabGTlypLMt53MXERFR4JqMMaZKlSpGkklMTHS2HT161Hh5eZmHH37Y2bZkyRIjyXz++eeX2HPwNEYwcFVVr15d9913n+bOnasjR45Ytt5hw4Y5/128eHE1a9ZMxhgNHTrU2V6mTBnVrl1bP/30k9vyUVFR8vf3d0737dtXFSpU0CeffCJJ2rlzp/bt26eBAwfq+PHjOnbsmI4dO6bMzEzdeuutSkxMlMPhcFnnyJEj81X7J598ohYtWrgcRvHz89P999+vn3/+WXv27MnfTriAj4+PvvrqKz366KOSzo8oDB06VBUqVNCYMWOUlZXltszo0aOd/7bZbBo9erTOnj2rzz77TNL5kw3tdru6dOnifP7Hjh1T06ZN5efnp88//9xl+zlOnTqlY8eOqX379jp9+rT27t3rtu1Fixapf//+GjFihF577TUVK3b+v6Y//vhD69at09133+1cz7Fjx3T8+HFFRkZq3759blcmDR8+XMWLF3dpK1OmjL766qtCj5yNGjXKZXrMmDGS5Hx/LFu2TA6HQ3fffbfLvgkNDVWtWrVc9o0keXl5aciQIfna9j/+8Q+tXr1aHTt2VFJSkp5++mm1b99etWrV0qZNm5z98vv6JCQk6K+//tKDDz6Y63PKzdChQ10OybVs2dLt85Xzubvw81WQ94wk1a1bV+3bt3dOly9fPs/PLK59nOSJq+6pp57SggULNHXqVM2cOdOSdVauXNll2m63y9vbW+XKlXNrP378uNvytWrVcpm22WyqWbOm81K4ffv2SZKio6PzrCEtLU2BgYHO6WrVquWr9oMHD6ply5Zu7TnDvgcPHizUZbx2u10vvPCCXnjhBR08eFAJCQn697//rVdeeUV2u13PPPOMs2+xYsVUvXp1l+X/53/+R5Jc9kFaWprbcf8cF55wuHv3bj311FNat26d0tPTXfqlpaW5TCcnJ+vee+9Vv3793K4w2r9/v4wxmjBhgiZMmJDndsPCwpzTue33F154QdHR0QoPD1fTpk3Vo0cPRUVFuT3nvPz9/VGjRg0VK1bMZd8YY9z65ShZsqTLdFhYWK7nVOQlMjJSkZGROn36tLZt26b33ntPc+bM0W233aa9e/cqODg436/PwYMHJUk1a9Z0mR8UFOTy/r1Qbp8vSQoPD3drv/DcioK8Z3LbjiQFBga6na+BooGAgauuevXquvfeezV37txcj8nndfLixU4Q/PtfrHm1ScrXOQl/lzM6MW3aNDVu3DjXPn5+fi7TF/4V72lVqlTRP/7xD91xxx2qXr263n33XZeAkR8Oh0PBwcF69913c51fvnx5SdLJkycVERGhgIAATZkyRTVq1JC3t7e2b9+uxx9/3G2kp0KFCs7Roq1bt6pZs2Yu25SkRx55RJGRkblu9+9flLnt97vvvlvt27fX8uXL9emnn2ratGl6/vnntWzZMnXv3j3/O+H//P096nA4ZLPZtGrVqlzfd1a9N3x9fdW+fXu1b99e5cqV0+TJk7Vq1SpFR0fn+/UpjLw+S7m1X/j5KmhNVn5m4XkEDHjEU089pXfeeUfPP/+827ycv6L+fmJczl9eV0LOCEUOY4z279+vhg0bSjr/F6skBQQEqHPnzpZuu0qVKvrhhx/c2nMOJeScsGmFwMBA1ahRQ999951Lu8Ph0E8//eQctZCkH3/8UZKc9xqoUaOGPvvsM7Vt2/aiX5Dr16/X8ePHtWzZMnXo0MHZnpycnGt/b29vrVy5Urfccou6deumDRs2qF69epLkHGEoWbLkZe/3ChUq6MEHH9SDDz6oo0eP6uabb9azzz6br4Cxb98+l5GR/fv3y+FwuOwbY4yqVavmsg+vpJwglnOoMb+vT877af/+/S7P6fjx45aPFOS3poK42NVTuLZwDgY8okaNGrr33nv12muvKSUlxWVeQECAypUr53a1x6uvvnrF6nn77bd16tQp5/TSpUt15MgR55dP06ZNVaNGDf373/9WRkaG2/K///57obfdo0cPbdmyRZs3b3a2ZWZmau7cuapatarq1q1b4HXu2rVLx44dc2s/ePCg9uzZo9q1a7vNe+WVV5z/NsbolVdeUcmSJXXrrbdKOj8KkJ2draefftpt2b/++ssZCHP+Cr3wr86zZ89e9PWz2+1as2aN8/LRnMtsg4OD1bFjR7322mu5nrOTn/2enZ3tdlgmODhYFStWzPVclNzMmjXLZTrnUE7O++POO+9U8eLFNXnyZLe/to0xuR6Wy6+EhIRc23PO/8h5LfP7+tx6660qUaKEZs+e7dLnwtffKvmtqSBy7mtSmGVxdTGCAY/517/+pQULFuiHH35w/sWaY9iwYZo6daqGDRumZs2aKTEx0fkX9ZUQFBSkdu3aaciQIUpNTdWMGTNUs2ZN5+WlxYoV0+uvv67u3burXr16GjJkiMLCwvTrr7/q888/V0BAgD766KNCbfuJJ57QokWL1L17d40dO1ZBQUF66623lJycrA8++MB5wmNBrF27VrGxserVq5datWrlvM/F/PnzlZWV5XIvEen8KMLq1asVHR2tli1batWqVfr444/15JNPOoexIyIiNGLECMXFxWnnzp3q2rWrSpYsqX379mnJkiWaOXOm+vbtqzZt2igwMFDR0dEaO3asbDabFixYcMlh7nLlymnt2rVq166dOnfurKSkJIWFhWnWrFlq166dGjRooOHDh6t69epKTU3V5s2b9csvv2jXrl0XXe+pU6dUqVIl9e3bV40aNZKfn58+++wzff3113neF+XvkpOT1atXL3Xr1k2bN2/WO++8o4EDB6pRo0aSzgfmZ555RjExMfr555/Vp08f+fv7Kzk5WcuXL9f9999/ybuv5qV3796qVq2abr/9dtWoUUOZmZn67LPP9NFHH6l58+a6/fbbJeX/9QkJCdG4ceM0ffp053PatWuXVq1apXLlylk6QpDfmgqicePGKl68uJ5//nmlpaXJy8tLt9xyS57necCDPHHpCm4sF16m+nc5l0heeJmqMecvcxw6dKix2+3G39/f3H333ebo0aN5XqaacynlhestXbq02/b+fklszmWqixYtMjExMSY4ONj4+PiYnj17moMHD7otv2PHDnPnnXeasmXLGi8vL1OlShVz9913m4SEhEvWdDEHDhwwffv2NWXKlDHe3t6mRYsWZuXKlW79lM/LVH/66SczceJE06pVKxMcHGxKlChhypcvb3r27GnWrVvn0jdnXx04cMB07drV+Pr6mpCQEBMbG+t2KaQxxsydO9c0bdrU+Pj4GH9/f9OgQQPz2GOPmd9++83Z54svvjCtWrUyPj4+pmLFiuaxxx4za9ascbu88O+vhzHG7N+/31SoUMHUqVPHuQ8PHDhgoqKiTGhoqClZsqQJCwszt912m1m6dKlzubzeZ1lZWebRRx81jRo1Mv7+/qZ06dKmUaNG5tVXX73kfsx5Lffs2WP69u1r/P39TWBgoBk9erT5888/3fp/8MEHpl27dqZ06dKmdOnS5qabbjKjRo0yP/zww0Wf88UsWrTI3HPPPaZGjRrGx8fHeHt7m7p165p//etfLpdW58jP6/PXX3+ZCRMmmNDQUOPj42NuueUW8/3335uyZcu6XHqa1z4t6OcuPzVVqVLF9OzZ023ZiIgIl0tfjTFm3rx5pnr16qZ48eJcsnoNsxnD2TPAjWzw4MFaunRprod+bnSTJk3S5MmT9fvvv7tdkXS9OXnypAIDA/XMM8/oX//6l6fLwXWAczAA4AZz4R1qc8yYMUPS+VuZA1bgHAwAuMG89957io+PV48ePeTn56ekpCQtWrRIXbt2zfX3a4DCIGAAwA2mYcOGKlGihF544QWlp6c7T/ws6L1RgIvhHAwAAGA5zsEAAACWI2AAAADL3XDnYDgcDv3222/y9/fnlrMAABSAMUanTp1SxYoVL3kTwBsuYPz2229uvwAIAADy7/Dhw6pUqdJF+9xwAcPf31/S+Z0TEBDg4WoAACg60tPTFR4e7vwuvZgbLmDkHBYJCAggYAAAUAj5OcWAkzwBAIDlCBgAAMByBAwAAGA5AgYAALAcAQMAAFiOgAEAACxHwAAAAJbzaMCYPXu2GjZs6LwnRevWrbVq1aqLLrNkyRLddNNN8vb2VoMGDfTJJ59cpWoBAEB+eTRgVKpUSVOnTtW2bdu0detW3XLLLerdu7d2796da/9NmzZpwIABGjp0qHbs2KE+ffqoT58++u67765y5QAA4GJsxhjj6SIuFBQUpGnTpmno0KFu8/r376/MzEytXLnS2daqVSs1btxYc+bMydf609PTZbfblZaWxp08AQAogIJ8h14z52BkZ2dr8eLFyszMVOvWrXPts3nzZnXu3NmlLTIyUps3b85zvVlZWUpPT3d5AACAK8vjAePbb7+Vn5+fvLy8NHLkSC1fvlx169bNtW9KSopCQkJc2kJCQpSSkpLn+uPi4mS3252PK/VLqtmOa2ogCAAAj343efzHzmrXrq2dO3cqLS1NS5cuVXR0tDZs2JBnyCiomJgYjR8/3jmd80twVitezKapy3fo8LEMy9cNAEBBhZfz0xN3NPHY9j0eMEqVKqWaNWtKkpo2baqvv/5aM2fO1GuvvebWNzQ0VKmpqS5tqampCg0NzXP9Xl5e8vLysrboPBw+lqH9KRyCAQDA44dI/s7hcCgrKyvXea1bt1ZCQoJL29q1a/M8ZwMAAHiGR0cwYmJi1L17d1WuXFmnTp3SwoULtX79eq1Zs0aSFBUVpbCwMMXFxUmSxo0bp4iICE2fPl09e/bU4sWLtXXrVs2dO9eTTwMAAPyNRwPG0aNHFRUVpSNHjshut6thw4Zas2aNunTpIkk6dOiQihX7/0GWNm3aaOHChXrqqaf05JNPqlatWlqxYoXq16/vqacAAABy4dGA8cYbb1x0/vr1693a+vXrp379+l2higAAgBWuuXMwAABA0UfAAAAAliNgAAAAyxEwAACA5QgYAADAcgQMAABgOQIGAACwHAEDAABYjoABAAAsR8AAAACWI2AAAADLETAAAIDlCBgAAMByBAwAAGA5AgYAALAcAQMAAFiOgAEAACxHwAAAAJYjYAAAAMsRMAAAgOUIGAAAwHIEDAAAYDkCBgAAsBwBAwAAWI6AAQAALEfAAAAAliNgAAAAyxEwAACA5QgYAADAcgQMAABgOQIGAACwHAEDAABYjoABAAAsR8AAAACWI2AAAADLETAAAIDlCBgAAMByBAwAAGA5jwaMuLg4NW/eXP7+/goODlafPn30ww8/XHSZ+Ph42Ww2l4e3t/dVqhgAAOSHRwPGhg0bNGrUKH355Zdau3atzp07p65duyozM/OiywUEBOjIkSPOx8GDB69SxQAAID9KeHLjq1evdpmOj49XcHCwtm3bpg4dOuS5nM1mU2ho6JUuDwAAFNI1dQ5GWlqaJCkoKOii/TIyMlSlShWFh4erd+/e2r17d559s7KylJ6e7vIAAABX1jUTMBwOhx566CG1bdtW9evXz7Nf7dq1NX/+fH344Yd655135HA41KZNG/3yyy+59o+Li5Pdbnc+wsPDr9RTAAAA/8dmjDGeLkKSHnjgAa1atUpJSUmqVKlSvpc7d+6c6tSpowEDBujpp592m5+VlaWsrCzndHp6usLDw5WWlqaAgABLas8xat5G7U9hhAQA4Hk1QwM0a3h7S9eZnp4uu92er+9Qj56DkWP06NFauXKlEhMTCxQuJKlkyZJq0qSJ9u/fn+t8Ly8veXl5WVEmAADIJ48eIjHGaPTo0Vq+fLnWrVunatWqFXgd2dnZ+vbbb1WhQoUrUCEAACgMj45gjBo1SgsXLtSHH34of39/paSkSJLsdrt8fHwkSVFRUQoLC1NcXJwkacqUKWrVqpVq1qypkydPatq0aTp48KCGDRvmsecBAABceTRgzJ49W5LUsWNHl/Y333xTgwcPliQdOnRIxYr9/0DLiRMnNHz4cKWkpCgwMFBNmzbVpk2bVLdu3atVNgAAuASPBoz8nF+6fv16l+mXXnpJL7300hWqCAAAWOGauUwVAABcPwgYAADAcgQMAABgOQIGAACwHAEDAABYjoABAAAsR8AAAACWI2AAAADLETAAAIDlCBgAAMByBAwAAGA5AgYAALAcAQMAAFiOgAEAACxHwAAAAJYjYAAAAMsRMAAAgOUIGAAAwHIEDAAAYDkCBgAAsBwBAwAAWI6AAQAALEfAAAAAliNgAAAAyxEwAACA5QgYAADAcgQMAABgOQIGAACwHAEDAABYjoABAAAsR8AAAACWI2AAAADLETAAAIDlCBgAAMByBAwAAGA5AgYAALAcAQMAAFiOgAEAACzn0YARFxen5s2by9/fX8HBwerTp49++OGHSy63ZMkS3XTTTfL29laDBg30ySefXIVqAQBAfnk0YGzYsEGjRo3Sl19+qbVr1+rcuXPq2rWrMjMz81xm06ZNGjBggIYOHaodO3aoT58+6tOnj7777rurWDkAALgYmzHGeLqIHL///ruCg4O1YcMGdejQIdc+/fv3V2ZmplauXOlsa9WqlRo3bqw5c+Zcchvp6emy2+1KS0tTQECAZbVL0qh5G7U/Jd3SdQIAUBg1QwM0a3h7S9dZkO/Qa+ocjLS0NElSUFBQnn02b96szp07u7RFRkZq8+bNufbPyspSenq6ywMAAFxZ10zAcDgceuihh9S2bVvVr18/z34pKSkKCQlxaQsJCVFKSkqu/ePi4mS3252P8PBwS+sGAADurpmAMWrUKH333XdavHixpeuNiYlRWlqa83H48GFL1w8AANyV8HQBkjR69GitXLlSiYmJqlSp0kX7hoaGKjU11aUtNTVVoaGhufb38vKSl5eXZbUCAIBL8+gIhjFGo0eP1vLly7Vu3TpVq1btksu0bt1aCQkJLm1r165V69atr1SZAACggDw6gjFq1CgtXLhQH374ofz9/Z3nUdjtdvn4+EiSoqKiFBYWpri4OEnSuHHjFBERoenTp6tnz55avHixtm7dqrlz53rseQAAAFceHcGYPXu20tLS1LFjR1WoUMH5eO+995x9Dh06pCNHjjin27Rpo4ULF2ru3Llq1KiRli5dqhUrVlz0xFAAAHB1eXQEIz+34Fi/fr1bW79+/dSvX78rUBEAALDCNXMVCQAAuH4QMAAAgOUIGAAAwHKFChirV69WUlKSc3rWrFlq3LixBg4cqBMnTlhWHAAAKJoKFTAeffRR5296fPvtt3r44YfVo0cPJScna/z48ZYWCAAAip5CXUWSnJysunXrSpI++OAD3XbbbXruuee0fft29ejRw9ICAQBA0VOoEYxSpUrp9OnTkqTPPvtMXbt2lXT+V1D5tVIAAFCoEYx27dpp/Pjxatu2rbZs2eK8MdaPP/54yd8SAQAA179CjWC88sorKlGihJYuXarZs2crLCxMkrRq1Sp169bN0gIBAEDRU6gRjMqVK2vlypVu7S+99NJlFwQAAIq+Qo1gFC9eXEePHnVrP378uIoXL37ZRQEAgKKtUAEjr98QycrKUqlSpS6rIAAAUPQV6BDJyy+/LEmy2Wx6/fXX5efn55yXnZ2txMRE3XTTTdZWCAAAipwCBYyccyyMMZozZ47L4ZBSpUqpatWqmjNnjrUVAgCAIqdAASM5OVmS1KlTJy1btkyBgYFXpCgAAFC0Feoqks8//9zqOgAAwHWkUAEjOztb8fHxSkhI0NGjR+VwOFzmr1u3zpLiAABA0VSogDFu3DjFx8erZ8+eql+/vmw2m9V1AQCAIqxQAWPx4sV6//33+WEzAACQq0L/2FnNmjWtrgUAAFwnChUwHn74Yc2cOTPPG24BAIAbW6EOkSQlJenzzz/XqlWrVK9ePZUsWdJl/rJlyywpDgAAFE2FChhlypTRHXfcYXUtAADgOlGogPHmm29aXQcAALiOFOocDEn666+/9Nlnn+m1117TqVOnJEm//fabMjIyLCsOAAAUTYUawTh48KC6deumQ4cOKSsrS126dJG/v7+ef/55ZWVl8XskAADc4Ao1gjFu3Dg1a9ZMJ06ckI+Pj7P9jjvuUEJCgmXFAQCAoqlQIxgbN27Upk2bVKpUKZf2qlWr6tdff7WkMAAAUHQVagTD4XAoOzvbrf2XX36Rv7//ZRcFAACKtkIFjK5du2rGjBnOaZvNpoyMDMXGxnL7cAAAULhDJNOnT1dkZKTq1q2rM2fOaODAgdq3b5/KlSunRYsWWV0jAAAoYgoVMCpVqqRdu3Zp8eLF+uabb5SRkaGhQ4dq0KBBLid9AgCAG1OhAoYklShRQvfee6+VtQAAgOtEoQPGb7/9pqSkJB09elQOh8Nl3tixYy+7MAAAUHQVKmDEx8drxIgRKlWqlMqWLSubzeacZ7PZCBgAANzgChUwJkyYoIkTJyomJkbFihX6buMAAOA6Vah0cPr0ad1zzz2ECwAAkKtCJYShQ4dqyZIlVtcCAACuE4U6RBIXF6fbbrtNq1evVoMGDVSyZEmX+S+++GK+1pOYmKhp06Zp27ZtOnLkiJYvX64+ffrk2X/9+vXq1KmTW/uRI0cUGhpaoOcAAACunEIHjDVr1qh27dqS5HaSZ35lZmaqUaNG+sc//qE777wz38v98MMPCggIcE4HBwfne1kAAHDlFfpOnvPnz9fgwYMva+Pdu3dX9+7dC7xccHCwypQpc1nbBgAAV06hzsHw8vJS27Ztra4l3xo3bqwKFSqoS5cu+uKLLy7aNysrS+np6S4PAABwZRUqYIwbN07/+c9/rK7lkipUqKA5c+bogw8+0AcffKDw8HB17NhR27dvz3OZuLg42e125yM8PPwqVgwAwI2pUIdItmzZonXr1mnlypWqV6+e20mey5Yts6S4v6tdu7bzvA9JatOmjQ4cOKCXXnpJCxYsyHWZmJgYjR8/3jmdnp5OyAAA4AorVMAoU6ZMgU7KvJJatGihpKSkPOd7eXnJy8vrKlYEAAAKFTDefPNNq+sotJ07d6pChQqeLgMAAFyg0D92ZoWMjAzt37/fOZ2cnKydO3cqKChIlStXVkxMjH799Ve9/fbbkqQZM2aoWrVqqlevns6cOaPXX39d69at06effuqppwAAAHJRqIDRpEmTXO93YbPZ5O3trZo1a2rw4MG53hTrQlu3bnXpk3OuRHR0tOLj43XkyBEdOnTIOf/s2bN6+OGH9euvv8rX11cNGzbUZ599dsntAACAq8tmjDEFXSgmJkazZ89WgwYN1KJFC0nS119/rW+++UaDBw/Wnj17lJCQoGXLlql3796WF3050tPTZbfblZaW5nKzLiuMmrdR+1O4DBYA4Hk1QwM0a3h7S9dZkO/QQo1gHDt2TA8//LAmTJjg0v7MM8/o4MGD+vTTTxUbG6unn376mgsYAADgyivUfTDef/99DRgwwK39nnvu0fvvvy9JGjBggH744YfLqw4AABRJhQoY3t7e2rRpk1v7pk2b5O3tLUlyOBzOfwMAgBtLoQ6RjBkzRiNHjtS2bdvUvHlzSefPwXj99df15JNPSpLWrFmjxo0bW1YoAAAoOgoVMJ566ilVq1ZNr7zyivMOmrVr19a8efM0cOBASdLIkSP1wAMPWFcpAAAoMgp9H4xBgwZp0KBBec738fEp7KoBAEARV6hzMCTp5MmTzkMif/zxhyRp+/bt+vXXXy0rDgAAFE2FGsH45ptv1LlzZ9ntdv38888aNmyYgoKCtGzZMh06dMh5500AAHBjKtQIxvjx4zV48GDt27fP5UqRHj16KDEx0bLiAABA0VSogPH1119rxIgRbu1hYWFKSUm57KIAAEDRVqiA4eXlpfR091ti//jjjypfvvxlFwUAAIq2QgWMXr16acqUKTp37pyk8z9ydujQIT3++OO66667LC0QAAAUPYUKGNOnT1dGRoaCg4P1559/KiIiQjVq1JCfn5+effZZq2sEAABFTKGuIrHb7Vq7dq2SkpL0zTffKCMjQ02bNtWtt95qdX0AAKAIKtAIxubNm7Vy5UrndLt27VS6dGm9+uqrGjBggO6//35lZWVZXiQAAChaChQwpkyZot27dzunv/32Ww0fPlxdunTRE088oY8++khxcXGWFwkAAIqWAgWMnTt3uhwGWbx4sVq0aKF58+Zp/Pjxevnll50/1w4AAG5cBQoYJ06cUEhIiHN6w4YN6t69u3O6efPmOnz4sHXVAQCAIqlAASMkJETJycmSpLNnz2r79u1q1aqVc/6pU6dUsmRJaysEAABFToECRo8ePfTEE09o48aNiomJka+vr9q3b++c/80336hGjRqWFwkAAIqWAl2m+vTTT+vOO+9URESE/Pz89NZbb6lUqVLO+fPnz1fXrl0tLxIAABQtBQoY5cqVU2JiotLS0uTn56fixYu7zF+yZIn8/PwsLRAAABQ9hb7RVm6CgoIuqxgA
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, should this be expected?

Comment thread tutorials/audio/readspeech/readspeech_tutorial.ipynb Outdated
Comment thread tutorials/audio/readspeech/readspeech_tutorial.ipynb Outdated
Comment thread tutorials/audio/readspeech/readspeech_tutorial.ipynb Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested fix: Add a markdown cell between cells 8 and 9 explaining the discrepancy:

Note: With only 10 samples from a clean read-speech corpus, all samples pass the default thresholds.
On the full 14,279-sample dataset, the combined pass rate is ~23% — dominated by SIGMOS filtering
(OVRL ≥ 3.5, NOISE ≥ 4.0). To see realistic filtering behavior, increase MAX_SAMPLES to 500+.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added. Also corrected the assumption in the suggested text — with all 5 filters enabled (SIGMOS OVRL ≥ 3.5, NOISE ≥ 4.0), even 10 clean read-speech samples see a 30–70% rejection rate, so "near 100% pass rate" is not accurate. The note reads: "With MAX_SAMPLES = 10 the pass rate is highly variable. On the full 14,279-sample dataset the combined pass rate is ~23%, dominated by SIGMOS filtering. Increase MAX_SAMPLES to 500+ to see a more representative pass rate."

Comment thread tutorials/audio/readspeech/README.md
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The README states:

Default GPU resource allocations total 0.9 GPU (0.1 + 0.2 + 0.2 + 0.4), allowing all stages to run concurrently on a single GPU.

But when speaker separation is enabled, AudioDataFilterStage.decompose() creates duplicate filter stages for the speaker-separated branch. The notebook's own execution plan confirms 15 stages including:

Main branch: VAD(0.1) + UTMOS(0.2) + SIGMOS(0.2) + SpeakerSep(0.4) = 0.9
Speaker branch: VAD_Speaker(0.1) + BandFilter_Speaker(0.0) + UTMOS_Speaker(0.2) + SIGMOS_Speaker(0.2) = 0.5
Total peak: 1.4 GPU
Ray's streaming executor won't fully schedule all stages concurrently, but the README's "0.9 GPU" claim is incomplete. Users on a single GPU with limited VRAM could hit OOM if Ray overlaps stages from both branches.

Suggested fix: Update to:

Default GPU allocations total 0.9 GPU for the main branch. When speaker separation is enabled, the duplicated filter stages for the speaker branch add another 0.5 GPU fraction (1.4 total). Ray's streaming executor manages scheduling to fit available resources, but users with limited GPU memory may need to reduce fractional allocations or disable speaker separation.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in two places: (1) default_config.yaml GPU fractions reduced — UTMOS 0.2→0.1, SIGMOS 0.2→0.1, SpeakerSep 0.4→0.3 — so peak across both filter branches with speaker sep enabled is now 0.9 GPU (was 1.4). (2) README note updated to document that AudioDataFilterStage.decompose() instantiates a duplicated post-speaker filter branch, explain the GPU arithmetic, and link to default_config.yaml for tuning.

Comment thread tutorials/audio/readspeech/readspeech_tutorial.ipynb Outdated
@shubhamNvidia
Copy link
Copy Markdown
Contributor Author

/ok to test 8ffc99d

- notebook: replace private _EmptyTask with public EmptyTask singleton
- notebook: drop RayDataExecutor, use pipeline.run() to match CLI default (Xenna)
- notebook: silence loguru/NeMo/Ray warnings before imports; add ray.shutdown()
- notebook: fix pass-rate note — remove incorrect 'near 100%' claim, clarify
  pass rate is variable at MAX_SAMPLES=10 vs ~23% on full dataset
- notebook: commit clean chart outputs; strip cells with library-emitted paths
- README: restore CC BY 4.0 hyperlinks in Dataset Overview and License sections
- README: update GPU table and note to reflect reduced fractions and document
  duplicated post-speaker filter branch (peak 0.9 GPU with speaker sep enabled)
- default_config.yaml: UTMOS 0.2->0.1, SIGMOS 0.2->0.1, SpeakerSep 0.4->0.3
- band.py: add BAND FILTER FAILED rejection log matching UTMOS/SIGMOS pattern
@shubhamNvidia
Copy link
Copy Markdown
Contributor Author

/ok to test d496e74

Copy link
Copy Markdown
Contributor

@sarahyurick sarahyurick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks!

Comment thread nemo_curator/stages/audio/filtering/band.py
Resolve E402 (imports not at top of cell), I001 (unsorted imports), and
S607 (partial executable path) in readspeech_tutorial.ipynb.
Wrap pipeline.run(executor) in pipeline.py and run.py with time.monotonic()
and log the elapsed wall time alongside the input dataset directory so users
can quickly see end-to-end runtime for their input data.
@shubhamNvidia
Copy link
Copy Markdown
Contributor Author

/ok to test 9b413f8

Comment on lines +145 to +148
"# Shut down any stale Ray instance and kill leftover Ray OS processes so\n",
"# Xenna's ray.init() finds exactly one cluster.\n",
"ray.shutdown()\n",
"subprocess.run([\"ray\", \"stop\", \"--force\"], capture_output=True, check=False) # noqa: S607\n",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of using Ray directly, the tutorial can start with:

ray_client = RayClient()
ray_client.start()

and end with ray_client.stop().

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

noted

@shubhamNvidia
Copy link
Copy Markdown
Contributor Author

/ok to test fb89584

@shubhamNvidia
Copy link
Copy Markdown
Contributor Author

/ok to test 2c2e97c

@shubhamNvidia
Copy link
Copy Markdown
Contributor Author

/ok to test fc7c522

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Apr 24, 2026

/ok to test fc7c522

@shubhamNvidia, there was an error processing your request: E2

See the following link for more information: https://docs.gha-runners.nvidia.com/cpr/e/2/

@shubhamNvidia
Copy link
Copy Markdown
Contributor Author

/ok to test d9851b1

Add required metadata field to display_data outputs and name field
to stream outputs to fix nbformat validation errors.
@shubhamNvidia
Copy link
Copy Markdown
Contributor Author

/ok to test 7b2fa1a

@shubhamNvidia
Copy link
Copy Markdown
Contributor Author

/ok to test 67e23aa

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

r1.2.0 Pick this label for auto cherry-picking into r1.2.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants