# Speech enhancement

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/sensein/senselab/blob/main/tutorials/audio/speech_enhancement.ipynb)

This tutorial demonstrates how to use the `enhance_audios` function to enhance speech signals.

We will show you how to use the [Speformer model (speechbrain/sepformer-wham16k-enhancement)](https://huggingface.co/speechbrain/sepformer-wham16k-enhancement).

In [1]:
%pip install senselab

Note: you may need to restart the kernel to use updated packages.


In [2]:
# Import the necessary modules from the Senselab package for audio processing
import os

from senselab.audio.data_structures import Audio
from senselab.audio.tasks.plotting.plotting import play_audio
from senselab.audio.tasks.preprocessing import resample_audios
from senselab.audio.tasks.speech_enhancement import enhance_audios
from senselab.utils.data_structures import DeviceType, SpeechBrainModel


  available_backends = torchaudio.list_audio_backends()


In [3]:
!mkdir -p tutorial_audio_files
!wget -O tutorial_audio_files/audio_48khz_mono_16bits.wav https://github.com/sensein/senselab/raw/main/src/tests/data_for_testing/audio_48khz_mono_16bits.wav

# Load an audio file from the specified file path
audio = Audio(filepath=os.path.abspath("tutorial_audio_files/audio_48khz_mono_16bits.wav"))

# Resample the audio to 16kHz to match the model's expected input format
audio = resample_audios([audio], 16000)[0]

# Play the resampled audio to verify the preprocessing step
play_audio(audio)

--2025-09-15 19:08:20--  https://github.com/sensein/senselab/raw/main/src/tests/data_for_testing/audio_48khz_mono_16bits.wav
Resolving github.com (github.com)... 140.82.112.3
Connecting to github.com (github.com)|140.82.112.3|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://raw.githubusercontent.com/sensein/senselab/main/src/tests/data_for_testing/audio_48khz_mono_16bits.wav [following]
--2025-09-15 19:08:20--  https://raw.githubusercontent.com/sensein/senselab/main/src/tests/data_for_testing/audio_48khz_mono_16bits.wav
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 472488 (461K) [audio/wav]
Saving to: ‘tutorial_audio_files/audio_48khz_mono_16bits.wav’


2025-09-15 19:08:20 (5.71 MB/s) - ‘tutorial_audio_files/audio_48kh

  info = torchaudio.info(self._file_path)
  return AudioMetaData(
  info = torchaudio.info(filepath)


In [4]:
# Load a pre-trained speech enhancement model from SpeechBrain (sepformer-wham16k-enhancement)
model = SpeechBrainModel(path_or_uri="speechbrain/sepformer-wham16k-enhancement", revision="main")

# Initialize the device for running the model
device = DeviceType.CPU

In [5]:
# Enhance the audio using the loaded model, running the process on the specified device
enhanced_audio = enhance_audios(
            audios=[audio], 
            model=model,
            device=device
        )[0]

  available_backends = torchaudio.list_audio_backends()


Failed to load SpeechBrain model as a SpectralMaskEnhancement model: Need hparams['compute_stft']
Trying to load as a SepformerSeparation model...


2025-09-15 19:08:26,091 - senselab - INFO - Time taken to initialize the speechbrain model: 1.14 seconds
2025-09-15 19:08:30,041 - senselab - INFO - Time taken for enhancing the audios: 3.95 seconds


In [6]:
# Play the enhanced audio to hear the result after speech enhancement
play_audio(enhanced_audio)