# Features extraction tutorial

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/sensein/senselab/blob/main/tutorials/audio/features_extraction.ipynb)


In this tutorial, we will explore how to extract some audio descriptors with the `senselab` package. Descriptors include acoustic and quality measures and are extracted with different libraries. 

In [1]:
%pip install 'senselab[audio]'

Note: you may need to restart the kernel to use updated packages.


In [2]:
!mkdir -p tutorial_audio_files
!wget -O tutorial_audio_files/audio_48khz_stereo_16bits.wav https://github.com/sensein/senselab/raw/main/src/tests/data_for_testing/audio_48khz_stereo_16bits.wav

--2025-09-15 19:21:45--  https://github.com/sensein/senselab/raw/main/src/tests/data_for_testing/audio_48khz_stereo_16bits.wav
Resolving github.com (github.com)... 140.82.114.4
Connecting to github.com (github.com)|140.82.114.4|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://raw.githubusercontent.com/sensein/senselab/main/src/tests/data_for_testing/audio_48khz_stereo_16bits.wav [following]
--2025-09-15 19:21:46--  https://raw.githubusercontent.com/sensein/senselab/main/src/tests/data_for_testing/audio_48khz_stereo_16bits.wav
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 991012 (968K) [audio/wav]
Saving to: ‘tutorial_audio_files/audio_48khz_stereo_16bits.wav’


2025-09-15 19:21:46 (8.27 MB/s) - ‘tutorial_audio_files/au

In [3]:
# Imports
from senselab.audio.data_structures import Audio
from senselab.audio.tasks.features_extraction.api import extract_features_from_audios
from senselab.audio.tasks.preprocessing import downmix_audios_to_mono, resample_audios
import os

  path = torchaudio.utils.download_asset(f"models/{self._path}")
  path = torchaudio.utils.download_asset(f"models/{self._path}")
  available_backends = torchaudio.list_audio_backends()


In [4]:
# Load audio
audio2 = Audio(filepath=os.path.abspath("tutorial_audio_files/audio_48khz_stereo_16bits.wav"))

# Downmix to mono
audio2 = downmix_audios_to_mono([audio2])[0]

# Resample both audios to 16kHz
audios = resample_audios([audio2], 16000)

  info = torchaudio.info(filepath)
  return AudioMetaData(


In [5]:
extract_features_from_audios(audios=audios,
                                      opensmile=True,
                                      parselmouth=True,
                                      torchaudio=True,
                                      torchaudio_squim=True)

[{'opensmile': {'F0semitoneFrom27.5Hz_sma3nz_amean': 25.710796356201172,
   'F0semitoneFrom27.5Hz_sma3nz_stddevNorm': 0.1605353206396103,
   'F0semitoneFrom27.5Hz_sma3nz_percentile20.0': 21.095951080322266,
   'F0semitoneFrom27.5Hz_sma3nz_percentile50.0': 25.9762020111084,
   'F0semitoneFrom27.5Hz_sma3nz_percentile80.0': 29.512413024902344,
   'F0semitoneFrom27.5Hz_sma3nz_pctlrange0-2': 8.416461944580078,
   'F0semitoneFrom27.5Hz_sma3nz_meanRisingSlope': 82.34796905517578,
   'F0semitoneFrom27.5Hz_sma3nz_stddevRisingSlope': 99.20043182373047,
   'F0semitoneFrom27.5Hz_sma3nz_meanFallingSlope': 22.002275466918945,
   'F0semitoneFrom27.5Hz_sma3nz_stddevFallingSlope': 9.043970108032227,
   'loudness_sma3_amean': 0.86087566614151,
   'loudness_sma3_stddevNorm': 0.43875235319137573,
   'loudness_sma3_percentile20.0': 0.5877408981323242,
   'loudness_sma3_percentile50.0': 0.8352401852607727,
   'loudness_sma3_percentile80.0': 1.1747918128967285,
   'loudness_sma3_pctlrange0-2': 0.587050914764

 ## Extracting health measurements from audio files

As part of our ongoing efforts in `senselab`, we are curating and maintaining a selection of metrics that show promise for health assessment and monitoring. Please refer to the documentation for further details. Below is a guide to easily extract these metrics.

In [6]:
from senselab.audio.workflows.health_measurements.extract_health_measurements import extract_health_measurements

extract_health_measurements(audios=audios)

[{'speaking_rate': 3.874983349680919,
  'articulation_rate': 3.874983349680919,
  'phonation_ratio': 1.0,
  'pause_rate': 0.0,
  'mean_pause_duration': 0.0,
  'mean_f0_hertz': 118.59917806814313,
  'std_f0_hertz': 30.232960797931817,
  'mean_intensity_db': 69.76277128148347,
  'std_intensity_db': 58.54414165935646,
  'range_ratio_intensity_db': -0.25736445047981316,
  'mean_hnr_db': 3.3285614070654375,
  'std_hnr_db': 3.36490968797237,
  'spectral_slope': -13.982306776816046,
  'spectral_tilt': -0.004414961849917737,
  'cepstral_peak_prominence_mean': 7.0388038514346825,
  'cepstral_peak_prominence_std': 1.5672438573255245,
  'mean_f1_loc': 613.4664268420964,
  'std_f1_loc': 303.98235579059883,
  'mean_b1_loc': 401.96960219300837,
  'std_b1_loc': 400.9001719378358,
  'mean_f2_loc': 1701.7755281579418,
  'std_f2_loc': 325.4405394017738,
  'mean_b2_loc': 434.542188503193,
  'std_b2_loc': 380.8914612651878,
  'spectral_gravity': 579.587511962247,
  'spectral_std_dev': 651.3025011919739,
 