Skip to content

Latest commit

 

History

History
30 lines (16 loc) · 571 Bytes

speech.md

File metadata and controls

30 lines (16 loc) · 571 Bytes

Applications

  • Noise reduction. VoIP, cellular etc.
  • Text to speech.
  • Speech to text.

Noise reduction

Motivation

  • Improving intelligibility
  • Reducing listener fatigue

Can be formulated in different ways

  • Speech enhancement
  • Noise reduction
  • Source separation

Stationary sources are well served by traditional DSP methods. However complex non-stationary noise sources are not. Machine learning, and especially deep learning, has shown great promise here.

Usually done by computing time-frequeny masks and applying them via spectral substraction.