This repo is a collection of software used for a hearing aid using Deep Learning to perform de-noising. This software will be deployed on a Jetson Nano 2GB, which in turn will be connected to an earpiece via headphone & microphone jacks.
The processing pipeline consists of a Short-Time Fourier Transform (STFT), an FCNN model, and an Inverse STFT. The sampling rate of the input/output audio is 22.05 KHz. The FCNN is based off of https://arxiv.org/pdf/1609.07132.pdf and https://sthalles.github.io/practical-deep-learning-audio-denoising/. The main features of this FCNN are skip connections and kernels with width 1. The pipeline accepted a 16 * 1024 sample input (0.743s of audio) and outputs 1024 sample blocks (0.046s of audio). Thus, the pipeline must have a latency less than 0.046s since no pipelining (in the hardware sense of the word, processing multiple components of the pipeline at once, i.e. performing STFT, FCNN, and ISTFT at once) is occurring. When tested on the Jetson Nano 2GB, the pipeline takes less than 0.03s.