upscalemp3

Converts an mp3 (lossy) file back into its uncompressed wav counterpart based on a generative AI model built with tensorflow.

Control Flow Outline:

mp3 and corresponding wav audio files are broken into overlapping 1 second segements for training.
These audio segments are transformed into their spectral composition using the (librosa) STFT and interleaved to support stereo.
A polynomial regression is run on the magnitudes of each sample for every time step to roughly recover lost higher frequencies
These augmented mp3 spectrograms are passed along with their corresponding wav spectrograms into a UNet-style neural network with residual encoder blocks (ResUNet) to clean up the polynomial regression and add more precision to magnitudes. Note: Phases are omitted from the model, as the Griffin-Lim algorithm does a better job (and because it cuts the size of the model in half).
After training, the model predicts the missing spectrogram data from input mp3 audio segments and returns the ISTFT of the spectrogram encoded as a wav file at 44.1kHz.
The overlapping interleaved segments are combined using OLA and a hanning window (this seems to be producing some artifacts based on the hard 1s splits in the original mp3, so I might experiment with using zero-cross cutting to minimize weirdness with the spectrograms)
Finally, each channel is run through a slightly modified Griffin-Lim algorithm to rebuild the correct phases. The algorithm runs for 200 iterations by default, but this can take a long time considering it happens once for each channel. That said, anything under 100 iterations sounds noticibly worse.
The L and R channels are combined, and the output is encoded as a 24-bit PCM .wav file that gets written to the designated output filepath as "upscaled_mp3.wav"

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
LICENSE		LICENSE
LSTM_ResUNet_small.ipynb		LSTM_ResUNet_small.ipynb
MODEL_DOWNLOAD_LINKS.md		MODEL_DOWNLOAD_LINKS.md
README.md		README.md
main.py		main.py
model_uresnet.py		model_uresnet.py
normalize_.py		normalize_.py
postprocessing.py		postprocessing.py
predict.py		predict.py
preprocessing.py		preprocessing.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

upscalemp3

Control Flow Outline:

About

Releases

Packages

Languages

License

matthewmcq/upscalemp3

Folders and files

Latest commit

History

Repository files navigation

upscalemp3

Control Flow Outline:

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages