Skip to content

theIvanR/ADE-MP3

Repository files navigation

ADE: Audio Disambiguation Engine

Restore degraded (LAME-encoded) MP3s with modern machine learning. ADE is a lightweight, experimental tool for recovering perceptual audio quality from lossy-compressed MP3 files. It specifically targets artifacts introduced by the LAME encoder and approaches restoration as a disambiguation problem: given information lost through perceptual coding, the model selects a plausible original consistent with learned musical structure and codec behavior. In internal evaluation, ADE achieves up to an 80% reduction in NMSE relative to the baseline LAME 3.100 decoder on held-out material.

Unlike conventional denoising, ADE explicitly targets the irreversible distortions introduced by perceptual compression, providing a post-hoc restoration path when the original master is unavailable.

ADE Demo

Quick Facts

  • Not generative. ADE is regularized to select a single plausible reconstruction, rather than hallucinating new content.
  • Not denoising. MP3 distortion is not additive noise; it is the result of a many-to-one compression process.
  • Psychoacoustically regularized. The training objective combines mode-locking with complementary geometric regularizers to encourage musically coherent outputs.
  • Evaluated on ~400k unseen music blocks. Peak relative NMSE reduction is approximately 80%, corresponding roughly to an effective bitrate gain of about 160 kb/s.
  • CBR MP3 only. VBR and ABR files are rejected.
  • Demo limit: 5 MB per file.

The Problem: The Lost Master

In modern audio workflows, compression is often irreversible in practice. A good codec aims to maximize perceptual quality while minimizing bitrate.

There are two broad directions:

  • Pre-hoc methods: improving the encoder itself through better psychoacoustic models and coding schemes, as in AAC and Opus.
  • Post-hoc methods: attempting to improve the output of an already-compressed file without access to the original source.

The second problem remains much less explored. In real life, source material is often unavailable: the original master may be lost, archived poorly, or simply inaccessible. That leaves only the compressed file, and the question becomes whether useful structure can still be recovered.

ADE is built for that setting.


Why This Is Different

MP3 compression is non-injective: multiple different originals can map to the same compressed representation. In geometric terms, each frame corresponds to a family of plausible preimages — a fiber of possible originals.

These fibers are not necessarily smooth or convex. They can contain gaps, discontinuities, and highly structured constraints. That is why naive denoising or regression-to-the-mean tends to fail: averaging over all plausible solutions often produces blurred, weak, or unnatural audio.

ADE treats restoration as a regularized disambiguation problem. It learns musical structure, evaluates plausible candidates under codec-aware priors, and locks onto a single high-probability mode per frame. The goal is not to invent new content, but to recover the most plausible lost detail.


Getting Started

1. Try the browser demo

Mirror Notes Status
https://audiode.theivanr.duckdns.org/ Mini Demo https://stats.uptimerobot.com/iYI12tPzn7
  • Upload a CBR MP3 with 44.1 kHz SR
  • File size limit: 5 MB
  • ADE detects the bitrate and selects the closest model automatically
  • Download the restored WAV

NOTE: This is intentionally compute-heavy and currently not optimized for large-scale hosting. The demo is meant for exploration; for real workloads or larger files, local execution is strongly recommended.

2. Run locally

This is the recommended route for larger files and local experimentation.

Prerequisites

  • Python 3.8+
  • Anaconda is recommended, but a standard Python environment also works

Installation

git clone <ADE-MP3>
cd ADE-MP3
pip install streamlit onnxruntime soundfile librosa numpy matplotlib scipy

Launch the App (and adjust maxUploadSize as needed)

streamlit run web_frontend.py --server.maxUploadSize=100

How It Works

A detailed white paper is currently under peer review. The core ideas are:

  • Geometric framing: MP3 inversion is modeled as selecting a single point from a fiber of plausible originals.

  • Mode-locking regularization: The network is trained to prefer one stable solution rather than a diffuse average.

  • Psychoacoustic loss: The objective is aligned with perceptual relevance, not just numerical reconstruction error.

  • Small footprint: The final model is about 1 MB and runs efficiently via ONNX Runtime on CPU or GPU.

Contributing and Contact:

  • Contributors are welcome, make an issue on github or write to me.

License Notice (AGPL v3)

Audio Disambiguation Engine (ADE) is released under the GNU Affero General Public License v3.0 (AGPLv3).

  • You may use ADE freely for personal, research, or commercial purposes.
  • If you run ADE as a service (including commercial offerings), you are required to provide full source code and any modifications under the same AGPL license.
  • You may redistribute or adapt ADE, but all derivative works must remain open source under AGPL.
  • For details, see the included LICENSE file.

About

A learned inverse model for reconstructing plausible audio from lossy LAME MP3 encodings.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages