Neural Proxies for Sound Synthesizers:
Perceptually Informed Preset Representations

Official repository for the paper
"Neural Proxies for Sound Synthesizers: Learning Perceptually Informed Preset Representations"
published in the Journal of the Audio Engineering Society (JAES).

Overview

This repository provides:

Dataset generation for synthesizer presets
Training of neural proxies (preset encoders)
Evaluation on a sound-matching downstream task

→ Audio examples are available on the project website.

→ The repository for the audio models evaluation can be found here.

→ The published version of the paper is available on JAES's website here, while the Author's Accepted Manuscript (AAM) is available on arXiv.

Main dependencies

PyTorch + Lightning
DawDreamer (VST rendering)
WandB (logging)
Optuna (HPO)
Hydra (config management)

See requirements.txt for the full list.

Installation

Clone the repo and install via pip or Docker.

→ See Installation & environment setup for details.

Supported synthesizers

Currently, the following synthesizers are supported:

→ See Adding synthesizers for instructions on integrating new ones.

Audio models

Wrappers for the following audio models are available in the src/models/audio/ directory:

EfficientAT (used in the paper)
Torchopenl3
PaSST
Audio-MAE
Mel features.

→ See Adding audio models for integration instructions.

→ The code for the audio models evaluation can be found in its corresponding repository.

Preset Encoders (Neural Proxies)

An overview of the implemented neural proxies can be found in src/models/preset/model_zoo.py.

Download pretrained checkpoints here and place them in checkpoints/.

Datasets

See Datasets for download links and generation instructions of synthetic and hand-crafted preset datasets.

Experiments

This repository provides the following experiments:

Training and evaluation of synthesizer proxies.
Hyperparameter optimization (HPO) with Optuna.
Sound matching downstream tasks (finetuning + estimator network).

→ See Experiments for scripts, configs, and usage examples.

Reproducibility

The detailed step-by-step instructions to replicate the results from the paper, including model evaluation and visualization scripts can be found in Reproducibility.

Citation

@article{combes2025neural, 
  author={Combes, Paolo and Weinzierl, Stefan and Obermayer, Klaus}, 
  journal={Journal of the Audio Engineering Society}, 
  title={Neural Proxies for Sound Synthesizers: Learning Perceptually Informed Preset Representations}, 
  year={2025}, 
  volume={73}, 
  issue={9}, 
  pages={561-577}, 
  month={September},
}

Thanks

Special shout out to Joseph Turian for his initial guidance on the topic and overall methodology, and to Gwendal le Vaillant for the useful discussion on SPINVAE from which the transformer-based preset encoder is inspired.

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
checkpoints		checkpoints
configs		configs
data/datasets		data/datasets
docs		docs
results		results
scripts		scripts
src		src
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
environment.yml		environment.yml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Neural Proxies for Sound Synthesizers:
Perceptually Informed Preset Representations

Overview

Main dependencies

Installation

Supported synthesizers

Audio models

Preset Encoders (Neural Proxies)

Datasets

Experiments

Reproducibility

Citation

Thanks

About

Uh oh!

Releases

Packages

Languages

DBraun/synth-proxy

Folders and files

Latest commit

History

Repository files navigation

Neural Proxies for Sound Synthesizers: Perceptually Informed Preset Representations

Overview

Main dependencies

Installation

Supported synthesizers

Audio models

Preset Encoders (Neural Proxies)

Datasets

Experiments

Reproducibility

Citation

Thanks

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Neural Proxies for Sound Synthesizers:
Perceptually Informed Preset Representations

Packages