Pulse-Train-Resonator Model

Physics-Informed Neural Engine Sound Synthesis

This repository contains the implementation and trained models for the Pulse-Train-Resonator (PTR), a deep learning architecture for engine sound synthesis that directly models combustion pressure pulses and exhaust propagation through differentiable synthesis components.

Overview

Engine sounds originate from sequential combustion pressure pulses rather than sustained harmonic oscillations. While existing neural synthesis methods model the resulting spectral characteristics, PTR directly models the underlying pulse structure: parameterized pressure waves aligned to engine firing patterns, propagated through differentiable Karplus-Strong resonators simulating exhaust acoustics.

The model is trained on the Procedural Engine Sounds Dataset and demonstrates consistent generalization across diverse engine configurations, achieving improved audio reconstruction compared to harmonic-plus-noise baselines using identical network architecture.

Key Features

Physics-informed pulse synthesis incorporating thermodynamic pitch modulation and valve-dynamics envelopes
Differentiable Karplus-Strong resonators for exhaust system acoustic modeling
Firing-order sequencing with per-cylinder parameterization
Generalization across engine types despite fixed architectural priors

Repository Structure

ptr-model/
├── checkpoints/                            # Pre-trained model weights
│   └── 2025-08-31_models_and_weights.zip
├── configs/                                # Base configuration files
├── scripts/                                # Training and inference scripts
│   ├── train.py                            # Training pipeline with CLI
│   └── inference.py                        # Model inference with CLI
├── src/                                    # Source code
│   ├── audio/                              # Audio processing utilities
│   ├── data/                               # Data loading and processing
│   ├── models/                             # Model implementations
│   │   ├── hpn_model.py                    # Harmonic-Plus-Noise baseline
│   │   ├── hpn_synth.py                    # HPN synthesis modules
│   │   ├── ptr_model.py                    # Pulse-Train-Resonator model
│   │   ├── ptr_synth.py                    # PTR synthesis modules
│   │   └── model.py                        # Base model architecture
│   ├── training/                           # Training utilities
│   └── utils/                              # General utilities
├── pyproject.toml                          # Python dependencies and metadata
├── README.md                               # This file
└── LICENSE                                 # License file

Installation

Clone the repository:

git clone https://github.com/rdoerfler/ptr-model.git
cd ptr-model

Install dependencies:

pip install .

Extract pre-trained models:

cd checkpoints
unzip 2025-08-31_models_and_weights.zip

Usage

Training

Download the Procedural Engine Sounds Dataset and put it next to this repository:

├── ptr-model/          # This repository
├── dataset/            # Procedural Engines Dataset
│   ├── A_full_set      # Data subsets
│   ├── B_full_set     
│   ├── C_full_set
│   └── ...

Train the HPN variant:

python scripts/train.py --model_type hpn --dataset C_full_set

Train the PTR variant:

python scripts/train.py --model_type ptr --dataset C_full_set

Customize model architecture:

python scripts/train.py --model_type ptr --num_harmonics 128 --hidden_size 512 --gru_size 1024

Inference

Generate engine sounds using trained models:

python scripts/inference.py

Note: As of now, the folder name of checkpoint for inference has to be defined directly within the inference.py CLI support will be inplemented soon.

Command Line Interface

The training and inference scripts use a unified CLI with the following key parameters:

--model_type: Choose between hpn (Harmonic-Plus-Noise) or ptr (Pulse-Train-Resonator)
--dataset: Specify dataset name (default: C_full_set)
--num_harmonics: Number of harmonics (default: 100)
--num_noisebands: Number of noise bands (default: 256)
--hidden_size: Hidden layer size (default: 256)
--gru_size: GRU layer size (default: 512)

Configuration is managed through a base config system combined with CLI parameter overrides.

Baseline

A Harmonic-Plus-Noise (HPN) baseline is included for comparative evaluation. HPN employs direct harmonic synthesis with systematic inharmonicity modeling and temporally structured noise components, using identical encoder-decoder architecture to PTR.

Results

PTR consistently outperforms the HPN baseline across three engine configurations, achieving 5.7% improvement in total validation loss and 21% improvement in audio reconstruction on unseen data.

Dataset	HPN Harm.	HPN STFT	HPN Total	PTR Harm.	PTR STFT	PTR Total
A	0.107	1.781	0.944	0.090	1.649	0.872
B	0.059	1.824	0.943	0.055	1.754	0.907
C	0.166	2.093	1.132	0.117	2.017	1.069
Mean	0.111	1.899	1.006	0.088	1.807	0.949

Dataset

This work utilizes the Procedural Engine Sounds Dataset, a comprehensive collection of procedurally generated engine audio with time-aligned control annotations.

19 hours of engine audio across varied operating conditions
Time-aligned RPM, torque, throttle, and DFCO annotations
Multiple engine configurations and acoustic scenarios
Systematic coverage of engine operating parameters

Dataset Availability:

Zenodo: https://doi.org/10.5281/zenodo.16883336
Hugging Face Datasets: https://huggingface.co/datasets/rdoerfler/procedural-engine-sounds

Results

Evaluation reveals complementary strengths between synthesis approaches:

PTR: 5.7% superior validation performance, consistent training-validation transfer
HPN: Greater flexibility across engine configurations, robust to harmonic irregularities
Both variants successfully capture authentic engine acoustic behaviors with distinct signatures

Citation

If you use the Procedural Engines Dataset in your research, please cite:

@dataset{doerfler_2025_procedural_engine_sounds,
  author    = {Doerfler, Robin},
  title     = {Procedural Engine Sounds Dataset},
  month     = {August},
  year      = 2025,
  publisher = {Zenodo},
  version   = {1.0},
  doi       = {10.5281/zenodo.16883336},
  url       = {https://doi.org/10.5281/zenodo.16883336}
}

License

This project is licensed under the Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). See the LICENSE file for details.

You are free to:

Share — copy and redistribute the material in any medium or format
Adapt — remix, transform, and build upon the material

Under the following terms:

Attribution — You must give appropriate credit and indicate if changes were made
NonCommercial — You may not use the material for commercial purposes

For commercial use, please contact the author.

Additional Resources

Audio Examples: Supplementary audio examples demonstrating model outputs are available at: https://rdoerfler.github.io/ptr-examples/
Dataset: Procedural Engine Sounds Dataset on Zenodo and Hugging Face

Acknowledgments

This research demonstrates systematic integration of physics-informed inductive biases into differentiable synthesis architectures, providing a methodological framework applicable to physically-constrained audio generation beyond automotive contexts.

Contact

For questions or collaboration opportunities, please contact doerflerrobin@gmail.com or open an issue on this repository.

Keywords: Engine Sound Synthesis, Differentiable Signal Processing, Physics-Informed Neural Networks, Inductive Biases, Neural Audio Synthesis

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pulse-Train-Resonator Model

Overview

Key Features

Repository Structure

Installation

Usage

Training

Inference

Command Line Interface

Baseline

Results

Dataset

Results

Citation

License

Additional Resources

Acknowledgments

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
checkpoints		checkpoints
configs		configs
scripts		scripts
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

Pulse-Train-Resonator Model

Overview

Key Features

Repository Structure

Installation

Usage

Training

Inference

Command Line Interface

Baseline

Results

Dataset

Results

Citation

License

Additional Resources

Acknowledgments

Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages