Simulated string with bowing excitation, exhibiting a Helmholtz motion.
This repo contains two PyTorch-based string simulators, StringFDTD-Torch [1] and DMSP [2].
-
- StringFDTD-Torch is a planar string simulation engine for musical instrument sound synthesis research. It simulates a string vibration from a given set of mechanical properties and excitation conditions based on a finite difference scheme (i.e., finite difference time domain) and outputs the resulting string sound.
-
- Differentiable Modal Synthesis for Physical Modeling (DMSP) is a neural network trained to approximate the string motion simulated using StringFDTD-Torch but in an efficient manner augmenting the modal synthesis method in a similar manner to the DDSP approach.
Here is a minimal step-by-step guide to run the DMSP code:
Install the dependencies (section 1.1)
xargs apt-get install -y < apt-packages.txt
pip install -r requirements.txtSynthesize FDTD data (section 1.2)
Synthesize 100 different FDTD data with the default configuration.
python -m run experiment=nsynth-like task.num_samples=100 task.result_dir=my_fdtd_simulationThe results are saved under ./results/my_fdtd_simulation.
Preprocess the data (section 2.1)
Preprocess the synthesized FDTD data and save them under my_dmsp_data.
Use 60/20/20 split for train/valid/test data.
nohup python -m run proc.gpus=[0] experiment=process_training_data task.result_dir=my_fdtd_simulation \
task.save_dir=my_dmsp_data/test task.data_split=5 task.split_n=0 > log_test &
nohup python -m run proc.gpus=[0] experiment=process_training_data task.result_dir=my_fdtd_simulation \
task.save_dir=my_dmsp_data/valid task.data_split=5 task.split_n=1 > log_valid &
nohup python -m run proc.gpus=[0] experiment=process_training_data task.result_dir=my_fdtd_simulation \
task.save_dir=my_dmsp_data/train task.data_split=5 task.split_n=2 > log_train1 &
nohup python -m run proc.gpus=[0] experiment=process_training_data task.result_dir=my_fdtd_simulation \
task.save_dir=my_dmsp_data/train task.data_split=5 task.split_n=3 > log_train2 &
nohup python -m run proc.gpus=[0] experiment=process_training_data task.result_dir=my_fdtd_simulation \
task.save_dir=my_dmsp_data/train task.data_split=5 task.split_n=4 > log_train3 &The results are saved under ./results/my_dmsp_data.
Train DMSP model (section 2.2)
python -m run proc.gpus=[0] experiment=synth-dmsp task.load_name=my_dmsp_data task.run=my_dmsp_modelThe results are saved under ./results/synthesize-supervised-dmsp-my_dmsp_model-YYYYMMDD-HHMMSS-000000.
You can also find the pretrained DMSP model from Hugging Face.
Run the inference (section 2.3)
python -m run proc.gpus=[0] experiment=synth-dmsp
task.load_name=my_dmsp_data \
hydra.run.dir=/workspace/torch-fdtd-string/results/synthesize-supervised-dmsp-my_dmsp_model-YYYYMMDD-HHMMSS-000000 \
proc.train=false proc.test=true \
task.plot_test_video=true \
task.save_test_score=trueThe results are saved under ./results/synthesize-supervised-dmsp-my_dmsp_model-YYYYMMDD-HHMMSS-000000/test/my_dmsp_data.
StringFDTD-Torch can simulate under your own predefined configurations or can also be used as a dataset generator by randomly augmenting within a set of bounds for the mechanical conditions. It can also simulate on the CPU or GPU, depending on the state of the device you want to simulate. You may want to pick CPU if you are simulating your own configuration in a small batch size. However, it is recommended to use GPU for simulating as a dataset generator with large batch size, to benefit from the GPU parallelism.
Install python and system dependencies
xargs apt-get install -y < apt-packages.txt
pip install -r requirements.txtStringFDTD-Torch can simulate the string under various conditions.
The configuration can be specified by passing arguments or
by defining them in a .yaml file (or using both of them.)
For example, we have prepared a typical example configuration as
src/configs/experiment/nsynth-like.yaml.
The nsynth-like.yaml predefines the configurations for
generating sound samples akin to NSynth [4].
You can run the simulation with the following command.
python -m run \
experiment=nsynth-like \
task.result_dir=my_fdtd_simulationThis will first load the base config file at src/configs/config.yaml,
then override it with the preset config file at src/configs/experiment/nsynth-like.json,
dump its task.result_dir argument to my_fdtd_simulation as specified by the user,
and finally run the run.py file.
Everything the StringFDTD-Torch simulates will be saved under {{ task.root_dir }}/{{ task.result_dir }} directory.
By default, the task.root_dir is set to ./results and if you run the above command,
the results will be saved under ./results/my_fdtd_simulation directory.
The saved files will be organized as follows:
my_fdtd_simulation/
├── codes/ # source code backup for the current simulation
├── config_tree.txt # final configuration tree (saves all the passed arguments as a final config)
├── cpu_time.txt # measured time for simulation
├── run.log # log file
├── {batch_id}-{batch_number}/ # simulation results (batch_id is a random 8-character string)
│ ├── bow_params.npz # properties related to bowing excitation
│ ├── hammer_params.npz # properties related to hammering excitation
│ ├── output-u.wav # pickup signal for the lateral motion
│ ├── output-z.wav # pickup signal for the longitudinal motion
│ ├── output.wav # mixed signal of output-u and output-z
│ ├── simulation.npz # simulated string states for whole space
│ ├── simulation_config.yaml # simulation parameters summary
│ └── string_params.npz # properties related to strings
└── ...In order to provide specific conditions
(such as predefined F0, bowing force, hammering timing, etc.)
save them as /path/to/preset/*.npy file.
Passing task.load_config=/path/to/preset will load
the preset parameters for simulation.
The other parameters, those are not defined in
/path/to/preset, will be randomly augmented.
You can also specify the augmentation range
using the experiment argument as follows.
python -m run experiment=nsynth-like task.excitation=hammer task.load_config=data/trumpetThe simulation is based on the finite difference time domain (FDTD) method. The FDTD method is a numerical approach to solve the wave equation for the string motion. The wave equation is discretized in both time and space, and the string motion is computed iteratively over time steps. Here are some important notes regarding the FDTD simulation:
StringFDTD-Torch uses GPUs by default.
In order to simulate using CPUs, set proc.cpu=true.
You might need to export CUDA_VISIBLE_DEVICES="" in your bash script.
A typical usage example, along with specifying the processors via taskset,
would appear like this.
taskset -c 16-31 python -m run proc.cpu=true ...In the string instrument simulation, the pitch (or the fundamental frequency) of the sound can vary depending on the stiffness of the string. While pitch changes with stiffness have been and continue to be studied for a long time, we have found that the amount of detune in the simulated sounds closely matches the theoretical model proposed by Fletcher [3]. And even if detune is a natural phenomenon, we provide the ability to pre-correct the pitch based on the theoretical model to achieve the sound at the intended pitch.
As the spectrograms above show, without the precorrection, there is a bit of detune in the intended fundamental frequency (white dashed lines) and the estimated one. However, the detune is resolved by pre-correcting the input fundamental frequency. Utilizing the pre-correction feature significantly reduces the detune phenomenon, but at the cost of a relatively large increase in computation and speed. So, if you think this pre-correction is not necessary, you can turn off pre-correction by passing task.precorrect=false argument. Please note that this argument is set true by default.
python -m run ... task.precorrect=falseYou can run the simulation in a debug mode by passing task.result_dir=debug.
python -m run experiment=my_experiment task.result_dir=debugYou can evaluate or plot the simulation results by passing experiment=evaluate.
Please specify the path to the directory that contains the simulation results.
ls /path/to/simulated/directory # 00000000-0 00000000-1 ... run.log
python -m run experiment=evaluate task.load_dir='./my_fdtd_simulation' The DMSP model [2] can be trained using the data simulated as above. The pretrained model can be downloaded from Hugging Face.
In order to train the DMSP model, first preprocess the simulated FDTD data
following the configs/experiment/process_training_data.yaml.
This incorporates spatially upsampling the FDTD results and saving the modal solutions.
2.1.1. Processing the whole FDTD data
python -m run experiment=process_training_data \
task.result_dir=my_fdtd_simulation \
task.save_dir=my_dmsp_dataThe processed data will be saved under {{ task.root_dir }}/my_dmsp_data.
The saved files will be organized as follows:
my_dmsp_data/
├── {batch_id}-{batch_number}/ # same as the simulation results
│ ├── parameters.npz # parameters for the simulation
│ ├── ua-{000}.wav # 1D modal solution of the string at x={000}
│ ├── ...
│ ├── ut-{000}.wav # Lateral FDTD solution of the string at x={000}
│ ├── ...
│ └── vt.wav # Velocity of the lateral solution summed up over x.
└── ...2.1.2. Processing with train/validation/test split
The process_training_data.yaml config file also supports splitting the data into
training, validation, and test sets. You can specify the split ratio using the task.data_split and the task.split_n arguments.
data_split: splits the list of whole data intodata_splitparts (set 0 to disable)split_n: only processes thesplit_n-th part of the splitted data sublist For example, if you want to split the data into 5 parts and process the first part as test data, the second part as validation data, and the remaining three parts as training data, you can run the following commands:
nohup python -m run proc.gpus=[0] experiment=process_training_data task.result_dir=my_fdtd_simulation \
task.save_dir=my_dmsp_data/test task.data_split=5 task.split_n=0 > log_test &
nohup python -m run proc.gpus=[0] experiment=process_training_data task.result_dir=my_fdtd_simulation \
task.save_dir=my_dmsp_data/valid task.data_split=5 task.split_n=1 > log_valid &
nohup python -m run proc.gpus=[0] experiment=process_training_data task.result_dir=my_fdtd_simulation \
task.save_dir=my_dmsp_data/train task.data_split=5 task.split_n=2 > log_train1 &
nohup python -m run proc.gpus=[0] experiment=process_training_data task.result_dir=my_fdtd_simulation \
task.save_dir=my_dmsp_data/train task.data_split=5 task.split_n=3 > log_train2 &
nohup python -m run proc.gpus=[0] experiment=process_training_data task.result_dir=my_fdtd_simulation \
task.save_dir=my_dmsp_data/train task.data_split=5 task.split_n=4 > log_train3 &The processed data will be saved under {{ task.root_dir }}/my_dmsp_data/{{ split }}
where {{ split }} is the name of the split (e.g., train, valid, test).
You can run the above command in parallel to speed up the processing, but make sure to
secure enough memory for each process. The saved files will be organized as follows:
my_dmsp_data/{{ split }}/
├── {batch_id}-{batch_number}/ # same as the simulation results
│ ├── parameters.npz # parameters for the simulation
│ ├── ua-{000}.wav # 1D modal solution of the string at x={000}
│ ├── ...
│ ├── ut-{000}.wav # Lateral FDTD solution of the string at x={000}
│ ├── ...
│ └── vt.wav # Velocity of the lateral solution summed up over x.
└── ...Now train the DMSP model using the preset saved under configs/experiment/synth-dmsp.yaml.
python -m run proc.gpus=[0] experiment=synth-dmsp \
task.load_name=my_dmsp_dataFor more details, see src/configs/experiment/synth-dmsp.yaml.
The train/valid/test results can be found under {{ task.root_dir }}/{{ task.result_dir }}.
The saved files will be organized as follows:
{{ task.root_dir }}/{{ task.result_dir }}/
├── codes/ # source code backup for the current training
├── config_tree.txt # the final configuration tree
├── run.log # running log
├── string/ # checkpoints
│ └── {run_id}/checkpoints/'epoch={epoch}-step={step}.ckpt'
├── test/ # test results
└── valid/ # validation results
├── plot/
│ ├── {epoch}-{iteration}-{batch_number}.png
│ └── ...
└── wave/
├── {epoch}-{iteration}-{batch_number}.wav
└── ...To run inference using the trained DMSP model,
you can use the same synth-dmsp config file as a base.
python -m run proc.gpus=[0] experiment=synth-dmsp \
task.load_name=my_dmsp_data \
hydra.run.dir=/full/path/to/your/trained/{{ task.result_dir }} \
proc.train=false proc.test=true \
task.plot_test_video=true \
task.save_test_score=trueProvide the hydra.run.dir argument to specify the full path to the directory
where the trained model is saved.
When you run the above command, here are what happens:
- Configuration files are loaded in the following order:
- Load the base configuration file at
src/configs/config.yaml - Override it with the preset config file at
src/configs/experiment/synth-dmsp.yaml - Override it with the in-line arguments (e.g.,
task.load_dir,task.load_name,hydra.run.dir, etc.)
- Load the base configuration file at
- The network is defined and loaded in the following order:
- Source code is loaded from the
{{ hydra.run.dir }}/codesdirectory - Model checkpoint is loaded from
{{ hydra.run.dir }}/string/{run_id}/checkpoints/*.ckpt.
- Source code is loaded from the
- The test step is run as follows:
- The test data is loaded from
{{ task.load_dir }}/{{ task.load_name }}/directory. - The inferenced outputs are saved under
{{ hydra.run.dir }}/test/directory. Note that the inference always use the (backup) code saved under{{ hydra.run.dir }}/codesdirectory. This ensures that the model is evaluated on the exact same version of the source code as the code it was trained on.
- The test data is loaded from
The saved files will be organized as follows:
{{ hydra.run.dir }}/
├── codes/ # source code backup for the current training
├── config_tree.txt # the final configuration tree
├── run.log # running log
├── string/ # checkpoints
├── test/ # test results
│ ├── {{ task.load_name }}/
│ │ ├── score/ # test scores (when provided `task.save_test_score=true`)
│ │ │ ├── modal.txt # Modal synthesis's score
│ │ │ └── output.txt # Current model's output score
│ │ ├── state/ # test output plot (when provided `task.plot_test_video=true`)
│ │ │ ├── {epoch}-{batch}.npz # string displacement raw file
│ │ │ ├── {epoch}-{batch}.pdf # string displacement plot
│ │ │ └── ...
│ │ └── video/ # test output video (when provided `task.plot_test_video=true`)
│ │ ├── {epoch}-{batch}-{method}-silent_video.mp4 # string motion video without sound
│ │ ├── {epoch}-{batch}-{method}.mp4 # string motion video with sound
│ │ ├── {epoch}-{batch}-{method}.pdf # string motion sample plot
│ │ └── ...
│ ├── plot/
│ └── wave/
└── valid/ # validation resultsNote that the results from the same model for different {{ task.load_name }}s
will be distinguished by the different {{ task.load_name }} directory names.
The DMSP synthesizer is a neural network model that approximates the string motion simulated using StringFDTD-Torch. It is trained to learn the mapping between the input parameters and the output string motion. The DMSP model is based on the modal synthesis method, which decomposes the string motion into a sum of modal vibrations.
In acquiring the eigenvalues and eigenvectors of the string motion,
the model can either leverage the Newton method or a NN trained to approximate the Newton method.
We call the latter as DMSP and the former as DMSP-Hybrid in the paper [2].
Both share the same architecture and training process,
but the model can be inferred using the Newton method by passing
model.use_precomputed_mode=true argument, making the result as the DMSP-Hybrid model's output.
python -m run proc.gpus=[0] experiment=synth-dmsp \
task.load_dir=./results task.load_name=my_dmsp_data proc.train=false proc.test=true \
model.use_precomputed_mode=trueYou can run the training in a debug mode by passing task.result_dir=debug.
python -m run experiment=my_experiment task.result_dir=debugThe code is developed and tested on the following environment:
python 3.9.21
gcc 12.2.0If you use this code in your research, please cite the following paper.
@inproceedings{leedifferentiable,
title = {Differentiable Modal Synthesis for Physical Modeling of Planar String Sound and Motion Simulation},
author = {Lee, Jin Woo and Park, Jaehyun and Choi, Min Jun and Lee, Kyogu},
booktitle = {The Thirty-eighth Annual Conference on Neural Information Processing Systems (NeurIPS)},
year = {2024}
}
@inproceedings{lee2024string,
title = {String Sound Synthesize on GPU-accelerated Finite Difference Scheme},
author = {Lee, Jin Woo and Choi, Min Jun and Lee, Kyogu},
booktitle = {ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
pages = {1--5},
year = {2024},
organization = {IEEE}
}[1] Lee, J. W., Choi, M. J., & Lee, K. (2024, April). String Sound Synthesizer On Gpu-Accelerated Finite Difference Scheme. In ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 1491-1495). IEEE.
[2] Lee, J. W., Park, J., Choi, M. J., & Lee, K. (2024). Differentiable Modal Synthesis for Physical Modeling of Planar String Sound and Motion Simulation. In The Thirty-eighth Annual Conference on Neural Information Processing Systems (NeurIPS).
[3] Fletcher, H. (1964). Normal vibration frequencies of a stiff piano string. The Journal of the Acoustical Society of America, 36(1), 203-209.
[4] Engel, J., Resnick, C., Roberts, A., Dieleman, S., Norouzi, M., Eck, D., & Simonyan, K. (2017, July). Neural audio synthesis of musical notes with wavenet autoencoders. In International Conference on Machine Learning (pp. 1068-1077). PMLR.
