Supervisors: George Tzanetakis and Kirk McNally
The sound synthesizer is an electronic musical instrument that has become commonplace in audio production for music, film, television and video games. Despite its widespread use, creating new sounds on a synthesizer -- referred to as synthesizer programming -- is a complex task that can impede the creative process. The primary aim of this thesis is to support the development of techniques to assist synthesizer users to more easily achieve their creative goals. One of the main focuses is the development and evaluation of algorithms for inverse synthesis, a technique that involves the prediction of synthesizer parameters to match a target sound. Deep learning and evolutionary programming techniques are compared on a baseline FM synthesis problem and a novel hybrid approach is presented that produces high quality results in less than half the computation time of a state-of-the-art genetic algorithm. Another focus is the development of intuitive user interfaces that encourage novice users to engage with synthesizers and learn the relationship between synthesizer parameters and the associated auditory result. To this end, a novel interface (Synth Explorer) is introduced that uses a visual representation of synthesizer sounds on a two-dimensional layout. An additional focus of this thesis is to support further research in automatic synthesizer programming. An open-source library (SpiegeLib) has been developed to support reproducibility, sharing, and evaluation of techniques for inverse synthesis. Additionally, a large-scale dataset of one billion sounds paired with synthesizer parameters (synth1B1) and a GPU-enabled modular synthesizer (torchsynth) are also introduced to support further exploration of the complex relationship between synthesizer parameters and auditory results.