No description, website, or topics provided.
Clone or download
Latest commit e13bbc8 Nov 16, 2018
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
LPC-based extrapolation Changed params to the ones used on the paper Nov 2, 2018
architecture Added test for channel wise fully connected layer Jul 20, 2018
audio_examples Added descriptions and examples Nov 16, 2018
datasetGenerator fixed dumb bug in the splitter part of the fma downloader Oct 15, 2018
images Added more examples Nov 16, 2018
network added channel-wise fully connected layer Jul 20, 2018
system spectrograms implemented in a more versatile way to work with magnitu… Jul 20, 2018
utils Save parameters for magnitude network Oct 31, 2018
.gitignore Updated gitignore file Apr 19, 2018
README.md Linked to researchgate Nov 15, 2018
_config.yml Set theme jekyll-theme-slate Nov 14, 2018
complex_network_parameters.pkl Changed names of the saved parameters Oct 31, 2018
index.html Added more examples Nov 16, 2018
magnitude_network_parameters.pkl Changed names of the saved parameters Oct 31, 2018
make_fakedataset.py Creating fake datasets May 3, 2018
make_fmadataset.py Updated the path of the test file Apr 19, 2018
make_nsynthdataset.py Typo in the name Apr 12, 2018
requirements.txt Added the requirements Apr 19, 2018
test hear .ipynb notebook to hear sounds Jun 14, 2018
trainComplexNetwork.py Added individual scripts to train complex and magnitude network respe… Oct 31, 2018
trainMagnitudeNetwork.py Added individual scripts to train complex and magnitude network respe… Oct 31, 2018

README.md

Audio inpainting with a context encoder

This project accompanies the research work on audio inpainting of small gaps done at the Acoustics Research Institute in Vienna collaborating with the Swiss Data Science Center. There is a preprint of the paper available now: https://www.researchgate.net/publication/328600925_A_context_encoder_for_audio_inpainting.

Installation

Install the requirements with pip install -r requirements.txt. For windows users, the numpy version should be 1.14.0+mkl (find it here). For the FMA dataset, librosa requires ffmpeg as an mp3 backend.

Instructions

The paper uses both google's Nsynth dataset and the FMA dataset. In order to recreate the used dataset, execute in the parent folder either python make_nsynthdataset.py or python make_fmadataset.py. The output of the scripts are three tfrecord files for training, validating and testing the model.

The default parameters for the network come pickled in the file Papers_Context_Encoder_parameters.pkl. In order to make other architectures use saveParameters.py.

To train the network, execute in the parent folder python paperArchitecture.py. This will train the network for 600k steps with a learning rate of 1e-3. You can select on which tfrecords to train the network, the script assumes you have created the nsynth dataset.

Sound examples