<a href="https://colab.research.google.com/github/eloimoliner/gramophone_noise_synth/blob/main/colab/demo.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Realistic Gramophone Noise Synthesis using a Diffusion Model

This notebook is a demo of the historical music denoising method proposed in:

> E. Moliner and V. Välimäki,, "A two-stage U-Net for high-fidelity denosing of historical recordings", submitted to IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Singapore, May, 2022

<p align="center">
<img src="https://user-images.githubusercontent.com/64018465/131505025-e4530f55-fe5d-4bf4-ae64-cc9a502e5874.png" alt="Schema represention"
width="400px"></p>

Listen to our [audio samples](http://research.spa.aalto.fi/publications/papers/icassp22-denoising/)

You can freely use it to denoise your own historical recordings.

### Instructions for running:

* Make sure to use a GPU runtime, click:  __Runtime >> Change Runtime Type >> GPU__
* Press ▶️ on the left of each of the cells
* View the code: Double-click any of the cells
* Hide the code: Double click the right side of the cell


In [None]:
#@title #Setup environment

#@markdown Execute this cell to download the code and weights 
! git clone https://github.com/eloimoliner/gramophone_noise_synth.git
%cd gramophone_noise_synth
! wget https://github.com/eloimoliner/gramophone_noise_synth/releases/download/gramophonediff/weights-750000.pt
! mkdir experiments
! mkdir experiments/trained_model
! mv weights-750000.pt experiments/trained_model/

!pip install omegaconf


In [4]:
#@title #Imports and others

#@markdown



import torch
import numpy as np
import torchaudio
import yaml
import os
from pathlib import Path
import librosa
import librosa.display
import IPython.display as ipd
import matplotlib.pyplot as plt
import time
import math
from scipy.fft import fft, ifft

from omegaconf import OmegaConf
from omegaconf.omegaconf import open_dict
from torch.utils.data import DataLoader
import numpy as np

#from learner import Learner
from model import UNet

from getters import get_sde


import soundfile as sf
#from sashimi.sashimi import Sashimi

from inference import GramophoneSampler 

from guide_synthesis import noise_presynthesis

args = yaml.safe_load(Path('conf/conf.yaml').read_text())
class dotdict(dict):
    """dot.notation access to dictionary attributes"""
    __getattr__ = dict.get
    __setattr__ = dict.__setitem__
    __delattr__ = dict.__delitem__
args=dotdict(args)
args.unet=dotdict(args.unet)

device=torch.device("cuda" if torch.cuda.is_available() else "cpu")

dirname = os.getcwd()
model_dir="experiments/trained_model/weights-750000.pt"


device=torch.device("cuda" if torch.cuda.is_available() else "cpu")

if args.architecture=="unet":
    model = UNet(args).to(device)



state_dict= torch.load(model_dir, map_location=device)

if hasattr(model, 'module') and isinstance(model.module, nn.Module):
    model.module.load_state_dict(state_dict['model'])
else:
    model.load_state_dict(state_dict['model'])

torch.backends.cudnn.benchmark = True #I dont know if this is useful


sde = get_sde(args.sde_type, args.sde_kwargs)


def plot_spec(x,ax, refr=None):
    D = librosa.stft(x, hop_length=128, n_fft=2048)
    if refr==None:
      refr=np.max(np.abs(D))
    S_db = 10*np.log10(np.abs(D)/refr)
    #D = librosa.amplitude_to_db(np.abs(librosa.stft(x, n_fft=1024,hop_length=128)))
    #librosa.display.specshow(D,ax=ax)
    librosa.display.specshow(S_db, cmap="inferno",vmax=0,vmin=-50,x_axis='time', y_axis='log', sr=44100,hop_length=128, ax=ax)
    return refr


#Unconditional Sampling

In [5]:
sampler=GramophoneSampler(model,sde)
steps=100
t_period=1/3

noise = sampler.predict_unconditional(steps,5, t_period)



period split at step  33


NameError: ignored

In [None]:
#@title #Download

#@markdown Execute this cell to download the denoised recording
files.download(wav_output_name)

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>