# FogConvolver

FogConvolver seems like a great tool to apply convolutions to audiowaves:
https://www.audiothing.net/effects/fog-convolver/

Most importantly, they have many free RIRs to add to your music, for example:
https://www.pluginboutique.com/products/2458-Old-Times-Impulse-Responses-Expansion-for-Fog-Convolver-

These are cool because they mimic all kinds of old recorder sounds. So, we would like to be able to take their free RIRs and apply them to our music. The only problem is each FogConvolver sound is configured by these weird ATP files:

In [3]:
print(open('../amt/assets/impulse/fogconvolver/old_times/Radar Click.atp').read())

<?xml version="1.0" encoding="UTF-8"?>

<FogConvolver_SETTINGS>
  <STATE Stretch="0.5" PreDelay="0" Start="0" End="1" FadeIn="0" FadeOut="0"
         FadeInCurve="0.5" FadeOutCurve="0.5" Reverse="0" PreGain="0.5"
         LowPass="1" HighPass="0" DryMix="0" WetMix="0.70999997854232788086"
         Bypass="0" StartInSamples="-1" EndInSamples="-1" WaveFile="Samples/NASA/Radar Click.wav"/>
  <ATTRIBUTES Category="NASA" Tags="speaker, static" Rating="" Author="NASA"/>
</FogConvolver_SETTINGS>



It seems pretty clear that what this ATP file is doing is handling a bunch of configs in relation to the RIR wav, the audio wav that we want to modify, or both, that the FogConvolver application then knows how to handle.

Ideally, we'd like to be able to translate these configs into `ffmpeg` commands, because then we can run them through `torchaudio` for our data augmentation process. The only problem is that there's no direct 1-1 mapping between these config keys and `ffmpeg` commands. Some *look* like they would be the same, for e.g. I found these configs under the `aiir` command: `wet_gain`, `dry_gain`, which probably correspond to `WetMix`, `DryMix`. After reading a bunch, I think I understand that `FadeInCurve=.5` is equivalent to `"afade=t=in"` and `FadeOutCurve=.5` is equivalent to `"afade=t=out"`.

However, I really don't know anything else about `ffmpeg` and it would take a long time to really understand how to use them. 

Also, to add to that, many of these fields don't seem to matter *too* much, as they're all the same:

In [4]:
import xmltodict
import glob
import pandas as pd 
import torchaudio
files = glob.glob('../amt/assets/impulse/fogconvolver/old_times/*.atp')
data = []
for f in files:
    d = xmltodict.parse(open(f).read())
    d_inner = d['FogConvolver_SETTINGS']['STATE']
    data.append(d_inner)

In [9]:
config_df = pd.DataFrame(data)
config_df.head(10)

Unnamed: 0,@Stretch,@PreDelay,@Start,@End,@FadeIn,@FadeOut,@FadeInCurve,@FadeOutCurve,@Reverse,@PreGain,@LowPass,@HighPass,@DryMix,@WetMix,@Bypass,@StartInSamples,@EndInSamples,@WaveFile
0,0.5,0,0.0,1,0,0.0,0.5,0.5,0,0.6000000238418579,1,0,0.0,0.7099999785423278,0,-1,-1,Samples/NASA/Apollo 11 Click.wav
1,0.5,0,0.0,1,0,1.0,0.5,0.5,0,0.7099999785423278,1,0,0.7099999785423278,0.7099999785423278,0,-1,-1,Samples/NASA/Gemini IV Static.wav
2,0.5,0,0.0299999993294477,1,0,0.7990000247955322,0.5,0.5,0,0.5,1,0,0.0,0.7099999785423278,0,-1,-1,Samples/Radio/Plans.wav
3,0.5,0,0.0,1,0,0.7020000219345092,0.5,0.5,0,0.7099999785423278,1,0,0.7099999785423278,0.7099999785423278,0,-1,-1,Samples/NASA/Gemini IV Voice.wav
4,0.5,0,0.0,1,0,1.0,0.5,0.5,0,0.7099999785423278,1,0,0.7099999785423278,0.7099999785423278,0,-1,-1,Samples/NASA/Apollo 11 Noise.wav
5,0.5,0,0.0,1,0,1.0,0.5,0.5,0,0.7099999785423278,1,0,0.7099999785423278,0.7099999785423278,0,-1,-1,Samples/Wax Cylinder/Wax Noise 1.wav
6,0.5,0,0.0,1,0,0.0,0.5,0.5,0,0.7099999785423278,1,0,0.5,0.7099999785423278,0,0,9830,Samples/78 RPM/Charleston - Vibrato Fill.wav
7,0.5,0,0.0,1,0,1.0,0.5,0.5,0,0.7099999785423278,1,0,0.7099999785423278,0.7099999785423278,0,-1,-1,Samples/Wax Cylinder/Wax Noise 2.wav
8,0.5,0,0.0,1,0,0.0,0.5,0.5,0,0.6000000238418579,1,0,0.0,0.7099999785423278,0,-1,-1,Samples/Radio/Brass 1.wav
9,0.5,0,0.0,1,0,0.7990000247955322,0.5,0.75,0,0.7099999785423278,1,0,0.0,0.7099999785423278,0,0,85722,Samples/Wax Cylinder/Wax Choir.wav


I can do a quick-and-dirty convolution, not knowing too much about `ffmpeg` commands, and it doesn't sound that bad:

In [22]:
from IPython.display import Audio
import torch
import torchaudio
from torchaudio.utils import download_asset
import torchaudio.functional as F
# Apply effects
def apply_effect(waveform, sample_rate, effect):
    effector = torchaudio.io.AudioEffector(effect=effect)
    return effector.apply(waveform, sample_rate)

SAMPLE_SPEECH = download_asset("tutorial-assets/Lab41-SRI-VOiCES-src-sp0307-ch127535-sg0042-8000hz.wav")
speech, audio_sample_rate = torchaudio.load(SAMPLE_SPEECH)

# Define effects
effect = ",".join(
    [
        "afade=t=in",
        "afade=t=out",
    ],
)

In [23]:
old_times_path = '../amt/assets/impulse/fogconvolver/old_times/'
f = old_times_path + 'Samples/Wax Cylinder/Wax Choir.wav'
rir_raw, impulse_sample_rate = torchaudio.load(f)

In [24]:
rir = rir_raw.T
rir = apply_effect(rir, impulse_sample_rate, effect).T
downsample_rate = 8000
rir = F.resample(rir, impulse_sample_rate, downsample_rate)
rir = rir.mean(0, keepdim=True)[:, : int(.2 * downsample_rate)]
# rir = rir[:, int(downsample_rate * .1) : int(downsample_rate * .5)]
rir = rir / torch.linalg.vector_norm(rir, ord=2)

In [20]:
Audio(speech, rate=audio_sample_rate)

In [21]:
augmented = F.fftconvolve(speech, rir)
Audio(augmented, rate=audio_sample_rate)

This doesn't sound *too* bad, but it's hard for me to know whether it's the original intention of the effect.

Also, I had to do some things which I don't think were specified in the original config, like downsample and take only the first 20\% of the RIR sound, otherwise it gets totally swamped.

So I don't really know if any of that is right. Small tweaks to that have a huge effect. So I would really like to be able to follow the presents better, to know that at least I'm using something that a professional developed, and not just what I hacked together.

Also, while you're here, can you think of a way to simulate this intense of an effect?

https://www.youtube.com/watch?v=Lv7i-gkSWn0&t=13s&ab_channel=d60944