Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A way to transform the inputs of a combiner without writing to disk? #100

Closed
USER-5 opened this issue Apr 6, 2020 · 3 comments
Closed
Assignees

Comments

@USER-5
Copy link

USER-5 commented Apr 6, 2020

I need to audio files at specified SNR levels, I've found that I can normalise the audio to 0dB & set the gain etc. using the transform class and build, then mix them using the combiner class & build a second time.

Is there any way I could do this without the intermediate step of writing the files to disk?
I need to automate this process for thousands of files & need all the efficiency I can get...

@lostanlen
Copy link
Member

Hello, can you describe your exact pipeline at the moment?
If I/O is an issue, i recommend using librosa for normalizing to 0dB and doing the mixing in terms of numpy arrays.

@USER-5
Copy link
Author

USER-5 commented Apr 7, 2020

Currently I'm playing with the following setup:

tfm = sox.Transformer()
tfm.norm(db_level=0.0)
tfm.convert(samplerate=16000,n_channels=1,bitdepth=32)
tfm.build('Input1.wav','Intermediate1.flac')
tfm.build('Input2.flac','Intermediate2.flac')

volume1 = 0.5 #example
volume2 = 0.5

cbn = sox.Combiner()
cbn.build(['Intermediate1.flac','Intermediate2.flac'],\
          'output.flac',\
          'mix',\
          input_volumes=[volume1,volume2])

I was just hoping that there would be a way to avoid writing (and then later on deleting) these intermediate files?
Otherwise my best approach is probably to pass over all the input files, and normalise them to 0dB, and remove the originals. Then combine them in a separate pass.

@lostanlen
Copy link
Member

Thank you for clarifying. I think that sox does not support this. Therefore, it is out of scope of pysox.

Coincidentally, the pipeline you describe is comparable to what scaper does. Please see:
https://github.com/justinsalamon/scaper/blob/master/scaper/core.py#L1640

Note that pysox uses temporary files to do the transformations before the combination.

Here's a librosa script which does what you want:

y1, _ = librosa.load("Input1.wav", sr=16000, mono=True)
y1_normalized = y1 / np.max(librosa.rmse(y1))
volume1 = 0.5

y2, _ = librosa.load("Input2.wav", sr=16000, mono=True)
y2_normalized = y2 / np.max(librosa.rmse(y2))
volume2 = 0.5

y_mix = volume1 * y1_normalized + volume2 * y2_normalizeed

Then use soundfile.write to export.
I hope this helps!

@lostanlen lostanlen self-assigned this Apr 7, 2020
@USER-5 USER-5 closed this as completed Apr 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants