splitting a file with silence #117

shakfu · 2020-09-19T12:37:18Z

One typical use case for me is to split a file using the silence effect as outlined in this excellent sox tutorial https://madskjeldgaard.dk/posts/sox-tutorial-split-by-silence/

sox input.wav clip.wav silence 1 0.1 1% 1 0.1 1% : newfile : restart

gives "clip001.wav", "clip002.wav" etc. with only the audio in clip and without the silence in between.

Is it possible to do this in pysox (and apply a fade-in and fade-out to each resulting clip)?

The text was updated successfully, but these errors were encountered:

lostanlen · 2020-09-21T02:27:53Z

Yes. See: https://github.com/rabitt/pysox/blob/master/sox/transform.py#L2769

shakfu · 2020-09-21T02:55:46Z

I should clarify my question more: I'm aware that the 'silence' effect is implemented. My questions was more related to how I could translate ": newfile : restart" idiom to pysox.

lostanlen · 2020-09-21T03:02:10Z

i'm not sure if this is featured in pysox. it might be good to bring this up to @rabitt

shakfu · 2020-09-21T03:11:18Z

Thanks for your response @lostanlen . I will post a more clearly specified feature request then.

shakfu · 2020-09-21T07:46:03Z

I have a small bash script to use sox to split an input file into a number of clips based on a silence threshold and then applying a fade (in/out) to the resulting output files.

# splits input file into clip files based on a silence threshold
sox --show-progress $1 clip.wav silence 1 0.1 1% 1 0.1 1% : newfile : restart

# applies fade-in fade-out to each output file
for f in clip*
do
    name=$(basename -s .wav $f)
    newname="$name-f.wav"
    sox $f $newname fade 0.1 0
done

My question to @rabitt is whether it is possible to translate this script's functionality (in particular the ": newfile : restart" idiom) into a pure-python pysox solution.

shakfu · 2020-09-24T09:22:19Z

Trying to convert the splitting part of the above sox call into current pysox, I got as far as the following:

import sox
t = sox.Transformer()
t.silence(1, 1.0, 0.1)
t.silence(-1, 1.0, 0.1)
t.build('input.wav', 'clip.wav', extra_args=[':', 'newfile', ':', 'restart'])

# pysox converts this into the following sox args:

args = ['sox', '-D', '-V2', '-c', '1', 's104.wav', 'clip.wav', 'silence', '1', '0.100000', '1.000000%', 'reverse', 'silence', '1', '0.100000', '1.000000%', 'reverse', ':', 'newfile', ':', 'restart']

The problem is that this doesn't split the input file. I presume the culprit is the default translation to 'reverse' which always output one file. Also the arg order for pysox silence didn't match with the command line silence args.

rabitt · 2021-02-18T18:11:50Z

Hey @shakfu

My question to @rabitt is whether it is possible to translate this script's functionality (in particular the ": newfile : restart" idiom) into a pure-python pysox solution.

The short answer is, no pysox doesn't currently support the newfile : restart idiom, but we could extend the API to support it.

Also the arg order for pysox silence didn't match with the command line silence args.

Yes, this is the case for several of the transforms - the documentation should describe what each argument is doing, but it's true that it may not exactly match the command line tool's ordering. Note that for the silence command in particular, the "location" argument in pysox is there to support removing silence from the end of the file, hence the reverse.

StuartIanNaylor · 2021-02-22T22:44:35Z

@rabitt I do love pysox it rocks guys & thankx

I am going to steal @shakfu bash script but also would love to do this in pysox (creating model datasets for kws)
Vad also but as Vad can sometimes be confusing (spectral representation doesn't always work out well) to results I often use silence as the result is more logical, rather than occasionally wondering about a curious VAD result.

Thnx @shakfu for the script as was just about to ask about silence splitting and saw your post

PS if it could also do no action but output split points to txt would also be useful might be useful with a ASR aligner as still have to get one that extracts words satisfactory.

shakfu · 2021-02-22T23:46:53Z

@rabitt Thanks for your response. It would be great if the pysox API could be extended to accommodate this use-case. Naturally, it would be great to accomplish this in python (-:

@StuartIanNaylor Thanks, glad that my little script can be of use. Incidentally, I was curious about the answer to your last question and found some possible solutions in this stack overflow exchange.

StuartIanNaylor · 2021-05-12T21:00:43Z

@shakfu

def get_voice_params(file, silence_maximum_amplitude,file_min_silence_duration=0.2):
  stat = sox.file_info.stat(file)
  file_maximum_amplitude = stat['Maximum amplitude']
  file_duration = stat['Length (seconds)']

  percent_silence_threshold = (silence_maximum_amplitude / file_maximum_amplitude) * 100

  tmp1 = tempfile.NamedTemporaryFile(suffix='.wav')
  tmp2 = tempfile.NamedTemporaryFile(suffix='.wav')

  tfm1 = sox.Transformer()
  tfm1.silence(location=-1, min_silence_duration=file_min_silence_duration, silence_threshold=percent_silence_threshold, buffer_around_silence=True)
  tfm1.build(kw_files[0], tmp1.name)
  tfm1.clear_effects()

  stat = sox.file_info.stat(tmp1.name)
  voice_end = stat['Length (seconds)']

  tfm1.silence(location=1, min_silence_duration=file_min_silence_duration, silence_threshold=percent_silence_threshold, buffer_around_silence=True)
  tfm1.build(tmp1.name, tmp2.name)
  tfm1.clear_effects()

  stat = sox.file_info.stat(tmp2.name)
  print(stat)
  voice_start = voice_end - stat['Length (seconds)']
  voice_duration = voice_end - voice_start
  return file_maximum_amplitude, file_duration, voice_start, voice_end, voice_duration


file_maximum_amplitude, file_duration, voice_start, voice_end, voice_duration = get_voice_params(kw_file, silence_maximum_amplitude)

print(file_maximum_amplitude, file_duration, voice_start, voice_end, voice_duration)

I just suddenly clicked and didn't like writing out to harddrive all the time use tmp...
Still need to change the python logging to stop the warnings but an easy add.

lostanlen closed this as completed Sep 21, 2020

lostanlen reopened this Sep 21, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

splitting a file with silence #117

splitting a file with silence #117

shakfu commented Sep 19, 2020

lostanlen commented Sep 21, 2020

shakfu commented Sep 21, 2020

lostanlen commented Sep 21, 2020 •

edited

Loading

shakfu commented Sep 21, 2020

shakfu commented Sep 21, 2020

shakfu commented Sep 24, 2020

rabitt commented Feb 18, 2021

StuartIanNaylor commented Feb 22, 2021 •

edited

Loading

shakfu commented Feb 22, 2021 •

edited

Loading

StuartIanNaylor commented May 12, 2021

splitting a file with silence #117

splitting a file with silence #117

Comments

shakfu commented Sep 19, 2020

lostanlen commented Sep 21, 2020

shakfu commented Sep 21, 2020

lostanlen commented Sep 21, 2020 • edited Loading

shakfu commented Sep 21, 2020

shakfu commented Sep 21, 2020

shakfu commented Sep 24, 2020

rabitt commented Feb 18, 2021

StuartIanNaylor commented Feb 22, 2021 • edited Loading

shakfu commented Feb 22, 2021 • edited Loading

StuartIanNaylor commented May 12, 2021

lostanlen commented Sep 21, 2020 •

edited

Loading

StuartIanNaylor commented Feb 22, 2021 •

edited

Loading

shakfu commented Feb 22, 2021 •

edited

Loading