Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expanding split_on_silence with minimum length / file size? #143

Closed
hclivess opened this issue Jun 8, 2016 · 3 comments
Closed

Expanding split_on_silence with minimum length / file size? #143

hclivess opened this issue Jun 8, 2016 · 3 comments

Comments

@hclivess
Copy link

hclivess commented Jun 8, 2016

Would it be easy/possible to combine minimum chunk length or minimum file size with splitting on silence? I would like to split large files in the silence gaps, but I want the files to be no less than x minutes.

@jiaaro
Copy link
Owner

jiaaro commented Jun 11, 2016

@hclivess My advice is to use pydub.silence.split_on_silence() and then recombine the segments as needed so that you have files that are roughly the size you're targeting.

something like

from pydub import AudioSegment
from pydub.silence import split_on_silence

sound = AudioSegment.from_file("/path/to/file.mp3", format="mp3")
chunks = split_on_silence(
    sound,

    # split on silences longer than 1000ms (1 sec)
    min_silence_len=1000,

    # anything under -16 dBFS is considered silence
    silence_thresh=-16, 

    # keep 200 ms of leading/trailing silence
    keep_silence=200
)

# now recombine the chunks so that the parts are at least 90 sec long
target_length = 90 * 1000
output_chunks = [chunks[0]]
for chunk in chunks[1:]:
    if len(output_chunks[-1]) < target_length:
        output_chunks[-1] += chunk
    else:
        # if the last output chunk is longer than the target length,
        # we can start a new one
        output_chunks.append(chunk)

# now your have chunks that are bigger than 90 seconds (except, possibly the last one)

Alternatively, you can use pydub.silence.detect_nonsilent() to find the ranges and make your own decisions about where to slice the original audio

@hclivess
Copy link
Author

Awesome, thank you very much. I will repost this code to StackExchange if you don't mind

@jiaaro
Copy link
Owner

jiaaro commented Jun 13, 2016

@hclivess oh, I see the question - will post it there too :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants