autoeditor

This script automatically cuts out portions of dead video from videos. I used it to post-parse OBS recordings of Twitch streams so that I could upload them to YouTube.

How does it work?

It uses ffmpeg to query the input video and generates a complex filter to crossfade between video sections, ensuring that no silence is longer than 3 seconds.

How do I use it?

Ensure ffmpeg is in your path, and then

python3 stage1_silencecutterfade.py --input <input directory> --output <output directory>

You can also pass the --audio-stream-id parameter if you use multi-track recordings.

How does it work - more details?

First off, the script uses ffmpeg to generate a "silences" file. It gets the output from ffmpeg -af silencedetect=noise=-30dB:d=# -f null where # is the duration in seconds, to detect all the places in the video where the audio is quieter than -30db for that amount of time or more. This command generates output like:

[silencedetect @ 0000021a5e45c540] silence_start: 11.0451

[silencedetect @ 0000021a5e45c540] silence_end: 16.756 | silence_duration: 5.71088

This output is parsed to get a set of silence starts, ends, and durations.

Next, the set of silences is iterated over to work out where to cross-fade. Because ffmpeg doesn't support sensible cross-fading, a "complex filter script" is created to do the crossfading. Example:

# Pre-fade
[0:v:0]trim=start=0:end=12.7951,setpts=PTS-STARTPTS[v1];
[0:a:0]atrim=start=0:end=12.7951,asetpts=PTS-STARTPTS[a1];

# 0.5 second crossfade section from pre-cut
[0:v:0]trim=start=12.7951:end=13.2951,setpts=PTS-STARTPTS[v2];
[0:a:0]atrim=start=12.7951:end=13.2951,asetpts=PTS-STARTPTS[a2];

# 0.5 second crossfade section from post-cut
[0:v:0]trim=start=16.006:end=16.506,setpts=PTS-STARTPTS[v3];
[0:a:0]atrim=start=16.006:end=16.506,asetpts=PTS-STARTPTS[a3];

# 0.5 second fade to alpha from pre-cut
[v2]format=pix_fmts=yuva420p,fade=t=out:st=0:d=0.5:alpha=1[z4];
[z4]fifo[y4];

# 0.5 second fade from alpha from post-cut
[v3]format=pix_fmts=yuva420p,fade=t=in:st=0:d=0.5:alpha=1[b4];
[b4]fifo[c4];

# Overlaying video alphas so we get a continuous video 
[y4][c4]overlay[v4];

# Crossfading video audio
[a2][a3]acrossfade=d=0.5[a4];

# The rest of the video
[0:v:0]trim=start=16.506:end=32.016,setpts=PTS-STARTPTS[v5];
[0:a:0]atrim=start=16.506:end=32.016,asetpts=PTS-STARTPTS[a5];

# Concatenating all the video segments together.
v1][a1][v4][a4][v5][a5]concat=n=3:v=1:a=1[v6][a6]

ffmpeg crunches away on this and spits out a video at the end of it.

Does it work properly?

Well, looking over the previous timestamps I think I'm a little out, so TODO? It looks like I'm actually generating silences of 1.75s + 0.5s + 0.25s == 2.5s rather than 3s. Oh well :D

Also if your video is very long or has a lot of silences then ffmpeg will probably run out of memory and crash.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
tests/input		tests/input
.gitignore		.gitignore
README.md		README.md
aeutils.py		aeutils.py
requirements.txt		requirements.txt
silencedetector.py		silencedetector.py
stage1_silencecutterfade.py		stage1_silencecutterfade.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

autoeditor

How does it work?

How do I use it?

How does it work - more details?

Does it work properly?

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

autoeditor

How does it work?

How do I use it?

How does it work - more details?

Does it work properly?

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages