Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Audio Support #26

Closed
RGuilfoyle opened this issue Jul 26, 2017 · 19 comments
Closed

Audio Support #26

RGuilfoyle opened this issue Jul 26, 2017 · 19 comments
Labels

Comments

@RGuilfoyle
Copy link

RGuilfoyle commented Jul 26, 2017

I have used filter_complex on an video as below, but the audio stream is lost.
I would like to be able to adjust the audio and video playback rate, suggestions?

import ffmpeg
stream = ffmpeg.input('input.mp4')
stream = ffmpeg.filter_(stream, 'fps', fps=20)
stream = ffmpeg.filter_(stream, 'crop', 'iw*.63:ih*.63:iw-(iw*.64):ih-(ih*.833)')
stream = ffmpeg.filter_(stream, 'setpts', '0.8333334*PTS')
stream = ffmpeg.output(stream, 'output.mp4')
ffmpeg.run(stream)

With ffmpeg I can mux back in the audio (also with a speedup);
-filter_complex 'crop=iw*.63:ih*.63:iw-(iw*.64):ih-(ih*.833)[vid];[vid]fps=20[vid2];[vid2]setpts=0.8333334*PTS[v];[0:a]atempo=1.2[a]' -map '[v]' -map '[a]'

@kkroening
Copy link
Owner

Sorry for the delayed reply. I'll take a look at this and see what I can figure out.

@kkroening
Copy link
Owner

kkroening commented Jul 30, 2017

Okay, streams in -filter_complex seem to be either audio or video but not both. This means that in general, we have to process the audio separately and then recombine it at the end, as in your example. Here's what it looks like graphically:

screen shot 2017-07-29 at 8 30 34 pm

The map_audio operator, which doesn't exist in ffmpeg-python yet, should produce the multiple -map parameter you showed above:

import ffmpeg
in_stream = ffmpeg.input('input.mp4')
audio_stream = in_stream.audio()
audio_stream = ffmpeg.filter_(audio_stream, 'atempo', '1.2')
video_stream = in_stream  # automatically equivalent to in_stream.video()
video_stream = ffmpeg.filter_(video_stream, 'fps', fps=20)
video_stream = ffmpeg.filter_(video_stream, 'crop', 'iw*.63', 'ih*.63', 'iw-(iw*.64)', 'ih-(ih*.833)')
video_stream = ffmpeg.filter_(video_stream, 'setpts', '0.8333334*PTS')
out_stream = ffmpeg.map_audio(video_stream, audio_stream)
out_stream = ffmpeg.output(out_stream, 'output.mp4')
ffmpeg.run(out_stream)

Or with fluent style:

in_stream = ffmpeg.input('input.mp4')
audio_stream = in_stream.audio().filter_('atempo', '1.2')
(in_stream
    .filter_('fps', fps=20)
    .filter_('crop', 'iw*.63', 'ih*.63', 'iw-(iw*.64)', 'ih-(ih*.833)')
    .filter_('setpts', '0.8333334*PTS')
    .map_audio(audio_stream)
    .output('output.mp4')
    .run()
)

Actually though, even the simplest -filter_complex example has the same problem with audio getting dropped:

screen shot 2017-07-29 at 8 34 36 pm

In order to get audio here, we'd have to do the -map '[v]' -map '[a]' thing:

screen shot 2017-07-29 at 8 36 32 pm

in_stream = ffmpeg.input('input.mp4')
(in_stream
    .hflip()
    .map_audio(in_stream.audio())
    .output('output.mp4')
    .run()
)

I think it's annoying to have to explicitly put the .map_audio in there, even though ffmpeg requires explicitly doing a second -map. I think we can do better with ffmpeg-python: ffmpeg-python could probably notice that the audio stream needs to be added back in so that the following is equivalent to the above:

(ffmpeg
    .input('input.mp4')
    .hflip()
    .output('output.mp4')
    .run()
)

This diverges from ffmpeg's behavior slightly, but it's probably easier to work with.

And for examples that use audio filters like the one you listed, it could be expressed as follows:

import ffmpeg
stream = ffmpeg.input('input.mp4')
stream = ffmpeg.filter_(stream, 'fps', fps=20)
stream = ffmpeg.filter_(stream, 'crop', 'iw*.63:ih*.63:iw-(iw*.64):ih-(ih*.833)')
stream = ffmpeg.filter_(stream, 'setpts', '0.8333334*PTS')
stream = ffmpeg.filter_(stream, 'atempo', '1.2')
stream = ffmpeg.output(stream, 'output.mp4')
ffmpeg.run(stream)

... or fluently:

(ffmpeg
    .input('input.mp4')
    .filter_('fps', fps=20)
    .filter_('crop', 'iw*.63', 'ih*.63', 'iw-(iw*.64)', 'ih-(ih*.833)')
    .filter_('setpts', '0.8333334*PTS')
    .filter_('atempo', '1.2')
    .output('output.mp4')
    .run()
)

Graphically, it looks like this:

screen shot 2017-07-29 at 8 28 09 pm

Nice and simple. ffmpeg-python would expand it to the more complicated picture as needed:

screen shot 2017-07-29 at 8 30 34 pm

That way we get all of the power of ffmpeg but none of the headache of having to keep track of separate streams, at least for simple examples. And if someone really wants the default ffmpeg behavior of dropping audio streams, all they'd have to do is put .video() on their input stream, and ffmpeg-python won't try to hold their hand.

So the plan is..
1: Add the map_audio operator.
2: Make it so that audio streams are automatically processed and mapped separately, in an intuitive but predictable manner.

I can't work on it tonight but will hopefully get step 1 done in the next day or two. Step 2 will take a bit longer, but I want to get it on the roadmap to keep ffmpeg-python easy to use.

@kkroening
Copy link
Owner

Thanks for reporting this, @RGuilfoyle!

@kkroening
Copy link
Owner

kkroening commented Jul 30, 2017

And of course, all of the filters you listed should soon become built-in operators in ffmpeg-python, e.g.:

(ffmpeg
    .input('input.mp4')
    .fps(20)
    .crop('iw*.63', 'ih*.63', 'iw-(iw*.64)', 'ih-(ih*.833)')
    .setpts('0.8333334*PTS')
    .atempo(1.2)
    .output('output.mp4')
    .run()
)

@RGuilfoyle
Copy link
Author

Thanks for the quick response @kkroening.

@kkroening
Copy link
Owner

I still haven't had a chance to work on this yet. Is it holding you up, @RGuilfoyle?

@RGuilfoyle
Copy link
Author

This was for an API demo I was building, nothing urgent!

@depau
Copy link
Collaborator

depau commented Nov 2, 2017

I like the API proposal. I think the map_audio method should be in the output object though, as you're mapping the audio to a specific output file/stream.

Every stream should have audio and video methods that simply add :a, :v to the stream selectors.

As I'm going to need this, I'll try to implement it and send a PR that behaves like I said.

@kkroening
Copy link
Owner

Yeah, I just wish hflip and other operators did predictable things. ffmpeg has really confusing behavior when it comes to audio, so if we can make ffmpeg-python take care of the dirty work in a predictable way, that would be best. Might be easier said than done though.

@depau
Copy link
Collaborator

depau commented Nov 3, 2017

In my opinion it shouldn't be this library's duty to fix ffmpeg. I think it should only provide means to make it easier, without doing it by default.

@kkroening
Copy link
Owner

kkroening commented Nov 4, 2017

Valid point. Hm, and that's probably true of the automatic -y option - maybe we should actually switch that back, like noahstier suggested.

@leadscloud
Copy link

AttributeError: module 'ffmpeg' has no attribute 'map_audio'

how to ?

@kkroening
Copy link
Owner

map_audio has not been implemented yet, unfortunately.

@leadscloud
Copy link

@kkroening

.filter_('atempo', '1.2')

[Parsed_setpts_1 @ 0000014439110320] Media type mismatch between the 'Parsed_setpts_1' filter output pad 0 (v
ideo) and the 'Parsed_atempo_2' filter input pad 0 (audio)
[AVFilterGraph @ 0000014438b4b480] Cannot create the link setpts:0 -> atempo:0
Error initializing complex filters.
Invalid argument

@leadscloud
Copy link

how to map audio to video ?

@skyHALud
Copy link

skyHALud commented Oct 3, 2019

I tried both versions 0.1.17 and 0.2.0 and the instream.audio() and instream.video() methods are not present (a backwards incompatible API change?). What works though is:

input_stream = ffmpeg.input(input_path)
video_stream = input_stream['v']
audio_stream = input_stream['a']

@LeonLIU08
Copy link

If I want to process the frame in NumPy and then map the raw audio to the final video, could you please share a possible way for this?

@mixmastamyk
Copy link

How to copy audio with the fluent interface? I still haven't figured it out. As soon as I add a filter, acodec='copy', no longer works. I think because it adds a map that excludes it.

@enjrolas
Copy link

I just ran into this same issue today -- my complex filter strips audio. This was my original complex filter:

                vid=ffmpeg.input(args.videoFile)
                ffmpeg
                .filter([vid, logo], 'overlay', overlayX, overlayY)
                .output(videoOutputFilename, metadata='title=%s' % title,**{'loglevel':'error', 'stats':None })
                .run(overwrite_output=True)

I read through this comment and tried adding

                vid=ffmpeg.input(args.videoFile)
                ffmpeg
                .filter([vid, logo], 'overlay', overlayX, overlayY)
                .map_audio(vid.audio())
                .output(videoOutputFilename, metadata='title=%s' % title,**{'loglevel':'error', 'stats':None })
                .run(overwrite_output=True)

but I get the error, " .map_audio(vid.audio())
AttributeError: 'FilterableStream' object has no attribute 'map_audio'"

Any idea how to address this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

8 participants