Skip to content


[ffmpeg] Convert from av_audio_convert API to the swresample one. #1290

merged 6 commits into from

8 participants


This is a followup on #882.
No new feature added, only switching from a deprecated internal API to a public one.

It includes support for libavresample if libswresample is not available, which should please libav boyz, but since libav has not made a single release with libavresample they will probably have to wait or use git master.

Team Kodi member

I can't help but wonder why we need this at all, now that we have the audio engine.


well, there were still calls to the internal ffmpeg API.
if you find another way to drop them, I'm happy too :)


OTOH, it may be faster to use swr for audio conversions in the audio engine; they have a lot of mmx/sse/avx optimisations.


^^^ i always said do that and fallback to pure C code which is the only thing AE should contain itself. but gnif wanted it selfcontained. since he's a gonner i vote we do it easy to map avr <-> swr?


for mapping avr <-> swr, see 274679d here
APIs are really similar, its not that difficult to map them (only a subset of the api is mapped here though)


okay, if we start with the "simple" bits, it's just converting AEConvert.cpp i guess. ultimately it will be more of a pain though if we want to do remapping and resampling at the same time (which we should), since ae is designed to do those in different steps..


@aballier, remember that there are currently three audio engines, SoftAE, PulseAE and CoreAudioAE.

Also depending if where AE is doing the processing, it might be working on a single frame at a time, or a bunch of frames.


problem with AEConvert is that its API doesnt map sanely to swr/avr: the latter have a model with a costly initialisation, setting up everything and allocating memory, and then fast conversion.
AEConvert API uses a cheap initialisation (a function that returns a function converting from/to float and the requested format).

I cant find a sane way to wrap swr api into AEConvert's one.

Other options are:

  • Make AEConvert an abstractaction layer of swr, adopting a very close API
  • Drop AEConvert and use swr directly

imho only option 1) (or well, something is its spirit) is okay. forcing a dependency on anything ffmpeg, though highly silly, scares vendors.

i guess the resample context needs to be moved from the engine (i.e. resample the incoming stream) to the stream itself (resample before you send to engine). that will allow the resampler context to live longer. this works for everything but nav sounds. problem there is even if we force some format, we may very well have to resample since we may be outputting audio at some other sample rate (nav sounds during media playback). i can't think of a solution to this, perhaps simply resampling those using C code will suffice?


hmm, wait, what is option 1) ? :)


and what do you mean by 'forcing a dependency on anything ffmpeg' ?

Team Kodi member

We now have amlplayer and soon to have omxplayer, which means that it's possible to build/distribute xbmc without a trace of ffmpeg. To some this is a big feature due to license/patent/voodoo concerns.


A LGPL library, that does not depend neither on libavformat nor libavcodec (where you may have some patent issues)???
Well, the voodoo remains :)


back to the subject, nav sounds are already resampled with src (libsamplerate in AEWAVLoader.cpp); however, I dont understand the idea (and neither the problem with nav sounds) here :(


as i said it's silly, but they want no trace of ffmpeg. and you cannot argue with phb's.

if i understood correctly, the problem here is that we resample in the engine, using the resample function "the cheap initialize API". to get the context living longer, my idea was to attach it to the AEStream instead. the stream lives for a whole movie/song/whatever. the only ptoblem there is the nav sounds - they are a "stream" that are very short lived due to the length of a nav sound..


ok but then I dont get where we're trying to go:
1) convert AE to do its format conversions with swr/avr
2) swr/avr is forbidden as a hard dep

for me the two are in contradiction

if the point is to make AE optionally use avr/swr, then this justifies even more this PR:

  • the first 3 commits will be needed
  • 6fc0730 is a straightforward conversion of existing code
  • 2fd1e6f only removes now unused API from the dll

for the context problem, maybe the following will be simpler:

  • change CAEConvert API a little bit and make CAEConvert::ToFloat / FrFloat return a context
  • add a convert function to CAEConvert taking as arguments the context, uint8_t* buffers and the # of samples

yes, this pr is most welcome in my opinion, independent of any ae change.

the point was to change the AE API to suite avr/swr, then optionally use it and fallback to a pure C implementation. i'm all for a simpler approach, i was just thinking out loud based on what i know about ae and what you said makes avr/swr hard to use.


ok I get it now :)

the simplest way is what i described above imho, adopting a context-based api ala swr; the most important change is that the caller should delete/free the context, which is not the case now.

Team Kodi member

cptspiff: I am most certainly not gone, just do not have the time at current to keep up with XBMC dev. I agree with using a class to allow libav reuse, but I do not agree with removing the old code at all. My plans are still the same, AE will become an external library depending on nothing by the OS sound API.


@gnif : Don't take my lines as offend. It is my personal standpoint.

For AE i still do not get it why refusing war proofed code and reinvent the wheel. ffmpeg includes a lot of optimized code from many people, for many platforms and tested a lot. My personal standpoint is, use ffmpeg where it fits and does a good job, instead of reinvent the wheel.

Team Kodi member

@huceke: Sure, that's why I said I am OK with abstracting it, I do like the idea of having the option, just keep the code around for the old stuff for when I/someone is ready to make it an external lib.


@gnif - nice to see you around ;)

Maybe I'm naive here on the ffmpeg / libav issues but does it make sense to just import the swr code into xbmc with acknowledgements? It would seem to solve the issues of AE staying as self-contained as possible while avoiding the ffmpeg / libav distro and / or patent issues without reinventing the wheel?


importing the code is not nice for people already having the shared lib: they dont need twice the same code in ram / on disk :)
moreover, its not that straightforward to import if you want to avoid symbol collisions


Understand your points - as to the additional code surely the resulting binary size hit would be no more than a few kb. Symbol collision would be a real issue although namespaces and "replace all" are handy ;)


here libswresample + libavutil (dep of libswr) .so's are ~200kb, not negligible imho


200kb is tiny compared with say avcodec (about 4 MBs), also dylibs are typically mmapp'ed into the address space, not actually loaded into ram, so the os will handle swapping it into active ram pages on demand. This is how we can run a full blown XBMC on embedded system that can have as little as 256MBs of ram.

I suggest baby steps, solve the current issue with accessing internal FFMpeg functions. Then think about SoftAE and friends.


@davilla - agreed. Part of my reasoning for bringing libswresample code directly into xbmc would be to avoid having two resamplers (one for gui wav's and one for streams) so the net effect would be a reduction in overall package size, reduced dependancies on external libs and simplicity in the overall resampler calling code.

It also helps with the goal of AE being the sole audio handler. Kinda like how we don't use external codecs for video decoding and rendering such a the k-lite packs. Seems thats been a core philosophy from the start.

Team Kodi member

Just my opinion. It might not be a good idea to dupe the code because someone has to handle backporting of upstream code fixes. And that could be a time consuming job (i bet the code needs to be adapted heavily for not bringing the whole lib into xbmc but only the parts we need). Compiling that asm foo which is in ffmpeg could also be a problem (atm we compile ffmpeg with mingw on windows but the core with vs).


Hey Memphiz - devil's in the details lol. Regarding backports swr has been around for ages and works just fine, but I agree that patches upstream would be an issue to adapt, and ofc we all love deciphering ffmpeg :P

@gnif - further thoughts?


no no no no no. we will not maintain swr! haven't we learned anything?


what i have in mind is: keep the reinvented wheel as such, and when swr is available route the resampling through it; no code duplication, ae remains self contained by default, every one is happy, no ? :)


Heh - sure. I'll bow to those still worried about the ffmpeg/libav issues with Linux distros...

@davilla davilla merged commit dcd820f into xbmc:master
@LongChair LongChair added a commit to plexinc/plex-home-theater-public that referenced this pull request
@LongChair LongChair Fix Cinema Trailers doesn't work from dashboard Recently Added fanout #… df03e38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Aug 14, 2012
  1. @aballier

    SwResample: Map swr_alloc_set_opts, swr_init, swr_free and swr_conver…

    aballier committed
    …t functions.
    That way we can use the libswresample API instead of the private libavcodec::audioconvert API.
  2. @aballier
  3. @aballier
  4. @aballier

    [dvdplayer]: Use the SwResample API.

    aballier committed
    This drops all usage of the private libavcodec::audiocovert API in xbmc.
  5. @aballier
  6. @aballier
@@ -1412,6 +1412,13 @@ if test "$use_external_ffmpeg" = "yes"; then
# libavcore is optional
+ # one of libswresample or libavresample is needed
+ PKG_CHECK_EXISTS([libswresample], FFMPEG_LIBNAMES="$FFMPEG_LIBNAMES libswresample",
+ [PKG_CHECK_EXISTS([libavresample],
+ AC_MSG_ERROR([You need either libswresample
+ or libavresample.]))])
@@ -1445,6 +1452,9 @@ if test "$use_external_ffmpeg" = "yes"; then
+ # Check for libswresample or libavresample headers.
+ AC_CHECK_HEADERS([libswresample/swresample.h libavresample/avresample.h])
# Check if AVFilterBufferRefVideoProps AVRational member is named
# 'pixel_aspect' or 'sample_aspect_ratio'.
40 lib/DllAvCodec.h
@@ -58,20 +58,8 @@ extern "C" {
#include <ffmpeg/avformat.h>
- /* From non-public audioconvert.h */
- struct AVAudioConvert;
- typedef struct AVAudioConvert AVAudioConvert;
- AVAudioConvert *av_audio_convert_alloc(enum AVSampleFormat out_fmt, int out_channels,
- enum AVSampleFormat in_fmt, int in_channels,
- const float *matrix, int flags);
- void av_audio_convert_free(AVAudioConvert *ctx);
- int av_audio_convert(AVAudioConvert *ctx,
- void * const out[6], const int out_stride[6],
- const void * const in[6], const int in_stride[6], int len);
#include "libavcodec/avcodec.h"
- #include "libavcodec/audioconvert.h"
@@ -115,13 +103,6 @@ class DllAvCodecInterface
virtual int avcodec_default_get_buffer(AVCodecContext *s, AVFrame *pic)=0;
virtual void avcodec_default_release_buffer(AVCodecContext *s, AVFrame *pic)=0;
virtual AVCodec *av_codec_next(AVCodec *c)=0;
- virtual AVAudioConvert *av_audio_convert_alloc(enum AVSampleFormat out_fmt, int out_channels,
- enum AVSampleFormat in_fmt , int in_channels,
- const float *matrix , int flags)=0;
- virtual void av_audio_convert_free(AVAudioConvert *ctx)=0;
- virtual int av_audio_convert(AVAudioConvert *ctx,
- void * const out[6], const int out_stride[6],
- const void * const in[6], const int in_stride[6], int len)=0;
virtual int av_dup_packet(AVPacket *pkt)=0;
virtual void av_init_packet(AVPacket *pkt)=0;
@@ -189,17 +170,6 @@ class DllAvCodec : public DllDynamic, DllAvCodecInterface
virtual void avcodec_default_release_buffer(AVCodecContext *s, AVFrame *pic) { ::avcodec_default_release_buffer(s, pic); }
virtual enum PixelFormat avcodec_default_get_format(struct AVCodecContext *s, const enum PixelFormat *fmt) { return ::avcodec_default_get_format(s, fmt); }
virtual AVCodec *av_codec_next(AVCodec *c) { return ::av_codec_next(c); }
- virtual AVAudioConvert *av_audio_convert_alloc(enum AVSampleFormat out_fmt, int out_channels,
- enum AVSampleFormat in_fmt , int in_channels,
- const float *matrix , int flags)
- { return ::av_audio_convert_alloc(out_fmt, out_channels, in_fmt, in_channels, matrix, flags); }
- virtual void av_audio_convert_free(AVAudioConvert *ctx)
- { ::av_audio_convert_free(ctx); }
- virtual int av_audio_convert(AVAudioConvert *ctx,
- void * const out[6], const int out_stride[6],
- const void * const in[6], const int in_stride[6], int len)
- { return ::av_audio_convert(ctx, out, out_stride, in, in_stride, len); }
virtual int av_dup_packet(AVPacket *pkt) { return ::av_dup_packet(pkt); }
virtual void av_init_packet(AVPacket *pkt) { return ::av_init_packet(pkt); }
@@ -251,13 +221,6 @@ class DllAvCodec : public DllDynamic, DllAvCodecInterface
DEFINE_METHOD2(enum PixelFormat, avcodec_default_get_format, (struct AVCodecContext *p1, const enum PixelFormat *p2))
DEFINE_METHOD1(AVCodec*, av_codec_next, (AVCodec *p1))
- DEFINE_METHOD6(AVAudioConvert*, av_audio_convert_alloc, (enum AVSampleFormat p1, int p2,
- enum AVSampleFormat p3, int p4,
- const float *p5, int p6))
- DEFINE_METHOD1(void, av_audio_convert_free, (AVAudioConvert *p1));
- DEFINE_METHOD6(int, av_audio_convert, (AVAudioConvert *p1,
- void * const p2[6], const int p3[6],
- const void * const p4[6], const int p5[6], int p6))
@@ -288,9 +251,6 @@ class DllAvCodec : public DllDynamic, DllAvCodecInterface
- RESOLVE_METHOD(av_audio_convert_alloc)
- RESOLVE_METHOD(av_audio_convert_free)
- RESOLVE_METHOD(av_audio_convert)
68 lib/DllSwResample.h
@@ -24,6 +24,7 @@
#include "config.h"
#include "DynamicDll.h"
+#include "utils/log.h"
extern "C" {
#ifndef HAVE_MMX
@@ -36,17 +37,36 @@ extern "C" {
#pragma warning(disable:4244)
- #include <libswresample/swresample.h>
+ #include <libswresample/swresample.h>
+ #include <libavresample/avresample.h>
+ #include <libavutil/opt.h>
+ #include <libavutil/samplefmt.h>
+ #define SwrContext AVAudioResampleContext
+ #else
+ #error "Either libswresample or libavresample is needed!"
+ #endif
#include "libswresample/swresample.h"
+class DllSwResampleInterface
+ virtual ~DllSwResampleInterface() {}
+ virtual struct SwrContext *swr_alloc_set_opts(struct SwrContext *s, int64_t out_ch_layout, enum AVSampleFormat out_sample_fmt, int out_sample_rate, int64_t in_ch_layout, enum AVSampleFormat in_sample_fmt, int in_sample_rate, int log_offset, void *log_ctx)=0;
+ virtual int swr_init(struct SwrContext *s)=0;
+ virtual void swr_free(struct SwrContext **s)=0;
+ virtual int swr_convert(struct SwrContext *s, uint8_t **out, int out_count, const uint8_t **in , int in_count)=0;
#if (defined USE_EXTERNAL_FFMPEG) || (defined TARGET_DARWIN)
// Use direct mapping
-class DllSwResample : public DllDynamic
+class DllSwResample : public DllDynamic, DllSwResampleInterface
virtual ~DllSwResample() {}
@@ -58,17 +78,59 @@ class DllSwResample : public DllDynamic
return true;
virtual void Unload() {}
+ virtual struct SwrContext *swr_alloc_set_opts(struct SwrContext *s, int64_t out_ch_layout, enum AVSampleFormat out_sample_fmt, int out_sample_rate, int64_t in_ch_layout, enum AVSampleFormat in_sample_fmt, int in_sample_rate, int log_offset, void *log_ctx) { return ::swr_alloc_set_opts(s, out_ch_layout, out_sample_fmt, out_sample_rate, in_ch_layout, in_sample_fmt, in_sample_rate, log_offset, log_ctx); }
+ virtual int swr_init(struct SwrContext *s) { return ::swr_init(s); }
+ virtual void swr_free(struct SwrContext **s){ return ::swr_free(s); }
+ virtual int swr_convert(struct SwrContext *s, uint8_t **out, int out_count, const uint8_t **in , int in_count){ return ::swr_convert(s, out, out_count, in, in_count); }
+// Wrap the same API through libavresample.
+class DllSwResample : public DllDynamic, DllSwResampleInterface
+ virtual ~DllSwResample() {}
+ // DLL faking.
+ virtual bool ResolveExports() { return true; }
+ virtual bool Load() {
+ CLog::Log(LOGDEBUG, "DllAvFormat: Using libavresample system library");
+ return true;
+ }
+ virtual void Unload() {}
+ virtual struct SwrContext *swr_alloc_set_opts(struct SwrContext *s, int64_t out_ch_layout, enum AVSampleFormat out_sample_fmt, int out_sample_rate, int64_t in_ch_layout, enum AVSampleFormat in_sample_fmt, int in_sample_rate, int log_offset, void *log_ctx) {
+ AVAudioResampleContext *ret = ::avresample_alloc_context();
+ av_opt_set_int(ret, "out_channel_layout", out_ch_layout , 0);
+ av_opt_set_int(ret, "out_sample_fmt" , out_sample_fmt , 0);
+ av_opt_set_int(ret, "out_sample_rate" , out_sample_rate, 0);
+ av_opt_set_int(ret, "in_channel_layout" , in_ch_layout , 0);
+ av_opt_set_int(ret, "in_sample_fmt" , in_sample_fmt , 0);
+ av_opt_set_int(ret, "in_sample_rate" , in_sample_rate , 0);
+ return ret;
+ }
+ virtual int swr_init(struct SwrContext *s) { return ::avresample_open(s); }
+ virtual void swr_free(struct SwrContext **s){ ::avresample_close(*s); *s = NULL; }
+ virtual int swr_convert(struct SwrContext *s, uint8_t **out, int out_count, const uint8_t **in , int in_count){ return ::avresample_convert(s, (void**)out, 0, out_count, (void**)in, 0,in_count); }
-class DllSwResample : public DllDynamic
+class DllSwResample : public DllDynamic, DllSwResampleInterface
+ DEFINE_METHOD9(SwrContext*, swr_alloc_set_opts, (struct SwrContext *p1, int64_t p2, enum AVSampleFormat p3, int p4, int64_t p5, enum AVSampleFormat p6, int p7, int p8, void * p9));
+ DEFINE_METHOD1(int, swr_init, (struct SwrContext *p1))
+ DEFINE_METHOD1(void, swr_free, (struct SwrContext **p1))
+ DEFINE_METHOD5(int, swr_convert, (struct SwrContext *p1, uint8_t **p2, int p3, const uint8_t **p4, int p5))
+ RESOLVE_METHOD(swr_alloc_set_opts)
+ RESOLVE_METHOD(swr_init)
+ RESOLVE_METHOD(swr_free)
+ RESOLVE_METHOD(swr_convert)
/* dependencies of libavformat */
29 xbmc/cores/dvdplayer/DVDCodecs/Audio/DVDAudioCodecFFmpeg.cpp
@@ -56,7 +56,7 @@ bool CDVDAudioCodecFFmpeg::Open(CDVDStreamInfo &hints, CDVDCodecOptions &options
AVCodec* pCodec;
m_bOpenedCodec = false;
- if (!m_dllAvUtil.Load() || !m_dllAvCodec.Load())
+ if (!m_dllAvUtil.Load() || !m_dllAvCodec.Load() || !m_dllSwResample.Load())
return false;
@@ -118,10 +118,7 @@ void CDVDAudioCodecFFmpeg::Dispose()
m_pFrame1 = NULL;
if (m_pConvert)
- {
- m_dllAvCodec.av_audio_convert_free(m_pConvert);
- m_pConvert = NULL;
- }
+ m_dllSwResample.swr_free(&m_pConvert);
if (m_pCodecContext)
@@ -185,18 +182,18 @@ void CDVDAudioCodecFFmpeg::ConvertToFloat()
if(m_pCodecContext->sample_fmt != AV_SAMPLE_FMT_FLT && m_iBufferSize1 > 0)
if(m_pConvert && m_pCodecContext->sample_fmt != m_iSampleFormat)
- {
- m_dllAvCodec.av_audio_convert_free(m_pConvert);
- m_pConvert = NULL;
- }
+ m_dllSwResample.swr_free(&m_pConvert);
m_iSampleFormat = m_pCodecContext->sample_fmt;
- m_pConvert = m_dllAvCodec.av_audio_convert_alloc(AV_SAMPLE_FMT_FLT, 1, m_pCodecContext->sample_fmt, 1, NULL, 0);
+ m_pConvert = m_dllSwResample.swr_alloc_set_opts(NULL,
+ m_dllAvUtil.av_get_default_channel_layout(m_pCodecContext->channels), AV_SAMPLE_FMT_FLT, m_pCodecContext->sample_rate,
+ m_dllAvUtil.av_get_default_channel_layout(m_pCodecContext->channels), m_pCodecContext->sample_fmt, m_pCodecContext->sample_rate,
+ 0, NULL);
- if(!m_pConvert)
+ if(!m_pConvert || m_dllSwResample.swr_init(m_pConvert) < 0)
CLog::Log(LOGERROR, "CDVDAudioCodecFFmpeg::Decode - Unable to convert %d to AV_SAMPLE_FMT_FLT", m_pCodecContext->sample_fmt);
m_iBufferSize1 = 0;
@@ -204,12 +201,8 @@ void CDVDAudioCodecFFmpeg::ConvertToFloat()
- const void *ibuf[6] = { m_pFrame1->data[0] };
- void *obuf[6] = { m_pBuffer2 };
- int istr[6] = { m_dllAvUtil.av_get_bytes_per_sample(m_pCodecContext->sample_fmt) };
- int ostr[6] = { m_dllAvUtil.av_get_bytes_per_sample(AV_SAMPLE_FMT_FLT) };
- int len = m_iBufferSize1 / istr[0];
- if(m_dllAvCodec.av_audio_convert(m_pConvert, obuf, ostr, ibuf, istr, len) < 0)
+ int len = m_iBufferSize1 / m_dllAvUtil.av_get_bytes_per_sample(m_pCodecContext->sample_fmt);
+ if(m_dllSwResample.swr_convert(m_pConvert, &m_pBuffer2, len, (const uint8_t**)m_pFrame1->data, m_pFrame1->nb_samples) < 0)
CLog::Log(LOGERROR, "CDVDAudioCodecFFmpeg::Decode - Unable to convert %d to AV_SAMPLE_FMT_FLT", (int)m_pCodecContext->sample_fmt);
m_iBufferSize1 = 0;
@@ -218,7 +211,7 @@ void CDVDAudioCodecFFmpeg::ConvertToFloat()
m_iBufferSize1 = 0;
- m_iBufferSize2 = len * ostr[0];
+ m_iBufferSize2 = len * av_get_bytes_per_sample(AV_SAMPLE_FMT_FLT);
4 xbmc/cores/dvdplayer/DVDCodecs/Audio/DVDAudioCodecFFmpeg.h
@@ -25,6 +25,7 @@
#include "DllAvCodec.h"
#include "DllAvFormat.h"
#include "DllAvUtil.h"
+#include "DllSwResample.h"
class CDVDAudioCodecFFmpeg : public CDVDAudioCodec
@@ -47,7 +48,7 @@ class CDVDAudioCodecFFmpeg : public CDVDAudioCodec
AVCodecContext* m_pCodecContext;
- AVAudioConvert* m_pConvert;
+ SwrContext* m_pConvert;
enum AVSampleFormat m_iSampleFormat;
CAEChannelInfo m_channelLayout;
int m_iMapChannels;
@@ -66,6 +67,7 @@ class CDVDAudioCodecFFmpeg : public CDVDAudioCodec
DllAvCodec m_dllAvCodec;
DllAvUtil m_dllAvUtil;
+ DllSwResample m_dllSwResample;
void BuildChannelMap();
void ConvertToFloat();
Something went wrong with that request. Please try again.