Universal SoundSource for FFmpeg 4.x #1356

uklotzde · 2017-09-28T19:34:09Z

Open Issues

The duration of MP3 files is imprecise as reported by FFmpeg in AVStream. This value might simply be calculated from the average bitrate and the length of the file. Workaround: Parse the whole MP3 file upfront for correctly determining the exact duration. Parsing should at least be faster than decoding. A similar strategy is used in SoundSourceMP3 already. Duration of MP3 files is reported as a multiple of the MP3 frame size, i.e. 1152 samples. If the last MP3 frame is incomplete then we will simply add silence. This is acceptable for DJing, no action needed.
Reduce priority from HIGHER to LOWER before merging. This will ensure that the existing SoundSources with priority DEFAULT are still favoured.
Fix SoundSourceProxyTest for file cover-test-itunes-12.3.0-aac.m4a
Analyze and fix warning for MP3 files: Warning [AnalyzerThread 0 #1]: SoundSourceFFmpeg4 - Overlapping sample frames in the stream: [-1105 -> 0)

Preamble

~~Based on PR AudioSource v2 API #1317 (AudioSource v2 API) -> will be rebased frequently on this branch~~
Enabled by compiling with ~~ffmpeg31=1~~ffmpeg4=1
~~NOT supported by Ubuntu Trusty -> disabled in CI builds~~

This one was really a challenge!! But I hope it was worth it.

The immediate reward is reliable M4A and ALAC decoding for (almost) everyone. In the long term I expect this to become the main SoundSource of Mixxx, replacing most of the custom implementations. The first decoders that could (or should?) be declared as legacy are SoundSourceM4A, SoundSourceFFmpeg, ~~SoundSourceMP3~~ (still needed for Windows), and ~~SoundSourceWV~~ (still needed for Windows).

I've tested it with all the corrupt files that I have collected over time. None of them is able to crash Mixxx or trigger a debug assertion. Even better: Some of them no longer need to be considered as corrupt!

Notes:

Uses whitelisting: Only registered for file types that pass all our tests
Chosen with HIGHER priority than DEFAULT to select it before any existing SoundSources -> safe because of whitelisting
OggVorbis decoding is disabled because of failing tests: FFmpeg #3825: Wrong PTS in Ogg Vorbis file
FLAC decoding is disabled because FFmpeg sometimes fails after seeking -> need to be discussed with the FFmpeg developers

Fixes the following bugs:
https://bugs.launchpad.net/mixxx/+bug/1336982
https://bugs.launchpad.net/mixxx/+bug/1665369

Essentially finished, but this piece of code should be tested thoroughly!

[Update 2017-10-02]
With my revised implementation it should now be much easier to follow and understand the code. It even reveals known bugs in FFmpeg that prevent us from enabling OggVorbis and FLAC support.

[Update 2017-10-02]
Added a workaround for decoding of AAC files that have been encoded with iTunes 12.3.0! I discovered this issue after fixing my calculation of the frame index range that didn't take the start time of a stream into account. The workaround is effective for all AAC files with a start time > 0, because we are not able to distinguish them properly. The only drawback is that some samples at the end might be cut off.
[Libav-user] Wrong duration of AAC files encoded by iTunes 12.3.0

[Update 2018-02-08]
The handling of start_time and duration in AVStream seems to have changed for FFmpeg 4.x. I adjusted the implementation and all tests now pass. The workaround for AAC files encoded with iTunes 12.3.0 is no longer necessary, but we should keep the test files.

illuusio · 2017-10-06T07:50:35Z

Without reading code and noting that integration ain't working. What are main benefits to create new FFlMpeg soundsource and not to just refactor old? What are main things that should be tested?

uklotzde · 2017-10-06T14:45:17Z

There is great demand for M4A/ALAC playback from our user base. Furthermore it is only a question of time until ancient libraries like FAAD2 and MAD will no longer be maintained.

SoundSourceFFmpeg does not pass our rigorous tests and also shows audible artefacts when playing and seeking randomly in the test files. I was not able to understand how the existing code works and how to fix those issues.

These were the decisions for adding a new SoundSourceFFmpeg31 implementation:

No more backward compatibility with older FFmpeg versions up to 3.0
Whole new send/receive/consume loop without any internal caching except buffering of received but not consumed samples
Leverages FFmpeg's resampling/reformatting capabilities instead of using a home brew solution

At least the code for opening files was partially reusable. Reimplementing the decoding loop seemed to be a task with moderate complexity, but I must admit that it was much harder than what I expected.

illuusio · 2017-10-08T09:51:28Z

Ok. There is much room for improving. Making FFMpeg work for every codec and container is harder than it should be. All other soundsources should be dropped soon if this should be the one because people tends to keep using old ones and not move to this one there is no reason to port them to new soundsource code.
There should be something like CuRL easy to have easier interface for decoding.

daschuer · 2017-10-08T21:21:42Z

Do we have CPU load measures comparing MAD with FFMPEG mp3 encoding?
If there are only negligible differences, it sounds reasonable to switch to FFMPEG.

illuusio · 2017-10-09T06:37:47Z

@daschuer even if CPU load doubles for encoding moving to solid encoder interface provided by FFMpeg is worth it. I haven't found difference. They use libmp3lame so you can test with lame and see how they manage as they don't seem to have native version.

daschuer · 2017-10-09T17:30:56Z

I have found some benchmark results:
https://multimedia.cx/eggs/gcc-of-multimedia/
Here ffmpeg is slightly faster than libmad.
In most other cases ffmpeg wins with a gap.

Can we adopt these results to us?

illuusio · 2017-10-10T07:22:56Z

I've little bit looked code. @uklotzde are you trusting FFMpeg for current timestamp and length? As my experience why things are done like they are old FFMpeg soundsource for most cases (like MP3 VBR) DTS/PTS is just good guess like you say in Vorbis.
Only way I could get it working was starting from beginning and seek correct point and then start reading from there. It was whole point of caching because people tends to just seek back and forward like 5 secs in audio to seek correct place and not to try find messages playing backwards whole song.

illuusio · 2017-10-10T07:23:50Z

@daschuer it depends how extreme with building options you want to go with FFMpeg.

uklotzde · 2017-10-10T15:50:25Z

@illuusio FFmpeg might use dts internally for determining the correct seek position. The SoundSource uses pts from the decoded frames for determining the correct position. We are seeking with the flag AVSEEK_FLAG_BACKWARD which should guarantee that we always land before the target position. Of course, we might need to skip some samples before reaching our target position.

I added a VBR test file and noticed that there was a bug when the audio file is actually shorter when decoded than initially reported by FFmpeg. Fixed.

illuusio · 2017-10-11T06:45:15Z

Yes @uklotzde I know difference between PTS and DTS. Older version they are mostly same and I don't see why they are not that in newer versions.
Problem that I originally faced with AVSEEK_FLAG_BACKWARD (or other flag) is when seeking MP3 you get bogus PTS which ain't same if you start reading from start. As MP3 has fixed frame size it should be but at leas sub 3.1 world it doesn't. If it's fixed then everything is fine.
This is also problem containers like WMV(2) where FFMpeg sucks hardly. Frame size ain't fixed it can be something up to 64 KB (divided by 1024) and seeking is just guess and doesn't every time drop in same byte.

illuusio · 2017-10-12T06:18:42Z

Trying this out. Is there option to disable all the other SoundSources to make FFMpeg rule them all?

uklotzde · 2017-10-12T14:57:45Z

The priority is already set to HIGHER, that's sufficient. The whitelisting is controlled by getSupportedFileExtensions(), OggVorbis and FLAC are still disabled.

illuusio · 2017-10-13T08:28:17Z

I got this often. First rows are normal FFMpeg warning and rest are something that is wrong?

Estimating duration from bitrate, this may be inaccurate
[New Thread 0x7ffee63ec700 (LWP 8710)]
[mp3 @ 0x7ffef40ece00] Estimating duration from bitrate, this may be inaccurate
[mp3 @ 0x7ffef428a3a0] Header missing
Warning [AnalyzerQueue 1]: SoundSourceFFmpeg31 - avcodec_send_packet() failed: No description for error code (-1094995529) found 
Warning [AnalyzerQueue 1]: AnalyzerQueue - Aborting analysis after failed to read sample data from "some.mp3" : expected frames = [21143552 -> 21147648) , actual frames = [21143552 -> 21143808)

Or it this the bug report on FFMpeg mailinglist about?

ronso0 · 2017-10-13T10:13:57Z

I was curious if this PR would fix Bug 1669500 (scratching backwards over loop_in disables loop), so I built with ffmpeg sources for Trusty (3.2.4) from here.

And indeed it does fix it, at least for some wav and mp3 files.
m4a files that failed with master, fail with this branch as well.
no ffmpeg-related entries in log for those files.

uklotzde · 2017-10-13T14:02:15Z

@illuusio Decoding of that mp3 files needs to be analyzed in detail by tracing all packets and frames. It might either be FFmpeg itself or our usage of FFmpeg that causes those failures. Even if the file is corrupt, decoding should not produce unexpected errors.

It is also crucial to test with a clean build. I experienced strange errors with partial builds after switching branches and don't trust scons.

illuusio · 2017-10-30T06:37:33Z

Is there something that needs particular testing. This works as expected with simple playing patterns.

uklotzde · 2017-10-30T10:37:28Z

@illuusio Known issue: The duration for MP3 is imprecise.

Carl Eugen from FFmpeg recommended that we should not rely on the duration reported by the stream! If we need to know the exact duration upfront we should instead parse the file from beginning to end, at least for MP3. They are mostly dealing with infinite streams and don't need to know the exact length even if it may exist.

Parsing should be faster than decoding, but still needs to read the whole file once. And I haven't figured out yet how to do it correctly and efficiently.

illuusio · 2017-10-30T11:20:10Z

@uklotzde fastest way I figured out is in soundsourceffmpeg. One just reads frames but not decode to packages (Which can also lead incorect place with VBR).
After that you can rely on dts/pts it was also one point which lead me not to use seeking functions. Many people doesn't use WMA2 but estimations of length of WMA2 is just very rough estimation and you can rely on that.
As wonderful FFMpeg is sometimes I wonder should we use VLC library for reading and let them do the FFMpeg stuff.

illuusio · 2018-03-05T09:18:18Z

What is current status with this? I have used this few times and it worked ok it there something to test on?

uklotzde · 2018-03-06T21:22:57Z

There is still one open issue that needs to be solved before integrating this PR: We need a quick scan through encoded files to reliably determine the exact number of sample frames = sum of all encoded frame lengths.

illuusio · 2018-07-22T18:30:11Z

Sometimes I feel we need mixxx-next which contains experimental stuff like this and they don't get stuck

daschuer · 2018-10-02T06:22:52Z

src/sources/soundsourceffmpeg31.cpp

+
+SoundSourceProviderPriority SoundSourceProviderFFmpeg31::getPriorityHint(
+        const QString& /*supportedFileExtension*/) const {
+    // TODO: Increase priority to HIGHER if FFmpeg should be used as the


Can we rephrase this a bit more weak? I know there are different opinions around we have to discuss, but this should not block this PR.

How about this:
// FFmpeg has the LOWER to be used as a fallback in case Mixxx has no own implementation. Increase priority to HIGER if you wish that ffmpeg decodes all files.

IMHO this should stick on LOWER forever. If we wish that ffmpeg is used we can remove the other implementation.

In case of mp3 and libmad, I am satisfied with the current implementation. While libmad is rather old, we have ironed out many cases and are rejecting files that are played with heavy sound artefacts in ffmpeg.

I agree.

Unfortunately the API of FFmpeg changes constantly in combination with an intransparent and almost erratic behavior, it's a nightmare! I guess maintenance could become a never ending story and I'm not willing to tweak this piece of code constantly.

WaylonR · 2018-12-19T13:13:56Z

need to update build/depends.py and build/features.py
depends.py: needs soundsourceffmpeg31.cpp added in sources.
features.py: needs path fixed.

uklotzde · 2018-12-19T13:31:20Z

"Enabled by compiling with ffmpeg31=1"

WaylonR · 2018-12-19T13:54:51Z

.. okay, don't need in depends.py, but without the path fix in features.py, will error.

illuusio · 2019-09-05T09:17:48Z

I did first read through the code. There is some FFMpeg rough points which I have been hitting my head while I have write code for FFmpeg 4.x series. If they bug me enough I'll report them If not they work and let them be as code is solid as is now.

illuusio

Over all bigger problem was that there is too long methods. They should be cut down to more easily understandable blocks

illuusio · 2019-08-11T15:46:41Z