Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bytesPerSample or sampleBitDepth return confusing results for audio #36

Closed
pps83 opened this issue Mar 28, 2018 · 12 comments
Closed

bytesPerSample or sampleBitDepth return confusing results for audio #36

pps83 opened this issue Mar 28, 2018 · 12 comments

Comments

@pps83
Copy link

pps83 commented Mar 28, 2018

I'm using avcpp and I really like it!
I'm decoding mp4 (h264+acc) and aac decoder returns AV_SAMPLE_FMT_FLTP (planar floats). In my case I have stereo aac and when I get a decoded audio av::AudioSamples frame values that I get for samples count size etc are all confusing.

these are the values that I have:

aframe.raw.nb_samples 1024
aframe.channelsCount() => 2
aframe.size() => 16384
aframe.sampleFormat().bytesPerSample() => 1 (?!)
aframe.sampleBitDepth() => 8 (?!)
aframe->linesize[0] = 8192

aframe.raw.buf[0] and buf[1] both have size 8192, however, looking at the memory in debugger it seems that samples actually take half of that buffer. Either it's ffmpeg allocates more (padding, rounding) or it's a bug? For 1024 FLTP sample it should be 4K in each plane or total 8K for 2 channels.

IMO, aframe.size() should return size of actual samples stored, and but currently it seem to return size of allocated memory. In any case, bytes per sample seems to be completely wrong there (it should be 32 or 64 if a sample is counted for all channels).

@h4tr3d
Copy link
Owner

h4tr3d commented Mar 28, 2018

@pps83 thanks for report! Could you please provide media sample? Just for local testing.

@h4tr3d
Copy link
Owner

h4tr3d commented Mar 28, 2018

According size() call. It just accumulate buf[x]->size and relative this field. Seems that new API call that make difference between actual size and capacity is needed. I know, that best solution have a pair size()/capacity() like std::vector, but it can break existing code silently.

h4tr3d added a commit that referenced this issue Mar 28, 2018
av::AudioSamples::sampleBitDepth()

av::SampleFormat::bytesPerSample() always return 1 instead of actual
sample size.

Ref #36
@h4tr3d
Copy link
Owner

h4tr3d commented Mar 28, 2018

Either it's ffmpeg allocates more (padding, rounding) or it's a bug?

Not sure. Sample data is welcomed.

@h4tr3d
Copy link
Owner

h4tr3d commented Mar 28, 2018

@pps83 could you please check 21018d2.

Actual data size should be calculated like:

auto planeSize = aframe.sampleFormat().bytesPerSample() * aframe.samplesCount();
auto totalSize = planeSize * aframe.channelsCount();

@h4tr3d
Copy link
Owner

h4tr3d commented Mar 28, 2018

Either it's ffmpeg allocates more (padding, rounding) or it's a bug?

Not sure. Sample data is welcomed.

According FFmpeg code, it store FULL buffer size at the linesize[0]:

ret = av_samples_get_buffer_size(&frame->linesize[0], channels,
                                                   frame->nb_samples, frame->format,
                                                   align);

And allocate it for each plane:

frame->buf[i] = av_buffer_alloc(frame->linesize[0]);

For other details, look into static int get_audio_buffer(AVFrame *frame, int align) at the libavutil/frame.c

From FFmpeg docs:

     * For audio, only linesize[0] may be set. For planar audio, each channel
     * plane must be the same size.

@pps83
Copy link
Author

pps83 commented Mar 28, 2018

I pulled all the change and it does not compile for me:

avcpp\src\audioresampler.cpp(11): error C2512: 'av::SampleFormat': no appropriate default constructor available

Also, I build with VS2015 and there is a few things that do not compile. I had to make a few changes:
pps83@717efdb

(I left some comments there).

@pps83
Copy link
Author

pps83 commented Mar 28, 2018

after I made latest code compile bytesPerSample and sampleBitDepth seem to return proper results now.

doese bytesPerSample come directly from avcodec api? It's common for sound libs for bytes per sample to return channles * sampleSize. Perhaps in fmod it's like that (but I might be wrong).
regarding planar audio and linesize[0]: I know all of that. Still what I see kind of strange, all these linesize seem to be pointless (as they do not seem to have any relationship to actual samples stored). It's either a bug in ffmpeg, bug in aac decoder, or weird api that makes these values pointless and actual data sizes have to be calculated from samples count, number of channels and size of each sample.

Clip that I used is temporarily committed to https://github.com/pps83/avcpp/tree/master3 (RobloxAnthemVideo384.mp4)

@h4tr3d
Copy link
Owner

h4tr3d commented Mar 28, 2018

doese bytesPerSample come directly from avcodec api?

Short answer: yes.

It's common for sound libs for bytes per sample to return channles * sampleSize.

I deal with ALSA directly, UAC (USB Audio Class): in all this places you should calc full size like: get_bytes_per_sample_for_format(SOME_FORMAT_LIKE_S16BE) * channels * samples_count. Can you provide some examples of described APIs?

It's either a bug in ffmpeg, bug in aac decoder, or weird api that makes these values pointless and actual data sizes have to be calculated from samples count, number of channels and size of each sample.

I have no complete answer. At least, it is not a bug in AAC: I describe above common utility code that uses to allocate buffers for audio AVFrame in all parts of FFmpeg. It is not a bug, but non-optimized way. In any case, you MUST make calculation in way, that described by me: AVFrame's and AVPacket's now reference-counted and can be reused in unpredictable way, so, real buffers can store more data that currently avail for give buffer. And, this buffer sizes like std::vector::capacity. I think, that I will provide some API that helps to get actual data size in buffer.

Clip that I used is temporarily committed

Thanks, I take it.

Also, I build with VS2015 and there is a few things that do not compile. I had to make a few changes:

According changes with size_t... Could you please revert it and try to add:

#include <stddef.h>

at the top of frame.h? GCC 7.3.1 has no problems with such code.

@pps83
Copy link
Author

pps83 commented Mar 29, 2018

Can you provide some examples of described APIs?

I think it was in fmod that it was like that, I don't remember where it was. I tried to find in docs, cannot find sample size related stuff. Perhaps it's in their headers.

regarding size_t changes. I didn't have issues with size_t, but I don't see a point to use size_t there. I had lots of compilation warnings because of signed/unsigned mismatch, so I patched it. size_t -> unsigned wasn't really needed. IMO it's doesn't make sense to use size_t or ssize_t for things that have very small range of values like stream count etc; ffmpeg also uses plaint int or unsigned int for these things and avcpp should also do the same.

all the changes done in that referenced fork are done for a quick one day project, that's why I have windows.h there and some other unrelated mess.

@pps83
Copy link
Author

pps83 commented Mar 29, 2018

Invalid data when decoding a youtube video.

This is the youtube video that triggers this problem. Sample ffmpeg cmd line:

ffmpeg -i "https://redirector.googlevideo.com/videoplayback?expire=1522321653&c=WEB&mime=video%2Fmp4&pl=33&itag=22&dur=212.091&source=youtube&mm=31%2C29&gcr=us&lmt=1518411497890374&fvip=4&ei=lXS8WpvWMpOr_APYwb_YAw&mt=1522299938&id=o-AFH2YZN9qTEE4epJvePQixNGBWLcAZt1KCkEdKE8EL0W&mn=sn-n4v7knll%2Csn-n4v7sn7s&sparams=dur%2Cei%2Cgcr%2Cid%2Cinitcwndbps%2Cip%2Cipbits%2Citag%2Clmt%2Cmime%2Cmm%2Cmn%2Cms%2Cmv%2Cpl%2Cratebypass%2Crequiressl%2Csource%2Cexpire&key=yt6&ip=2600%3A3c01%3A%3Af03c%3A91ff%3Afe24%3Ab564&ms=au%2Crdu&mv=m&requiressl=yes&ipbits=0&initcwndbps=342500&ratebypass=yes&signature=3A7DF33468445EE8A6240F20413A70936433173A.36EE2C51AE0A26D1091E2A805FB07371ED16A18D" -s 384x216 -crf 15 -c:a copy YT.mp4

this encodes and completes without any errors. However, when I run it through avcpp I get Invalid data found when processing input return from audio decoder at the end of the movie. Last audio packet that goes to decoder produces this error. The packet does not look suspicious or broken.

@h4tr3d
Copy link
Owner

h4tr3d commented Mar 29, 2018

Invalid data when decoding a youtube video.

create new task, please.

@h4tr3d h4tr3d closed this as completed Mar 29, 2018
@pps83
Copy link
Author

pps83 commented Mar 30, 2018

.... Can you provide some examples of described APIs?

https://www.fmod.com/docs/api/content/generated/FMOD_CREATESOUNDEXINFO.html

For example decodebuffersize field is described as:

[w] Optional. Specify 0 to ignore. For streams. This determines the size of the double buffer (in PCM samples) that a stream uses. Use this for user created streams if you want to determine the size of the callback buffer passed to you. Specify 0 to use FMOD's default size which is currently equivalent to 400ms of the sound format created/loaded.

I assure you that this parameter means channel-samples, as I had issues with that and once I converted to calculations to treat all samples as channel-samples all issues got sorted out. Similarly in many other cases it's not even specified, but assumed that all samples are channel-samples.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants