New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix the PTS offset logic error when first reading a file with FFmpegReader #565
Conversation
…der. Use the calculated 0 - PTS, unless it is too large (more than 1 second off from zero)
@ferdnyc Please take a look, and let me know that I'm not crazy, haha. I think this is much "saner" than before, and saner than not filtering out crazy values. |
Codecov Report
@@ Coverage Diff @@
## develop #565 +/- ##
===========================================
+ Coverage 48.66% 49.88% +1.22%
===========================================
Files 129 129
Lines 10043 10846 +803
===========================================
+ Hits 4887 5411 +524
- Misses 5156 5435 +279
Continue to review full report at Codecov.
|
I'll take a look! I'm in my phone now, so I only managed a quick perusal with too little context. One question, though: Why is there a GetVideoPTS() method, but then the audio computation just uses |
@ferdnyc That method does a little bit of checking before deciding which timestamp to return (PTS or DTS in some situations)
|
In every situation where it's set, in fact. ...That feels off to me. I'm worried that I'm re-hashing #311 here, but... DTS is the decode timestamp, so by unilaterally overriding the presentation timestamp with it, isn't the code saying, "Always store these frames in the order they were decoded, even if we know they're out of order from how they should be shown."? That's how I read the code above — which sounds off to me. It's also very different from what switch (d->avctx->codec_type) {
case AVMEDIA_TYPE_VIDEO:
ret = avcodec_receive_frame(d->avctx, frame);
if (ret >= 0) {
if (decoder_reorder_pts == -1) {
frame->pts = frame->best_effort_timestamp;
} else if (!decoder_reorder_pts) {
frame->pts = frame->pkt_dts;
}
}
break;
case AVMEDIA_TYPE_AUDIO:
ret = avcodec_receive_frame(d->avctx, frame);
if (ret >= 0) {
AVRational tb = (AVRational){1, frame->sample_rate};
if (frame->pts != AV_NOPTS_VALUE)
frame->pts = av_rescale_q(frame->pts, d->avctx->pkt_timebase, tb);
else if (d->next_pts != AV_NOPTS_VALUE)
frame->pts = av_rescale_q(d->next_pts, d->next_pts_tb, tb);
if (frame->pts != AV_NOPTS_VALUE) {
d->next_pts = frame->pts + frame->nb_samples;
d->next_pts_tb = tb;
}
}
break; They don't even touch the packet timestamps at all, and they don't look at DTS unless they're explicitly prevented from reordering frames by That makes sense to me, as trying to reorder packets before they're decoded into frames feels like rearranging deck chairs on a SCUD missile. You know the thing's going to be thrown away when it's decoded into a more usable format, so why even mess with it? From the If we just hand every packet to |
I love it! I think your concerns and suggestions are worth investigating for sure! However, for this PR, I'm fixing the really broken logic back to what I initially intended. I'll add a TODO above this code with some links back to your suggestion, as I'm sure we'll revisit this again. Thx! |
Use the calculated
0 - PTS
, unless it is too large (more than 1 second off from zero). This is related to a previous PR (with much discussion): #311. This PR fixes the regression related to limiting the offset, and I think makes more sense than previous suggestions.The PTS offset was always designed as a quick way to "zero" out the video PTS and audio PTS values, so we could calculate frame numbers, based on the starting PTS values, etc... But in my testing, some videos are encoded with negative timestamps, especially common with audio tracks, where the audio starts before the 1st video frame. OpenShot is not really designed to deal with this, and it causes us to search for matching packets which never come. This could/does introduce some amount of audio drift, just depending on how negative the audio timestamps are. Also, in my testing, I've come across videos that have garbage timestamps (usually negative values), for example -9999999999999999, and then the rest of the packets have normal PTS values. So, this is the background on why I want to limit the offset to some reasonable amount (-1 to +1 seconds in timebase units).
I've done some testing on this with my crazy test videos, and this seems to work great.