Skip to content
This repository

video playback enhancements for Linux part1: vdpau #870

Closed
wants to merge 11 commits into from
Rainer Hochecker
Collaborator

I have been working on this for quite a while and built dozens of prototypes. Now I think I have something worth for you to review. It is actually an almost complete re-write of vdpau. In order to get the most out of it I had to touch player and rederer as well.

  • Performace improvements
    Separate threads for deinterlacing, opengl interoperation, frame buffering in renderer. This maximizes time for decoding and deinterlacing.

  • Improved deinterlacing
    The advanced methods now use 4 past and 2 future fields, instead of 2 past and 1 future. I have stutter free playback of 1080i@50, temporal/spacial deinterlacing on ION2

  • Fix stuttering caused by overwrite (early reuse) of video surfaces

Joakim Plate
Collaborator

You really need to split this up in separated changes. It's quite impossible to review like this.

For example changes in renderer like:
Add feature more than 3 buffers
Use feature more than 3 buffers for vdpau

Joakim Plate
Collaborator

For example, the whole new logic for dropping support should absolutely be a separate commit.

Rainer Hochecker
Collaborator

I have split the commits.

Joakim Plate elupus commented on the diff May 29, 2012
xbmc/cores/VideoRenderers/LinuxRendererGL.cpp
((112 lines not shown))
  2413
+    {
  2414
+      planes[p].pixpertex_x = 1;
  2415
+      planes[p].pixpertex_y = 1;
  2416
+    }
  2417
+  }
  2418
+  // crop
  2419
+//  m_sourceRect.x1 += vdpau->crop.x1;
  2420
+//  m_sourceRect.x2 -= vdpau->crop.x2;
  2421
+//  m_sourceRect.y1 += vdpau->crop.y1;
  2422
+//  m_sourceRect.y2 -= vdpau->crop.y2;
  2423
+
  2424
+  // set textures
  2425
+  fields[1][0].id = vdpau->texture[0];
  2426
+  fields[1][1].id = vdpau->texture[2];
  2427
+  fields[2][0].id = vdpau->texture[1];
  2428
+  fields[2][1].id = vdpau->texture[3];
2
Joakim Plate Collaborator
elupus added a note May 29, 2012

Where is the progressive frame? fields[0] should be the progressive scaled image.

Rainer Hochecker Collaborator
FernetMenta added a note May 29, 2012

GL_NV_vdpau_interop does not expose a progressive frame, only fields.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
xbmc/cores/VideoRenderers/OverlayRenderer.cpp
... ...
@@ -90,7 +90,7 @@ long COverlayMainThread::Release()
90 90
 CRenderer::CRenderer()
91 91
 {
92 92
   m_render = 0;
93  
-  m_decode = (m_render + 1) % 2;
  93
+//  m_decode = (m_render + 1) % 2;
4
Joakim Plate Collaborator
elupus added a note May 29, 2012

Eh?

Rainer Hochecker Collaborator
FernetMenta added a note May 29, 2012

I should have deleted this line. m_decode is set by render manager. It calls CRenderer::SetBuffer(int idx)

Joakim Plate Collaborator
elupus added a note May 29, 2012

On this note.. My thoughts here has always been to separate m_decode and m_render. dvdplayer decodes, calls flippage, it signals app thread new frame, but doesn't make the new frame available for render yet. FlipPage blocks until a new decode buffer i available.

This mean that app thread can keep rendering previous frame until it request a new frame. When it request a new frame it check if a frame is available after decode. If there is, it swaps this in and make the previous render frame available for decode (this wakes up decode thread that was waiting for a buffer).

In a sense it's a buffer system where each buffer can have 3 states. Idle, Used for render or used for decode.

Rainer Hochecker Collaborator
FernetMenta added a note May 29, 2012

Not sure about this comment, your description is exactly how it works.
We want to keep overlays in sync with video, hence render manager controls the buffer states. dvdplayer video calls CXBMCRenderManager::WaitForBuffer asking for an idle buffer. It keeps decoding until no buffer is available, then it blocks waiting for a flip event.
App thread calls CXBMCRenderManager::HasFrame asking for "used for decode" buffers ready for render. It sets the buffer state by calling CXBMCRenderManager::FlipRenderBuffer and renders the buffer. In case of HasFrame times out, it renders the previous frame.
App thread calls CXBMCRenderManager::NotifyDisplayFlip right after having swapped front/back buffers. This releases a rendered buffer to state idle.

Buffer states are implemented by a circular buffer. Check the comment in RenderManager.h

idle: all buffers between output and displayed
used for decode: all buffers between output and current
used for render: all buffers between current and displayed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Rainer Hochecker
Collaborator

@elupus
I have rebased and updated this PR. The first two commits are from another PR

Deleted user

The content you are editing has changed. Reload the page and try again.

ping

Sending Request…

Attach images by dragging & dropping or selecting them. Octocat-spinner-32 Uploading your images… Unfortunately, we don't support that file type. Try again with a PNG, GIF, or JPG. Yowza, that's a big file. Try again with an image file smaller than 10MB. This browser doesn't support image attachments. We recommend updating to the latest Internet Explorer, Google Chrome, or Firefox. Something went really wrong, and we can't process that image. Try again.

Cyberwizzard

Is this the fix for the VDPAU issue where at the start of playback only audio is played and XBMC hangs? If so, does it need additional testing?

escalade

Shouldn't this be integrated soon?

arnova
Collaborator

@elupus : What's the status of this? @FernetMenta : could you please rebase this PR on master?

davilla
Collaborator

too late for Frodo, this touches to many areas and beta3 should be the last beta.

arnova
Collaborator

That's a pity although I already suspected that. The reason is that currently using vdpau + h264 + upscaling to 1080p doesn't work properly on Ion (although I assume it's a more general nVidia issue), sometimes the video drops to a very low framerate, especially when dialogs are shown on top of a playing video. This PR fixes that. @elupus @FernetMenta : Is there anything that can be cherrypicked specifically for this issue from this PR perhaps for Frodo?

davilla
Collaborator

yes, a pity but last activity on it was a month ago. plenty of time for someone to have handled it then.

Rainer Hochecker
Collaborator

This is a complete redesign of vdpau, hence I don't think anything can be cherry picked. It's really a pity, I would rather not using XBMC at all on Linux than without those changes.

Memphiz
Owner

Frodo won't be our last release for sure ;). And that said - i'm using XBMC on linux with VDPAU since ages (without this patch) - so at least its not totally broken as is.

Lars Op den Kamp
Collaborator

well, it's broken enough. i haven;t used xbmc on nvidia and amd without @FernetMenta's patches since ages, as that's what makes it usable for pvr, so it's really a shame that this won't be in frodo (although completely understandable)

Peter Frühberger
Collaborator

There is a reason that OpenElec v2.0 shipped with the complete vdpau rewrite. This will also be our basis when going for the OpenElec 3.0 release. If you find some time, read the forum - a huge amount of people is upgrading their default xbmc(buntu) installation to the version with the @FernetMenta patches. That brings btw. other goods, like a XVBA implementation, that bases on the same changes and is therefore strongly coupled ...

arnova
Collaborator

Yeah it's a shame I didn't figure out any time sooner, what the cause of my problems was (and that this PR fixes it). The main reason as that the "scene" recently switched from xvid to h264 for SD releases and I therefor didn't use vdpau acceleration that often... I now also patched my htpc setup, but the users out there that don't have that ability are probably going to b*tch about that...

Wolfgang Schupp
Collaborator

Users already bitch about it all over the forums since Eden was released. Thats why we redirect them to FernetMentas branch(alias Xvba branch)

arnova
Collaborator

Then we might as well only release an ATI/Intel version of xbmcbuntu 12 and redirect them to OpenElec for nVidia....

Peter Frühberger
Collaborator

Erm. ATI does not work at all with mainline xbmc - the xvba-va-driver that is used here is not developed since ages. You get huge artefacts with specific H264 files and some VC1 files don't run at all. Stability is unbelievable bad - also there is to my knowledge no deinterlacing at all, which makes AMD useless for every PVR setup. Nothing can be done about this.

Cause of this @FernetMenta and I started writing XVBA, which replaces the buggy support for AMD via VAAPI. This is also integrated in OpenELEC 2.0 - here some mergendising: http://www.youtube.com/watch?v=jc10em9qQBI

Lars Op den Kamp
Collaborator

way to late to do something about this now. next chance will be frodo+1. maybe ati will have fixed some bugs with their driver at that time too :P

macmenot

well as 99% of Linux users will likely be installing from debian/ubuntu repositories/PPAs, and maybe some Archlinux PKGBUILD, is it worth talking to @amejia1 (e.g.,) about submitting this as a clean debian/patches/vdpau.patch on top of the 12.0 release build, for inclusion in the debian/ubuntu unstable / development repos (where the stable/testing repos would hold 12.0 release) ?

David Teirney
Collaborator

@FernetMenta, is it possible for this to be rebased against master? I'd like to try this PR out to see if it helps with the frequent "only audio starts playing and then XBMC locks up when stopping" problem we've been having playing recorded 1080i DVB-T content on Linux using VDPAU. The current PR doesn't cleanly apply to master. Thanks!

Rainer Hochecker
Collaborator

@dteirney I will do soon. In the meantime can you just try my master?

arnova
Collaborator

Shouldn't the PR's description be made more general? This doesn't only affect vdpau, right?

Rainer Hochecker
Collaborator

Other decoders can benefit from the buffering but in the PR it is only enabled for vdpau.

wmyrda

Seems like the time for merging could be now ;)

cc2cf05

Please take a look at this patch as like it have been said here time and again there is plenty of those that would welcome the patch in the official tree :)

arnova
Collaborator

@elupus: What's the status of this?
@FernetMeta: Please rebase

I would like to test this code by pulling it to my code. I never used GIt so How can I pull thing for testing since I currently work with master..

Thank you.

Martijn Kaijser

Github is not a support forum. Use Google.

ronie
Collaborator

@uNiversal
1) download PR as a patch: add .patch at the end of the url
2) apply patch to master: patch -p1 < 870.patch

davilla
Collaborator

follow the instructions when you click on the little i with a circle around it (green merge bar, left side)

Martijn Kaijser

@elupus @FernetMenta
Can we hit this thing moving again so we can start to sync OE upstream?

Ok Testing this PR against current mainline code and add this patch and compile. As ronie suggested
(13.0-ALPHA1 Git:20130209-20f2624)

Results are astounding, it removes the broken/incorrect VDPAU studio level conversion option from XBMC this alone is merge worthy,.

On a ION gen 1 the slight judder on panning is improved considerably (unnoticeable on any test files so far) on all vdpau accelerated content.
All judder when using Video Scaling other than bilinear is 100% gone with this as is without only bilinear worked.

@davilla @FernetMenta
idk what is holding this back from being merged, since this is working like butter.

If there is anything specific FernetMenta like me to test let me know.

uNi

Joakim Plate
Collaborator
arnova
Collaborator

+1 for getting this in, especially since other decoders can eventually also benefit from this.

xbmc/Application.cpp
((10 lines not shown))
2239 2234
 
2240  
-  TightConditionVariable<InversePredicate<int&> > cv(m_frameCond, InversePredicate<int&>(m_frameCount));
2241  
-  cv.wait(lock,timeout);
2242  
-  done = m_frameCount == 0;
  2235
+  if (!g_renderManager.HasFrame() && !m_frameEvent.WaitMSec(timeout))
  2236
+    return false;
5
Joakim Plate Collaborator
elupus added a note February 10, 2013

This is racy. frameEvent can be set right after the call to reset, then we miss it.

Rainer Hochecker Collaborator

If this is the case g_renderManager.HasFrame() returns true. With the old code application tracked frame counter and had to keep it in sync with renderManager. Now renderManager alone knows whether it has anything to render.

Joakim Plate Collaborator
elupus added a note February 10, 2013

Right.. but it should be a loop, and the the m_frameEvent() should be auto reset and never manually reset. Then there is no race.

Rainer Hochecker Collaborator

:) iirc I had a loop here in the last interation and you were not happy with it.
Need to think about it again.

Rainer Hochecker Collaborator

well, there can't be a race here. frameEvent is set by renderManager AFTER it has something to render. Even there was a context switch after reset(), HasFrame would return true.
I think auto reset make no sense here. RenderManager can set the event for e.g. 2 new frames while application is still in one cycle.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
xbmc/cores/VideoRenderers/OverlayRenderer.cpp
((26 lines not shown))
166 172
 
167  
-  Release(m_buffers[m_decode]);
  173
+void CRenderer::ReleaseBuffer(int idx)
  174
+{
  175
+  CSingleLock lock(m_section);
  176
+  Release(m_buffers[idx]);
168 177
 }
4
Joakim Plate Collaborator
elupus added a note February 10, 2013

I don't really see the point of the changes to the overlay renderer. You are still going to render stuff in the correct order right? Just split the flip on the decoder side from the flip on the render side. Forcing caller to keep track of order of stuff seem wrong imho.

Rainer Hochecker Collaborator

What do you mean with "forcing the caller". RenderManager is in control of the buffers and keeps overlay in sync with video.

Joakim Plate Collaborator
elupus added a note February 10, 2013

Ok. i can live with it. But why do you need the ReleaseBuffer() function? You can't control the what it will render next with this API anyway. So why can't the Flip call just release previous render buffer?

Rainer Hochecker Collaborator

It is released after front/backbuffers (XBMCRenderManager::NotifyDisplayFlip) have been flipped. This is the time we know the displayed buffer is ready for reuse. It's the most efficient use of the buffers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
xbmc/cores/dvdplayer/DVDCodecs/Video/DVDVideoCodec.h
... ...
@@ -243,4 +250,11 @@ class CDVDVideoCodec
243 250
   {
244 251
     return 0;
245 252
   }
  253
+
  254
+  virtual bool GetPts(double &pts, int &skippedDeint, int &interlaced)
  255
+  {
  256
+    return false;
  257
+  }
  258
+
6
Joakim Plate Collaborator
elupus added a note February 10, 2013

Eh? this is weird. You should never ask codec for pts.

Rainer Hochecker Collaborator

This is needed to control dropping. Consider we just dropped a frame in decoder, this is at the end of the queue. How should player know that it has already catched up? You can't by just looking at the front of the queue.

Joakim Plate Collaborator
elupus added a note February 10, 2013

The my question is. why do we queue stuff in decoder? It shouldn't if at all possible.

If we must, it would make much more sense that the Decode() function return the VC_DROPPED flag even if it currently return a valid picture. (return values of decode doesn't have to refer to current picture)

What is the skippedDeint for? Anything that applies to current output picture, should be marked in the picture structure. Anything that is generic should be in the Decode() return.

Rainer Hochecker Collaborator

In order to get the most out of it the decoder can decode frames in advance (the thread does not block in flipPage). This way the one or other decoding cycle can even last longer than refreshRate without the need of dropping a frame.

vdpau decodes, then does deinterlacing. de-interlacing needs 4 past and 2 future frames. Hence there's quite a couple of frames in the queue.

When doing de-interlacing it is better just to skip a deinterlaced frame instead of dropping in decoder. The decoder signals with this flag that it just skipped a deinterling cycle. This can't be determined by pts.

Joakim Plate Collaborator
elupus added a note February 10, 2013

Right. So technically there is no queue in the codec, it's just the minimum amount of frames needed to do deinterlace. That is perfectly fine.

I just don't like the api change. It should be solvable with return values from decode since I see no need for the PTS value returned. What do you need it for. Note decoder doesn't even know PTS until it has finished decoding a frame.

Could you explain to me what the PTS value is for. Then maybe I can understand the need for it.

Rainer Hochecker Collaborator

PTS is right after ffmpeg decoder. For vdpau (and xvba) those decoded frames queue up in IHardwareDecoder::Decode for post processing. After this stage the frames go to the queue in renderer. For simplicity consider an entire queue (after ffmpeg) of 10 frames. Lateness detection is done best at the end of renderer, just before the frames get rendered to screen. Now if we are late and need to drop a frame it takes a while that this gain in time will become noticeable at the stage we determine lateness.

CDVDPlayerVideo::CalcDropRequirement askes for PTS right after decoder. Doing this it can detect gaps, dropped frames, and correct lateness by already dropped frames. Otherwise we would drop the following 9 frames as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Joakim Plate
Collaborator

We really need to do this pull in two steps. The changes to rendermanager and queueing and the vdpau stuff.

I have not real opinion on the vdpau stuff internally, i'll leave that up to somebody who knows it.

But I do have opinions on the rendermanagers and the codec interfaces in dvdplayer. I really do think it's a good idea to allow queuing of frames in rendermanager/ renderers so we should start working on that as a separate pull.

Note I want queueing working for non HW accel rendering too, maybe it already does with this pull i'm not sure.

Rainer Hochecker
Collaborator

np, I'll create a separate pr for the buffering when back from business trip next weekend.

Note I want queueing working for non HW accel rendering too, maybe it already does with this pull i'm not sure.

I see no reason why it should not work when being activated. But I have to run some tests to be sure.

Fneufneu
Collaborator

PR tested yesterday with fresh master and xbmc crashed when i hit X to stop a video.
xbmc.log: http://pastebin.com/iWqddnTb
backtrace: http://pastebin.com/StAS12WX

Peter Frühberger
Collaborator

Erm, I don't think this is him - it was me: #2289

Fneufneu
Collaborator

fritsch: xbmc continue to crash with your patch and give same backtrace :(

i'll try on another hardware.

Peter Frühberger
Collaborator

Okay, then it was not me - hehe.

Rainer Hochecker
Collaborator

I think I have identified the problem. Interesting that it did not show earlier. Need to think about a solution.

Rainer Hochecker
Collaborator

@Fneufneu hmm, I have not identified this problem. According to your backtrace it crashes when releasing a lock in CVdpau::Release. Do you think you can get the line in this method where this happens?

Fneufneu
Collaborator

@FernetMenta i will !
i tried today on another desktop on FreeBSD and no crash ...
so difference between two systems are: gcc vs clang, boost libs, ..

Fneufneu
Collaborator

@FernetMenta CDecoder::Release() finish completly
and a CLog::Log() after the SAFE_RELEASE(m_pHardware) show nothing :(

Fneufneu
Collaborator

and the most interesting information: it crash only when built with clang !
no problem with gcc 4.6 or 4.7
clang: 3.2 and 3.3 r174891

Rainer Hochecker
Collaborator

Do you think the issue is caused by clang or this pr? Any ideas how to track this further down? I am not familiar with clang.

Fneufneu
Collaborator

can someone with clang try this PR plz ?
i need to know if it's a FreeBSD problem.

davilla
Collaborator

iOS/OSX has clang when building on 10.8 but no vdpau so this does not help

macmenot

@Fneufneu what compiler flags / optimization level did you use? I could build XBMC 12.0 + this PR using clang on debian, but if the issue is due to clang optimization flags (e.g.,) I wouldn't see it.

macmenot

Interestingly I notice that Eden wasn't previously able to compile successfully on clang at all?
http://buildd-clang.debian.net/package.php?p=xbmc&suite=sid

Fneufneu
Collaborator

@macmenot see PR #1535 for this problem
i need it for FreeBSD but i think OSX doesn't

Rainer Hochecker
Collaborator

@Fneufneu As mentioned I am not familiar with llvm but at least the name implies that it does or can do some kind of garbage collection. Is it possible to disable this? It looks like the decoder object is released though it is still ref counted. It is ref counted by structures 2 levels below the same object. Maybe the gc, if any, does not see this.

Fneufneu
Collaborator

so maybe we hit a clang bug here, i'll request some help to debug this.
but does this bug also hit OSX ?

davilla
Collaborator

It can't hit osx in the same place, no vdpau on osx.

Rainer Hochecker FernetMenta closed this August 02, 2013
Rainer Hochecker FernetMenta referenced this pull request August 02, 2013
Merged

VDPAU redesign #3037

Tobias Hieta tru referenced this pull request from a commit in plexinc/plex-home-theater-public November 20, 2013
Tobias Hieta Unknown SRT's should show Unknown not None in the interface.
Fixes #870
f24f7d5
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.