Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High CPU usage on radeon vdpau with subittles #202

Closed
dolohow opened this issue Aug 29, 2013 · 23 comments
Closed

High CPU usage on radeon vdpau with subittles #202

dolohow opened this issue Aug 29, 2013 · 23 comments
Labels
down-upstream features and bugs that need to be implemented and fixed upstream

Comments

@dolohow
Copy link

dolohow commented Aug 29, 2013

Mesa: 9.2
Kernel: 3.10
Video card: Radeon 6530D HD
Output: vdpau

When subtitles showing up the CPU usage dramatically increases to 40% when normally it's about 9%. Applies to both no-ass and ass mode.
I cannot reproduce this issue with my nVidia card on another PC.

@pigoz
Copy link
Member

pigoz commented Aug 29, 2013

Are you using vdpau to attempt accelerated video decoding? If so, why no use vaapi?

@dolohow
Copy link
Author

dolohow commented Aug 29, 2013

First of all I'm using open source drivers which not support vaapi.
I also forgot to mention that problem didn't occurred using mplayer which is mpv regression.

@ghost
Copy link

ghost commented Aug 29, 2013

Well, mpv and mplayer do things a little bit differently by now. With ASS subtitles, performance shouldn't be worse with mpv, actually it should be better. With mplayer OSD or -noass mode, performance might be betterr than mpv, because mplayer uses a possible simpler OSD format here (but whether that really is the case... I have my doubts).

IMO this points rather to a Mesa performance problem, than a mpv problem.

Note that --no-ass in mpv doesn't really change how things are done on the VO level (other than in mplayer). In mpv, both -ass and -no-ass use libass for rendering the OSD.

Meanwhile, you can try the --force-rgba-osd-rendering. This puts more load on the CPU (actually, it makes rendering much slower on my machine), but could help making things simpler for the driver.

@ghost
Copy link

ghost commented Sep 15, 2013

Ping?

@aphirst
Copy link

aphirst commented Sep 28, 2013

I can confirm this behaviour in mpv stable 0.1.7 and git (today, 28th Sep '13).

E2-1800 /w HD7340
linux or linux-ck 3.11
xf86-video-ati 7.2.0
mesa 9.2.0

For me this is best demonstrated using a Hi10P (i.e. not hardware-decodable) file with high bitrate sections, and ASS subtitles, e.g. https://dl.dropboxusercontent.com/u/3219541/Kannagi_OP.mkv

With vo_xv (or in other players such as VLC) this file plays back fine, with the full ASS subtitle track; no desync, no late frames.

With vo_vdpau (which I obviously have set for the sake of all my other hardware-decodable files), tons of desync and hundreds of late frames at the high-bitrate "film grain" sections. Disabling ASS helps; but fully disabling the subtitle track (changing sub track, the j key by default) lets the file play through with no desync at all.

--force-rgba-osd-rendering makes things drastically worse; presumably because the extra CPU load further hinders the CPU decoding of the Hi10P file.

So yeah, it kinda looks like the way VDPAU handles the subtitle rending performs poorly, at least with the radeon driver.

@ghost
Copy link

ghost commented Sep 28, 2013

Probably a Mesa issue. Either VdpBitmapSurfaceCreate or VdpBitmapSurfacePutBitsNative must be pretty slow.

Disabling ASS helps; but fully disabling the subtitle track (changing sub track, the j key by default)

What do you mean by disabling ASS?

--force-rgba-osd-rendering makes things drastically worse

This flag should reduce the load on vdpau a little, but even if it really does, the additional CPU load probably kills it. (This involves unoptimized C code, which decisions between quality/correctness and performance going in favor of quality/correctness.)

Some technical background: ASS subs can lead to a very high number of sub-surfaces, and the driver could be slow or have trouble handling such a large number of surfaces. --force-rgba-osd-rendering blends the libass sub-surfaces into a rather low number of RGBA surfaces, which in theory might help the driver.

@ghost
Copy link

ghost commented Sep 28, 2013

Looking at the file you posted, these subtitles are pretty simple. Each event has a 300 ms fade in and fade out, which might makes things slower. Can you extract the subtitle track and try test the file with passing the extracted subtitle track to -sub? After that, edit the file and remove all {\fad(300, 300)} parts, and test again.

@aphirst
Copy link

aphirst commented Sep 28, 2013

I'll repeat some of our IRC conversation here, so that anyone else can get on the same page.

By disabling ASS I meant using --ass=no, which just disables the styles. I think mpv still uses libass for rendering either way.

By extracting the .ass file and using it on a fairly basic test file (some crappy low bitrate AVI), I still get noticeable desync at the text fades. Editing out the fades eliminates that, so vo_vdpau is definitely doing something inefficiently here (but obviously the fault could be anywhere from in xf86-video-ati to mesa, mpv's calls to libvdpau or maybe even vdpau itself?)

I also tested things with --vf=sub and --vf=format=420p,scale,sub, which either had little noticeable difference or made things a little worse. Also, on close inspection of the A-V counter, even with subtitles disabled (j) vo_vdpau still drifts a little with that Hi10P file (where vo_xv doesn't, especially with subtitles also disabled).

I hope we're able to work out what the weak link is here...

@aphirst
Copy link

aphirst commented Sep 28, 2013

Another point to possibly cement the idea that the problem here is (mostly) subtitle related; I even have a couple of normal 8bit h.264 files (specifically here, the 720p DmonHiro BD rips of Umineko) which decode effortlessly without the subtitles, but the subtitles cause insane desync with vo_vdpau, even when used with a low-bitrate file for testing (just like before). Again, perfectly fine with vo_xv.

I haven't extracted the fonts, but that doesn't seem to have an effect on the performance (or lack thereof). https://dl.dropboxusercontent.com/u/3219541/umineko.ass

I should clarify here that I can't really rely on software decoding for all files, since that doesn't cope with my 1080p 8bit stuff, which is obviously fine with the video card. All my Hi10P stuff is, and has to be, 720p, because the CPU is kinda weak. My experience is that when software-decoding, it being 8 or 10bit doesn't really matter, it's more the bitrate/resolution. So if there are any 1080p 8bit files which get this subtitle-related desync, I can't really use vo_xv :P

@ghost
Copy link

ghost commented Sep 28, 2013

I still get noticeable desync at the text fades. Editing out the fades eliminates that

To add to that: when displaying static subs, you have to render the subtitles every frame (makes sense, doesn't it?). Fades, appart from making libass work, additionally require uploading new sub-bitmaps on every frame. So, let's assume that Mesa ideally has fast rendering of sub-bitmaps, then even slow upload of sub-bitmaps could ruin performance.

--vf=sub

Probably works in 10 bit, and converts to 8 bit later, which should be slower than this:

--vf=format=420p,scale,sub, which either had little noticeable difference or made things a little worse

That's strange, especially compared to xv, which calls exactly the same code to render subtitles. So maybe Mesa vdpau is generally less efficient than xv.

@aphirst
Copy link

aphirst commented Sep 29, 2013

Fades, apart from making libass work, additionally require uploading new sub-bitmaps on every frame. So, let's assume that Mesa ideally has fast rendering of sub-bitmaps, then even slow upload of sub-bitmaps could ruin performance.

I think you might be right. Several other things I've watched in the past few days have had a mixture of static and animated (fade, or karaoke-style left-to-right fill) subs. The animated sections and vo_vdpau do not get on at all, while the static sections seem to have no problems.

@dolohow
Copy link
Author

dolohow commented Oct 11, 2013

Mesa 9.2.1 fixed this issue for me.

I'll leave it open for now due to the fact that I wasn't the only one.

@aphirst
Copy link

aphirst commented Oct 12, 2013

Still present for me; I was gonna wait till I was running Linux 3.12 with its Radeon improvements before checking back, I'm hoping something there indirectly affects this.

EDIT: Despite what I initially thought when testing, this behaviour IS present for me with mplayer as well; substantially worse and more pronounced than with mpv.

@aphirst
Copy link

aphirst commented Nov 14, 2013

OK, well, I've finally upgraded to Linux 3.12, and also to mesa/mesa-libgl/ati-dri 9.2.3, so I thought I'd touch base again.

The behaviour I was getting has definitely improved. For example, the umineko.ass subtitles I mention above no longer desynch by over ten seconds and take over 20 for the video to resynch; instead it's now merely a couple of seconds desynch, returning to normal in only another couple. Additionally, the subtitles from the Victorique BD release of Gosick during the OP used to cause over 10 seconds of desynch, now reduced to a second or so. An improvement, but it's not gone away completely, since CPU usage is still just under full.

The subtitles for the Kannagi OP (again, somewhere above) still cause significant desync; presumably because that video file can't be hardware-decoded (Hi10P), and so the CPU is additionally strained. If I extract the ASS file and use it on another (hardware-decodable) video, then the fade/transition effects only cause one or two late frames, which I think adds to the idea that it's related to CPU usage.

On the other hand, playing the same files/subtitles in mplayer doesn't seem to show any improvement. That comes off as somewhat strange to me, because all the improvements I've mentioned happened when I upgraded linux and mesa, rather than when I upgraded mpv (which has happened several times since I last commented here). I guess mpv is just fundamentally doing things a better way, that can actually take advantage of kernel/graphics improvements.

So yeah. Things are better; but there's still a problem.

@Zehkul
Copy link

Zehkul commented Nov 20, 2013

You could try switching to performance governor. VDPAU seems really sensitive there, opengl does get a very noticeable performance improvement too though.
I get up to 10 percentage points less CPU usage with vdpau + performance governor, and that’s quite a lot since mpv caps at 30-35% cpu usage (opengl already at 25-30%?) for me when desynch happens. (Quad core, so probably 5-10% for decoding hi10p and one core busy with subtitles?)

Ondemand changes in linux 3.12 might or might not have improved this, I only tested this just now and don’t know about older kernels. Opengl is still more efficient and faster than vdpau either way, so not much changed there, and xv performance ridiculously good. (As usual)

@aphirst
Copy link

aphirst commented Nov 20, 2013

As far as I can tell, on my hardware setting to 'performance' doesn't have much effect (will faff a bit more and add to this post later). Since my machine's quite weak, CPU usage for intense subtitles can easily hit 80% or 90% for each core.

Yeah, I'm not sure exactly what it is about 3.12 that slightly improved things for me, but I do remember reading about the whole ondemand thing.

Am I hallucinating, or is VDPAU/OpenGL interop to be implemented in the forseeable future? Being able to decode video frames on hardware, and do all the extra processing via OpenGL would (surely?) improve this a whole lot.

@ghost
Copy link

ghost commented Nov 20, 2013

Am I hallucinating, or is VDPAU/OpenGL interop to be implemented in the forseeable future?

It already is in Mesa git, but it's buggy with mpv. Almost surely a Mesa bug. Not sure if it got fixed already.

@mia-0
Copy link
Member

mia-0 commented Dec 11, 2013

I also get quite horrible subtitle performance with VDPAU on r300g. It's much better with vo_opengl (especially with recent Mesa), although still not optimal. --force-rgba-osd-rendering has no effect on either VO.

Switching to the "performance" governor has no noticeable effect either (though powersave behavior is apparently distro-specific, and mine seems to go with a "get things done ASAP, don't switch C states too aggressively" strategy with the "ondemand" governor). Some 3.11 kernel.

Anyway, it doesn't seem like there's a lot we can do other than waiting for Mesa to fix itself. I'll try profiling this thing with apitrace later, but I don't expect to find out how to improve performance on our side.

@mia-0
Copy link
Member

mia-0 commented Jan 7, 2014

Just some update on my plans to apitrace this: Can't do this; Mesa doesn't support some required extension for GPU profiling.

@dolohow
Copy link
Author

dolohow commented Feb 24, 2014

With Mesa 10.0.3 and mpv 0.3.5 there is only few percent raise when subtitles appearing.

@aphirst
Copy link

aphirst commented Feb 25, 2014

This has substantially improved in recent months, presumably thanks to improvements in mesa, xf86-video-ati, and of course in libass (SSE/AVX work, also the combining bitmaps thing).

In fact, currently the only noticeable subtitle-related slowdowns I get at all are during Gaussian blurs (the one big thing missing from the libass ASM work), which is of course completely unrelated to mpv.

It kinda seems like the problem has been more sidestepped than solved, but the end result seems to be the same.

@ghost
Copy link

ghost commented Apr 8, 2014

Is this still a problem?

Gaussian blur still hasn't received any optimizations in libass master.

@aphirst
Copy link

aphirst commented Apr 10, 2014

@wm4 I would say that it appears to be solved. Other than the Gaussian blur stuff, subtitle-rendering seems to be basically fine with either vo_vdpau or vo_opengl on the affected setups. Perhaps someone else could confirm?

@ghost ghost removed the meta:info-needed label Jul 18, 2015
@ghost ghost closed this as completed Jul 18, 2015
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
down-upstream features and bugs that need to be implemented and fixed upstream
Projects
None yet
Development

No branches or pull requests

5 participants