Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate Chromium GPU flags #5378

Open
The-Compiler opened this issue Apr 22, 2020 · 21 comments
Open

Investigate Chromium GPU flags #5378

The-Compiler opened this issue Apr 22, 2020 · 21 comments
Labels
component: performance Issues with performance/benchmarking. priority: 1 - middle Issues which should be done at some point, but aren't that important.

Comments

@The-Compiler
Copy link
Member

Split off from #5375 (cc @maximbaz @toofar):

There are various GPU-related flags which seem to make a noticable difference in CPU usage on e.g. https://yourbias.is - for example "ignore-gpu-blacklist", "enable-gpu-rasterization", "enable-native-gpu-memory-buffers", "num-raster-threads=4".

Similar flags are also on various how-to's to force GPU acceleration in Chromium around the web. I also remember some 4chan posts suggesting those as well.

Note that some of those (or their opposites) are also handled by QtWebEngine.

In general, I presume there's a good reason those are set in Chromium and/or Qt. However, it might be good to find out if:

  • Any of those flags are set differently in different Qt versions, and if so, if we can set them with earlier versions already.
  • Any of those flags can be set upstream in QtWebEngine (even if not set in Chromium) because of the different graphics handling.

If they land upstream (without any other changes to Qt's code), it should be safe to enable them for qutebrowser as well. I'd like to avoid blindly enabling them by default though, without knowing what the impact is.

@The-Compiler The-Compiler added component: performance Issues with performance/benchmarking. priority: 1 - middle Issues which should be done at some point, but aren't that important. labels Apr 22, 2020
@maximbaz
Copy link
Contributor

maximbaz commented Apr 22, 2020

I've spent a bit of time looking into this:

  • enable-gpu-rasterization is the main flag that gives me all the performace; if we could focus on one flag only it should be this one. Qt doesn't enable it, but on the other hand there is no comment saying that they intentionally do so. They do disable this feature but only if user explicitly asks for software rendering. They might be thinking that this feature is enabled by default (or maybe it is, but in a newer version?).
    • ===> This is a great candidate to be enabled by default.
  • ignore-gpu-blacklist alone doesn't do much, but it is needed if a certain feature is banned from an OS or from a specific subset of drivers/devices. For example, from my time of maintaining AUR/chromium-vaapi I can tell you that VA-API rendering is blacklisted for all Linux users, whether their graphics card supports VA-API or not, and we use this flag to override that and enable VA-API. But general GPU rasterization works well for me even if I don't provide this flag.
    • ===> I would strongly recommend against enabling this flag by default, because those whose hardware does not support GPU rasterization will suffer (we did this in Arch Linux once, and although some were happy for VA-API, others were not happy for completely broken browser...). People who need to enable it for their hardware should do it manually and understand the risk.
  • enable-native-gpu-memory-buffers changes nothing for me, there is no mention of it in Qt.
    • ===> I propose we ignore this flag
  • enable-zero-copy slightly reduces the performance for me, this flag is part of the group that Qt comments with "These are currently only default on OS X, and we don't support them"
    • ===> I propose we ignore this flag
  • enable-oop-rasterization slightly reduces the performance for me, there is no mention of this in Qt
    • ===> I propose we ignore this flag
  • enable-gpu-memory-buffer-compositor-resources completely breaks rendering for me, this flag is part of the group that Qt comments with "These are currently only default on OS X, and we don't support them"
    • ===> I propose we ignore this flag
  • enable-viz-display-compositor is enabled by default if your GPU is threaded (there probably is a reason for this), it is enabled for me and works well.
    • ===> No action needed
  • num-raster-threads=4 seems to be default, if I launch qutebrowser with no flags, in htop I see QtWebEngineProcess having this flag.
    • ===> No action needed
  • enable-accelerated-2d-canvas seems to be enabled by default, as per chrome://gpu.
    • ===> No action needed

chrome://gpu is a good place to look for the effects of some of these flags.

@The-Compiler
Copy link
Member Author

The-Compiler commented Apr 22, 2020

Thanks for the detailed analysis, @maximbaz! One thing I'm still left wondering about, though: Why is GPU rasterization disabled by default in Chromium?

Some interesting links:

@maximbaz
Copy link
Contributor

It seems it's only disabled by default on Linux. Maybe it's the same story as with VA-API, where it works for some but not for all. I would hope that if a certain hardware is in blacklist, enable-gpu-rasterization would do nothing (unless ignore-gpu-blacklist is also provided), but if this is not the case, then enabling enable-gpu-rasterization for everyone could be just as dangerous as enabling ignore-gpu-blacklist. Not sure how to validate this 🤔

@jgkamat
Copy link
Member

jgkamat commented Apr 22, 2020

It seems it's only disabled by default on Linux.

FWIW with default profiles rasterization in chrome://gpu seems to be enabled for me on my primary machine for both debian and fedora (but not on arch). Not sure if it's patches/wrappers or something else. Fedora also seems to have video decoding Enabled (??)

1587588439

@maximbaz
Copy link
Contributor

Fedora also seems to have video decoding Enabled (??)

Fedora was among the first distros to enable hardware video acceleration by default, it's done via patch because this is a choice you do during compilation (as opposed to GPU rasterization which you can enable in runtime via flag). On Arch we tried this once too, but reverted because it caused too many issues - for now it only exists as AUR/chromium-vaapi.

FWIW with default profiles rasterization in chrome://gpu seems to be enabled for me on my primary machine for both debian and fedora (but not on arch)

This is interesting observation! I can't easily find why this is the case. Is everything else relatively similar in your setup? Is it the same machine running all 3 OS, or with similar hardware, is desktop environment the same, etc.?

@jgkamat
Copy link
Member

jgkamat commented Apr 22, 2020

This is interesting observation! I can't easily find why this is the case. Is everything else relatively similar in your setup? Is it the same machine running all 3 OS, or with similar hardware, is desktop environment the same, etc.?

The kernel, graphics driver, and WM are all the same and they are all running simultaneously, each distro here is effectively a chroot. I'm using --user-data-dir=tmp<num> to make sure each profile is clean. I couldn't find any note of enabling this in the debian patches so I'm not sure why either. The only big difference I could find is that arch is at chrome 81 and fedora/debian are on 80.

@The-Compiler
Copy link
Member Author

It seems it's only disabled by default on Linux. Maybe it's the same story as with VA-API, where it works for some but not for all. I would hope that if a certain hardware is in blacklist, enable-gpu-rasterization would do nothing (unless ignore-gpu-blacklist is also provided), but if this is not the case, then enabling enable-gpu-rasterization for everyone could be just as dangerous as enabling ignore-gpu-blacklist. Not sure how to validate this

If it was the case, I suppose it would be turned on by default in Chromium already? That difference between Debian/Fedora/Arch sure is weird though...

@Hi-Angel
Copy link
Contributor

Hi-Angel commented Jun 15, 2020

UPD: okay, nvm, apparently command line not being used to pass flags, but I can see changes in chrome://gpu

Outdated question

What do you use in config to enable these flags though? I have in config.py a line

c.qt.args = ["enable-native-gpu-memory-buffers", "enable-gpu-rasterization", "use-gl=egl", "ignore-gpu-blacklist"]

However when I do ps aux | grep WebEngine, I don't see any of these on the command line:

…
constan+  239310  0.4  0.9 1209808 72696 ?       Sl   01:12   0:00 /usr/lib/qt/libexec/QtWebEngineProcess --type=renderer --disable-gpu-memory-buffer-video-frames --enable-threaded-compositing --use-gl=desktop --enable-features=AllowContentInitiatedDataUrlNavigations,TracingServiceInProcess --disable-features=BackgroundFetch,BlinkGenPropertyTrees,MojoVideoCapture,NetworkServiceNotSupported,OriginTrials,SmsReceiver,UsePdfCompositorServiceForPrint,WebAuthentication,WebAuthenticationCable,WebPayments,WebUSB --lang=en-US --webengine-schemes=qute:lL;qrc:sLV --num-raster-threads=2 --enable-main-frame-before-activation --service-request-channel-token=9272405711495504544 --renderer-client-id=1714 --shared-files
…

@fmartingr
Copy link

In my case, Video Decode: Hardware accelerated didn't show up until I turned on the flag ignore-gpu-blacklist.
Using an integrated intel gpu on a Dell XPS laptop.
After enabling this flags it seems that my browser performance improved drastically for me. I had some freezes while opening tabs previously (mouse/keyboard I/O would freeze but you saw progress on screen sometimes).

@rien333
Copy link
Contributor

rien333 commented Mar 7, 2021

[@fmartingr wrote:] In my case, Video Decode: Hardware accelerated [showed up]

Because I just spend more than a day trying to get hardware accelerated video to work in qutebrowser/qtwebengine, allow me some observations about what still needs to be done to get it to actually work.

Note that while the flags mentioned by @fmartingr and others in this thread are necessary for hardware accelerated video decoding (h264, vpx, and anything supported by vaapi), they are by themselves not sufficient (check the Video*/0 fields of intel-gpu-top to check if your h264/vpx units are actually being used). For this, you'll need a build of chromium with vaapi support enabled. In the recent year, obtaining such a build has become incredibly easy: you more or less supply a build flag (use_vaapi=true/false) at compile time. And within the last two years, most distros have indeed set this flag to true, and since v88 of chromium , this flag effectively defaults to true within the source code itself. (for more information, see the arch wiki)

Currently, qtwebengine 5.15.3 sits at chromium v87, which should be sufficiently new for everything to work smoothly. And indeed, chromium's 87 hw accelerated video decoding works great if I download that version from the arch repos. The thing is, the Qt company hasn't enabled the use_vaapi build flag in qtwebengine 5.15.3, which is conservative even for Debian's standards (to emphasize: this compile time flag enables support, but doesn't turn on anything at runtime). (this negligence fits this whole "sneaky" release of webengine 5.15.3, I guess).

@The-Compiler: Do you think the Qt people will be willing to turn the use_vaapi flag on? Within qutebrowser, you could then in principle make a video_hardware_acceleration flag that turns on ignore-gpu-blacklist and the like. In my opinion, it would be nice for qutebrowser to have the powersavings that I observe from GPU decoding in other programs (I know mpv exists, but this isn't always an option).

@everyone else: I've almost succeeded in turning on vaapi support in my custom build of qt webengine 5.15.3 (you only need to change a couple of lines, mostly just turning on use_vaapi). Still, it errors right at the linking stage, probably because the Qt project doesn't pass the right linker flags when creating QtWebEngineCore. Perhaps someone smarter than me knows how I can avoid these linking errors:

ulimit -n 4096 && g++ @/tmp/makepkg/qt5-webengine/src/build/src/core/release/QtWebEngineCore_o.rsp -Wl,--start-group @/tmp/makepkg/qt5-webengine/src/build/src/core/release/QtWebEngineCore_a.rsp -Wl,--end-group -Wl,-z,noexecstack -Wl,--fatal-warnings -Wl,--build-id=sha1 -fPIC -Wl,-z,relro -Wl,-z,now -Wl,-z,defs -m64 -Wl,-O2 -Wl,--gc-sections -rdynamic -Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now -Wl,--enable-new-dtags -Wl,-whole-archive -lqtwebenginecoreapi -Wl,-no-whole-archive -Wl,--no-undefined -Wl,--version-script,QtWebEngineCore.version -Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now -Wl,--enable-new-dtags -shared -Wl,-Bsymbolic-functions -Wl,-soname,libQt5WebEngineCore.so.5 -o libQt5WebEngineCore.so.5.15.3   /usr/lib/libQt5Quick.so /usr/lib/libQt5Gui.so /usr/lib/libQt5QmlModels.so /usr/lib/libQt5WebChannel.so /usr/lib/libQt5Qml.so /usr/lib/libQt5Network.so /usr/lib/libQt5Positioning.so /usr/lib/libQt5Core.so -lpthread -lGL -lpthread -ldl -lrt -licui18n -licuuc -licudata -lsmime3 -lnss3 -lnssutil3 -lplds4 -lplc4 -lnspr4 -lz -levent -lresolv -ljpeg -lm -lopus -lavcodec -lavformat -lavutil -lX11 -lXcomposite -lXdamage -lXext -lXfixes -lXrender -lXrandr -lXtst -lgio-2.0 -lgobject-2.0 -lglib-2.0 -lpng16 -lwebpmux -lwebpdemux -lwebp -lfreetype -lexpat -lfontconfig -lharfbuzz-subset -lharfbuzz -lre2 -lX11-xcb -lxcb -lxkbcommon -ldbus-1 -ldrm -lgbm -lpci -lasound -lsnappy -lva -lxml2 -lxslt -lminizip -llcms2 -L/tmp/makepkg/qt5-webengine/src/build/src/core/api/release -lGL
/usr/bin/ld: /tmp/makepkg/qt5-webengine/src/build/src/core/release/obj/media/gpu/chromeos/common/platform_video_frame_utils.o: in function `media::CreateNativePixmapDmaBuf(media::VideoFrame const*)':
platform_video_frame_utils.cc:(.text._ZN5media24CreateNativePixmapDmaBufEPKNS_10VideoFrameE+0x14c): undefined reference to `gfx::NativePixmapDmaBuf::NativePixmapDmaBuf(gfx::Size const&, gfx::BufferFormat, gfx::NativePixmapHandle)'
/usr/bin/ld: /tmp/makepkg/qt5-webengine/src/build/src/core/release/obj/ui/ozone/platform/drm/gbm/client_native_pixmap_factory_drm.o: in function `ui::CreateClientNativePixmapFactoryDrm()':
client_native_pixmap_factory_drm.cc:(.text._ZN2ui34CreateClientNativePixmapFactoryDrmEv+0x1): undefined reference to `gfx::CreateClientNativePixmapFactoryDmabuf()'
/usr/bin/ld: /tmp/makepkg/qt5-webengine/src/build/src/core/release/obj/ui/ozone/platform/drm/gbm/gbm_surface_factory.o: in function `ui::(anonymous namespace)::GLOzoneEGLGbm::CreateOffscreenGLSurface(gfx::Size const&)':
gbm_surface_factory.cc:(.text._ZN2ui12_GLOBAL__N_113GLOzoneEGLGbm24CreateOffscreenGLSurfaceERKN3gfx4SizeE+0x27): undefined reference to `gl::SurfacelessEGL::SurfacelessEGL(gfx::Size const&)'
/usr/bin/ld: /tmp/makepkg/qt5-webengine/src/build/src/core/release/obj/ui/ozone/platform/drm/gbm/gbm_surfaceless.o: in function `ui::GbmSurfaceless::~GbmSurfaceless()':
gbm_surfaceless.cc:(.text._ZN2ui14GbmSurfacelessD2Ev+0x19): undefined reference to `gl::SurfacelessEGL::Destroy()'
/usr/bin/ld: gbm_surfaceless.cc:(.text._ZN2ui14GbmSurfacelessD2Ev+0x163): undefined reference to `gl::SurfacelessEGL::~SurfacelessEGL()'
/usr/bin/ld: /tmp/makepkg/qt5-webengine/src/build/src/core/release/obj/ui/ozone/platform/drm/gbm/gbm_surfaceless.o: in function `ui::GbmSurfaceless::GbmSurfaceless(ui::GbmSurfaceFactory*, std::unique_ptr<ui::DrmWindowProxy, std::default_delete<ui::DrmWindowProxy> >, unsigned int)':
gbm_surfaceless.cc:(.text._ZN2ui14GbmSurfacelessC2EPNS_17GbmSurfaceFactoryESt10unique_ptrINS_14DrmWindowProxyESt14default_deleteIS4_EEj+0x2d): undefined reference to `gl::SurfacelessEGL::SurfacelessEGL(gfx::Size const&)'
etc.

@The-Compiler
Copy link
Member Author

Currently, qtwebengine 5.15.3 sits at chromium v87, which should be sufficiently new for everything to work smoothly.

Agreed. The last related bugfix I could find is media/gpu/vaapi: fix VA-API on X11 after Ozone was enabled which is part of 87.0.4280.144.

The thing is, the Qt company hasn't enabled the use_vaapi build flag in qtwebengine 5.15.3, which is conservative even for Debian's standards (to emphasize: this compile time flag enables support, but doesn't turn on anything at runtime). (this negligence fits this whole "sneaky" release of webengine 5.15.3, I guess).

As much as I oppose their decisions, let's not get into unconstructive conspiracy theories please. A major Chromium update usually means millions of changed lines, and takes them about a person-month to adapt (IIRC). Chromium moves at an incredible pace, with commits all couple of minutes, pretty much around the clock (see also). It's easy to miss something in such an environment, especially because apparently nobody really ever asked about this (other than on embedded Linux).

@The-Compiler: Do you think the Qt people will be willing to turn the use_vaapi flag on?

Not sure that'd happen for Qt 5.15 LTS. It's in its strict phase, which limits which changes are accepted. I suppose you could argue this falls under Performance: significant fix improving O() or is a NOP in general, but it looks like it was mistakenly enabled at runtime before (fixed in 87)...

For the next normal release containing QtWebEngine (6.2), I'm guessing this won't be relevant anymore, given that they're working on updating to 88 already.

Doesn't hurt to ask though.

Within qutebrowser, you could then in principle make a video_hardware_acceleration flag that turns on ignore-gpu-blacklist and the like. In my opinion, it would be nice for qutebrowser to have the powersavings that I observe from GPU decoding in other programs (I know mpv exists, but this isn't always an option).

It looks like there's a new enable-accelerated-video-decode flag - with that, I don't think ignore-gpu-blacklist is still required?

@everyone else: I've almost succeeded in turning on vaapi support in my custom build of qt webengine 5.15.3 (you only need to change a couple of lines, mostly just turning on use_vaapi). Still, it errors right at the linking stage, probably because the Qt project doesn't pass the right linker flags when creating QtWebEngineCore. Perhaps someone smarter than me knows how I can avoid these linking errors:

I guess it'd be good to have a patch ready to propose it to Qt (can't hurt to try). Maybe if you can share your changes here I can take a look as well, though I don't really know Chromium's build system that well.

Thanks for all your effort on this! 👍

@rien333
Copy link
Contributor

rien333 commented Mar 8, 2021

Thanks for the detailed response! I'm in the process of opening a bug report, to at least get it on the radar (EDIT: see https://bugreports.qt.io/browse/QTBUG-91677). Sorry for my snarky comment about Qt, I understand that everyone's/the developers intention is to do good. I guess I was just frustrated after a day of failed builds.

It looks like there's a new enable-accelerated-video-decode flag - with that, I don't think ignore-gpu-blacklist is still required?

It seems that the ignore-gpu-blocklist flag (note the name change) is necessary for some, but not all. From what I understand, there's a big json file (software_rendering_list.json) that disables hw video decode on some configurations (e.g. old GPUs and drivers). My guess is that some people's systems are blacklisted in that file, and others aren't. This matches my observations on the web, where some people report indeed not needing that flag. Anyone that uses a fairly recent intel GPU (I'm on kaby lake) should be fine.

On my system, chromium v87 (obtained from the Arch repos) only needs two flags to enable hw video decoding on big buck bunny h264 mp4:

--use-gl=desktop # or use-gl=egl on wayland
--enable-accelerated-video-decode

(it works without use-gl, but that does alter the extent to which my GPU is used).

How I (failed) to build qtwebengine with vaapi support

Maybe if you can share your changes here I can take a look as well, though I don't really know Chromium's build system that well.

Taking the Arch Linux PKGBUILD as a base (yay -G qt5-webengine), I've modified the prepare() function, makedepends, and source array as follows:

makedepends=('git' 'python2' 'python' 'gperf' 'jsoncpp' 'ninja' 'qt5-tools' 'poppler' 'libpipewire02' 'nodejs' 'libva' 'libdrm' 'mesa')

...

source=(git+https://code.qt.io/qt/qtwebengine.git#commit=$_commit
        git+https://code.qt.io/qt/qtwebengine-chromium.git
        qt5-webengine-glibc-2.33.patch
        vaapi-gcc-fix.patch
        vaapi-flags.patch)

...

prepare() {
  mkdir -p build

  cd ${_pkgfqn}

  git submodule init
  git submodule set-url src/3rdparty "$srcdir"/qtwebengine-chromium
  git submodule set-branch --branch 87-based src/3rdparty
  git submodule update

  patch -p1 -i "$srcdir"/qt5-webengine-glibc-2.33.patch # Fix text rendering when building with glibLc 2.33
  patch  -p1 < ../vaapi-flags.patch 
  patch  -p1 < ../vaapi-gcc-fix.patch
}

The gcc patch

diff --git a/src/3rdparty/chromium/media/gpu/vaapi/vaapi_wrapper.h b/src/3rdparty/chromium/media/gpu/vaapi/
vaapi_wrapper.h
index 01ec582..e2c95f1e 100644
--- a/src/3rdparty/chromium/media/gpu/vaapi/vaapi_wrapper.h
+++ b/src/3rdparty/chromium/media/gpu/vaapi/vaapi_wrapper.h
@@ -345,8 +345,8 @@
   // Convenient templatized version of SubmitBuffer() where |size| is deduced to
   // be the size of the type of |*data|.
   template <typename T>
-  bool SubmitBuffer(VABufferType va_buffer_type,
-                    const T* data) WARN_UNUSED_RESULT {
+  bool WARN_UNUSED_RESULT SubmitBuffer(VABufferType va_buffer_type,
+                                       const T* data) {
     return SubmitBuffer(va_buffer_type, sizeof(T), data);
   }
   // Batch-version of SubmitBuffer(), where the lock for accessing libva is

This gcc patch was once shipped in some form by other distributions as well.

The vaapi patch (outdated, see below)

diff --git a/src/core/config/linux.pri b/src/core/config/linux.pri
index 3e490a0..f49905b 100644
--- a/src/core/config/linux.pri
+++ b/src/core/config/linux.pri
@@ -24,6 +24,8 @@ qtConfig(webengine-embedded-build) {

     qtConfig(webengine-ozone-x11) {
         gn_args += ozone_platform_x11=true
+        gn_args += use_vaapi=true
+        gn_args += ozone_platform_drm=true # needed to pass an assert
         gn_args += use_xkbcommon=true
         packagesExist(xscrnsaver): gn_args += use_xscrnsaver=true
         qtConfig(webengine-webrtc): gn_args += rtc_use_x11=true

diff --git a/src/core/config/linux.pri b/src/core/config/linux.pri
--- a/src/core/config/linux.pri
+++ b/src/core/config/linux.pri
@@ -40,7 +40,7 @@ qtConfig(webengine-embedded-build) {
     qtConfig(webengine-system-libxml2):  gn_args += use_system_libxml=true use_system_libxslt=true
     qtConfig(webengine-system-opus):     gn_args += use_system_opus=true
     qtConfig(webengine-system-snappy):   gn_args += use_system_snappy=true
-    qtConfig(webengine-system-libvpx):   gn_args += use_system_libvpx=true
+    qtConfig(webengine-system-libvpx):   gn_args += use_system_libvpx=false # per Gentoo bug #633332
     qtConfig(webengine-system-icu):      gn_args += use_system_icu=true icu_use_data_file=false
     qtConfig(webengine-system-ffmpeg):   gn_args += use_system_ffmpeg=true
     qtConfig(webengine-system-re2):      gn_args += use_system_re2=true

The libvpx change reduces the number of linking errors somewhat, not sure if it's needed in the end.

Again, this builds fine (and largely matches how most distros build chromium), but results in a bunch of opengl related errors right at the final link stage. I think this indicates that linking stage of QtWebengine doesn't "know" about the vaapi stuff, but I wouldn't what or how to tell qmake and Qt's build tools about vaapi-related changes (though -lva is passed at this stage, which according to pkgconf --libs libva is right).

Slightly different approach (EDIT)

Setting ozone_platform_drm to true doesn't seem the way to go; the better route would be to rely on ozone_platform_x11. I was sort of misled to focus on drm because qt uses a really weird ui/gfx/linux/BUILD.gn file that I can't really find in the chromium project. The closest I could find was this file. That file does support ozone_platform_x11. Swapping in this upstream chromium file using a patch, I was able to get down to just two linker errors:

/usr/bin/ld: /tmp/makepkg/qt5-webengine/src/build/src/core/release/obj/media/gpu/vaapi/common/vaapi_wrapper.o: in function `media::VaapiWrapper::ExportVASurfaceAsNativePixmapDmaBuf(media::ScopedVASurface const&)':
vaapi_wrapper.cc:(.text._ZN5media12VaapiWrapper35ExportVASurfaceAsNativePixmapDmaBufERKNS_15ScopedVASurfaceE+0x706): undefined reference to `gfx::NativePixmapDmaBuf::NativePixmapDmaBuf(gfx::Size const&, gfx::BufferFormat, gfx::NativePixmapHandle)'
/usr/bin/ld: /tmp/makepkg/qt5-webengine/src/build/src/core/release/obj/media/gpu/chromeos/common/platform_video_frame_utils.o: in function `media::CreateNativePixmapDmaBuf(media::VideoFrame const*)':
platform_video_frame_utils.cc:(.text._ZN5media24CreateNativePixmapDmaBufEPKNS_10VideoFrameE+0x14c): undefined reference to `gfx::NativePixmapDmaBuf::NativePixmapDmaBuf(gfx::Size const&, gfx::BufferFormat, gfx::NativePixmapHandle)'
/usr/bin/ld: /tmp/makepkg/qt5-webengine/src/build/src/core/release/obj/media/gpu/vaapi/vaapi/vaapi_picture_native_pixmap_egl.o: in function `media::VaapiPictureNativePixmapEgl::Allocate(gfx::BufferFormat)':
vaapi_picture_native_pixmap_egl.cc:(.text._ZN5media27VaapiPictureNativePixmapEgl8AllocateEN3gfx12BufferFormatE+0x1e8): undefined reference to `gfx::NativePixmapDmaBuf::NativePixmapDmaBuf(gfx::Size const&, gfx::BufferFormat, gfx::NativePixmapHandle)'
collect2: error: ld returned 1 exit status

@The-Compiler
Copy link
Member Author

The-Compiler commented Mar 4, 2022

I've recently talked to QtWebEngine devs about those flags again - both the dev (carewolf) and I actually see reduced performance (especially much more CPU usage) with enable-gpu-rasterization.

@maximbaz I know it's been a while, but if you're still interested in this, maybe you could try using perf to confirm that there's some performance change with that option on still? It it's really a benefit for some but not others for whatever reason, I'd like to expose it as a qutebrowser option - but I'd first like to understand what's going on.

@maximbaz
Copy link
Contributor

maximbaz commented Mar 4, 2022

Thanks for the ping! I can confirm that it doesn't seem to have any significant effect anymore, CPU usage seems similar, sites that heavily use animations, Google Maps, video playback, all seems just as fast and just as smooth with and without the flag.

@toofar
Copy link
Member

toofar commented Mar 5, 2022

I did a quick test with a 1080p youtube video, on a desktop (amd open source graphics). I believe I (also) see slightly higher CPU usage with rasterization (and hardware acceleration) turned on. That's just by eyeballing a CPU usage graph while playing it fullscreen in two different instances in sequence without much else going on on the machine. I'm not sure what impact that has on power consumption, which seems to be the only practical use case raised so far, but probably less CPU usage and less GPU usage (because less offload? sure seemed to be less going on in radeontop) means it'll be less power consumption too.

For some less practical musing, I wonder if we can see if it has an effect on latency? There is a "Frame Rendering Stats" overlay in the developer tools and I thiiink I see higher frame rates (just a little bit) with hardware acceleration and rasterization on. But the overlay doesn't seem to be reliably updated so I'm not sure how much I trust it. To turn it on open the three-dot menu on the right hand side of the top bar of the developer tools (it can be tricky to hit on Qt6 because it is mostly behind the close button!) then More Tools -> Rendering and check it on that page

@rien333
Copy link
Contributor

rien333 commented Feb 9, 2023

vaapi support is finally being worked on!

@rien333
Copy link
Contributor

rien333 commented Sep 9, 2023

Now that vaapi support is merged (or almost merged?), would it be a good idea to expose a flag to users to enable it?

I have some time on my hands and thinking about getting back into programming, so I'm happy to implement and test said flag.

@The-Compiler
Copy link
Member Author

I didn't quite follow the topic - at least for vaapi itself, there doesn't seem to be anything needed to enable it at runtime? Or do you mean enabling QT_XCB_GL_INTEGRATION=xcb_egl? It's not quite clear to me what that does precisely, but I'm also not against it (especially given that you know more about the topic than I do).

@toofar
Copy link
Member

toofar commented Sep 11, 2023

@rien333 Yeah, vaapi looks to be a build time check based on the presence of libgbm-dev and libva-dev. If you build locally you should see a VA-API support entry under Qt WebEngineCore in config.summary. I don't know if it shows up in chrome://gpu/, my builds don't have it because I didn't realise libgbm was a new dependency (I'll kick off a new build later in the week). Although my chrome://gpu/ already says "Video Decode: Hardware accelerated", so I wonder how we can verify it was actually using it (ltrace?)?

Anyway, if no-one else weighs in before then I'll try to make a new build later in the week and attempt to verify if it actually works (ldd?). In the meantime, if you are looking for something else to contribute to you are most welcome! Preferrably something from the 3.0.1 milestone (there's one there about tab titles (or is it whole tabs 🤔) not rendering, which I think is similar to an issue you raised a while back). Otherwise if you are up for helping verify any of the pending PR's to update userscripts those are usually pretty standalone.

Update:

Rebuilt 6.6 with va-api support allegedly complied in.
Since I don't have an intel GPU I found that the GPU independent nvtop can show how much of a GPU is spent on decoding video, like so:
image

That picture is from mpv with the --hwdec=vaapi arg though, not from qutebrowser. An initial test seems to show qute didn't use hardware decoding out of the box. I'll try to play around with Qt args later (and verify my build is correct).

@rien333
Copy link
Contributor

rien333 commented Sep 11, 2023

Sorry, my earlier comment came from a naive place, mostly just from what I remembered chromium doing. I should've re-read what the Qt people actually did in the end.

vaapi looks to be a build time check based on the presence of libgbm-dev and libva-dev

True — I might or might not be confusing everyone, but IIRC, in chromium, vaapi then still required certain flags for it be actually used. Hence my posting in this flags-related thread 😊 In any case, because I think no one has messed with this in the context of qutebrowser, I will try and experiment today. If anything, it might be useful to make a "hw accelerated video" entry on the FAQ, since questions about qutebrowser's high CPU usage on videos come up quite often.

Although my chrome://gpu/ already says "Video Decode: Hardware accelerated", so I wonder how we can verify it was actually using it (ltrace?)?

intel_gpu_top 😉 (though it would be nice to check if everything works correctly under amd)

if you are looking for something else to contribute to you are most welcome!

I will have a look!

@toofar
Copy link
Member

toofar commented Sep 16, 2023

I started a discussion here to continue the investigation into Vaapi support: #7917
If you've got Qt6.6 available (looks like it's in the kde-unstable repo) have a look and see if you can help test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: performance Issues with performance/benchmarking. priority: 1 - middle Issues which should be done at some point, but aren't that important.
Projects
None yet
Development

No branches or pull requests

7 participants