Skip to content

Commit

Permalink
Legendary 4638 shared texture perf improvement
Browse files Browse the repository at this point in the history
This fixes remaining performance and frame pacing issues when using CEF
95 with texture sharing on Windows. I hacked Chromium internally to
treat shared textures similarly to how the 3770 method worked.

Let me document my little adventure.

First, we were getting system freezes, and from analysis of the
bluescreen dumps, they were always during synchronization, so I had a
hunch: "are keyed mutexes doing this?" The system freeze issue left me
hopeless. I already had a disdain for the stupid keyed mutexes, and due
to my immense hatred of them, I was angry and I just wanted to try
removing them, because 1.) What if they were the cause? (They were), and
2.) I hated them anyway. It was an irrational vendetta, and I was on a
war path to remove them just in the *slightest* chance that those god
forsaken keyed mutexes were the cause.

So I got angry and decided to remove them from almost everything in
Chromium just to see if it would fix the system freeze issue. I removed
their usage from almost everything in Chromium related to GLImageDXGI.

Let me go on a rant about keyed mutexes. I hate keyed mutexes. I *want*
synchronization between devices, but what I *don't* want is to be forced
to swap stupid "keys" between the two devices; especially if you're in a
situation where you cannot pass the next key value to the next device so
the next device knows what key to use, because then, the original device
can no longer lock the object anymore, and the object is completely
forfeit. Then you get suck in a situation where you're forced to wait
infinitely if you have no way of telling the other device to use the
correct key. I wish I could suggest a better design, but all I know is
that I hate it, it's an insufferable design as it is right now, and I
don't think there's a single human being on the planet who will ever
convince me otherwise. Absolutely insufferable design. I've always hated
them and always will.

Anyway, sorry about that. Like I was saying, I removed keyed mutex usage
from texture sharing inside chromium. But the problem is that with the
4638 version of shared textures, it was about 5 textures which were used
round-robin. Because we were forced to remove keyed mutexes (which were
crashing the entire system), we could no longer synchronize between
client and Chromium, thus frame pacing issues were introduced. Even
flushing wasn't helping. They weren't noticeable, and we were *almost*
just going to use it as it, but I decided I wasn't finished with my war
path.

I had fixed the system freeze issue by removing keyed mutexes, but now
there was this frame pacing issue. So, I was upset, and I tried many
different techniques to try to synchronize. Flushing, more textures,
less textures, trying to adjust timing; I thought it was hopeless, until
I started looking at the 3770 version and started looking around
4638 code for ways I could do the same thing. I had already removed
keyed mutex objects from GLImageDXGI objects, but then I realized: what
if I just add the same staging/CopyResource method 3770 did, and then
just use one shared texture? 3770 worked amazingly well, so why not try
it? Through much toil and experimentation, I got it working.

However, there was still one annoying caveat. Because of the fact that
Chromium now only deals with NT-style HANDLE objects for shared
textures, it would duplicate handles every time it was passed. There was
no way of detecting whether we were already using the same shared
texture (and CompareObjectHandles in KernelBase didn't work), so we had
to recreate the stupid texture object every time. So I introduced a new
OnAcceleratedPaint2 function specifically to specify whether the texture
has been changed -- that way we don't need to have to continually keep
recreating the god-forsaken texture object.

All these things combined has won us a huge victory, not only did we
solve the system freeze issue, but we also reduced the amount of
resources being used from the original version by removing the round
robin, eliminated frame pacing issues, and improved the perf back to
3770 levels. In fact, I'm pretty sure that perf levels are even better
than they were even with the keyed mutex version (if they weren't
causing stupid system freezes), because the keyed mutex version forced
INFINITE lock durations due to the inability to relay keys.

After 27.2 had this issue, I had to delay the release, and I spent a
week toiling over this. To get the system freeze issue fixed, and then
to get perf levels back to 3770 levels. And I did it by sifting through
millions of lines of Chromium code.

Needless to say I feel really damn good right now. This was a legendary
fix. I'm sorry, I need a little bit of ego just for once. I feel like
I've earned it.
  • Loading branch information
jp9000 committed Feb 20, 2022
1 parent 0fcb7eb commit 36fdac1
Show file tree
Hide file tree
Showing 13 changed files with 203 additions and 53 deletions.
7 changes: 7 additions & 0 deletions include/capi/cef_render_handler_capi.h
Original file line number Diff line number Diff line change
Expand Up @@ -178,6 +178,13 @@ typedef struct _cef_render_handler_t {
cef_rect_t const* dirtyRects,
void* shared_handle);

void(CEF_CALLBACK* on_accelerated_paint2)(struct _cef_render_handler_t* self,
struct _cef_browser_t* browser,
cef_paint_element_type_t type,
size_t dirtyRectsCount,
cef_rect_t const* dirtyRects,
void* shared_handle,
bool new_texture);
///
// Called when the user starts dragging content in the web view. Contextual
// information about the dragged content is supplied by |drag_data|. (|x|,
Expand Down
10 changes: 5 additions & 5 deletions include/cef_api_hash.h
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
// Copyright (c) 2021 Marshall A. Greenblatt. All rights reserved.
// Copyright (c) 2022 Marshall A. Greenblatt. All rights reserved.
//
// Redistribution and use in source and binary forms, with or without
// modification, are permitted provided that the following conditions are
Expand Down Expand Up @@ -42,13 +42,13 @@
// way that may cause binary incompatibility with other builds. The universal
// hash value will change if any platform is affected whereas the platform hash
// values will change only if that particular platform is affected.
#define CEF_API_HASH_UNIVERSAL "670fa90fc98b659eeed5e6319cc9e41a0ed6beeb"
#define CEF_API_HASH_UNIVERSAL "d8e8564cc94a347224b8c490da315c7d4d8a0215"
#if defined(OS_WIN)
#define CEF_API_HASH_PLATFORM "3e58d20ee6030d7a0ebad757e84689aff7f418ee"
#define CEF_API_HASH_PLATFORM "1d2b58daf1f3c4840050e1dea2a578dc94d8bdb0"
#elif defined(OS_MAC)
#define CEF_API_HASH_PLATFORM "f2aa53d5f4a5ae2f27575d4ab0bfac9b70e3b25a"
#define CEF_API_HASH_PLATFORM "057f6b101288803d5b6ff1c9efa9074c7dae0fd3"
#elif defined(OS_LINUX)
#define CEF_API_HASH_PLATFORM "ab0e6c425fd066ec7377cd9cd79d98cd556ef996"
#define CEF_API_HASH_PLATFORM "8793d9151aa2d32306f17381c69b9445c76e0748"
#endif

#ifdef __cplusplus
Expand Down
6 changes: 6 additions & 0 deletions include/cef_render_handler.h
Original file line number Diff line number Diff line change
Expand Up @@ -180,6 +180,12 @@ class CefRenderHandler : public virtual CefBaseRefCounted {
const RectList& dirtyRects,
void* shared_handle) {}

virtual void OnAcceleratedPaint2(CefRefPtr<CefBrowser> browser,
PaintElementType type,
const RectList& dirtyRects,
void* shared_handle,
bool new_texture) {}

///
// Called when the user starts dragging content in the web view. Contextual
// information about the dragged content is supplied by |drag_data|.
Expand Down
2 changes: 1 addition & 1 deletion libcef/browser/osr/external_renderer_updater.mojom
Original file line number Diff line number Diff line change
Expand Up @@ -9,5 +9,5 @@ import "ui/gfx/geometry/mojom/geometry.mojom";
import "ui/gfx/mojom/buffer_types.mojom";

interface ExternalRendererUpdater {
OnAfterFlip(gfx.mojom.GpuMemoryBufferHandle buffer, gfx.mojom.Rect damage_rect) => ();
OnAfterFlip(gfx.mojom.GpuMemoryBufferHandle buffer, bool new_texture, gfx.mojom.Rect damage_rect) => ();
};
34 changes: 7 additions & 27 deletions libcef/browser/osr/gl_output_surface_external.cc
Original file line number Diff line number Diff line change
Expand Up @@ -151,12 +151,6 @@ void GLOutputSurfaceExternal::EnsureBackbuffer() {
if (size_.IsEmpty())
return;

if (!available_surfaces_.empty()) {
current_surface_ = std::move(available_surfaces_.back());
available_surfaces_.pop_back();
return;
}

gpu::gles2::GLES2Interface* gl = context_provider_->ContextGL();

const int max_texture_size =
Expand All @@ -168,19 +162,15 @@ void GLOutputSurfaceExternal::EnsureBackbuffer() {
gl, context_provider_->ContextCapabilities());
current_surface_->Create(texture_size, color_space_,
gpu_memory_buffer_manager_);
new_texture = true;

if (!fbo_) {
gl->GenFramebuffers(1, &fbo_);
}
}

void GLOutputSurfaceExternal::DiscardBackbuffer() {
displayed_surface_.reset();
displaying_surface_.reset();
current_surface_.reset();
for (auto& surface : in_flight_surfaces_)
surface = nullptr;
available_surfaces_.clear();

gpu::gles2::GLES2Interface* gl = context_provider_->ContextGL();

Expand Down Expand Up @@ -235,34 +225,24 @@ void GLOutputSurfaceExternal::SwapBuffers(OutputSurfaceFrame frame) {

void GLOutputSurfaceExternal::OnSyncWaitComplete(
std::vector<ui::LatencyInfo> latency_info) {
gfx::GpuMemoryBufferHandle handle = current_surface_->GetHandle();
gfx::GpuMemoryBufferHandle handle;
if (new_texture)
handle = current_surface_->GetHandle();

in_flight_surfaces_.push_back(std::move(current_surface_));

if (handle.type != gfx::GpuMemoryBufferType::EMPTY_BUFFER) {
if (current_surface_) {
external_renderer_updater_->OnAfterFlip(
std::move(handle), gfx::Rect(size_),
std::move(handle), new_texture, gfx::Rect(size_),
base::BindOnce(&GLOutputSurfaceExternal::OnAfterSwap,
weak_ptr_factory_.GetWeakPtr(),
std::move(latency_info)));
new_texture = false;
} else {
OnAfterSwap(latency_info);
}
}

void GLOutputSurfaceExternal::OnAfterSwap(
std::vector<ui::LatencyInfo> latency_info) {
if (in_flight_surfaces_.front()) {
if (displayed_surface_) {
available_surfaces_.push_back(std::move(displayed_surface_));
}
if (displaying_surface_) {
displayed_surface_ = std::move(displaying_surface_);
}
displaying_surface_ = std::move(in_flight_surfaces_.front());
}
in_flight_surfaces_.pop_front();

latency_tracker()->OnGpuSwapBuffersCompleted(latency_info);
// Swap timings are not available since for offscreen there is no Swap, just a
// SignalSyncToken. We use base::TimeTicks::Now() as an overestimate.
Expand Down
5 changes: 1 addition & 4 deletions libcef/browser/osr/gl_output_surface_external.h
Original file line number Diff line number Diff line change
Expand Up @@ -48,10 +48,7 @@ class VIZ_SERVICE_EXPORT GLOutputSurfaceExternal : public GLOutputSurface {
ExternalImageData* CreateSurface();

std::unique_ptr<ExternalImageData> current_surface_;
std::unique_ptr<ExternalImageData> displaying_surface_;
std::unique_ptr<ExternalImageData> displayed_surface_;
std::vector<std::unique_ptr<ExternalImageData>> available_surfaces_;
base::circular_deque<std::unique_ptr<ExternalImageData>> in_flight_surfaces_;
bool new_texture = false;

uint32_t fbo_ = 0;
gfx::Size size_;
Expand Down
15 changes: 10 additions & 5 deletions libcef/browser/osr/host_display_client_osr.cc
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ class CefExternalRendererUpdaterOSR

// viz::mojom::ExternalRendererUpdater implementation.
void OnAfterFlip(gfx::GpuMemoryBufferHandle handle,
bool new_texture,
const gfx::Rect& damage_rect,
OnAfterFlipCallback callback) override;

Expand All @@ -53,16 +54,20 @@ CefExternalRendererUpdaterOSR::~CefExternalRendererUpdaterOSR() = default;

void CefExternalRendererUpdaterOSR::OnAfterFlip(
gfx::GpuMemoryBufferHandle handle,
bool new_texture,
const gfx::Rect& damage_rect,
OnAfterFlipCallback callback) {
#if defined (OS_WIN) && !defined(ARCH_CPU_ARM_FAMILY)
HANDLE nthandle = handle.dxgi_handle.Get();
view_->OnAcceleratedPaint(damage_rect, nthandle);
HANDLE nthandle = nullptr;
if (new_texture)
nthandle = handle.dxgi_handle.Get();
view_->OnAcceleratedPaint2(damage_rect, nthandle, new_texture);
#elif defined(OS_MAC)
view_->OnAcceleratedPaint(damage_rect,
(void*)(uintptr_t)(handle.io_surface.get()));
view_->OnAcceleratedPaint2(damage_rect,
(void*)(uintptr_t)(handle.io_surface.get()),
new_texture);
#else
view_->OnAcceleratedPaint(damage_rect, nullptr);
view_->OnAcceleratedPaint2(damage_rect, nullptr, false);
#endif
std::move(callback).Run();
}
Expand Down
30 changes: 30 additions & 0 deletions libcef/browser/osr/render_widget_host_view_osr.cc
Original file line number Diff line number Diff line change
Expand Up @@ -1497,6 +1497,36 @@ void CefRenderWidgetHostViewOSR::OnAcceleratedPaint(
ReleaseResizeHold();
}

void CefRenderWidgetHostViewOSR::OnAcceleratedPaint2(
const gfx::Rect& damage_rect,
void* shared_texture,
bool new_texture) {
if (!use_shared_texture_) {
return;
}

CefRefPtr<CefRenderHandler> handler =
browser_impl_->client()->GetRenderHandler();
CHECK(handler);

gfx::Rect rect_in_pixels(GetViewBounds());
rect_in_pixels.Intersect(damage_rect);

CefRenderHandler::RectList rcList;
rcList.push_back(CefRect(rect_in_pixels.x(), rect_in_pixels.y(),
rect_in_pixels.width(), rect_in_pixels.height()));

handler->OnAcceleratedPaint2(browser_impl_.get(),
IsPopupWidget() ? PET_POPUP : PET_VIEW, rcList,
shared_texture, new_texture);

// Release the resize hold when we reach the desired size. The damage rect is
// currently always equal to the full texture size.
gfx::Size pixel_size(damage_rect.width(), damage_rect.height());
if (hold_resize_ && pixel_size == SizeInPixels())
ReleaseResizeHold();
}

ui::Layer* CefRenderWidgetHostViewOSR::GetRootLayer() const {
return root_layer_.get();
}
Expand Down
3 changes: 3 additions & 0 deletions libcef/browser/osr/render_widget_host_view_osr.h
Original file line number Diff line number Diff line change
Expand Up @@ -231,6 +231,9 @@ class CefRenderWidgetHostViewOSR
const gfx::Size& pixel_size,
const void* pixels);
void OnAcceleratedPaint(const gfx::Rect& damage_rect, void* shared_texture);
void OnAcceleratedPaint2(const gfx::Rect& damage_rect,
void* shared_texture,
bool new_texture);

void OnBeginFame(base::TimeTicks frame_time);

Expand Down
38 changes: 38 additions & 0 deletions libcef_dll/cpptoc/render_handler_cpptoc.cc
Original file line number Diff line number Diff line change
Expand Up @@ -324,6 +324,43 @@ render_handler_on_accelerated_paint(struct _cef_render_handler_t* self,
CefBrowserCToCpp::Wrap(browser), type, dirtyRectsList, shared_handle);
}

void CEF_CALLBACK
render_handler_on_accelerated_paint2(struct _cef_render_handler_t* self,
cef_browser_t* browser,
cef_paint_element_type_t type,
size_t dirtyRectsCount,
cef_rect_t const* dirtyRects,
void* shared_handle,
bool new_texture) {
shutdown_checker::AssertNotShutdown();

DCHECK(self);
if (!self)
return;
// Verify param: browser; type: refptr_diff
DCHECK(browser);
if (!browser)
return;
// Verify param: dirtyRects; type: simple_vec_byref_const
DCHECK(dirtyRectsCount == 0 || dirtyRects);
if (dirtyRectsCount > 0 && !dirtyRects)
return;

// Translate param: dirtyRects; type: simple_vec_byref_const
std::vector<CefRect> dirtyRectsList;
if (dirtyRectsCount > 0) {
for (size_t i = 0; i < dirtyRectsCount; ++i) {
CefRect dirtyRectsVal = dirtyRects[i];
dirtyRectsList.push_back(dirtyRectsVal);
}
}

// Execute
CefRenderHandlerCppToC::Get(self)->OnAcceleratedPaint2(
CefBrowserCToCpp::Wrap(browser), type, dirtyRectsList, shared_handle,
new_texture);
}

int CEF_CALLBACK
render_handler_start_dragging(struct _cef_render_handler_t* self,
cef_browser_t* browser,
Expand Down Expand Up @@ -504,6 +541,7 @@ CefRenderHandlerCppToC::CefRenderHandlerCppToC() {
GetStruct()->on_popup_size = render_handler_on_popup_size;
GetStruct()->on_paint = render_handler_on_paint;
GetStruct()->on_accelerated_paint = render_handler_on_accelerated_paint;
GetStruct()->on_accelerated_paint2 = render_handler_on_accelerated_paint2;
GetStruct()->start_dragging = render_handler_start_dragging;
GetStruct()->update_drag_cursor = render_handler_update_drag_cursor;
GetStruct()->on_scroll_offset_changed =
Expand Down
42 changes: 42 additions & 0 deletions libcef_dll/ctocpp/render_handler_ctocpp.cc
Original file line number Diff line number Diff line change
Expand Up @@ -265,6 +265,48 @@ void CefRenderHandlerCToCpp::OnAcceleratedPaint(CefRefPtr<CefBrowser> browser,
delete[] dirtyRectsList;
}

NO_SANITIZE("cfi-icall")
void CefRenderHandlerCToCpp::OnAcceleratedPaint2(CefRefPtr<CefBrowser> browser,
PaintElementType type,
const RectList& dirtyRects,
void* shared_handle,
bool new_texture) {
shutdown_checker::AssertNotShutdown();

cef_render_handler_t* _struct = GetStruct();
if (CEF_MEMBER_MISSING(_struct, on_accelerated_paint))
return;

// AUTO-GENERATED CONTENT - DELETE THIS COMMENT BEFORE MODIFYING

// Verify param: browser; type: refptr_diff
DCHECK(browser.get());
if (!browser.get())
return;

// Translate param: dirtyRects; type: simple_vec_byref_const
const size_t dirtyRectsCount = dirtyRects.size();
cef_rect_t* dirtyRectsList = NULL;
if (dirtyRectsCount > 0) {
dirtyRectsList = new cef_rect_t[dirtyRectsCount];
DCHECK(dirtyRectsList);
if (dirtyRectsList) {
for (size_t i = 0; i < dirtyRectsCount; ++i) {
dirtyRectsList[i] = dirtyRects[i];
}
}
}

// Execute
_struct->on_accelerated_paint2(_struct, CefBrowserCppToC::Wrap(browser), type,
dirtyRectsCount, dirtyRectsList, shared_handle,
new_texture);

// Restore param:dirtyRects; type: simple_vec_byref_const
if (dirtyRectsList)
delete[] dirtyRectsList;
}

NO_SANITIZE("cfi-icall")
bool CefRenderHandlerCToCpp::StartDragging(CefRefPtr<CefBrowser> browser,
CefRefPtr<CefDragData> drag_data,
Expand Down
5 changes: 5 additions & 0 deletions libcef_dll/ctocpp/render_handler_ctocpp.h
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,11 @@ class CefRenderHandlerCToCpp
PaintElementType type,
const RectList& dirtyRects,
void* shared_handle) override;
void OnAcceleratedPaint2(CefRefPtr<CefBrowser> browser,
PaintElementType type,
const RectList& dirtyRects,
void* shared_handle,
bool new_texture) override;
bool StartDragging(CefRefPtr<CefBrowser> browser,
CefRefPtr<CefDragData> drag_data,
DragOperationsMask allowed_ops,
Expand Down
Loading

2 comments on commit 36fdac1

@defgsus
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there! I'm not really sure what you did here, but your commit message deserves respect already ;-)
It would be nice if github allowed to star commits. Yes. you should feel really damn good!
(found it through good-github)

@jtorjo
Copy link

@jtorjo jtorjo commented on 36fdac1 Mar 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maaan, you are a legend!
Having spent some time myself on Chromium, and countless hours building and rebuilding different branches, what you've done is AMAZING!
Congrats!

Please sign in to comment.