Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow rw locking of texture caches #2433

Merged
merged 1 commit into from
Dec 18, 2019

Conversation

lgritz
Copy link
Collaborator

@lgritz lgritz commented Dec 18, 2019

To reduce thread contention on the IC/TS caches, change from unique
locks to reader-writer locks.

  • unordered_map_concurrent: Change the shards from spin_lock to
    spin_rw_lock. Most operations (insert, erase, holding an iterator,
    and find, which returns an iterator) are unique (write) locks. But
    the underappreciated retrieve() method is stateless and can be a reader
    lock!

  • In imagecache.cpp, change a bunch of the find() calls and fiddling
    with iterators into simpler calls to retrieve(), which should be
    able to happen concurrently.

On a highly threaded machine (52 cores) this resulted in a 6% gain
(geometric mean) across a set of production render frames that we like
to test with, but that disguises what's really happening. Most of the
individual tests in that suite don't change at all (or vary within the
+/- ~1% we expect from timing noise and run-to-run variation. But a
handful of the tests sped up by 20-40%! So I think you are unlikely
to see any change for a few threads, or even many threads for most
situations. But for highly threaded renderings where the access pattern
really results in the cache locks as a major rendering bottleneck,
this should help a lot.

To reduce thread contention on the IC/TS caches, change from unique
locks to reader-writer locks.

* unordered_map_concurrent: Change the shards from spin_lock to
  spin_rw_lock. Most operations (insert, erase, holding an iterator,
  and find, which returns an iterator) are unique (write) locks. But
  the underappreciated retrieve() method is stateless can be a reader
  lock!

* In imagecache.cpp, change a bunch of the find() calls and fiddling
  with iterators into simpler calls to retrieve(), which should be
  able to happen concurrently.

On a highly threaded machine (52 cores) this resulted in a 6% gain
(geometric mean) across a set of production render frames that we like
to test with, but that disguises what's really happening.  Most of the
individual tests in that suite don't change at all (or vary within the
+/- ~1% we expect from timing noise and run-to-run variation.  But a
handful of the tests sped up by 20-40%! So I think you are unlikely
to see any change for a few threads, or even many threads for most
situations. But for highly threaded renderings where the access pattern
really resulting in the cache locks as a major rendering bottleneck,
this should help a lot.
@fpsunflower
Copy link
Contributor

LGTM!

@lgritz
Copy link
Collaborator Author

lgritz commented Dec 18, 2019

ok, here goes nothing...

@lgritz lgritz merged commit b3f6a68 into AcademySoftwareFoundation:master Dec 18, 2019
@lgritz lgritz deleted the lg-rw branch December 18, 2019 23:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants