Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve parallel scalability #36

Closed
mmp opened this issue Oct 4, 2015 · 2 comments
Closed

improve parallel scalability #36

mmp opened this issue Oct 4, 2015 · 2 comments

Comments

@mmp
Copy link
Owner

mmp commented Oct 4, 2015

I've done some benchmarks of a parallel scalability on a 16 core machine, using 1, 2, 4, 8, and 16 threads (for all of the integrators besides Whitted and direct lighting). This spreadsheet summarizes the results (for the moderately complex "breakfast" scene, to come in the pbrt-v3 scenes distribution). For the benchmarks, I modified Film:WriteImage() to return immediately, so that the time measured in the Render() methods didn't include that time.

https://docs.google.com/spreadsheets/d/10gbn0kmaZ-DGq1URB6Yw-jbiaCyaPQyuim0IjcL-zcE/edit?usp=sharing

Interestingly enough scalability for everything but SPPM is very nearly the same: a not impressive 1.9x with two cores, up to ~12.7x with 16 cores. SPPM is broken into the camera pass, photon pass, and statistics update pass; there, scalability is slightly worse. (Though for SPPM there is some construction of and updating of shared data structures, so it's reasonable that it's a bit worse...)

@wjakob
Copy link
Collaborator

wjakob commented Oct 5, 2015

Hi Matt,

why did you write "not impressive" for path tracing etc.? I think 1.9 is pretty decent given other factors like memory traffic, job distribution etc..

The issue with SPPM resembles my own scalability benchmarks with Mitsuba. A better way to parallelize things would be to use the Knaus & Zwicker-style iterations where each photon map pass uses a globally uniform photon radius. This removes some interdependences so that each thread can do its own photon map pass. Maybe something for v4? (with things going like they are now, parallelism should be an even bigger deal a few years down the road)

Thanks,
Wenzel

@mmp
Copy link
Owner Author

mmp commented Dec 8, 2015

Commit 74cab9e, which I'd hoped would help a bit with this, barely moved the needle. SPPM grid construction is now 1.98x faster with 16 cores than 1 core, which is an improvement from the 1.40x before, but there is still a ways to go!

@wjakob I dunno, maybe this is reasonable, but I feel like it could be better. The data is almost entirely read-only, there's a lot of compute and a lot of independent jobs, etc. (Or, coming at it from a different way, none has yet carefully ran it through a profiler to look for unexpected false sharing or other things that can meaningfully degrade scalability, so it's likely there are other unknown issues that could be tuned up...)

@mmp mmp closed this as completed Mar 29, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants