PoC: multithreaded chunked model CPU computation #1837

illwieckz · 2025-10-01T00:38:33Z

Chunked version of:

PoC: multithreaded model CPU computation #1833

For unknown reasons, only one thread is spawned.

illwieckz · 2025-10-01T00:43:33Z

@slipher here is my chunked variant, but it doesn't spawn more than one thread.

Actually, it could be possible to write a custom dispatch function (I don't know how to do it).

illwieckz · 2025-10-01T02:00:04Z

I added a commit that uses standard threading functions instead of __gnu_parallel::for_each() or std::for_each( std::execution::par, …), that would allow us to not require libgomp or libtbb, but it still runs as if all was done sequentially.

I added a logger to check things and things look correct. I don't know what is missing.

illwieckz · 2025-10-01T02:45:39Z

Hmm, one drawback of doing it that way, is that unlike what does OpenMP, threads are not reused and then profilers like Orbit list thousands of threads and not only that amount is crazy to list, but profiling is just meh because computed statistics are for each thread separately. It would be cool to be able to reuse those threads.

illwieckz · 2025-10-01T02:46:02Z

But at least, if we could get the current implementation working that would be a start.

illwieckz · 2025-10-01T03:02:43Z

Hmm, actually, it seems to work, with my custom thread start. It's just so inefficient that performance drops like if nothing was done. When switching from 1 thread to 2 I see a performance difference. It's just so bad compared to OpenMP.

But now, I don't get why this chunked implementation doesn't work with OpenMP.

This reverts commit 701cae3.

This reverts commit f976903.

This reverts commit 564e26e.

illwieckz · 2025-10-01T03:34:13Z

I got it working with OpenMP, I had to use another syntax (which in fact skips my useless vector trick). I now get 438fps with 16 threads, which is much faster! So yes my idea of controlling the way it is chunked was good!

illwieckz · 2025-10-01T03:57:14Z

With this implementation we are now as fast in the CPU code than when using the GPU code running on CPU with the llvmpipe software renderer. With LIBGL_ALWAYS_SOFTWARE=1 I get the exact same framerate wether I enable r_vboVertexSkinning or disable it. Before, the difference wasn't big, now I see none.

illwieckz · 2025-10-01T19:33:46Z

The experiment was a success. I close this and will submit a completed and cleaned-up branch later.

illwieckz added 4 commits September 30, 2025 01:14

cmake: enable C++17 in engine

593e66d

cmake: enable OpenMP in engine

66bac18

PoC: multithreaded model CPU computation

3731e39

PoC: IQM model CPU chunked processing

8c6aa41

illwieckz marked this pull request as draft October 1, 2025 00:38

illwieckz force-pushed the illwieckz/poc-multithread-cpu-model-gnu-chunk branch 3 times, most recently from d250969 to 8c6aa41 Compare October 1, 2025 00:42

WIP: threads

564e26e

comment

f976903

illwieckz force-pushed the illwieckz/poc-multithread-cpu-model-gnu-chunk branch from 50dccf8 to f976903 Compare October 1, 2025 02:53

illwieckz added 5 commits October 1, 2025 05:11

cargo cult

701cae3

Revert "cargo cult"

d562092

This reverts commit 701cae3.

Revert "comment"

ed3fdf9

This reverts commit f976903.

Revert "WIP: threads"

8084578

This reverts commit 564e26e.

get it working with openmp

1efa631

cleanup

e36bb4a

illwieckz closed this Oct 1, 2025

illwieckz deleted the illwieckz/poc-multithread-cpu-model-gnu-chunk branch October 1, 2025 19:33

illwieckz mentioned this pull request Oct 2, 2025

multithread the MD5/IQM model CPU code using OpenMP #1838

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

PoC: multithreaded chunked model CPU computation #1837

PoC: multithreaded chunked model CPU computation #1837

Uh oh!

illwieckz commented Oct 1, 2025

Uh oh!

illwieckz commented Oct 1, 2025

Uh oh!

illwieckz commented Oct 1, 2025

Uh oh!

illwieckz commented Oct 1, 2025

Uh oh!

illwieckz commented Oct 1, 2025

Uh oh!

illwieckz commented Oct 1, 2025

Uh oh!

illwieckz commented Oct 1, 2025

Uh oh!

illwieckz commented Oct 1, 2025

Uh oh!

illwieckz commented Oct 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

PoC: multithreaded chunked model CPU computation #1837

PoC: multithreaded chunked model CPU computation #1837

Uh oh!

Conversation

illwieckz commented Oct 1, 2025

Uh oh!

illwieckz commented Oct 1, 2025

Uh oh!

illwieckz commented Oct 1, 2025

Uh oh!

illwieckz commented Oct 1, 2025

Uh oh!

illwieckz commented Oct 1, 2025

Uh oh!

illwieckz commented Oct 1, 2025

Uh oh!

illwieckz commented Oct 1, 2025

Uh oh!

illwieckz commented Oct 1, 2025

Uh oh!

illwieckz commented Oct 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant