Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-threading! #56

Open
rcombs opened this issue Mar 1, 2014 · 5 comments · May be fixed by #636
Open

Multi-threading! #56

rcombs opened this issue Mar 1, 2014 · 5 comments · May be fixed by #636

Comments

@rcombs
Copy link
Member

rcombs commented Mar 1, 2014

Multi-threaded rendering would almost certainly be a performance boost for libass, but we need to answer a lot of questions before we start work, including:

  • At what stage(s?) do we use threads? Per-event? Per-glyph?
  • Which thread library do we use? We need cross-platform support.
  • How will this affect cacheing? What do we need to do to efficiently use mutexes, avoid deadlocks, etc…
  • Will this have any impact on multithreaded video decoding? If so, how can it be mitigated?
  • How should we determine how many threads to use, especially when considering video decoders, already-multithreaded players, and potential render-in-advance scenarios?
  • How will this interact with Provide a way to render ahead-of-time #55 in general?
@ghost
Copy link

ghost commented Mar 1, 2014

At what stage(s?) do we use threads? Per-event? Per-glyph?

IMO both per-event and per-glyph sound promising. We should probably profile first what takes most time. My guess is rasterization and gaussian blur?

Which thread library do we use? We need cross-platform support.

pthreads. If you dislike the pthreads-w32 dependency and want to sodomize yourself, you still could do a minimal pthread wrapper like x264 and ffmpeg do.

How will this affect cacheing? What do we need to do to efficiently use mutexes, avoid deadlocks, etc…

Avoiding deadlocks is trivial: keep a clear lock hierarchy, and always acquire locks in the same order according to the hierarchy. (Making a hierarchy that works is less trivial....)

How should we determine how many threads to use, especially when considering video decoders, already-multithreaded players, and potential render-in-advance scenarios?

1 thread per core sounds good for something CPU-bound. Yes, this could slow down other operations running in the background (like video decoding), but in the end you have only limitted CPU power. I guess it's better to avoid too many threads if overhead grows too fast...

@grigorig
Copy link
Member

grigorig commented Mar 1, 2014

At what stage(s?) do we use threads? Per-event? Per-glyph?

Events are completely independent from each other so they look like a good candidate. But there are issues with caching, and we might end up rendering stuff twice. Extensive locking will also only slow things down unreasonably. Years ago, I experimented with multithreading a bit and I split up rendering of individual glyphs into multiple threads. This somewhat worked with OK speedup, especially in case blur was used. I used OpenMP back then.

Which thread library do we use? We need cross-platform support.

OpenMP is widely supported. Also, C11 includes a standard threading interface, maybe we could use that. Mesa has implementations for various platforms.

@ghost
Copy link

ghost commented Mar 1, 2014

C11 is not a good match. The only platform where C11 threads would help will never natively support C11 (MS Windows). All other platforms support pthreads just fine.

@rcombs
Copy link
Member Author

rcombs commented Jun 1, 2014

See #107
Answers to my original questions:

  • Per-event
  • pthreads
  • This worked out pretty well
  • Haven't seen a significant impact
  • 1 per vcore seems to work well
  • Fuck knows

@ghost
Copy link

ghost commented Jun 1, 2014

Also relevant: #55

@TheOneric TheOneric linked a pull request Sep 15, 2022 that will close this issue
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants