Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High CPU overhead on very complex scenes #228

Closed
rytone opened this issue Aug 4, 2019 · 3 comments
Closed

High CPU overhead on very complex scenes #228

rytone opened this issue Aug 4, 2019 · 3 comments
Labels

Comments

@rytone
Copy link

@rytone rytone commented Aug 4, 2019

Right now I am working on a project where I need to render very complex lines (~2000 points per line, all straight segments per line, usually 3 lines) every frame and have been trying out Pathfinder for it. While I am aware this is probably not an intended use case, I think this may still be valuable for improving the performance of Pathfinder.

The code I am working with can be found here. (see line 296 for some of the main rendering code)

First of all, in order to even get realtime performance, I had to break each line into smaller PathObjects instead of one single PathObject per line. I ended up with rendering taking roughly 8-11ms per frame (using RayonExecutor, measured very non-scientifically), which doesn't really leave enough time for the other stuff I want to do. One of the things I found was by not using OutlineStrokeToFill, although that obviously renders incorrectly, the render time reduced to about 2-4ms. #119 mentions that the stroke algorithm could be done on the GPU, and that might increase performance greatly in this scenario.

Something else I noticed under this scenario was that RayonExecutor doesn't seem to scale well with added threads. Even though my CPU has 8 threads, using RayonExecutor was only about 70% faster than SequentialExecutor (again measured very roughly, about 16-17ms per frame using SequentialExecutor). Additionally, the CPU usage reported by htop was about double that of using SequentialExecutor

Finally, I decided to compare performance of Pathfinder to Blend2D, a purely CPU based rasterizer. My code for testing it is on the blend2d branch of the same repo. Using it, only 3-5ms were spent drawing the lines. However, the caveat is that I have later steps that I want to perform on the GPU, and copying the rendered image from the CPU to the GPU takes a significant amount of time (at least with my naive implementation), bringing the total frame time back up to only a little faster than Pathfinder using RayonExecutor. Despite that, the CPU usage reported while using Blend2D is roughly half of Pathfinder using SequentialExecutor.

@pcwalton pcwalton added the performance label Aug 5, 2019
@pcwalton
Copy link
Collaborator

@pcwalton pcwalton commented Aug 5, 2019

Thanks for trying it out; this feedback is very valuable!

Yes, I think you described the biggest potential performance improvement: rendering strokes with a stroke shader instead of using fills. Another potential win will simply be to make tiling faster; @nical has a work-in-progress branch to improve things significantly. Finally, moving tiling to GPU would help a lot.

@pcwalton
Copy link
Collaborator

@pcwalton pcwalton commented May 16, 2020

You may wish to try again. With @Veedrac's optimizations and the new tiling code that just landed, I've measured an 83% improvement in CPU time on some test cases.

@pcwalton
Copy link
Collaborator

@pcwalton pcwalton commented Jul 28, 2020

Closing for now as there have been many CPU time improvements, including an entirely new backend, since this was filed. Feel free to reopen if more issues are found.

@pcwalton pcwalton closed this Jul 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.