Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign up(Do not land) Move tile decomposition to frame building. #2513
Conversation
|
The PR is in a pretty awful state, it is currently more of a proof of concept than than something to land. The awfulness is mostly a lot of code duplication coming from that the way I think that this can be refactored into something acceptable, but it's possible that more involved architecture changes would be preferable to support splitting primitives during frame building (or it's possible that I misunderstood the code that deals with brush segments and that I can rely on something similar when there may be many tiles). If you think so, please let me know sooner rather than later. Even once this has been made nice and clean, I don't think it would be reasonable to land before we are also culling tiles during frame building (which this PR doesn't do yet), otherwise this would badly regress performance in some cases. |
|
One reason we might want want to take the time to generalize the notion that a primitive can be split during frame building is that I think it would be good to move gradient decomposition to the frame building phase as well for the same reasons I am moving image decomposition. |
|
|
|
The code is in a better shape now, although I still have to cull tiles before anything can land. After a bit of digging I am erring on the side of not using the brush segmentation infrastructure for image tiles, because:
All in all it looks quite a bit simpler to split image primitives without relying on the segment infrastructure, although I am not saying it is better in the long run. The downsides I can think of are that with segments we can share the clip rect between tiles in the gpu cache, and overall having a single system for splitting primitives sounds cleaner (although I am not sure how this holds in practice). Now is a good time to let me know if you can think of a better way than the approach I am going for in this PR. |
|
@nical I haven't looked at this at all yet, but the comment above about prim_store and batch expecting a single batch item per-primitive doesn't sound quite right. For example, borders produce 8 instances with 2 shaders for each primitive. I'm not sure if that's relevant here or not. I think using the segment infrastructure here should work, and will probably work out well. I'll try to take a detailed look at this today and offer some more useful comments. |
|
(We discussed this on IRC and @nical is going to take a look at using the segment infrastructure and see if there's any gotchas with it). |
|
As I am digging into this I realize that images are actually not converted to brushes yet (in the majority of cases). Assuming we do support repeating in the brush image shader, we then need to store per segment:
Bumping the number of vecs per segment from 2 to 3 (for all brush shaders IIUC). Is this going to be an issue? I'll start by trying to selectively port over image primitives that have the stretch size equal to the primitive size.\ Edit: Actually it looks like the UV is stored in a separate allocation and the segment data can get away with storing only a 32bit address which can be packed with the stretch size. |
|
Some additional options to consider: You could have a separate GPU cache handle stored in the brush::image primitive. You can write per-segment data to this, store the handle in the user data of the instance, and then fetch from that in the image shader (by adding the segment index). This allows you to store an arbitrary amount of extra data per-segment for images, without affecting other brush primitives. Another option is to do the image repeating on the CPU via extra instances. To handle those bad cases of 1x1 repeats, what we would do is detect if the repeat is less than a threshold (e.g. 128px) and then use |
|
We also discussed on IRC that it's feasible to modify the way brush segment generation works to supply a separate user data field per segment. This is the preferred way to pass texture cache handles, since it avoids any complexities with invalidating GPU cache data if the texture cache entry gets evicted / moved. |
|
|
|
@nical Do we still need this one open? |
|
Closing in favor of #2572. |
nical commentedMar 13, 2018
•
edited by larsbergstrom
Step 2 of #2370 (Step 1 being #2507).
This change is