The fewer places that rely on a full world rect, the easier it is to do clustered culling, where we accept / reject primitives in groups. The main use for world rects of primitivs is overlap calculations during batching. Instead, switch to use the surface relative rect for batching. This is safe since we know that when we merge batches from different surfaces, they will never be overlapping in the allocated render target. Other changes: * Refactor the initial picture traversal to use a state object that maintains an internal stack of picture / surface info. This is easier to reason about, and will be helpful once we start using this to pass information about caching state. * Use world rect rather than clipped prim world rect for bounds during plane-splitting. This was landed previously but backed out due to an unrelated bug in that patch. * Change get_raster_rects to not calculate the transform, since most uses of this method don't require it. * Change conservative tiling calculations to use the world rect for bounds instead of clipped prim rect. All we're trying to do here is reject tiles outside the viewport, so this simplifies the code and removes the need for the primitive world rect in one more location.