Calculate local clipping rectangle per-primitive run
Instead of using the hierarchical nature of the ClipScrollTree to calculate local clipping rectangles for primitives, do it when we are about to render a particular primitive run. This will allow us to handle custom clip chains while maintaining local clipping rectangle optimizations. It also allows us to greatly simplify a lot of shader code that used to deal with both clipping and scrolling nodes, as the local clipping rectangle is now bundled into the primitives local_clip_rect. TransformOrOffset is the main struct that handles the efficient transformation of clipping rectangles between compatible coordinate systems. It will avoid doing expensive matrix operations as much as possible, which should make these calculations cheaper -- important now that they are done per-primitive run.