Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a new display_list_benchmarks test suite #29562

Merged
merged 4 commits into from
Dec 23, 2021

Conversation

gw280
Copy link
Contributor

@gw280 gw280 commented Nov 5, 2021

This is an initial work in progress for adding a new microbenchmarks suite that'll allow us to get an idea of the relative cost of each raster op defined in our DisplayList format.

@gw280 gw280 requested review from cbracken and flar November 5, 2021 20:45
@google-cla google-cla bot added the cla: yes label Nov 5, 2021
@gw280 gw280 added the Work in progress (WIP) Not ready (yet) for review! label Nov 5, 2021
@gw280
Copy link
Contributor Author

gw280 commented Nov 5, 2021

Sample output from a benchmarking run (currently only DrawLine is benchmarked):

Run on (16 X 2300 MHz CPU s)
2021-11-05 13:43:01
Benchmark                                             Time           CPU Iterations
-----------------------------------------------------------------------------------
DrawLine<SoftwareCanvasProvider>/16                   6 ms          6 ms        117
DrawLine<SoftwareCanvasProvider>/32                   7 ms          7 ms         96
DrawLine<SoftwareCanvasProvider>/64                  10 ms         10 ms         66
DrawLine<SoftwareCanvasProvider>/128                 18 ms         18 ms         42
DrawLine<SoftwareCanvasProvider>/256                 33 ms         33 ms         21
DrawLine<SoftwareCanvasProvider>/512                 62 ms         62 ms         11
DrawLine<SoftwareCanvasProvider>/1024               120 ms        119 ms          6
DrawLine<SoftwareCanvasProvider>/2k                 262 ms        260 ms          3
DrawLine<MetalCanvasProvider>/16/real_time            8 ms          8 ms         77
DrawLine<MetalCanvasProvider>/32/real_time            8 ms          8 ms         82
DrawLine<MetalCanvasProvider>/64/real_time            8 ms          8 ms         83
DrawLine<MetalCanvasProvider>/128/real_time           8 ms          8 ms         85
DrawLine<MetalCanvasProvider>/256/real_time           7 ms          7 ms         91
DrawLine<MetalCanvasProvider>/512/real_time           8 ms          8 ms         79
DrawLine<MetalCanvasProvider>/1024/real_time          8 ms          8 ms         94
DrawLine<MetalCanvasProvider>/2k/real_time            8 ms          8 ms         94

@gw280
Copy link
Contributor Author

gw280 commented Nov 5, 2021

The basic idea here is to create a unit test for each raster op that:

  • Accepts some arguments via the benchmark::State object that allows us to modify parameters for the particular raster op we're profiling
  • Builds the DisplayList to be benchmarked
  • Grabs a SkCanvas to render that DisplayList to using the CanvasProvider interface
  • Times how long it takes to render that DisplayList to the provided SkCanvas

Using a templated CanvasProvider allows us to separate the logic creating the destination SkSurface from the test itself, so we can re-use tests across a Metal, Software, GL, Vulkan CanvasProvider. This will give us a good way of benchmarking different backends against each other.

I haven't yet figured out a good way to handle setting attributes for each test, but I am actively working on that. For example, we'd like to be able to run the DrawLine tests with both AA enabled and disabled. Jim's work in #29470 should make that easier with the refactoring to use DisplayListFlags.

@flar
Copy link
Contributor

flar commented Nov 8, 2021

I'm guessing this is all run on the host for now. Do we have a way to run it on some mobile devices as well?

@gw280
Copy link
Contributor Author

gw280 commented Nov 8, 2021

I'm guessing this is all run on the host for now. Do we have a way to run it on some mobile devices as well?

So we don't currently run any of the engine unittests on device. In theory it shouldn't be too hard, but the infrastructure just isn't there right now.

@CaseyHillers CaseyHillers changed the base branch from master to main November 15, 2021 18:12
@zanderso zanderso changed the base branch from master to main November 19, 2021 02:18
@gw280 gw280 force-pushed the gwright-rasterops-bench branch 2 times, most recently from b3bef36 to 7752af3 Compare December 9, 2021 21:43
@gw280 gw280 changed the title [WIP] Add a new flow_benchmarks test suite Add a new flow_benchmarks test suite Dec 16, 2021
@gw280 gw280 removed the Work in progress (WIP) Not ready (yet) for review! label Dec 16, 2021
@gw280
Copy link
Contributor Author

gw280 commented Dec 16, 2021

I think this is ready for a first pass at reviewing.

Copy link
Contributor

@flar flar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I seem to have lost a bunch of comments here during a browser reload, so I'm submitting them as comments (to hopefully bring them back into view) and then resuming the review.

flow/display_list_benchmarks_gl.cc Outdated Show resolved Hide resolved
flow/display_list_benchmarks.cc Outdated Show resolved Hide resolved
flow/display_list_benchmarks.cc Outdated Show resolved Hide resolved
flow/display_list_benchmarks.cc Outdated Show resolved Hide resolved
flow/display_list_benchmarks.cc Outdated Show resolved Hide resolved
flow/display_list_benchmarks.cc Outdated Show resolved Hide resolved
flow/display_list_benchmarks.cc Outdated Show resolved Hide resolved
flow/display_list_benchmarks.cc Outdated Show resolved Hide resolved
flow/display_list_benchmarks.cc Outdated Show resolved Hide resolved
flow/display_list_benchmarks.cc Outdated Show resolved Hide resolved
@gw280 gw280 requested a review from flar December 22, 2021 23:42
Copy link
Contributor

@flar flar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About the only functional problem is that there may be overflow on some of the drawImage tests depending on the draw counts compared to the bitmap sizes.

I'm also a little wary of the way that the vertices are constructed leading to degenerate triangles.

Other than that, the rest is just a bunch of nits.

flow/display_list_benchmarks_metal.cc Outdated Show resolved Hide resolved
flow/display_list_benchmarks_gl.cc Outdated Show resolved Hide resolved
flow/display_list_benchmarks.cc Outdated Show resolved Hide resolved
flow/display_list_benchmarks.cc Outdated Show resolved Hide resolved
flow/display_list_benchmarks.cc Outdated Show resolved Hide resolved
else
colors.push_back(SK_ColorBLUE);
}
break;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

N/2+1 vertices

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For fan mode you could generate N points and have N+1 vertices.

vertices.push_back(center);
colors.push_back(i % 2 ? SK_ColorBLUE : SK_ColorCYAN);
}
break;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As written, that for loop produces N/2*2 = N vertices. My proposal above would generate N/2+N/4 vertices for strips and N+N/2 for regular triangles.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you adopt my proposals then you might use 2N/3 points for strips and N/3 for triangles (rounding up?).

// Each vertex colour will alternate through Red, Green, Blue and Cyan.
sk_sp<SkVertices> GetTestVertices(SkPoint center,
float radius,
size_t vertex_count,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to measure the number of triangles instead of vertices? The triangles represent work done, the vertices represent efficiency of packing.

And, ugh, when I went to investigate this I couldn't find a vertex count getter in SkVertices, meaning that we currently have no way of estimating any of this even if we collect the data on it... :(

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to look at the vertex count because that's the data we actually input. We will need to figure out a way to either grab it, or approximate it. It looks like SkVertices::approximateSize() is a half-decent first-order approximation, as it returns an approximation for the total number of bytes used by the SkVertices object and all the data arrays. If we just subtract the size of the SkVertices object from that then we will have an approximation of the vertex count.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can also grab the vertex count from Vertices.cc in dart:ui: https://github.com/flutter/engine/blob/main/lib/ui/painting/vertices.cc#L61

It's a little annoying to have to plumb through that count ourselves but I think it's better than trying to calculate the number of triangles.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll eventually get to that and flatten all of the Sk structures into the DL, but for now we rely on the data we can grab from the Sk objects.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't grab the triangle count either, so I think it makes more sense to vary the vertex count to get an idea of that.

flow/display_list_benchmarks.cc Outdated Show resolved Hide resolved
flow/display_list_benchmarks.cc Outdated Show resolved Hide resolved
@flutter flutter deleted a comment from gw280 Dec 23, 2021
@gw280
Copy link
Contributor Author

gw280 commented Dec 23, 2021

I've rebased this branch off the latest main, which includes Chinmay's changes to move the DisplayList code out of flow. As a result I've squashed the history and renamed the benchmark suite to display_list_benchmarks.

The only review comment that still needs to be addressed is the DrawVertices stuff.

@gw280 gw280 changed the title Add a new flow_benchmarks test suite Add a new display_list_benchmarks test suite Dec 23, 2021
Copy link
Contributor

@flar flar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - some comments about work that you can consider or not or do at a later date, but nothing blocking.

}
if (rect.bottom() > canvas_size) {
rect.offset(0, -canvas_size);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion for future work. Create a bouncing rect class that automates this process.

George Wright added 3 commits December 23, 2021 13:11
…crobenchmark all the rasterops defined in

our DisplayList format on both CPU and GPU canvases.
Clean up test definitions, add missing ones
Add labels to test definitions
@gw280 gw280 merged commit 724ada6 into flutter:main Dec 23, 2021
engine-flutter-autoroll added a commit to engine-flutter-autoroll/flutter that referenced this pull request Dec 23, 2021
chinmaygarde added a commit to chinmaygarde/flutter_engine that referenced this pull request Dec 24, 2021
No functional change. Makes the display list subsystem easier to navigate as the
major classes are in their own TUs. Also avoids importing unnecessary headers
when the previous kitchen sink header was imported. I've tried to remove all
display list related imports and start from scratch but I may have missed some
files. Minor structs and classes (like the ones in utils, ops, etc..) still
don't get their own TUs though.

There were [two](flutter#29562) [related](flutter#30484) changes being made to this subsystem that have since
landed. So I don't think I am stepping on anyones toes with the reorganization.
Happy to incorporate any work-in-progress changes being made to the this
subsystem before submitting.
gw280 added a commit that referenced this pull request Dec 25, 2021
gw280 added a commit that referenced this pull request Dec 25, 2021
chinmaygarde added a commit to chinmaygarde/flutter_engine that referenced this pull request Dec 28, 2021
No functional change. Makes the display list subsystem easier to navigate as the
major classes are in their own TUs. Also avoids importing unnecessary headers
when the previous kitchen sink header was imported. I've tried to remove all
display list related imports and start from scratch but I may have missed some
files. Minor structs and classes (like the ones in utils, ops, etc..) still
don't get their own TUs though.

There were [two](flutter#29562) [related](flutter#30484) changes being made to this subsystem that have since
landed. So I don't think I am stepping on anyones toes with the reorganization.
Happy to incorporate any work-in-progress changes being made to the this
subsystem before submitting.
chinmaygarde added a commit that referenced this pull request Dec 28, 2021
…30487)

No functional change. Makes the display list subsystem easier to navigate as the
major classes are in their own TUs. Also avoids importing unnecessary headers
when the previous kitchen sink header was imported. I've tried to remove all
display list related imports and start from scratch but I may have missed some
files. Minor structs and classes (like the ones in utils, ops, etc..) still
don't get their own TUs though.

There were [two](#29562) [related](#30484) changes being made to this subsystem that have since
landed. So I don't think I am stepping on anyones toes with the reorganization.
Happy to incorporate any work-in-progress changes being made to the this
subsystem before submitting.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants