-
-
Notifications
You must be signed in to change notification settings - Fork 280
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compute skylines and bottom lines in batches (+WebGL): 30 to 60% performance gain #1158
Compute skylines and bottom lines in batches (+WebGL): 30 to 60% performance gain #1158
Conversation
…SkyBottomLineBatchCalculatorBackend.ts
Isn't there a possibility that WebGL logic is not tested correctly due to lack of GPU support in CI/CD test environment? |
I thought @paxbun Is it correct that the |
@sschmidTU I am very very sorry about this. |
No problem, thanks for confirming! (see PR #1160 for this, working around gl not being able to install) |
|
Ah, yes, that sounds like a very good solution! I agree, I didn't know it was difficult to install on macOS as well, so putting it under |
…ild errors (#1160) The build of gl can fail on linux if you have gcc-11 (instead of gcc-10) installed, so it was moved to optionalDependency,. since it's only used in the generateImages_browserless script and visual regression tests, and the import of gl was made dynamic with try catch, so that the script still works even if gl could not be installed on your system. squashed commits from PR #1160: * chore: dynamic import() for gl in generateImages_browserless.mjs * chore: move gl to optionalDependencies (from devDependencies) in package.json (#1158) * chore: restore removed generate:blessed NPM script
@paxbun I did some performance benchmarks, and WebGL was clearly faster for me in Chrome and Edge, but clearly slower in Firefox. So I've now set the preferred backend to Plain instead of WebGL for Firefox as well. Obviously, more testing on different machines and better benchmarks would be great. In #1160 (comment) you mentioned that non-batch was faster than batch for you on a Windows machine. Which browser did you use? Non-batch was about 3 times as slow as batch for me in Chrome, and around 10% slower in Firefox. My benchmarks: // render Actor prelude once:
console.time();
osmd.clear();
await osmd.load("ActorPreludeSample.xml");
osmd.render();
console.timeEnd(); // render a piece 30 times in a row (Beethoven Geliebte) (unrealistic use case):
console.time();
for (let i=1; i <= 30; i++) {
osmd.render();
}
console.timeEnd() Results (Windows): WebGL-Chrome (Actor 1 times): Plain-Chrome (Actor 1 times): WebGL-Chrome-nonbatch (Actor 1 times): WebGL-LinuxVM-Firefox (Actor 1 times): Plain-LinuxVM-Firefox (Actor 1 times): WebGL-Chrome (Beethoven 1 times): Plain-Chrome (Beethoven 1 times): WebGL-Firefox (Actor 1 times): Plain-Firefox (Actor 1 times): Plain-Firefox-nonbatch (Actor 1 times): WebGL-Edge (Actor 1 times): Edge-plain (Actor 1 times): Rendering the same piece 30 times in a row is slightly slower in WebGL compared to Plain, but i think that's an unrealistic use case, and might be due to the startup performance cost of creating a WebGL context (sometimes the browser also complains about too many WebGL contexts created): Plain-Chrome (30 renders, Beethoven Geliebte): This was on a Windows machine. There were other background processes with around ~18% CPU usage. The most important performance issue in OSMD and performance use case is loading one big piece once, which can take several seconds on slow machines. And in Chrome and Edge, that's significantly faster with WebGL. (In Firefox and Safari, we use Plain by default) |
@sschmidTU It was tested on Windows Edge 99 as well, but with only 100 measures. I don't have detailed results now, but the trend was as follows: That's why I said WebGL was slower. The slope of WebGL's graph was obviously more gradual, but I thought the x-intercept was too high. |
Yes, it seems like WebGL has some startup/windup cost. I like your hand-drawn graph! |
@paxbun We could only use WebGL if the sheet has a minimum number of measures, like with batch processing. Our default Beethoven piece has 15 measures and 3 staves = 45 graphical measures (3x15) and it's slightly slower with WebGL, so maybe we start using WebGL at 60 graphical measures minimum? So, we should count graphical measures, not just the "number of measures" ( Would you be interested in writing some tests or benchmarks for performance? We could also use something like BrowserStack or Selenium to automatically test different browsers and systems. |
I think 60 is a reasonable choice. I set this to 5, but considering the test results on my Windows machine and yours, 60 is much better than 5.
I thought I was counting the number of graphical measures (as in
I would really like to, but since I'm working on another project now, I can't guarantee that I can write benchmarks for this at the moment. I will let you know before this weekend when I finish the task right now. |
This does look correct, that's just a different way to get all the graphical measures. (though it doesn't respect multi-measure rests, see below)
A measure with a multi-rest is one graphical measure. The graphical measures that would appear if we didn't use a multi-rest measure are undefined and not rendered. So, a piece with multiple measure rests rendered has less graphical measures than |
…nd AlwaysSetPreferredSkyBottomLineBackendAutomatically (#1158), fix measure counting fix measure threshold to include only graphical measures that are rendered (applies to multi-measure rests) refactors, comments, jsdoc getting ready for release 1.5.0
…leWebGLInSafariAndIOS for options (#1158) Currently WebGL is always slower in Firefox and Safari, but that may change with new versions of these browsers, so it's good to have the option to enable WebGL in these cases
I did some performance tests on a Macbook (late 2012 Pro), Webgl Safari Actor Plain Safari Actor Plain Firefox Actor WebGL Firefox Actor WebGL Chrome (MacOS) Actor Plain Chrome (MacOS) Actor (WebGL is enabled by default for Chrome on MacOS, because |
Introduction
The current rendering logic requires calculation of skyline and bottom line information of each measure, which is the main bottleneck of OSMD. For example, when rendering Clementi's Sonatina Op. 36 No. 3 with OSMD on Microsoft Edge in a M1 Mac mini device which takes 765 milliseconds on average, skyline and bottom line computation takes about 450 milliseconds. Almost half of the computation is for drawing measures in a temporary canvas, and another half is for retrieving pixel data (
ctx.getImageData()
) from the canvas.The current logic draws a single measure on a single canvas, so the number of calls to
ctx.getImageData()
is equal to the number of measures, which makes the computation very slow. This fix is to reduce the bottleneck of retrieving pixel data by drawing multiple measures in a single canvas, so OSMD can render the sheet music with fewer number of calls to functions accessing the pixel data. This fix also introduces WebGL-accelerated skyline and bottom line calculation, which computes the lines of measures on a single canvas simultaneously.Summary of change
Removed
calculateLines()
fromSkyBottomLineCalculator
.Renamed
SkyBottomLineCalculator
toSkyBottomLine.ts
, as it does not contain the calculation logic.Added classes for batch calculation
SkyBottomLineBatchCalculator
SkyBottomLineBatchCalculatorBackend
PlainSkyBottomLineBatchCalculatorBackend
WebGLSkyBottomLineBatchCalculatorBackend
SkyBottomLineCalculationResult
Updated module rules in
webpack.common.js
to import GLSL filesglobal.d.ts
for typingMoved calls to
SkyBottomLineCalculator
fromMusicSheetCalculator
toVexFlowMusicSheetCalculator
SkyBottomLineCalculator
referencesVexFlow*
classes which makes a circular dependency. For some reason,MusicSheetCalculator
inBenchmarks
(The width of
body
is set to900px
in all tests)(The improvement becomes x1.55 - x1.7 with WebGL on Edge in subsequent renderings)
What if the browser does not support WebGL?
PlainSkyBottomLineBatchCalculatorBackend
operates as the fallback logic.What makes the WebGL version faster?
From the canvas where the measures are drawn (see the image on the top), the WebGL logic generates an image which has the same width with the canvas and whose height is equal to the number of rows of the canvas. That is, the number of pixels in the output are 300 times fewer than the input, which makes the time taken to
gl.readPixels
shorter. Also, the WebGL version computes the lines of all measures in the canvas at the same time.Why is the WebGL version is slower on Safari?
While
gl.texImage2D
takes about 12 milliseconds on Microsoft Edge (Chromium),gl.texImage2D
takes about 90 milliseconds on Safari.gl.texImage2D
converts the canvas where the measures are drawn into a texture, so there might be an implicit pixel data copy. There might be another browser which shows similar behavior. We have to test on as many browsers as we can, so we can select specificSkyBottomLineBatchCalculatorBackend
according to the value ofnavigation.userAgent
.Potential issues
Benchmark codes
(I don't have detailed banchmark with other
IOSMDOptions
, but the trend is similar)