Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement batch drawing on WebGL #591

Closed
agmcleod opened this issue Oct 20, 2014 · 46 comments
Closed

Implement batch drawing on WebGL #591

agmcleod opened this issue Oct 20, 2014 · 46 comments
Assignees
Milestone

Comments

@agmcleod
Copy link
Collaborator

This should reduce heavy usage of buffers, and number of draw calls required. Leading to a performance boost.

parasyte added a commit that referenced this issue Oct 21, 2014
- Platformer CPU usage went down from 30% to 24% on my MBA
- The same patch is needed for fillRect
- The next big win will come from batching [#591]
@parasyte
Copy link
Collaborator

The referenced commit is not batching. All I did was reuse buffers in drawImage. The same buffer reuse is needed for fillRect.

The process of batching involves keeping a record of everything you want to draw which uses the same texture, then constructing the triangle vertices/texture coordinates/indices, and sending the whole thing in a single drawElements call.

This will make tile layers very fast (an entire map layer can be drawn in a single command), and sprites that use the same texture atlas will get the same benefit.

@agmcleod
Copy link
Collaborator Author

Ah yes, i did mis understand your original post. But regardless this will definitely help :)

@parasyte
Copy link
Collaborator

It's a good first step, I agree.

Thinking this morning how to structure the batching API, it seems natural to use me.TextureAtlas The whole purpose of this data structure is to store texture coordinates. The coordinate range is a bit different between Canvas and WebGL (pixels vs float between 0.0 .. 1.0 inclusive) but they require identical behavior.

So I want to rename me.TextureAtlas to me.CanvasRenderer.Texture, and then extend it as me.WebGLRenderer.Texture. That takes care of the texture coordinates. To build batching on top of that, we have multiple options! yay! My favorite so far is lazy batching:

  • For each me.video.renderer.drawImage(texture, index, x, y, w, h) where index is an atlas index name or number, a reference to the last texture is remembered:
    • If the last reference is the same as the new reference, add index, x, y, w, h to a Batcher object.
    • Else flush the Batcher object (draw the batch, then reset itself) and add index, x, y, w, h
  • At the end of frame drawing, blitSurface is called. This function will flush the Batcher object (draw the batch, then reset)

My second favorite is explicit batching, where something has to create a new Batcher, add items to draw, then flush the Batcher when ready. IMHO, this will be harder to get working with sprite objects, since there's really no "dividing line" between how sprites should be grouped. TMX has ObjectGroup for this purpose, but there's no guarantee that each object uses the same texture.

The implementation of this Batcher class is TBD, but basically it just needs to take the x, y, w, h screen coordinates, and assemble the vertices for these, appending them to the vertex list. It also uses the texture atlas index to fetch the texture coordinates, and appends them to the texture coordinate list for the batch. The index list (for drawElements) is also updated likewise. It has a second method to flush these lists to the GPU before resetting.

As you might tell from this last description, the texture batching will actually "undo" a lot of the bufferData reuse work that I just did! 😉 But it was a super useful exercise that helped me get more familiar with WebGL.

@agmcleod @obiot Please weigh in with your thoughts, especially in regard to how I want to redesign the Texture Atlas.

@agmcleod
Copy link
Collaborator Author

My only concern with the second one is it could be limiting in some way. The first method is the one a developer at big viking games used, adding a webgl renderer to their canvas-like api. Similar to what we're doing really.

@parasyte
Copy link
Collaborator

Implicit batching is by far easier to manage. And if you cram everything into a single texture atlas (all tile sets, all sprites) then you can theoretically get the best possible performance by executing a single drawElements call per frame, without any extra coding, or other special setup.

@agmcleod
Copy link
Collaborator Author

Yep exactly. Been trying to get better practice at doing so. Nicer to see a shorter list of files getting uploaded when i SCP stuff to my site.

@parasyte
Copy link
Collaborator

😋

Also there's something to be said about at least exposing an API to allow custom batching operations. Just in case someone actually wants to manage it themselves for some reason.

@agmcleod
Copy link
Collaborator Author

agmcleod commented Nov 8, 2014

melonjs/melonjs-spine#1 :)

@parasyte parasyte self-assigned this Dec 20, 2014
@parasyte
Copy link
Collaborator

I'll start this one. ;)

parasyte added a commit that referenced this issue Dec 20, 2014
- Precalculate the texture regions
- Added a new uvMap to each region for WebGL
parasyte added a commit that referenced this issue Dec 20, 2014
parasyte added a commit that referenced this issue Dec 20, 2014
- Destination coordinates are only useful when trimming is used
- melonJS does not support trimming
@parasyte
Copy link
Collaborator

  • The TextureAtlas constructor now accepts a new "internal" texture atlas format (for now it's just the same as TexturePacker with "melonJS" in the meta.app field.)
  • The atlas regions each have a uvMap property that specifies texture coordinates (in WebGL triangles) for the region. These can be passed directly to WebGL. For batching, these need to be concatenated together and used with drawElements(gl.TRIANGLES, ...)

@obiot
Copy link
Member

obiot commented Dec 21, 2014

But merge "ticket-620" into master first ;)

On 21 déc. 2014, at 03:29, Jay Oster notifications@github.com wrote:

I'll start this one. ;)


Reply to this email directly or view it on GitHub.

parasyte added a commit that referenced this issue Dec 21, 2014
- The WebGL Renderer adds a texture buffer, UV map, and index buffer to the atlas (these will be used later by the WebGL batcher)
@parasyte
Copy link
Collaborator

The next step is adding the batcher to the WebGL Renderer. I think this won't be too much trouble. It just needs to do a little bit of memory accounting to avoid GC. I'll start with ~1KB memory buffers, and do the usual growth by 2x pattern (never shrinking).

Second, I don't want to change the function signature for drawImage, since that's already well-established. Instead I'll add a public bindTexture method to the Renderer API (there's one right now, but it needs to be replaced) that will remember the provided Texture region, and use it for batching. Using the method will look something like this:

me.video.renderer.bindTexture(texture.getRegion(name)).drawImage(texture, x, y, ...);

That's the best thing I can think of...

Otherwise it should be really straightforward.

@parasyte
Copy link
Collaborator

On the other hand ... The drawImage method can just introspect the first argument with instanceof. For Image it will use the original signature. And for me.video.renderer.Texture it will expect the region info as the second parameter. Kind of weird, but easy enough!

@agmcleod
Copy link
Collaborator Author

Nice work on this so far jason. Do you think using instanceof and checking for different parameter types might duplicate/increase logic in the method? Could potentially refactor it into two methods if you think it would be worth it.

@parasyte
Copy link
Collaborator

It won't cause any duplicate logic, the code to select source coordinates will be different. It will also be the cleanest interface, IMHO. (This is probably the design I imagined when I proposed the batcher. I've just forgotten about the details.)

Anyway, let's try it with drawImage getting a new fourth function signature, and see how it works.

@parasyte
Copy link
Collaborator

This part of the work is actually quite involved (more than I had imagined!) It will require some shader rewrites. The best case scenario is the batcher uploads all of the textures, vertex buffers, UV maps, index buffers, etc to the GPU on the first frame, then only needs to provide transformation matrices for all of the sprites on each frame update.

Some new points need to be addressed:

  • The batcher will need to be reset on each state change (e.g. LoadingScreen -> PlayScreen) to remove old texture state.
  • The GPU has limited texture memory, so fewer textures are better than many. Power-of-two sizing is also important. We need to hard-code the fragment shader to expect a maximum number of textures, like 8. GPUs that have more texture units could benefit from dynamically modifying the shader code before compiling. The number of available texture units is in gl.MAX_COMBINED_TEXTURE_IMAGE_UNITS The batcher will throw an exception if too many textures are used simultaneously.
  • The fragment shader needs a texture index to select the correct texture. This can be done with an if-statement, similar to the one described here: http://webglsamples.org/sprites/readme.html We will use a single u8 instead of a vec4; the value of the u8 selects the texture.
  • The vertex shader needs to accept an array of 2D transformation matrices for each sprite. (The matrix allows arbitrary translation/rotation/scaling operation order outside of the shader.) Some optimizations to this data structure could be used, to reduce the total amount of data sent to the GPU each frame. This presentation has some great info on the subject: https://docs.google.com/presentation/d/12AGAUmElB0oOBgbEEBfhABkIMCL3CUX7kdAPLuwZ964/edit#slide=id.i177 e.g. each float element can be packed into a u16 fixed point format, and unpacked by the shader, reducing GPU bandwidth requirements by half.
  • We need to remove this magic code:
    gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_WRAP_S, gl.CLAMP_TO_EDGE);
    gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_WRAP_T, gl.CLAMP_TO_EDGE);
    to support hardware accelerated image repeating for me.ImageLayer (see: Optimize ImageLayer "repeat" modes #603) -- requires power-of-two textures.
  • Allowing custom shaders also means allowing a custom batcher; since the batcher is an API for the shaders. Custom shaders may use the same API, but we should provide the flexibility to use a custom API.
  • One day melonJS will host 3D games with the right set of shaders and batcher.

parasyte added a commit that referenced this issue Dec 28, 2014
- It wasn't a 3D matrix (a 3D matrix is 4x4)
- The only difference with me.Matrix2d is that me.Matrix3d also stored the hidden row to make the matrix square (required by WebGL)
- This patch adds the equivalent hidden row storage to me.Matrix2d
- Fixed me.Matrix2d.translate()
- Some optimizations in me.Matrix2d

Breaking changes:
- me.Matrix2d.set() now requires all 9 arguments instead of only 6
- me.video.renderer.transform() now requires a me.Matrix2d instance instead of individual number arguments
parasyte added a commit that referenced this issue Jan 16, 2015
- Fixes fragment shader compilation on some systems
parasyte added a commit that referenced this issue Jan 16, 2015
… uniforms!

- That was easier than I imagined. ;)
- This should make it nicer to work with when building custom compositors!
parasyte added a commit that referenced this issue Jan 16, 2015
- This change allows multiple shaders programs to be created; the singleton API is no longer used for shader program storage
- Also fixes a missing description in documentation
parasyte added a commit that referenced this issue Jan 16, 2015
- No need for a createShader method
- Lookup the texture unit within the uploadTexture method
@parasyte
Copy link
Collaborator

Alright, I think it's finally in a pretty stable state! The stuff I did tonight focuses on customizability of the WebGL environment. I don't want to tie any users down in regards to how they use WebGL. So now it's possible to use an entirely custom Compositor class by passing the compositor option to me.video.init()! I don't think I'll be writing a different compositor any time soon, but I like that we can provide this flexibility for others who wish to experiment.

Another important change is that the me.video.shader.createShader() method is now independent of the singleton, allowing it to compile multiple shader programs. This I will be using in our default compositor for the line-rendering (e.g. fillStroke) shader program. The compositor just needs to flush when switching between shader programs.

Most of my TODO lists are already done, which is exciting! And I heard today from @ldd that his tests show a nice improvement in rendering speed. Apparently in his tests, CanvasRenderer is capable of 142 objects max and WebGLRenderer is capable of over 500. This is a good start, but I want more! :)

parasyte added a commit that referenced this issue Jan 18, 2015
parasyte added a commit that referenced this issue Jan 18, 2015
- This shows a potential optimization path; melonJS should only use known regions, so we can remove this safety net code!
- Requires Texture [atlas] support for TMX tilesets and bitmap fonts, at the very least
obiot added a commit that referenced this issue Jan 19, 2015
…spritesheet

some redundant code to be removed later.
parasyte added a commit that referenced this issue Jan 19, 2015
- Lots of new FIXME comments :\
- The attributes and uniforms need to be configured as part of the `Compositor.useShader` function (maybe with a callback?)
- The lines feel like they are rendered with poor precision; they are drawn perfectly on pixel boundaries without antialiasing...
parasyte added a commit that referenced this issue Jan 20, 2015
- It seems the WebGLRenderer.setColor() overwrites the alpha channel, even for colors that don't specify an alpha component.
- Maybe we should "fix" this in CanvasRenderer.setColor(); use the same me.Color code there
@parasyte
Copy link
Collaborator

With the last few commits, stroke (line rendering) is finally in place. It's not efficient, though. At first glance, it appears that the depth buffer can make it very efficient; we just need a way to get the Z-coordinate information into the compositor. That will likely depend on the work in #637

In the meantime, there are a few FIXME comments that need to be addressed (especially with how the uniform variables are set, and the attribute bindings are handled).

Second to that, getting fonts working (and in particular replacing the RTT thing in the debugPanel) is a priority for release. There's also a weird ghosting effect seen on the debugPanel with WebGLRenderer. That needs to be investigated further.

parasyte added a commit that referenced this issue Feb 4, 2015
…WebGL

- Changes me.Font API to accept a Renderer reference (was Context2D reference)
- renderer.drawFont() is now private
- renderer.measureText() is gone! Use me.Font.measureText()
- TODO: Create a second texture cache for font textures, so we aren't creating thousands of new textures each frame in the font_test example
- XXX: Adds a new renderer.fontContext2D reference that we probably don't want to keep Replace this with the secondary texture cache
- XXX: find a better way to integrate renderer.drawFont() and me.Font.draw() using the renderer.fontContext2D hack SUCKS!
@parasyte
Copy link
Collaborator

parasyte commented Feb 4, 2015

Started working on font support in WebGL. The hack in the branch is pretty ugly, but it does make the me.Font API consistent! (Solves #619)

It's currently very slow with the font_text example, because it spends most of its time creating and uploading massive textures. 😆 The secondary texture cache (proposed in the commit) will help that a little bit.

A better way to support fonts in WebGL will be important long-term, but this will work for 2.1!

@agmcleod
Copy link
Collaborator Author

agmcleod commented Feb 4, 2015

Awesome! For now we can recommend keeping usage of the me.Font api simple, or to use Canvas instead :)

obiot added a commit that referenced this issue Feb 6, 2015
…webgl renderer

this "FIXME" was just bothering me :P:P:P
obiot added a commit that referenced this issue Feb 8, 2015
@obiot
Copy link
Member

obiot commented Mar 26, 2015

@parasyte if you don't mind, could you maybe create one or several small tickets to better identify what's left to be done for this one ?

@parasyte
Copy link
Collaborator

Everything left to do is a task here and here

@obiot
Copy link
Member

obiot commented Mar 27, 2015

oh sorry, missed that, but for my defense this ticket is super long now ;P

@obiot
Copy link
Member

obiot commented Mar 29, 2015

did you guys see that ?
http://patriciogonzalezvivo.com/2015/thebookofshaders/

@obiot obiot modified the milestones: 2.1.1, 2.1.0 May 8, 2015
@parasyte
Copy link
Collaborator

parasyte commented Jun 3, 2015

Closing this. Followup ticket is #637

@parasyte parasyte closed this as completed Jun 3, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants