Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with converting the OpenGL glDrawElements function to WebGL in emscripten #4214

Closed
hnn0003 opened this issue Mar 31, 2016 · 50 comments
Closed
Labels

Comments

@hnn0003
Copy link

hnn0003 commented Mar 31, 2016

I'm running into an issue with how emscripten converts the OpenGL glDrawElements function to the WebGL equivalent (Glctx.drawElements). I had been getting a number of weird triangles popping around in my graphics, and I believe I’ve tracked it down to this:

In emscripten’s library_gl.js file resides the glDrawElements function that replaces the OpenGL call with the same name. Part of what this function does is set up a buffer for the indices (called GLctx.ELEMENT_ARRAY_BUFFER) and (in the GL.preDrawHandleClientVertexAttribBindings function call) it sets up a buffer for the vertices (called GLctx.ARRAY_BUFFER).

The problem is that in the GL.preDrawHandleClientVertexAttribBindings function, it is setting how large the GLctx.ARRAY_BUFFER data array will be based on the variable ‘count’, which is the size of the GLctx.ELEMENT_ARRAY_BUFFER (i.e. the array of indices). This is a problem because what the Glctx.drawElements function will actually do is use the values in the index array to determine where it needs to get its data from in the vertex array (i.e. GLctx.ARRAY_BUFFER). So if the values in the index array point to an index larger than the size of the index array, the Glctx.drawElements function will try to get data from outside the bounds of the GLctx.ARRAY_BUFFER (which just gives junk data).

An example of where the current code would have this happen (i.e. failing to produce the correct graphics) would be if you defined a triangle with an index array (or GLctx.ELEMENT_ARRAY_BUFFER) of [0, 5, 6] and a vertex array (or GLctx.ARRAY_BUFFER) containing only the position in (x,y,z) with value of
[-1,-1,-5,
1, 2, 3,
2, 3, 4,
3, 4, 5,
4, 5, 6,
1, -1, -5,
1, 1, -5].
After running through emscripten, the program would calculate the size of the index array to be 3 (since only 3 indices were given) and each position has 3 values (x, y, and z), so the GLctx.ARRAY_BUFFER would be made to hold only the first 9 values:
[-1,-1,-5,
1, 2, 3,
2, 3, 4]
However, when the WebGL function GLctx.drawElements is called, it would be looking for values in the GLctx.ARRAY_BUFFER array at 0, 5 and 6 (which should give it -1, -1, -5 and 1, -1, -5 and 1, 1, -5 respectively). But since the size of the GLctx.ARRAY_BUFFER is based on the fact that there are 3 indices (and therefore doesn’t include the values after the first 9), it will actually just be getting junk values for indices 5 and 6.

Solution:
I believe the best way to fix this (and at least the only way I can see to fix it) would be to base the size of the GLctx.ARRAY_BUFFER data array on the highest/maximum index value in the index array instead of basing it on the size of the index array. This would ensure that you would not accidentally make the GLctx.ARRAY_BUFFER data array shorter than it should be. To this end, I came up with a fix for this problem (I put a copy of this diff below). I know that this works for my code, but I believe it should also work in general. But if there’s a better or more concise way to fix this, great.

Basically what I’ve done is find the maximum index value in the index array and use that as the parameter for the GL.preDrawHandleClientVertexAttribBindings function instead of using the size of the index array (which makes the size of the GLctx.ARRAY_BUFFER data array be based on the maximum index value instead of the size of the index array).

@@ -6975,6 +6975,33 @@
   glDrawElements: function(mode, count, type, indices) {
#if FULL_ES2
     var buf;
+
+    // Calculate the maximum index number
+    var maxIndex = count;
+    switch (GL.byteSizeByType[type - GL.byteSizeByTypeRoot]) {
+        case 1: 
+            var indicesArray = HEAPU8.subarray((indices>>0), (indices>>0) + count);
+            break;
+        case 2: 
+            var indicesArray = HEAPU16.subarray((indices>>1), (indices>>1) + count);
+            break;
+        case 4:
+            var indicesArray = HEAPU32.subarray((indices>>2), (indices>>2) + count);
+            break;
+        case 8:
+            var indicesArray = HEAP64.subarray((indices>>3), (indices>>3) + count);
+            break;
+        default:
+    }
+    if (indicesArray) {
+        maxIndex = indicesArray[0];
+        for (var i = 1; i < count; i++) {
+            if (indicesArray[i] > maxIndex)
+                maxIndex = indicesArray[i];
+        }
+        maxIndex += 1; // indices start at 0, so final size is max+1
+    }
+
     if (!GL.currElementArrayBuffer) {
       var size = GL.calcBufLength(1, type, 0, count);
       buf = GL.getTempIndexBuffer(size);
@@ -6987,7 +7014,7 @@
     }
     // bind any client-side buffers
-    GL.preDrawHandleClientVertexAttribBindings(count);
+    GL.preDrawHandleClientVertexAttribBindings(maxIndex);
#endif
     GLctx.drawElements(mode, count, type, indices);

@eukarpov
Copy link
Contributor

I checked glDrawElements in library_gl.js and it looks good. Can you put here a small example which you use for rendering?

@kripken
Copy link
Member

kripken commented Apr 1, 2016

An example would also be useful as a testcase for the test suite, assuming this turns out to be a bug.

@hnn0003
Copy link
Author

hnn0003 commented Apr 5, 2016

Sorry I haven't had a chance to reply till now. Hopefully this works for the example you requested:

Below is a snipet of opengl code that should work to test against. It simply produces (or is supposed to produce) a red triangle in the middle of the screen with a black background. If you use the variables vVertices and ndxArray for the data for the vertex array and the index array, then it works fine in opengl, but fails (as in, the triangle doesn't appear as it is supposed to) when using the current emscripten to generate the js/webGL version (though it worked for me using emscripten when I tested this with the fix I posted above). If you use the variables vVertices2 and ndxArray2 for the data, then it works fine for both emscripten and for the traditional opengl.

Testing wise, in this example the size of the vertex array that is made in the GL.preDrawHandleClientVertexAttribBindings function should contain 7 generic vertex attribute values (so 7 values for vertex attribs * 3 components--x,y,z--for each attribute = 21 values in the vVertices array * 4 bytes for floats = a total of 84 bytes) for the vVertices/ndxArray set of data and 3 generic vertex attribute values (for a total of 36 bytes) for the vVertices2/ndxArray2 set of data.

So, in the GL.preDrawHandleClientVertexAttribBindings function, the variable 'size' should equal 84 bytes when the vVertices/ndxArray data is used and 34 bytes when the vVertices2/ndxArray2 is used. (though, of course, you could test it in a few other ways too)


    // Load the vertex shader
    const GLchar *vShaderStr =
        "attribute vec4 vPosition;    \n"
        "void main()                  \n"
        "{                            \n"
        "   gl_Position = vPosition;  \n"
        "}                            \n";
    GLuint vertexShader = glCreateShader(GL_VERTEX_SHADER);
    glShaderSource(vertexShader, 1, &vShaderStr, NULL);
    glCompileShader(vertexShader);
    //Load the fragment shader
    const GLchar *fShaderStr =
        "precision mediump float;\n"\
        "void main()                                  \n"
        "{                                            \n"
        "  gl_FragColor = vec4 ( 1.0, 0.0, 0.0, 1.0 );\n"
        "}                                            \n";
    GLuint fragmentShader = glCreateShader(GL_FRAGMENT_SHADER);
    glShaderSource(fragmentShader, 1, &fShaderStr, NULL);
    glCompileShader(fragmentShader);

    // Create the program object
    GLuint programObject = glCreateProgram();

    glDisable(GL_DEPTH_TEST);
    glAttachShader(programObject, vertexShader);
    glAttachShader(programObject, fragmentShader);

    // Bind vPosition to attribute 0   
    glBindAttribLocation(programObject, 0, "vPosition");

    // Link the program
    glLinkProgram(programObject);

    glClearColor(0.0f, 0.0f, 0.0f, 0.0f);


    GLfloat vVertices[] = { 
        0.0f, 0.5f, 0.0f,
        1.0f, 2.0f, 3.0f,
        2.0f, 3.0f, 4.0f,
        3.0f, 4.0f, 5.0f,
        4.0f, 5.0f, 6.0f,
        -0.5f, -0.5f, 0.0f,
        0.5f, -0.5f, 0.0f };
    GLfloat vVertices2[] = { 0.0f,  0.5f, 0.0f,
        -0.5f, -0.5f, 0.0f,
        0.5f, -0.5f, 0.0f };
    GLshort ndxArray[] = { 0,5,6 };
    GLshort ndxArray2[] = { 0,1,2 };

    // Clear the color buffer
    glClear(GL_COLOR_BUFFER_BIT);

    // Use the program object
    glUseProgram(programObject);

    // Load the vertex data
    glVertexAttribPointer(0, 3, GL_FLOAT, GL_FALSE, 0, vVertices);
    glEnableVertexAttribArray(0);

    glDrawElements(GL_TRIANGLES, 3, GL_UNSIGNED_SHORT, ndxArray);

One other thing: A college of mine just told me about opengl's primitive restart functionality, which I've never used before. It's apparently used to sort-of restart when you're drawing a triangle strip, by inserting a specific number (such as the max int allowed) into the index array, which lets opengl know to start a new triangle strip. This would potentially mess up the example fix that I posted here. I am very fuzzy on how it works or what versions of openGL actually support this feature. From basic googling, it looks to me like it's possibly in/going to be in WebGL2, ES 3, and OpenGL 4.3. I've no idea how emscripten is handling / will handle that, or precisely what to do with this new information in regards to the bug I'm seeing in the glDrawElements function, as I'm not totally clear on how to see what the current primitive restart index would be or how emscripten handles this now. Any thoughts or ideas?

Thank you both for all your help on this, and please let me know if you need anything else from me.

@kripken
Copy link
Member

kripken commented Apr 6, 2016

Thanks for the testcase. This does look like a problem. Let's see what @juj thinks (might be delayed, he's on vacation).

@eukarpov
Copy link
Contributor

eukarpov commented Apr 6, 2016

It does and in such a important function. glDrawRangeElements can be a workaround without losing performance but this function is available only in OpenGL ES 3.0

@eukarpov
Copy link
Contributor

eukarpov commented Apr 8, 2016

The problem can be solved by using vbo because glDrawElements doesn't have to calculate buffer for loading. This example works fine for me.

    // Load the vertex data
    GLuint vbo = 0;
    glGenBuffers(1, &vbo);
    glBindBuffer(GL_ARRAY_BUFFER, vbo);
    glBufferData(GL_ARRAY_BUFFER, sizeof(vVertices), vVertices, GL_STATIC_DRAW);

    glVertexAttribPointer(0, 3, GL_FLOAT, GL_FALSE, 0, 0);
    glEnableVertexAttribArray(0);

    glDrawElements(GL_TRIANGLES, 3, GL_UNSIGNED_SHORT, ndxArray);

    glDisableVertexAttribArray(0);
    glBindBuffer(GL_ARRAY_BUFFER, 0);
    glDeleteBuffers(1, &vbo);

However glDrawElements has to change in case if data is on client side. Here is important to optimize the function because of performance and detect case when max element should be calculated .

draft

--- a/src/library_gl.js
+++ b/src/library_gl.js
@@ -6981,6 +6981,7 @@ var LibraryGL = {
   glDrawElements: function(mode, count, type, indices) {
 #if FULL_ES2
     var buf;
+    var vertexes = 0;
     if (!GL.currElementArrayBuffer) {
       var size = GL.calcBufLength(1, type, 0, count);
       buf = GL.getTempIndexBuffer(size);
@@ -6988,12 +6989,26 @@ var LibraryGL = {
       GLctx.bufferSubData(GLctx.ELEMENT_ARRAY_BUFFER,
                                0,
                                HEAPU8.subarray(indices, indices + size));
+    
+     // Detecting vertex count if attributes' data should be loaded
+     for (var i = 0; i < GL.currentContext.maxVertexAttribs; ++i) {
+        var cb = GL.currentContext.clientBuffers[i];
+        if (cb.clientside && cb.enabled) {
+            var array_classes = {
+                '5121' /* GL_UNSIGNED_BYTE */: Uint8Array,
+                '5123' /* GL_UNSIGNED_SHORT */: Uint16Array,
+                '5125' /* GL_UNSIGNED_INT */: Uint32Array};
+            if (array_classes[type])
+                vertexes = Math.max.apply(null, new array_classes[type](HEAPU8.buffer, indices, count)) + 1;
+            break;
+        }
+     }
       // the index is now 0
       indices = 0;
     }

     // bind any client-side buffers
-    GL.preDrawHandleClientVertexAttribBindings(count);
+    GL.preDrawHandleClientVertexAttribBindings(vertexes);
 #endif

     GLctx.drawElements(mode, count, type, indices);  

@eukarpov
Copy link
Contributor

eukarpov commented Apr 8, 2016

commit a4fcb6b

@hnn0003
Copy link
Author

hnn0003 commented Apr 8, 2016

Awesome. Thanks for all your help!

Do you know when/approximately when this fix will make it into a release?

@kripken
Copy link
Member

kripken commented Apr 8, 2016

I'm confused, how does that commit show up as part of this repo? I don't see a PR for it, and it doesn't look merged into any branch? I also don't see it in my local git clone.

@eukarpov
Copy link
Contributor

eukarpov commented Apr 8, 2016

Yeah, me too. This is just a commit to my fork, i don't why it looks like this.

@kripken
Copy link
Member

kripken commented Apr 8, 2016

Weird... might be a github bug?

Anyhow, if that code is ready, please open a PR with it. And thanks for looking into this :)

@juj
Copy link
Collaborator

juj commented Apr 9, 2016

Agreed that there is a problem that the FULL_ES2=1 mode is not able to know how many vertices are needed to be uploaded. The assumption has been that there are (at most) as many vertices than there are indices, so if any of the index values is larger than the total number of indices drawn (like your example code), the issue does occur.

In GLES2, the function glDrawRangeElements has been added exactly for this purpose, so that the application can signal GL directly what the min and max indices in the index array are. See https://www.opengl.org/sdk/docs/man/html/glDrawRangeElements.xhtml. This has the advantage that an application might know this without having to do the costly iteration through the index array to know it.

The proposed PR #4238 looks like it has extremely high performance footprint, since it does O(#of rendered indices) amount of work to figure out the max index, and allocates a new JS object at each index. I'd like this to be solved in a different way that does not impact performance and is optional to only be applied in codebases that know they need this fix so they don't receive the performance impact needlessly:

  1. add support for glDrawRangeElements in FULL_ES2=1 mode. This lets applications submit the min and max indices to resolve the issue in a fast manner.
  2. retain the assumption that max index < # of indices in FULL_ES2=1 builds, and for codebases that one doesn't want to convert to using glDrawRangeElements, add a new mode e.g. FULL_ES2=2, which does the work of counting on demand like proposed in this PR. I think the counting should be done via an implementation in asm.js for speed, since this is an extremely performance critical portion of a rendering loop. The code could be hand-implemented in asm.js in src/library_gl.js, and make it call glDrawRangeElements internally once it has the answer.
  3. possibly add in GL_ASSERTIONS=1 + FULL_ES2=1 builds a check that counts the max and ensures that it is small enough, so that people can use this mode to detect if they are hitting this problem (and then choose if they want to fix by using FULL_ES2=2 mode or by migrating to glDrawRangeElements).

@eukarpov: How does that sound?

@eukarpov
Copy link
Contributor

eukarpov commented Apr 9, 2016

How I mentioned before glDrawRangeElements is available only in OpenGL ES 3.0
https://www.khronos.org/opengles/sdk/docs/man3/html/glDrawRangeElements.xhtml
Relative to PR #4238

  1. complexity O(n) in case if there is a buffer on client side
  2. Why?

allocates a new JS object at each index

I was thinking about other solution, but not sure if it's possible to implement.
in case if a buffer is on client side, use a buffer's pointer to detect heap block and its size. then load a block from the buffer pointer till end of the memory block.

@ringoz
Copy link
Contributor

ringoz commented Oct 21, 2016

When people use FULL_ES2, they already know there is a performance impact involved.
But in return they expect proper emulation, which glDrawElements does not deliver currently.

@dwhipps
Copy link

dwhipps commented Apr 21, 2017

For the record, the fix proposed in PR #4238 also seems to solve my problem, but I agree that there could be a significant performance impact. When can we expect a proper fix for this to be included in Master branch?

@dwhipps
Copy link

dwhipps commented Apr 24, 2017

@juj @kripken This is a very big issue and has been identified for over a year. What's the best way I can help getting this fix in there?

@kripken
Copy link
Member

kripken commented Apr 24, 2017

I'm not a GL expert, but based on @juj's last comment here, I think we can add this if it's a new option (so it doesn't add overhead). Or if we show benchmark numbers that prove it has no new overhead, we might not need a new option.

@dwhipps
Copy link

dwhipps commented Apr 24, 2017

@kripken I'm sure it's fine to add as a new option, but to be clear, this is definitely a bug. The function doesn't work as advertised, in its current incarnation. Seems strange to have to use a special option in order to have the function "work." I implemented the fix locally, but as pointed out above, there are likely better / more efficient ways of doing it.

@kripken
Copy link
Member

kripken commented Apr 24, 2017

Yeah, agreed it's a bug, but if it's one that is rarely encountered and has severe perf impact to properly fix, then we need to weigh those factors too, I think that's what was mentioned before.

Alternatively maybe the fixed/slow state should be the default, and an option added to speed things up unsafely with the bug. I guess it depends on the speed difference here, we should measure that.

@eukarpov
Copy link
Contributor

eukarpov commented Apr 24, 2017

I think this problem is critical for people who just started working with openGL and emscripten.
Instead of simple example how it was originally in this bug, they need to understand how VBO works. And usually if developers need to improve their performance they switch to VBO.
It should be solved to engage new developers.
Simple message on compile or running time about performance should be enough.

@kripken
Copy link
Member

kripken commented Apr 25, 2017

Makes sense. Sounds like most people here would prefer this be fixed by default + add an option to speed things up unsafely? (and add a warning somewhere, perhaps)

@juj
Copy link
Collaborator

juj commented Apr 26, 2017

The bug here affects a specific set type of usage, specifically those codebases that refer to indices >= the size of the index buffer. Everyone agrees that this is a bug, but the appropriate method to fix was not yet agreed above.

If this was fixed with a method that causes large performance impact for existing developers, then those developers would likely complain a performance regression bug against Emscripten. That is why I proposed a second flag, and a development time assertions check, so that developers can investigate whether they need the more involved emulation or not. I know a number of developers are currently shipping production code with -s FULL_ES2=1 and they do not use index buffers with indices >= buffer size, so they would not appreciate an unjustified (from their perspective) slowdown.The above PR had the concern of being always enabled and possibly slow since it causes GC pressure from each draw call, and the most recent state was that we have not yet progressed to resolve those concerns. A good fix should accommodate all people here.

It would be best to use a C side unrolled loop to compute the max index element so that it does not generate garbage. Using a technique similar to used in the PR #4898 it's possible to mix C and JS code to implement this emulation. Happy to review if someone wants to write a PR to do that, or I can also put this on my queue if that would be preferable?

@dwhipps
Copy link

dwhipps commented Apr 26, 2017

@juj It sounds like you have the best handle on this, and it would make sense for you to implement an efficient fix, but of course that's not my call. Thanks again to all of you for re-engaging on this.

@eukarpov
Copy link
Contributor

@juj How to write performance and garbage test for that code? Just i wanted to be sure that it is so bad and C implementation is needed there

@dwhipps
Copy link

dwhipps commented May 30, 2018

Any solution here? I got bit by this again, because I had updated Emscripten and forgot to apply my patch. Could we not just implement this with another flag?

@kripken
Copy link
Member

kripken commented May 30, 2018

@dwhipps it looks like @juj doesn't have time to resolve this. I can try to help out here. Reading the above discussion, I'm not sure what you mean by "my patch" - is it the PR linked, or the patch pasted in a comment here, or something else? (I'm trying to read the code with the best solution for the problem, which is what I assume is in that patch?)

@dwhipps
Copy link

dwhipps commented May 31, 2018

@kripken. Sorry, I meant the solution proposed here:
a4fcb6b

The issue is that it's not really performant. I guess there's not much we can do about that?

@kripken
Copy link
Member

kripken commented May 31, 2018

Hmm, it doesn't look that slow to me - those loops should be fast? I don't see concrete measurements in the discussion above, that might be worth doing.

Anyhow, I see no risk in adding this behind a flag, that way at least you don't need to maintain a patch. But hopefully we can land it enabled by default if it's fast enough.

@kripken
Copy link
Member

kripken commented May 31, 2018

As a starting point, if you want to open a PR with that as an option, that would be good I think.

@dwhipps
Copy link

dwhipps commented May 31, 2018

Ok. Might just be my code that is slow... I haven't done any performance testing on that specific method, just my use of it. Not sure I'm comfortable adding a new flag, and the code is literally there to copy paste. Would prefer someone with a bit more experience contributing to handle it, but happy to help out if you guys don't have the bandwidth.

@eukarpov
Copy link
Contributor

eukarpov commented Jun 1, 2018

As I said before i guess the performance is fine here and complicity is just O(n).
I can open a PR once again with this commit and add some tests if you suggest the best place for that. Thanks!

@kripken
Copy link
Member

kripken commented Jun 4, 2018

@eukarpov thanks! that sounds good. For tests, maybe a good relevant one that uses FULL_ES2 is test_cubegeom_pre_vao_es (in tests/test_browser.py). Can create a similar test to that, basically adding 2 lines to test_browser.py, the source file, and the output file (for the output, I usually compile the test manually, run it, and do save-as of the image shown).

My feeling is that this behavior should be correct by default, so no need for a new flag. But we can optionally add a flag to disable it, if people want. Could be left to a followup later.

@kripken
Copy link
Member

kripken commented Jun 4, 2018

Actually, it might be nice to add that flag with that PR, and add a test for what happens without that flag set, that is, doing the assumption that leads to incorrect results. Those two tests together would give better testing overall, I think.

@stale
Copy link

stale bot commented Sep 18, 2019

This issue has been automatically marked as stale because there has been no activity in the past year. It will be closed automatically if no further activity occurs in the next 7 days. Feel free to re-open at any time if this issue is still relevant.

@caiiiycuk
Copy link
Contributor

caiiiycuk commented Jul 18, 2024

Hi, faced this issue, even this issue is reported long time ago it was hard to find for me. In my case game rendering works in 99% of cases, only some objects are missing. I read documentation many times but didn't pay attention to issue#4214 note 🥇 .

Anyway after digging and debugging I also realized that size in GL.calcBufLength(cb.size, cb.type, cb.stride, count) is not correct. And if I understand correctly, there is no workaround for it.

@juj What do you think if

1st.

Add in emscripten.h:

void emscripten_glVertexAttribPointer (
   GLuint index, 
   GLint size, GLenum type, 
   GLboolean normalized, 
   GLsizei stride, 
   const void *pointer, 
   GLuint exact_size // NEW
);

2st.

Change library_webgl.js like this:

  glVertexAttribPointer: (index, size, type, normalized, stride, ptr, exact_size) => {
#if FULL_ES2
    var cb = GL.currentContext.clientBuffers[index];
#if GL_ASSERTIONS
    assert(cb, index);
#endif
    if (!GLctx.currentArrayBufferBinding) {
      cb.exact_size = exact_size; // NEW

// ...

    preDrawHandleClientVertexAttribBindings: (count) => {
      GL.resetBufferBinding = false;

      // TODO: initial pass to detect ranges we need to upload, might not need
      // an upload per attrib
      for (var i = 0; i < GL.currentContext.maxVertexAttribs; ++i) {
        var cb = GL.currentContext.clientBuffers[i];
        if (!cb.clientside || !cb.enabled) continue;

        GL.resetBufferBinding = true;

        var size = cb.exact_size ?? GL.calcBufLength(cb.size, cb.type, cb.stride, count); // NEW

I think this change is quite simple and effective. This also solve my case, in current proejct I always know the size of buffer at binding time.

@eukarpov
Copy link
Contributor

Perhaps it can be fixed like this, however it is not OpenGL ES 2.0 API.
https://registry.khronos.org/OpenGL-Refpages/es3.0/html/glVertexAttribPointer.xhtml

I still think that to calculate buffer size how it was proposed above is a correct direction with a warning message that this function might need an optimization or to switch to OpenGL ES 3.0 which has this support.
https://registry.khronos.org/OpenGL-Refpages/gl4/html/glDrawRangeElements.xhtml

Leaving this bug unsolved does not seem right way.

@caiiiycuk
Copy link
Contributor

This is exactly what I want, so here what I can do in my project:

#ifdef EMSCRIPTEN
#include <webgl/webgl2.h>
#endif

//...
#ifdef EMSCRIPTEN
    emscripten_glDrawRangeElements(GL_TRIANGLES, 0, m_NbVertex, count, GL_UNSIGNED_SHORT, indices);
#else
    glDrawElements(GL_TRIANGLES, count, GL_UNSIGNED_SHORT, indices);

But, unfortunately glDrawRangeElements does nothing, it just call glDrawElements:

  glDrawRangeElements: (mode, start, end, count, type, indices) => {
    // TODO: This should be a trivial pass-though function registered at the bottom of this page as
    // glFuncs[6][1] += ' drawRangeElements';
    // but due to https://bugzilla.mozilla.org/show_bug.cgi?id=1202427,
    // we work around by ignoring the range.
    _glDrawElements(mode, count, type, indices);
  },

There is some implemenation in LEGACY_GL_EMULATION mode, but I think it something different. Also I don't want to enable FULL_ES3, cause produces some rendering artifacts if compare to FULL_ES2

@eukarpov
Copy link
Contributor

Could you please try this solution a4fcb6b and provide performance feedback?
It might the best way to resolve this issue without mixing different APIs.

@caiiiycuk
Copy link
Contributor

Could you please try this solution a4fcb6b and provide performance feedback? It might the best way to resolve this issue without mixing different APIs.

I tested it. Visually it works just fine: no missed geometries, no artifacts. But I can't tell about perfomance, my game is pretty simple and always run on 60FPS. I will try to test it on more devices.

@eukarpov
Copy link
Contributor

I suggest to proceed with this solution which supports default API and add a warning message about a potential performance issue. If it is an acceptable resolution for your project I will reopen this PR next week.

@caiiiycuk
Copy link
Contributor

For the current project, yes, this is the best option. However I'm also a bit scared about performance, maybe we can add another definition that is not enabled by default, like -sGL_PRECIES_DRAWELEMENTS or something like that?

@eukarpov
Copy link
Contributor

My point of view is that full, proper emulation should be supported out of the box and not by hiding bugs under feature flags.

@kripken
Copy link
Member

kripken commented Jul 22, 2024

@eukarpov I am not a GL expert but we have similar code to yours in other parts of FULL_ES2, so I think your proposed fix is reasonable. In general FULL_ES2 does add some overhead in return for proper emulation out of the box, as you said. So I support moving forward with your change.

@caiiiycuk I also agree we should be careful about performance, but do you think this is more risky than other parts of FULL_ES2 code that also loop on GL.currentContext.maxVertexAttribs? (I hope it is no slower than them, but again, I am not a GL expert...)

@caiiiycuk
Copy link
Contributor

@kripken I tested average Unity Game with lot of draw calls that works on 40FPS on iPhone8 / Android, applying this changes does not affect performance at all. So I also agreed to complete this PR.

@eukarpov
Copy link
Contributor

@kripken
Copy link
Member

kripken commented Jul 31, 2024

Sounds good, thanks @eukarpov @caiiiycuk !

@juj
Copy link
Collaborator

juj commented Aug 1, 2024

@kripken I tested average Unity Game with lot of draw calls that works on 40FPS on iPhone8 / Android, applying this changes does not affect performance at all. So I also agreed to complete this PR.

I don't think Unity renders from client side memory at all, so the slow path in the proposed PR would not be incurred by testing in Unity.

@caiiiycuk
Copy link
Contributor

@kripken I tested average Unity Game with lot of draw calls that works on 40FPS on iPhone8 / Android, applying this changes does not affect performance at all. So I also agreed to complete this PR.

I don't think Unity renders from client side memory at all, so the slow path in the proposed PR would not be incurred by testing in Unity.

Idk, but when I comment body of that function the game renders nothing

@juj
Copy link
Collaborator

juj commented Aug 1, 2024

Unity definitely uses glDrawElements() to render most of its mesh data, but it does not use rendering from client side memory (to the best of my understanding).

@caiiiycuk
Copy link
Contributor

Anyway, I use the following script to patch the resulting js. Maybe it will be useful for someone.
Run it like this:

python fix-gles.py <path to emscripten js>

Source:

#!/bin/python

import sys

file = sys.argv[1]
pattern = "_glDrawElements=(mode,count,type,indices)=>{"

with open(file, "r") as f:
    text = f.read()
    start = text.find(pattern) + len(pattern) - 1
    end = start + 1
    balance = 0
    while 0 <= balance:
        if text[end] == "{":
            balance = balance + 1
        if text[end] == "}":
            balance = balance - 1
        
        end = end + 1
    
    text = text[0:start + 1] + """
    var buf;
    var vertexes = 0;
    if (!GLctx.currentElementArrayBufferBinding) {
        var size = GL.calcBufLength(1, type, 0, count);
        buf = GL.getTempIndexBuffer(size);
        GLctx.bindBuffer(34963, buf);
        GLctx.bufferSubData(34963, 0, HEAPU8.subarray(indices, indices + size));
        // Calculating vertex count if shader's attribute data is on client side
        if (count > 0) {
            for (var i = 0; i < GL.currentContext.maxVertexAttribs; ++i) {
                var cb = GL.currentContext.clientBuffers[i];
                if (cb.clientside && cb.enabled) {
                    var array_classes = {
                        5121 /* GL_UNSIGNED_BYTE */: Uint8Array,
                        5123 /* GL_UNSIGNED_SHORT */: Uint16Array,
                        5125 /* GL_UNSIGNED_INT */: Uint32Array};
                    if (array_classes[type])
                        vertexes = Math.max.apply(null, new array_classes[type](HEAPU8.buffer, indices, count)) + 1;
                    break;
                }
            }
        }
        indices = 0
    }
    GL.preDrawHandleClientVertexAttribBindings(vertexes);
    GLctx.drawElements(mode, count, type, indices);
    GL.postDrawHandleClientVertexAttribBindings();
    if (!GLctx.currentElementArrayBufferBinding) {
        GLctx.bindBuffer(34963, null)
    }
    """ + text[end - 1:]

with open(file, "w") as f:
    f.write(text)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

8 participants
@kripken @juj @ringoz @caiiiycuk @eukarpov @dwhipps @hnn0003 and others