Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WebGL Internal error: 0x00000502: Vertex buffer is not big enough for the draw call #3578

Closed
madmaxio opened this issue Mar 11, 2023 · 13 comments · Fixed by #3597
Closed

WebGL Internal error: 0x00000502: Vertex buffer is not big enough for the draw call #3578

madmaxio opened this issue Mar 11, 2023 · 13 comments · Fixed by #3597
Labels
api: gles Issues with GLES or WebGL area: validation Issues related to validation, diagnostics, and error handling help wanted Contributions encouraged type: bug Something isn't working

Comments

@madmaxio
Copy link

This happens on WebGL when recreating a buffer:

[.WebGL-0000554C38753500] GL_INVALID_OPERATION: Error: 0x00000502, in ..\..\third_party\angle\src\libANGLE\renderer\d3d\VertexDataManager.cpp, reserveSpaceForAttrib:520. Internal error: 0x00000502: Vertex buffer is not big enough for the draw call.

At the moment it is hard to provide small example to reproduce, sadly.

@madmaxio
Copy link
Author

And this error doesn't appear on wgpu 0.14.0, so the problem appeared somewhere on the way from wgpu 0.14.0 to 0.15.2, and looks like it is not libANGLE bug.

@grovesNL
Copy link
Collaborator

Still having trouble simplifying a case to reproduce this. I'm not seeing much that would've changed between 0.14.0 and 0.15.2.

Is it possible it's the STATIC_DRAW to DYNAMIC_DRAW change in #3391 ? You could try reverting that locally and seeing if it helps.

@grovesNL grovesNL added type: bug Something isn't working help wanted Contributions encouraged area: validation Issues related to validation, diagnostics, and error handling api: gles Issues with GLES or WebGL labels Mar 13, 2023
@madmaxio
Copy link
Author

Still having trouble simplifying a case to reproduce this. I'm not seeing much that would've changed between 0.14.0 and 0.15.2.

Is it possible it's the STATIC_DRAW to DYNAMIC_DRAW change in #3391 ? You could try reverting that locally and seeing if it helps.

That's exactly it. Reverting to glow::STATIC_DRAW from glow::DYNAMIC_DRAW fixes the problem. So this is about #3391 and #3371.

@grovesNL
Copy link
Collaborator

Very weird. We can't back that fix out either because that breaks some older devices. STATIC_DRAW and DYNAMIC_DRAW is just supposed to be a usage hint so this whole situation is a bit silly...

usage is provided as a performance hint only. The specified usage value does not constrain the actual usage pattern of the data store.

Is it possible that the buffer sizes are different between the buffer at creation vs. the one we use during draw? I don't think so because we I believe we're supposed to be pre-allocating the buffer fully and copying into that original buffer.

I wonder if we can easily trace this code path through ANGLE to see how we can workaround it, or possibly file a bug upstream to ANGLE if the buffer size is correct.

@kdashg
Copy link

kdashg commented Mar 14, 2023

There is a very distant possibility that this is the driver accidentally exposing its limited size for host_mappable|gpu_local memory buffers, but you're right that the spec says it's a hint only, and the allocation should fail over to whatever STATIC_DRAW does.

That said, I don't understand what's going on in #3391. STATIC_DRAW should always, always work. This sounds like another bug elsewhere, honestly. Otherwise we should have hit this long ago in WebGL implementations in browsers.

So we should definitely prove out whether it's ANGLE at fault first!

@Dinnerbone
Copy link
Contributor

Dinnerbone commented Mar 14, 2023

That said, I don't understand what's going on in #3391. STATIC_DRAW should always, always work. This sounds like another bug elsewhere, honestly. Otherwise we should have hit this long ago in WebGL implementations in browsers.

I did manage to reproduce that issue with regular webgl JS (excluding wgpu from the picture). I'd have to bring out an old device to test it again, but it was literally something like "after STATIC_DRAW, more copies/writes are ignored until the next frame", on all browsers but specific older (usually intel) devices.

WGPU always hit it because we basically allocated, zerod out, and then copied the initial data - the STATIC_DRAW was frozen in place on either the initial allocation or the zeroed out copy, and wouldn't accept the initial data. If we didn't do the initial zeroing out copy then it may have worked more often, but I hadn't put much time into investigating that.

@jleibs
Copy link
Contributor

jleibs commented Mar 16, 2023

We are hitting the exact same issue here. Specifically in Chrome on Windows (no problem on other OSes or firefox).

After coming across the reference to #3391 in the code, we also tried reverting DYNAMIC_DRAW back to STATIC_DRAW and confirm it "fixes" the problem for us as well.

@Wumpf
Copy link
Member

Wumpf commented Mar 16, 2023

@jleibs and me investigated this for a bit and found someone else fixing that in their library by always using draw-instanced: mosra/magnum#539
We're getting a workaround fix ready.
EDIT: Done!

Orthogonal to that the workaround of using DYNAMIC_DRAW over STATIC_DRAW is still concerning :(. Would be nice if we could narrow down on which devices STATIC_DRAW doesn't work and apply that only then.

@madmaxio
Copy link
Author

Very weird. We can't back that fix out either because that breaks some older devices. STATIC_DRAW and DYNAMIC_DRAW is just supposed to be a usage hint so this whole situation is a bit silly...

usage is provided as a performance hint only. The specified usage value does not constrain the actual usage pattern of the data store.

Is it possible that the buffer sizes are different between the buffer at creation vs. the one we use during draw? I don't think so because we I believe we're supposed to be pre-allocating the buffer fully and copying into that original buffer.

I wonder if we can easily trace this code path through ANGLE to see how we can workaround it, or possibly file a bug upstream to ANGLE if the buffer size is correct.

Probably

int64_t maxByte = GetMaxAttributeByteOffsetForDraw(attrib, binding, maxVertexCount);

in ANGLE is bugged, because it returns some nonsence when checking against

static_cast<int64_t>(bufferD3D->getSize())

But not sure how to prove or debug it.

@grovesNL
Copy link
Collaborator

It would be interesting to see how maxByte changes between STATIC_DRAW vs. DYNAMIC_DRAW, then trace it back to see what causes the difference

@grovesNL
Copy link
Collaborator

grovesNL commented Mar 17, 2023

Minimal repro:

<canvas></canvas>
<script>
  const gl = document.querySelector("canvas").getContext("webgl2");
  const program = gl.createProgram();
  const vertexShader = gl.createShader(gl.VERTEX_SHADER);
  const fragmentShader = gl.createShader(gl.FRAGMENT_SHADER);
  gl.shaderSource(
    vertexShader,
    `#version 300 es
    in float scale; void main() { gl_Position = vec4(vec2(gl_VertexID % 2, gl_VertexID % 3) * scale, 0, 1); }`
  );
  gl.shaderSource(
    fragmentShader,
    `#version 300 es
    precision mediump float; out vec4 color; void main() { color = vec4(1, 0, 0, 1); }`
  );
  gl.compileShader(vertexShader);
  gl.compileShader(fragmentShader);
  gl.attachShader(program, vertexShader);
  gl.attachShader(program, fragmentShader);
  gl.linkProgram(program);
  gl.detachShader(program, vertexShader);
  gl.detachShader(program, fragmentShader);
  gl.useProgram(program);
  gl.bindBuffer(gl.ARRAY_BUFFER, gl.createBuffer());
  gl.bufferData(
    gl.ARRAY_BUFFER,
    new Float32Array([0.5]),
    // Switch to STATIC_DRAW and both will work
    gl.DYNAMIC_DRAW
  );
  const loc = gl.getAttribLocation(program, "scale");
  gl.enableVertexAttribArray(loc);
  gl.vertexAttribPointer(loc, 1, gl.FLOAT, false, 0, 0);
  gl.vertexAttribDivisor(loc, 1);

  // Both of these should succeed
  gl.drawArraysInstanced(gl.TRIANGLES, 0, 3, 1);
  gl.drawArrays(gl.TRIANGLES, 0, 3);
</script>

@grovesNL
Copy link
Collaborator

I submitted a bug to Chrome/ANGLE at https://bugs.chromium.org/p/chromium/issues/detail?id=1425606

The correct workaround is to use drawArraysInstanced as proposed in #3597 for now.

@grovesNL
Copy link
Collaborator

This has also been fixed in ANGLE upstream (i.e., including Chrome) now 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: gles Issues with GLES or WebGL area: validation Issues related to validation, diagnostics, and error handling help wanted Contributions encouraged type: bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants