Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

macOS: glgsg: Invalid operation when using p3d_TextureMatrix #846

Closed
el-dee opened this issue Jan 15, 2020 · 12 comments
Closed

macOS: glgsg: Invalid operation when using p3d_TextureMatrix #846

el-dee opened this issue Jan 15, 2020 · 12 comments
Assignees
Labels
Milestone

Comments

@el-dee
Copy link
Contributor

@el-dee el-dee commented Jan 15, 2020

On macOS, if a shader uses the uniform p3d_TextureMatrix defined like this :

uniform mat4 p3d_TextureMatrix[];

The glgsg aborts due to invalid operation :

:display:gsg:glgsg(error): at 2299 of panda/src/glstuff/glShaderContext_src.cxx : invalid operation

Here is an apitrace :
apitrace.zip

The error can be seen in frame 24 > render#0 > opaque

@rdb

This comment was marked as outdated.

Copy link
Member

@rdb rdb commented Jan 15, 2020

This should be a mat3. However, we should give a better error message.

@el-dee

This comment was marked as outdated.

Copy link
Contributor Author

@el-dee el-dee commented Jan 15, 2020

@el-dee

This comment was marked as outdated.

Copy link
Contributor Author

@el-dee el-dee commented Jan 15, 2020

Ah , with

uniform mat3 p3d_TextureMatrix[];

I get something else :

:display:gsg:glgsg(error): p3d_TextureMatrix should be mat4[], not mat3[]!
@rdb

This comment has been minimized.

Copy link
Member

@rdb rdb commented Jan 17, 2020

I apologise, it needs to be mat4 after all, I must have been mixed up with another matrix.

I'm struggling to replay your .trace file correctly on a non-macOS machine (since it makes CGL calls which don't appear to be fully emulated by apitrace). However, from your shader, I see that it is declared as so:

uniform mat4 p3d_TextureMatrix[];

I did not think that GLSL allows declaring unbounded uniform arrays, though apparently glslangValidator does accept this. Maybe there is a particularity about macOS' GLSL compiler that treats these arrays without bounds differently? Does the error still occur when you add an explicit bound?

@el-dee

This comment has been minimized.

Copy link
Contributor Author

@el-dee el-dee commented Jan 18, 2020

You're right, unbound array uniforms are not allowed by glsl, unless the uniform is bind to a ssbo. Though, on nvidia the compiler does not complain and the shader works fine. (either by sheer luck, or some magic trick of the driver).

On the other hand on macOS, I get the error even with an array with an explicit size. To be precise, I get the error if I'm setting the exact size or bigger. I I set too small a size, I get a warning that I'm reading out of the array bound, but the gsg does not abort
This Mac is using an Intel iris pro, I will try later with one using an AMD Radon.

@el-dee

This comment has been minimized.

Copy link
Contributor Author

@el-dee el-dee commented Jan 18, 2020

Same problem on the Radeon, so it's not a driver bug.

Here is a sample program to check the problem:

import os

from panda3d.core import Texture, TextureStage, CardMaker, Shader, load_prc_file_data
from direct.showbase.ShowBase import ShowBase

load_prc_file_data("", "gl-version 3 2\n"
                   "gl-check-errors #t\n")
def shader():
    return Shader.make(Shader.SL_GLSL,
                       vertex="""
#version 410

uniform mat4 p3d_ProjectionMatrix;
uniform mat4 p3d_ModelViewMatrix;

in vec4 p3d_Vertex;
in vec4 p3d_MultiTexCoord0;

out vec4 texcoord;

void main() {
    gl_Position = p3d_ProjectionMatrix * (p3d_ModelViewMatrix * p3d_Vertex);
    texcoord = p3d_MultiTexCoord0;
}
""",
                       fragment="""
#version 410

uniform mat4 p3d_TextureMatrix[2];
uniform sampler2D p3d_Texture0;
uniform sampler2D p3d_Texture1;

in vec4 texcoord;

out vec4 frag_color;

void main() {
    vec4 texcoord_tex0 = p3d_TextureMatrix[0] * texcoord;
    //vec4 texcoord_tex0 = texcoord;    
    vec4 tex0 = texture(p3d_Texture0, texcoord_tex0.xy);
    vec4 texcoord_tex1 = p3d_TextureMatrix[1] * texcoord;
    //vec4 texcoord_tex1 = texcoord;
    vec4 tex1 = texture(p3d_Texture1, texcoord_tex1.xy);
    frag_color = tex0 * tex1;
}
""")


base = ShowBase()

gsg = base.win.gsg

print("Hardware Vendor:", gsg.driver_vendor)
print("Driver Renderer: %s (%s)" % (gsg.driver_renderer, gsg.driver_version))

cm = CardMaker('card')
card = render.attachNewNode(cm.generate())
card.setPos(-0.5, 3, -0.5)
card.set_shader(shader())

tex = loader.loadTexture('maps/smiley.rgb')
ts = TextureStage('first')
card.setTexture(ts, tex)
card.setTexScale(ts, 1, -1)

tex = loader.loadTexture('maps/noise.rgb')
ts = TextureStage('second')
card.setTexture(ts, tex)
card.setTexScale(ts, 2, 2)

base.run()
@rdb rdb added this to the 1.10.6 milestone Jan 19, 2020
@rdb rdb added macos bug labels Jan 19, 2020
@rdb

This comment has been minimized.

Copy link
Member

@rdb rdb commented Jan 19, 2020

Same problem on the Radeon, so it's not a driver bug.

If I'm not mistaken, the situation on macOS is that not the IHVs, but Apple develops the OpenGL drivers. This is also the reason why the OpenGL support on macOS is lagging behind so much and supports much less than on other platforms. So it's still possible for this to be a bug in Apple's OpenGL implementation, or leniency in other implementations in the case of a Panda bug.

It may also be the reason why the code is working fine for me on Linux. I'll have to see if I can replicate it on my Mac Mini sometime.

@el-dee

This comment has been minimized.

Copy link
Contributor Author

@el-dee el-dee commented Jan 21, 2020

I did some quick tests, transferring the transform matrices using PTAMat4 works without problem on macOS, so it's not the array of matrices the problem.
Looking at the Panda code for the TextureMatrix, the array is not updated as a whole but matrix by matrix (and in the api trace only the second glUniformMatrix4fv actually fails). So it seems that partial update of a array uniform is not supported on macOS, or at least not the way Panda does it. Sadly, the OpenGL documentation from Apple is almost non-existent and I could not find anything related.

@rdb

This comment has been minimized.

Copy link
Member

@rdb rdb commented Jan 22, 2020

I had a chance to run your code on my old Mac. This is what I see in the debug output:

Active uniform p3d_TextureMatrix[0] with size 2 and type 0x8b5c is bound to location 0
Active uniform p3d_Texture0 with size 1 and type 0x8b5e is bound to location 8
Active uniform p3d_ModelViewMatrix with size 1 and type 0x8b5c is bound to location 9
Active uniform p3d_Texture1 with size 1 and type 0x8b5e is bound to location 13
Active uniform p3d_ProjectionMatrix with size 1 and type 0x8b5c is bound to location 14

Note how indices 1..7 are unassigned. So despite there being only two array entries, it's reserving 8 locations. And after p3d_ModelViewMatrix, there are also three empty slots. Is the macOS driver counting locations... per row/column, rather than per matrix?

Okay, so let's see what happens if I assume the next matrix is four locations down instead of one:

diff --git a/panda/src/glstuff/glShaderContext_src.cxx b/panda/src/glstuff/glShaderContext_src.cxx
index 618b228754..35e894a6af 100644
--- a/panda/src/glstuff/glShaderContext_src.cxx
+++ b/panda/src/glstuff/glShaderContext_src.cxx
@@ -890,7 +890,7 @@ reflect_uniform(int i, char *name_buffer, GLsizei name_buflen) {
 
         // Add it once for each index.
         for (bind._index = 0; bind._index < param_size; ++bind._index) {
-          bind._id._seqno = p + bind._index;
+          bind._id._seqno = p + bind._index * 4;
           _shader->_mat_spec.push_back(bind);
         }
         _shader->_mat_deps |= bind._dep[0];

No more error!

I'm quite sure I remember that uniform locations are meant to be consecutive for array elements, but I will need to validate this.

@rdb

This comment has been minimized.

Copy link
Member

@rdb rdb commented Jan 22, 2020

GLSL 4.40 spec, section 4.4.3:

Individual elements of a uniform array are assigned consecutive locations with the first element taking location location.

And this bit (which also actually states that [] is legal):

Locations can be assigned to default-block uniform arrays and structures. The first inner-most scalar, vector or matrix member or element takes the specified location and the compiler assigns the next inner-most member or element the next incremental location value. Each subsequent inner-most member or element gets incremental locations for the entire structure or array. This rule applies to nested structures and arrays and gives each inner-most scalar, vector, or matrix type a unique location. For arrays without an explicit size, the size is calculated based on its static usage.

However, I then looked up the GLSL 4.10 spec, which surprisingly enough has no such section at all about uniform locations, and appears to make no guarantees. In fact, Apple's implementation seems to apply the same location assignment rules that apply to varying inputs (and outputs), in which case it does need to be one location per column.

Not entirely sure what the best solution is here. We could simply multiply the location offset by four #ifdef __APPLE__, but technically the safest thing to do is to just supply the whole array in one go.

rdb added a commit that referenced this issue Jan 22, 2020
@rdb

This comment has been minimized.

Copy link
Member

@rdb rdb commented Jan 25, 2020

Still working on a solution for master. It requires rethinking how the matrix cache works (but I think I can rethink it in a way that it will be faster, too).

@rdb rdb self-assigned this Jan 25, 2020
rdb added a commit that referenced this issue Jan 25, 2020
If multiple ShaderMatSpec entries use the same state matrix, this should result in a reduction in the number of times that state matrix is fetched.  This is especially so for arrays, which are now fetched once rather than once for every item.

This is the first step towards trying to solve #846.
rdb added a commit that referenced this issue Jan 25, 2020
If multiple ShaderMatSpec entries use the same state matrix, this should result in a reduction in the number of times that state matrix is fetched.  This is especially so for arrays, which are now fetched once rather than once for every item.

This is the first step towards trying to solve #846.
rdb added a commit that referenced this issue Jan 31, 2020
If multiple ShaderMatSpec entries use the same state matrix, this should result in a reduction in the number of times that state matrix is fetched.  This is especially so for arrays, which are now fetched once rather than once for every item.

This is the first step towards trying to solve #846.
@rdb

This comment has been minimized.

Copy link
Member

@rdb rdb commented Mar 19, 2020

I'm going to close this since we have a hack that works for 1.10. I'll open a new issue for addressing it more cleanly on master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.