-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fast-Path performance issue for YUV! Need "packed" GL_RGBA instead of GL_LUMINANCE #427
Comments
We are moving from using the firmware side graphics driver to the arm side graphics driver. |
Wow. I had no idea. Where do I go for information on this? All I can read is that once turned on everything 3D breaks. I'm using GLES2.0. What is the migration process for developers? What changes are coming for MML and camera access? (Sorry - I know you are all busy and any help is much appreciated and not required) |
The arm side driver is a standard mesa driver. Any standard linux app that uses opengl or opengles should work without specific Pi porting (e.g. you can The initial announcement was long ago. Searching for Currently integration with mmal/camera is in progress but not released. |
You can also "pack" it yourself in "compress" shader and then use glReadPixels with (width / 4) Shader should look like:
|
Closing due to lack of activity. Please request to be reopened if you feel this issue is still relevant. |
Pardon me if this is the wrong place since this is not a bug but rather an important performance suggestion. All other sites seem to have very old discussion on them so I'll try here:
The problem with EGL_IMAGE_BRCM_MULTIMEDIA_Y is that it gets linked to a GL_LUMINANCE texture via the fast-path direct GPU GL texture access. GL_LUMINANCE however is not all that useful from a performance perspective. It would be much more useful to be able to get the YUV component channels delivered via a GL_RGBA type of texture at 1/4 the x dimension (i.e. pack 4 Luminance values into a single RGBA pixel). This way I can load 4 values in a shader with a single texture read call. It is 4x faster!!! ( GL_LUMINANCE does not get optimized over GL_RGBA on a per-texture read basis on the PI).
Put another way - I get about the same performance saving the YUV buffer to CPU memory and packing them into a GL_RGBA texture that is 1/4 the size for further processing. Imagine running a sobel filter with 1/4 the pixel reads required in a shader.
So fast-path seems to offer no benefit for YUV since you have to make a copy anyways to pack them into an RGBA texture for better performance down the line.
Or perhaps I'm a moron and don't know how to trick ES2.0 to use a GL_LUMINANCE texture as a GL_RGBA texture? I would be super thankful if someone had some ideas for a workaround on this...
In any case I would think that this is an easy feature to implement and it will increase FPS - especially on a Sobel type of filter.
And feel free to yell at me if this is not the right place for this feature request. I'll remove it.
The text was updated successfully, but these errors were encountered: