Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pre-built shader support #292

Closed
wants to merge 3 commits into from
Closed

Pre-built shader support #292

wants to merge 3 commits into from

Conversation

downor
Copy link
Contributor

@downor downor commented Mar 22, 2018

These two changes are for making compositor be able to use prebuilt-shader program to save time to build it from sources.

There are two steps to generate prebuilt shaders, First one is to create shader source file that can be parsed by shader-db offline compiler tool using common/compositor/gl/gl_shader_pre_built/generate_shader_test.sh.

Second step is to use shader-db tool to compile "hwc_shader_prog_<n_layers>.shader_test" generated from first step.

The pre-built shader needs to be placed in the same folder where the application is running.

Jira: None.
Test: Verify testlayer works with and without pre-built shaders

Signed-off-by: Dongwon Kim dongwon.kim@intel.com

gl-program now tries to use pre-built shader program if available
to save time by skipping run-time build.

pre-built shader file name accepted by the program is,

'hwc_shader_prog_<number of layers>.shader_test.bin'

Signed-off-by: Dongwon Kim <dongwon.kim@intel.com>
generate_shader_test.sh <layer_count> creates compositor.shader_test
(shader source code with some options) that can be compiled by shader-db
offline shader compiler.

Signed-off-by: Dongwon Kim <dongwon.kim@intel.com>
@csravelar
Copy link
Contributor

csravelar commented Mar 22, 2018

Looking good, we should probably follow:
https://github.com/android-ia/manifest/wiki/Contributions

Copy link
Contributor

@strassek strassek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See inline review feedback. Also, consider moving the test script to the test/ directory.

if (shader_prog_fp) {
/* check the size of file */
fseek(shader_prog_fp, 0, SEEK_END);
int binary_sz = ftell(shader_prog_fp);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ftell returns a long


FILE *shader_prog_fp;
static PFNGLPROGRAMBINARYOESPROC program_binary_oes =
(PFNGLPROGRAMBINARYOESPROC)eglGetProcAddress("glProgramBinaryOES");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency, can you place this with the other get_proc calls in common/compositor/gl/shim.cpp

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

std::ostringstream shader_program_fname;
shader_program_fname << "hwc_shader_prog_"
<< num_textures
<<".shader_test.bin";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we need a full path here? Where would these files be stored on Android?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assumption was this shader program is placed in the same location as executable. I am also wondering if there's any dedicated location of those binaries loaded by compositor. I will check this and update the code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to read in the shader program from disk at runtime? It seems far safer to bake the binary data into the HWC binary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make that last point more clear, you can take a look at how we are including SPIR-V (binary) shaders in the vulkan backend.

static const uint8_t vkcomp_vert_spv[] = {
#include "vkcomp.vert.h"
};
static const uint8_t vkcomp_frag_spv[] = {
#include "vkcomp.frag.h"
};

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I got your point.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the shader program binary comes with sha-1 key in its header that contains i965's build id that is re-generated every time i965 is rebuilt (even if the source code is same)

So mesa rejects anything that doesn't match? This seems like a sensible restriction on Mesa's part, but I don't think we can lean on this behavior. At least not unless that restriction is coming from spec. And if it isn't, we are required to supply and load only the binaries that match the current hardware and driver build. In other words, we can't just keep pushing various binaries to Mesa until we get a match (not that anyone was suggesting otherwise).

On the HWC side how can we check if the binary we want to load is suitable for the underlying driver and hardware? Can we even ask the driver to give us its build id? @downor does shader-db give you the option of building the shader binary for anything other than the current environment?

We don't have a good reason to even attempt to shipping binaries for anything but the exact environment of the target machine, and that just isn't known until installation time.

do you want this arrays created automatically during make or you want to pre-generate these manually with fixed target and the driver version (i965_dri.so)?

I think the reality that we are arriving at is that it just isn't tractable to pre-generate, and you can't expect products to do anything manually.

Can we instead explore the possibility of allowing the driver to generate the binaries on first run? I know that may be unacceptable for some products, but it seems like a much saner default case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So mesa rejects anything that doesn't match? This seems like a sensible restriction on Mesa's part, but I don't think we can lean on this behavior.

DW] This is because glProgramBinary we use to load the pre-built binary is paired with glGetProgramBinary that we also use to generate shader program binary. Both API should be from the same driver (i965) to guarantee the pre-built binary works. This is done by checking build-id embedded.

@downor does shader-db give you the option of building the shader binary for anything other than the current environment?

DW] Yes, shader-db does the cross compilation for any hardware and driver (as far as it's intel). We specify the target driver by setting "LIBGL_DRIVERS_PATH" and target architecture by passing in target GEN's PCI_ID.

Can we instead explore the possibility of allowing the driver to generate the binaries on first run? I know that may be unacceptable for some products, but it seems like a much saner default case.

DW] Unfortunately the whole point of using pre-built shaders is to make things boot up faster by letting the system bypass shader build process. BTW, "shader-cache" mechanism is already up and running in i965. This allows the system to cache shader program/object whenever it's built for the same purpose but this doesn't cover very first running.

DW] Anyway, I already started to work on a script and simple tool that generates .h with binary arrays that can be embedded in composer library. I hope we can use this for the automation. I will share these in couple of days.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, if program fails to link the pre-compiled shaders we should blacklist and compile it during run time. hmm Instead of PCI_ID, can we just pass in the Gen (i.e. 7, 7.5) ? This should make it easy to tell during compile time if you want to target a specific HW Version.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We specify the target driver by setting "LIBGL_DRIVERS_PATH" and target architecture by passing in target GEN's PCI_ID.

Got it, so it sounds like we need to require that the target PCI_ID gets set at configure time, otherwise don't try enabling the feature.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@strassek : Yeah, it should be pre-determined and yes, if not, we need to disable it.
@kalyankondapally : Each gen architecture has several variants and each variant in a group has its own PCI ID. So only way to differentiate it completely is using PCI_ID for now.

}

fclose(shader_prog_fp);
free(binary_prog);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a possible double free. Maybe you meant these two calls to be in an else statement?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, you are right. I will fix this.

int binary_sz = ftell(shader_prog_fp);
rewind(shader_prog_fp);

char *binary_prog = (char *)malloc(binary_sz);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a bad idea without doing some bounds checking first.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, will do check on binary_prog before moving further


if (fread(binary_prog, 1, binary_sz, shader_prog_fp) == binary_sz) {
fclose(shader_prog_fp);
program_binary_oes(program, GL_PROGRAM_BINARY_FORMAT_MESA, binary_prog,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you please query if GL_MESA_program_binary_formats extension is supported first and also same with program binary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I will add code for query.

@tpalli
Copy link
Contributor

tpalli commented Mar 27, 2018

We have now EGL_ANDROID_blob_cache support, is there reason why hwcomposer cannot use that? I've noticed Surfaceflinger does not use it, why is this? As example systemui utilizes it. If it would be used, the shaders would get generated on first boot and then sequential boots would utilize prebuilt shaders.

@kalyankondapally
Copy link
Contributor

kalyankondapally commented Mar 27, 2018

@tpalli I remember with SF, we need to enable build flag. I am waiting for all latest code to settle down a bit before turning it on. A

Also, here we are trying to include pre-compiled shaders even during first boot. Another reason is that eventually we want to have support as weston plugin too. In this case, we will not have support for EGL Android Blob Cache extension.

@tpalli
Copy link
Contributor

tpalli commented Mar 28, 2018

The effort required here looks quite a bit just for first boot, I wast thinking maybe enabling blob cache would be more seamless approach. With Weston you will have the same functionality as what EGL_ANDROID_blob_cache provides via regular disk cache, it is enabled now (from Mesa 18.0) for Linux desktop builds.

@downor
Copy link
Contributor Author

downor commented Mar 28, 2018

@tpalli Would shader program stay in the database even after power-cycle if we use EGL_ANDROID_blob_cache? As far as I know, disk-shader-cache helps when the same application is re-executed over and over again without rebooting the system. Is EGL_ANDROID_blob_cache different?

@tpalli
Copy link
Contributor

tpalli commented Mar 29, 2018

@downor Yes, disk cache does not get destroyed during reboot, on regular Linux desktop you can find it from $HOME/.cache/mesa_shader_cache/ and on Android it is under /data/user_de/0, for example for Settings app it is /data/user_de/0/com.android.settings/code_cache/com.android.opengl.shaders_cache.

@kalyankondapally
Copy link
Contributor

@tpalli Missed u comments hmm

@eero-t
Copy link

eero-t commented Apr 13, 2018

For things that are built as part of the OS image, shader cache content could be pre-generated at the build time. That way it would match Mesa version on the OS image and cover things needed at first boot of the device.

@downor
Copy link
Contributor Author

downor commented Apr 13, 2018

@eero-t Do you mean we can run 3D app during build time to get those shaders compiled? How can you handle cross compilation? Are you considering overriding PCI-ID?

@eero-t
Copy link

eero-t commented Apr 13, 2018

Yes, PCI ID would obviously need to be overriden (like e.g. Mesa shader-db does). Instead of the real target applications, which may be hard to run on build environment, there could be a separate shader compilation program that just compiles the relevant shaders with suitable GL state setup, to seed the shader cache.

Note that some GL state differences would result in different shader compilation results, that's why state needs to match fairly well, otherwise this program would do redundant compilations for things that wouldn't match the run-time shader identification.

Maybe one possibility for that kind of a program could be enhancing Mesa's shader-db shader source dumping facility:
https://www.mesa3d.org/shading.html#capture

to contain all necessary state information (the items that cause shader compilation differences), and then use shader-db program to do the cache preceeding during build?

As last step, one would then need to include the generated shader cache content to the final OS image.

@downor
Copy link
Contributor Author

downor commented Apr 13, 2018

@eero-t
Have you checked a new pull request:
#310

I added automation there in the new pull request. With PCI-ID fed from configure script, it runs shader-db to create shader programs for the target then integrate it to hw composer (after doing binary to arrays). Compositor code tries to load this pre-built shaders from arrays on run-time. If it fails, it checks if there are any shader binary files under dedicated location in the file system. If this fails as well, it goes through run-time compilation path.

I think this is pretty much identical with what you described except that it use get_program_binary instead of directly using "shader-cache".

To cover difference in GL state, output shader program binary contains not only GEN program byt also NIR form of shader program. So I hope most of GL state difference can be handled by regenerating GEN program from NIR if there are any.

@eero-t
Copy link

eero-t commented Apr 16, 2018

Are there any additional security concerns that need to be taken into account?

Mesa's shader cache is shared by all processes running under same user, but I don't think all such processes (e.g. browser) should be able to substitute other processes cache items -> either the cache storage needs to be read-only, or these locations should be isolated. If OS already doesn't do that...

@tpalli
Copy link
Contributor

tpalli commented Apr 16, 2018

If I've understood correctly this cache implemented here is only for the HWC so there should be no security concerns, nobody else uses this cache.

What comes to disk cache, on Android the disk cache is separated across processes, every process gets it's own directory of items to mess with.

@kalyankondapally
Copy link
Contributor

#310 merged as part of this PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants