Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New OpenGL ES Capture is breaking ANGLE back-end capture (GLES on Vulkan/D3D11/GL) #1045

Closed
null77 opened this issue Jul 18, 2018 · 16 comments

Comments

Projects
None yet
3 participants
@null77
Copy link

commented Jul 18, 2018

I've been using RenderDoc quite a bit to debug ANGLE's Vulkan and D3D backends over the past few years. I believe recent changes introduced to enable capturing OpenGL ES on Windows are breaking the ANGLE back-end capture. I've noticed RenderDoc is remapping ANGLE's eglMakeCurrent and creating a GLES 3.0 context. This was generating an EGL Error with the Vulkan back-end since we don't support GLES 3.0. There are likely other problems from this.

Is it possible to either disable GLES ANGLE capture or detect when we want to capture ANGLE's front-end vs back-end APIs?

@baldurk

This comment has been minimized.

Copy link
Owner

commented Jul 18, 2018

Yes I added the RENDERDOC_HOOK_EGL environment variable that you can set to 0, this should disable the hooking of GLES emulators on windows. Environment variables suck for user experience, and it'd be nice to have a proper UI - in theory it would be nice to be able to select on the fly which API you want to hook and capture - but realistically nested APIs like this are a very niche case so it's a low priority improvement.

I'm not sure what you mean about RenderDoc remapping eglMakeCurrent. I don't do any remapping but there is some work that happens when a context is first activated to apply a few checks to detect IHV bugs and things - that might create a GLES 3.0 context, is that what you mean? I can look at a fallback path there, although it might be a bit tricky as part of that check is seeing if vertex array objectss are shared between contexts - and GLES 2.0 doesn't have VAOs.

@null77

This comment has been minimized.

Copy link
Author

commented Jul 23, 2018

Agree that ANGLE is a bit of a marginal case. Disabling EGL capture via the UI would be a good improvement.

Regarding your second question: eglMakeCurrent was being intercepted by RenderDoc. Internally in RenderDoc's interception call it was creating a GLES 3.0 context on top of ANGLE's Vulkan back-end. Since our back-end doesn't support GLES 3.0 this was causing a context creation failure. I can provide more details or repro steps if you're interested.

There might have been other problems even if the Context creation was fixed. But I found it was easy to disable the EGL capture so I went with that.

@baldurk

This comment has been minimized.

Copy link
Owner

commented Jul 24, 2018

Right yes, that's the internal context for tests that I mentioned. I should be able to fix that at some point although I'll need to check the context version and make sure I don't do any GLES 3.0 tests on a GLES 2 context.

Just to clarify, setting the env variable to disable EGL capture worked for you, so I can close this issue? Having a UI option would be ideal but is unlikely to happen any time soon.

@baldurk baldurk closed this Jul 30, 2018

@null77

This comment has been minimized.

Copy link
Author

commented Aug 27, 2018

Hey Baldur, thanks for the responses. I agree it's not a high priority issue as the environment variable does work but this is confusing for any ANGLE developer not familiar with the context of this issue. Can you reopen this and leave it as a feature request with a low priority?

@baldurk

This comment has been minimized.

Copy link
Owner

commented Aug 28, 2018

Thinking about this I'm a little conflicted honestly. I don't mind having a low priority feature request issue in the general case if it's something I might get to at some point, but I can't decide if this is something that will ever be implemented. If I know it's never going to be implemented I'd rather not have an issue sitting around pointlessly.

When I was thinking about a UI option before I was conflating a few things. I was thinking about having UI for handling applications that use two different APIs independently (or with interop), which is already possible today with enough fiddling and app-side code, but could use some UI that pops up when that situation arises. I don't think I'd want to promote the existing RENDERDOC_HOOK_EGL var to a UI option. It can't appear situationally because it has to be enabled before the application even loads. That means it has to appear for everyone. I don't think the cost/benefit works out for adding another toggle that can confuse people for such a niche feature, it's better to have as a more internal documented but unadvertised flag.

The real problem is a GLES-on-GL wrapper. For GLES-on-D3D or Vulkan you might be able to get away with just capturing both and treating them as if they're a weird version of an application using two APIs. I haven't tested it but I think it might work. With GLES-on-GL I cannot support capturing both ends of the wrapper because the application calls GLES, which I intercept and call forward into the wrapper, but then the wrapper calls into GL which I also intercept and call forward again into the wrapper, and stack overflow...

The only way to get that to work properly would be some kind of internal layering system inside RenderDoc where I could tell which part of the stack I'm on and have different onward function pointers, but that also requires having different hook implementations. And critically without that proper fix any GLES-on-GL wrapper will hard crash. I don't know what the wrapper is going to do beforehand, so it's not something that can be supported incrementally.

The environment variable can work today because it is read at hooking time, and I basically choose to either hook the GLES libraries and blacklist hooking any onward calls they make, or else I skip hooking the GLES calls and only hook any onwards calls. i.e. either the front or the back, not both.

As a compromise would it be useful to you to control this toggle from code somehow instead of just as an env var? It might be a little tricky, but if you were able to toggle on a 'developer mode' for ANGLE that then doesn't ship, and just automagically does the right thing with renderdoc, would that be useful? As long as it is minimal impact I don't mind adding something like that, but either way the onus is going to be on those specific developers knowing what to enable for their use-case.

@null77

This comment has been minimized.

Copy link
Author

commented Aug 29, 2018

Yeah, having a toggle in the code is good. Ideally this could be done in ANGLE's startup. Not sure when RenderDoc checks for the hooking variable. Is that before or after EGL startup?

@baldurk

This comment has been minimized.

Copy link
Owner

commented Aug 29, 2018

It happens really early because it needs to be known while hooks are being applied to the process, which is inside renderdoc.dll's DllMain. It would have to be something like a symbol exported in a module loaded in the process before renderdoc.dll - which could be libEGL.dll if the program links against it. Either a symbol that acts as a flag itself like NvOptimusEnablement, or else a C-exported function that can be called and returns an int 1/0 flag.

@null77

This comment has been minimized.

Copy link
Author

commented Sep 5, 2018

A C-exported function is pretty easy. You could check for the presence of ANGLEGetDisplayPlatform. Question is, would this mess up other users who are trying to capture with ANGLE?

@baldurk

This comment has been minimized.

Copy link
Owner

commented Sep 5, 2018

That's the thing, it would have to be exported only on developer builds (i.e. you folks who want to capture the back-end of ANGLE as the default) and not on shipping builds. Or else it would have to be a C-exported function that I could actually call and would then implement some logic to detect developers however you want - e.g. reading a config file or something like that.

@ShabbyX

This comment has been minimized.

Copy link

commented Feb 14, 2019

@baldurk, Please make the RENDERDOC_HOOK_EGL workaround cross-platform. ANGLE now supports a Vulkan back-end, and this issue is preventing running renderdoc for ANGLE/Vulkan debugging on Linux.

@baldurk

This comment has been minimized.

Copy link
Owner

commented Feb 14, 2019

That workaround can't be made cross-platform, on linux the hooking happens through LD_PRELOAD so it's too late by the time the library is loaded to decide not to hook any symbols that are already exported.

@ShabbyX

This comment has been minimized.

Copy link

commented Feb 15, 2019

Is a work around conceivable?

@baldurk

This comment has been minimized.

Copy link
Owner

commented Feb 15, 2019

I think the most feasible solution is to support both the GLES input commands to ANGLE and the vulkan output commands from ANGLE, and then allow you to choose which to capture from (or perhaps even do a synchronised capture from both).

In theory this should work on all platforms, the problem on windows is that there exist GLES-over-GL wrappers, and the code currently cannot support two distinct GL wrappers in the same process so it has to exclusively pick which interface to intercept. GLES-over-D3D and GLES-over-Vulkan should be doable in principle though would still require infrastucture to support without conflicts.

@ShabbyX

This comment has been minimized.

Copy link

commented Feb 15, 2019

Actually now that ANGLE dynamically loads the EGL and GLES .so files, it fails to even properly get its symbols (so it cannot run). That happens because dlopen("libEGL.so") ends up pointing to renderdoc's .so file which obviously doesn't have ANGLE's symbols.

So, it's not an issue of renderdoc intercepting the GLES calls, but rather it disallowing ANGLE from loading the libs (thus, not running at all). renderdoc capturing both streams would not help in this case.

I have to look deeper in the code to be able to tell, but would it be possible for example not to preload EGL when the environment variable is set? (Just to avoid confusion, I set the environment variable when running renderdoc itself, not when launching the application)

@baldurk

This comment has been minimized.

Copy link
Owner

commented Feb 15, 2019

I don't really understand the situation, I thought that ANGLE itself provided libEGL.so and libGLES.so and then called out to in this case Vulkan, not that it loaded those libraries itself which seems circular.

LD_PRELOAD doesn't work in such a way as to selectively filter libraries or symbols. It places librenderdoc.so at the head of the list of libraries to search when resolving symbols, meaning any ELF file that looks for eglGetProcAddress or anything else will find RenderDoc's hooked implementation first. The interception of dlopen is just to make this consistent between runtime symbol lookup and the dynamic linker.

The only way to change that is at compile time by not exporting those symbols at all. You could for example build renderdoc with -DENABLE_GLES=OFF -DENABLE_GL=OFF to accomplish that by completely disabling GL and GLES support. I don't think it would be enough to disable only GLES support, since because linux doesn't scope symbols to particular import libraries like windows that would still leave all of the GL functions being exported so any program linked to libGLES.so expecting to find glDrawArrays would hit RenderDoc's hook.

@ShabbyX

This comment has been minimized.

Copy link

commented Feb 15, 2019

There was a recent change in angle's tests that actually do load libEGL.so and libGLESv2.so dynamically. The point was to be able to run angle's tests with a native implementation for comparison. That means angle's tests are no longer directly linked to angle.

I'll go ahead and build renderdoc with Vulkan-only then, as there doesn't seem to be a way around it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.