Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creation of ICoreWebView2Controller fails with E_FAIL #850

Closed
danjag opened this issue Jan 25, 2021 · 21 comments
Closed

Creation of ICoreWebView2Controller fails with E_FAIL #850

danjag opened this issue Jan 25, 2021 · 21 comments
Assignees
Labels
bug Something isn't working tracked We are tracking this work internally.

Comments

@danjag
Copy link

danjag commented Jan 25, 2021

Description
I have been trying to get our webview2 based browser running in a remote desktop environment of a customer.
This environment has very resticted network access ("no internet"), and I think this may be causing the problem.

When I start our browser the it fails to create the controller instance for the browser.
When I start the installed Edge browser it also fails, but more gracefully by showing an error page.

Version
SDK: 89.0.721.0
Runtime: 89.0.767.0 dev
Framework: Win32
OS: Windows Server 2016 1607

Additional context
I have noticed that a number of network calls are made during the ICoreWebView2Environment::CreateCoreWebView2Controller call, and the before the completion interface ICoreWebView2CreateCoreWebView2ControllerCompletedHandler::Invoke is called.

In the customer environment this completion function is called after quite some time with a 0x80004005 (E_FAIL) errorCode.
I'm guessing that the reason for the delay could be due to some network connection timeout.

By using Fiddler I could identify network calls for at least:

smartscreen (https://europe.smartscreen.microsoft.com/api/browser/edge/actions)
shopping (https://www.bing.com/api/shopping/v1/getsupporteddomains)
skype(?) (https://config.edge.skype.com/config/v1/Edge/88.0.705.49)

I managed to disable smartscreen by using --disable-features=msSmartScreenProtection as described in issue #834
It seems shopping was enabled by default since version 87.0.664.55. This perhaps makes perfect sense in the official Edge browser, but in the embedded usecase I think it makes no sense at all.

  • Could a failure in any of these network calls be causing my E_FAIL error?

  • Are there any ways to disable the remaining features?

AB#31482191

@danjag danjag added the bug Something isn't working label Jan 25, 2021
@champnic
Copy link
Member

Hm, I'm not sure what testing we've done in a no internet environment. I'll open this bug on our backlog and we'll take a look. It's possible that one of the network calls is causing a failure, but it would surprise me.

Disabling those network requests is currently tracked in #819.

Thanks!

@champnic champnic added the tracked We are tracking this work internally. label Jan 26, 2021
@champnic champnic self-assigned this Jan 26, 2021
@danjag
Copy link
Author

danjag commented Jan 27, 2021

Yepp, I also noticed these "garbage" requests which are mentioned in issue #819.
There seems to be a policy named EdgeShoppingAssistantEnabled which may affect the "shopping" request made to www.bing.com.
Perhaps I will try that later

@danjag
Copy link
Author

danjag commented Feb 1, 2021

I have been trying out some policies that seem related to these "hidden" network requests:

HKLM\SOFTWARE\Policies\Microsoft\Edge\DNSInterceptionChecksEnabled=0

This setting did suppress the three network requests to random hosts as mentioned in #819
Perhaps this feature should be disabled by default due to Chromium’s impact on root DNS traffic

HKLM\SOFTWARE\Policies\Microsoft\Edge\EdgeShoppingAssistantEnabled=0

This setting suppress the request to https://www.bing.com/api/shopping/v1/getsupporteddomains

HKLM\SOFTWARE\Policies\Microsoft\Edge\ExperimentationAndConfigurationServiceControl=0

This setting suppress the request to https://config.edge.skype.com/config/v1/Edge/88.0.705.49
I didn't see the request in our webview2 based application, so it may be off by default?

However, these policies only seem to be read by the "official" Edge browsers - which is logical due to the namespace of the registry settings (SOFTWARE\Policies\Microsoft\ Edge).

But how do we achieve the same thing in a custom application which only embedds the webview2?
Are there any features that can be disabled through the commandline (like --disable-features=msSmartScreenProtection)?

I have tried to inspect the commandline of the msedge.exe rendering process, but I can't really see any changes due to my policy changes. When I inspect the msedgewebview2.exe which is used in our application, I can see a list of disabled/enabled features.

@champnic
Copy link
Member

champnic commented Feb 2, 2021

I just tried to repro the original bug (creation failure when no network) and in a simple Winforms app I get an error page:
image
Can you confirm that the app works fine when the remote desktop has network access? Or that it works when run locally? Just trying to narrow down a possible cause.

We are working on disabling the shopping features by default, because as you mentioned it doesn't make much sense for WebView2. The policies indeed only effect the Microsoft Edge browser so we're going to build other controls and defaults for those features.

@danjag
Copy link
Author

danjag commented Feb 2, 2021

No, sorry but I can't change the network conditions in the customer environment so I can not test if it works.

However, I have been testing some more and I think I have at least seen what is happening...

During, the CreateCoreWebView2Controller call all the msedgewebview2.exe sub processes seem to be a started.
When I run in the problematic customer envrionment I can see the msedgewebview2.exe process tree starting up and then
all these processes immediately shuts down again. After a while my host exe gets notified with a E_FAIL

I have also noticed that the outcome seem to depend on our host exe.
The problem seem to occur when I start our host exe which is built on our jenkins build server, but if I start a release build copied directly from my developer machine it starts up as intended. When we build on our build server we do some additional postprocessing on the modules (eg patching version information, UPX-ing and signing)

When the msedgewebview2.exe is launched the name of the host exe is passed on the command line: --webview-exe-name=OurHost.exe.

  • What do msedgewebview2.exe do with this information?
  • Are there any implicit assumptions about our host exe that can fail due to our postprocessing?

It should be noted that the postprocessed exe file runs without problems on my Windows 10 machine.
The customer machine is running Windows Server 2012 R2

@danjag
Copy link
Author

danjag commented Feb 2, 2021

Perhaps the problem somehow origins from WebView2Loader.dll

When I start the release build from my developer machine (the one that works) I get this commandline for the topmost msedgewebview2.exe (the one launched from our host exe):

"C:\Program Files (x86)\Microsoft\Edge Dev\Application\89.0.774.14\msedgewebview2.exe" 
--embedded-browser-webview=1 
--webview-exe-name=WASyncBrowserR.exe 
--webview-exe-version=3.6 
--user-data-dir="C:\Users\5danjag\AppData\Roaming\WEADD\WASyncBrowser\EBWebView" 
--no-default-browser-check --disable-component-extensions-with-background-pages 
--no-first-run 
--disable-default-apps 
--noerrdialogs 
--embedded-browser-webview-dpi-awareness=2 
--disable-features=msEdgeOnRampFRE,msEdgeOnRampImport, ... ,SpareRendererForSitePerProcess 
--disable-popup-blocking 
--enable-features=ForwardMemoryPressureEventsToGpuProcess 
--internet-explorer-integration=none 
--js-flags="--harmony-weak-refs-with-cleanup-some --expose-gc" 
--winhttp-proxy-resolver 
--mojo-named-platform-channel-pipe=6204.3620.5614225147673504067

When I start our postprocessed jenkins build the same commandline lacks a lot of stuff:

"C:\Program Files (x86)\Microsoft\Edge Dev\Application\89.0.774.14\msedgewebview2.exe" 
--embedded-browser-webview=1 
--webview-exe-name=WASyncBrowser9.exe 
--webview-exe-version=browser.beta9

I also tested an additional jenkins build which had a "traditional" product version (9.99.99.99) in case of our somewhat odd product version (browser.beta9) should be the problem - but it wasn't:

"C:\Program Files (x86)\Microsoft\Edge Dev\Application\89.0.774.14\msedgewebview2.exe" 
--embedded-browser-webview=1 
--webview-exe-name=WASyncBrowser99.exe 
--webview-exe-version=9.99.99.99

This build fails in the same way.

May some code in WebView2Loader.dll fail in such a way that only the first three parameters are added to the command line?

@danjag
Copy link
Author

danjag commented Feb 2, 2021

I recently uppgraded to WebView2Loader.dll version 1.0.705.50
I don't think this upgrade has anything to do with the problem - it has been around for some time...

@danjag
Copy link
Author

danjag commented Feb 3, 2021

Hmm, when I was investigating my problem as described in issue #893, I uninstalled the dev channel version in order to get our browser to select the standard Edge version. When that didn´t work I installed the beta channel instead as I described.

When using the beta version this problem disappeared...

The commandline for the topmost msedgeviewview2.exe now became:

"C:\Program Files (x86)\Microsoft\Edge Beta\Application\88.0.705.56\msedgewebview2.exe" 
--embedded-browser-webview=1 
--user-data-dir="C:\Users\5danjag\AppData\Roaming\WEADD\WASyncBrowser\EBWebView" 
--no-default-browser-check 
--disable-component-extensions-with-background-pages 
--no-first-run 
--disable-default-apps 
--noerrdialogs 
--embedded-browser-webview-dpi-awareness=2 
--disable-features=msEdgeOnRampFRE, ... ,SpareRendererForSitePerProcess 
--disable-popup-blocking 
--enable-features=ForwardMemoryPressureEventsToGpuProcess 
--internet-explorer-integration=none 
--js-flags="--harmony-weak-refs-with-cleanup-some --expose-gc" 
--winhttp-proxy-resolver 
--mojo-named-platform-channel-pipe=8664.6684.11875398230195540071

In this case these parameters was not added

--webview-exe-name=WASyncBrowser99.exe 
--webview-exe-version=9.99.99.99

I have been trying do get our jenkins server to make multiple versions of our browser exe with a different degrees of postprocessing, but now they all worked....

Maybe I will try to reinstall the dev channel again and see if I can pinpoint the problem.

@danjag
Copy link
Author

danjag commented Feb 9, 2021

The exact same problem suddenly occured in the beta version when it reached 89.0.774.x

Without any changes on our side besides using the new beta I noticed that these parameters which I previously saw in the dev version now also was added to the beta target:

--webview-exe-name=WASyncBrowser99.exe 
--webview-exe-version=9.99.99.99

I have tracked down that it is sufficient for us to "patch" the version information in our host exe to trigger the problem.

So, my wild guess is that the loader get some problem when reading our "patched" version info and fails to produce a correct commandline for msedgewebview2.exe which fails to launch properly, and that the loader does this additional version reading step if the targeted version is 89 or greater

@champnic can you confirm if this is the case?

My main consern is that this will cause our application to break once version 89 reaches stable

We use a somewhat custom and old version of verpatch to "replace" the version information. However, my understanding is that the a new version information record is added to the exe, and that the old resource remain "unused".

We have never had any problems with this, and GetFileVersionInfo has no problem reading the "correct" resource. But I guess that the two version resources could be confusing if this information is read in some other way...

@champnic
Copy link
Member

champnic commented Feb 9, 2021

Hey @danjag - thanks for the continued investigation and added info.

I just tried a basic app and it does look like the parameters are new, but I'm not seeing a crash as a result of it. I believe the parameters were added so that we can tie crash reports from the webview runtime to a specific app and see if specific types of usage are resulting in crashes.

Is it possible for you to try unpatched binaries? Can you keep everything else the same in your scenario, but just don't run verpatch so that you get the default version behavior, and then see if you still repro the issue? That would hopefully demonstrate that there is a problem with the versioning and how it's sent over, and not some other problem that showed up between the 88 and 89 runtimes.

@danjag
Copy link
Author

danjag commented Feb 10, 2021

I just tried a basic app and it does look like the parameters are new, but I'm not seeing a crash as a result of it.

Did you verpatch your app?
If you did verpatch you app - can you see if your app got two version resources?

My exe contains two resource and the lang:0 one was added by verpatch
image

Is it possible for you to try unpatched binaries?

I did try out various binaries with different "levels" of post processing and once i tried a verpatched version I got the problem, and the non-verpatched version worked just fine - everything else was the same.

Could you see any reason why the commandline could be shorter for som reason?

When the problem occur the commandline is "short":

"C:\Program Files (x86)\Microsoft\Edge Dev\Application\89.0.774.14\msedgewebview2.exe" 
--embedded-browser-webview=1 
--webview-exe-name=WASyncBrowserR.exe 
--webview-exe-version=3.6 

It is terminated directly after the new commanline switches which is based on our app resource is added.
Could some failure here (perhaps due to problems related to the resource) cause the following parameter not to be added:

--user-data-dir="C:\Users\5danjag\AppData\Roaming\WEADD\WASyncBrowser\EBWebView" 
--no-default-browser-check --disable-component-extensions-with-background-pages 
--no-first-run 
--disable-default-apps 
--noerrdialogs 
--embedded-browser-webview-dpi-awareness=2 
--disable-features=msEdgeOnRampFRE,msEdgeOnRampImport, ... ,SpareRendererForSitePerProcess 
--disable-popup-blocking 
--enable-features=ForwardMemoryPressureEventsToGpuProcess 
--internet-explorer-integration=none 
--js-flags="--harmony-weak-refs-with-cleanup-some --expose-gc" 
--winhttp-proxy-resolver 
--mojo-named-platform-channel-pipe=6204.3620.5614225147673504067**  

All these parametes are present when we run the non-verpatched version

My guess is that it is the WebView2Loader.dll that is responsable for building this commandline and to launch the "topmost" msedgeviewview2.exe. Is this correct?

@champnic
Copy link
Member

@danjag Yes that is correct - the loader will get the file version info from the hosts exe and build the command line. My guess is something about the version formatting as a result of verpatch is causing it to fail out and not complete the rest of the command line.

@champnic
Copy link
Member

We use GetFileVersionInfo to read out the version of the exe. I'll try and have an engineer look at this relatively soon, but if you want to keep debugging you could try calling GetFileVersionInfo on both the patched and unpatched versions of your binaries to see if there's an obvious difference.

I's also generally unfamiliar with the verpatch tool. Could you share how you are using it/calling it (parameters, etc.) on your host exe?

Thanks!

@danjag
Copy link
Author

danjag commented Feb 11, 2021

We already use GetFileVersionInfo to read version info from our verpatched modules, and it works for us.

Does the loader try to parse the returned version strings?
I guess such string parsing could fail if it doesn't have the assumed format.

This is why I tested to use a traditional "1.2.3.4" formated product version - "since it could likely be parsed" - but it didn't matter.
We actually use our build tag in git as product version so it can be kind of arbritary - eg 20.A.2rc3
We did a small change in (our version of) verpatch to allow these special product versions

I's also generally unfamiliar with the verpatch tool. Could you share how you are using it/calling it (parameters, etc.) on your host exe?

From what I remember it is something like (it was a long time ago)

verpatch myapp.exe "1.2.3.4" /pv "1.0.0.0"

where "1.2.3.4" is the file version, and "1.0.0.0" is the product version (which the loader seems to read)
I think myapp.exe should already have a version resource - which is copied, patched and then appended

There apparently is an article on CodeProject about the tool

@danjag
Copy link
Author

danjag commented Feb 11, 2021

Explorer.exe read our verpatched modules without problems:

image

The things in the red boxes are patched by our postprocessing.

The product version above is an example where the module was built from our browser.beta9 tag in git.
We allow our modules to have independent file versions - apart from the build number which is shared between all modules which is built for the product version (ie tag in git)

@RichardSteele
Copy link

Now that the WebView2 runtime has updated to 89, we're facing the same problem with an executable created by the VB6 compiler. There's no exe hacking going on. Surprisingly, there's no problem when launching in debug mode via the VB6 IDE.

Just opening and saving the executable with ResEdit (http://www.resedit.net/) solves the problem. I'm going to have a closer look at what's going on here.

@RichardSteele
Copy link

RichardSteele commented Mar 8, 2021

This could be an instance of a buggy resource compiler as described by Raymond Chen.

Asking for the product version of the original executable as created by the VB6 compiler, I get a length of x using VerQueryValueA and a length of 2x using VerQueryValueW.
Asking for the product version of the ResEdit patched executable, I get a length of x using VerQueryValueA and a length of x using VerQueryValueW.

The documentation of VerQueryValueA and VerQueryValueW say that for version information strings puLen carries the length of characters. According to Raymond Chen and the string structure documentation the count stored within the executable is of bytes and not characters. To me it seems that VB6 produces a correct file because it contains the count of bytes. The patched executable created by ResEdit should be wrong because it contains the count of characters.

As the command line of msedgewebview2.exe looks truncated after the version parameter, I suspect that you go via VerQueryValueW and treat puLen as the string's character length. This leads to a string containing too many zero terminators that finally truncate the command line.

Is VerQueryValueW wrong because it doesn't report the number of characters like VerQueryValueA?

@danjag
Copy link
Author

danjag commented Mar 8, 2021

Thanks @RichardSteele, this was very helpful.

It appears as if our version of the verpatch utility aren't doing the right thing...

the string structure documentation clearly states that the wValueLength member contains the size, in words (not bytes), of the Value member, which should usually mean that it equals to the number of wide characters in a string resource.

If I look at the original "ProductVersion" record which was linked into our .exe it looks like this:

2C00
0400 										(04h = 4 words = 8 bytes) 	
0100
5000 7200 6F00 6400 7500 6300 7400 5600 6500 7200 7300 6900 6F00 6E00 0000   	"ProductVersion"
3300 2E00 3600 0000 								"3.5"   (3+1 chars = 8 bytes) 

and if I look the "ProductVersion" record produced by our verpatch utility:

4200
1E00                                                                            (1Eh = 30 words = 60 bytes) **WRONG**     
0100
5000 7200 6F00 6400 7500 6300 7400 5600 6500 7200 7300 6900 6F00 6E00 0000   	"ProductVersion"
6200 7200 6F00 7700 7300 6500 7200 2E00 6200 6500 7400 6100 3100 3100 0000 	"browser.beta11" (14+1 chars = 30 bytes)

So, it looks like I need to have a look att the source code of our verpatch utility - it should be an easy fix once I find the place...

Is VerQueryValueW wrong because it doesn't report the number of characters like VerQueryValueA?

In our case this most likely happens because of wValueLength appears to contain number of bytes instead of number of words

@champnic
Copy link
Member

Hey all - we have a fix that makes our app more resilient to this issue. Turns out it's more widespread and was hitting apps in production, so we've pushed the fix down to the stable WebView2 Runtime 89, which should be patched this week. Additionally the fix should show up in Microsoft Edge Canary in a day or two. Apologies for any inconvenience, and huge thanks for the investigation you all did here to track this issue down!

@danjag
Copy link
Author

danjag commented Mar 10, 2021

Great news @champnic, thanks for letting us know

@champnic
Copy link
Member

The fix has been checked in and is available in:
WebView2 Runtime version 89.0.774.50+
Microsoft Edge Canary/Dev version 91.0.823.0+

We still encourage you to update your version formats if you were previously running into this.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working tracked We are tracking this work internally.
Projects
None yet
Development

No branches or pull requests

3 participants