Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vswhere hangs on Windows 2008 R2 #87

Closed
randellhodges opened this issue Jul 21, 2017 · 27 comments
Closed

vswhere hangs on Windows 2008 R2 #87

randellhodges opened this issue Jul 21, 2017 · 27 comments
Assignees
Labels

Comments

@randellhodges
Copy link

I first reported my issues over on vsts-agent, but I'm fairly sure it belongs here.
microsoft/azure-pipelines-agent#1090

Windows 2008 R2 server with the latest windows updates.

for /L %n in (1,1,100) do "vswhere.exe" -version [15.0,15.1) -latest -format json

If I run this, it will eventually hang. I might have to run that a couple times, but on my machine I can always get it to hang.

I ran it a few times on a Windows 2016 server and I never got a hang.

I tried v1.0.62 and v2.0.2

@heaths heaths self-assigned this Jul 21, 2017
@heaths heaths added the bug label Jul 21, 2017
@heaths
Copy link
Member

heaths commented Jul 21, 2017

Probably not the same as #72 since it sounds like you're using released versions. I'll see if I can repro this on a VM, but it would help if you uploaded a dump of the hung process.

However, I will point out that Visual Studio 2017 is not supported on Windows Server 2008 R2. See the System Requirement, so that query would never find anything anyway. If you were searching for the Build Tools 2017 (which are supported on 2008 R2 SP1, you will need to add -products * (or the specific product ID).

@randellhodges
Copy link
Author

I'll take a look at the requirements. When it does run, it does find it.

[
  {
    "instanceId": "16153ad4",
    "installDate": "2017-05-24T19:33:39Z",
    "installationName": "VisualStudio/15.2.0+26430.16",
    "installationPath": "C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Enterprise",
    "installationVersion": "15.0.26430.16",
    "displayName": "Visual Studio Enterprise 2017",
    "description": "Microsoft DevOps solution for productivity and coordination across teams of any size",
    "enginePath": "C:\\Program Files (x86)\\Microsoft Visual Studio\\Installer\\resources\\app\\ServiceHub\\Services\\Microsoft.VisualStudio.Setup.Service",
    "channelId": "VisualStudio.15.Release",
    "channelPath": "C:\\Users\\hodgesr\\AppData\\Local\\Microsoft\\VisualStudio\\Packages\\_Channels\\4CB340F5\\catalog.json",
    "channelUri": "https://aka.ms/vs/15/release/channel",
    "releaseNotes": "https://go.microsoft.com/fwlink/?LinkId=660284#15.1.26430.16",
    "thirdPartyNotices": "https://go.microsoft.com/fwlink/?LinkId=660300"
  }
]

@randellhodges
Copy link
Author

I am not familiar with dumping hung processes. If you can point me to a howto, I'd be glad to provide one.

@heaths
Copy link
Member

heaths commented Jul 21, 2017

Assuming Visual Studio 2017 even runs on 2008 R2 (it's not supported),

  1. Start Visual Studio 2017
  2. Click Debug->Attach to process...
  3. Select "vswhere"
  4. Click Attach
  5. Click Debug->Break All
  6. Click Debug->Save dump as...
  7. Select Minidump with heap (*.dmp)
  8. Save the dump
  9. Upload the dump to OneDrive or somewhere else I can access and provide the URL

@randellhodges
Copy link
Author

It works like a champ.

Here is the dump:
https://1drv.ms/u/s!AoeS4SeEx6cLgfs7_X9PHxIyw2IPow

@heaths
Copy link
Member

heaths commented Jul 21, 2017

There's a deadlock in the loader:

0:001> ~
#  0  Id: 21c4.26c0 Suspend: 1 Teb: 7efdd000 Unfrozen
.  1  Id: 21c4.2878 Suspend: 1 Teb: 7efda000 Unfrozen
0:001> !cs -l -o
-----------------------------------------
DebugInfo          = 0x77064380
Critical section   = 0x770620c0 (ntdll!LdrpLoaderLock+0x0)
LOCKED
LockCount          = 0x1
WaiterWoken        = No
OwningThread       = 0x00002878
RecursionCount     = 0x1
LockSemaphore      = 0xDC
SpinCount          = 0x00000000
OwningThread DbgId = ~1s
OwningThread Stack =
	ChildEBP RetAddr  Args to Child              
	00fcf6d0 76faebae 000000d0 00000000 00000000 ntdll!ZwWaitForSingleObject+0x15 (FPO: [3,0,0])
	00fcf734 76faea92 00000000 00000000 00000000 ntdll!RtlpWaitOnCriticalSection+0x13e (FPO: [Non-Fpo]) (CONV: stdcall)
	00fcf75c 0f7114df 0f737b60 00fcf79c 0f712d83 ntdll!RtlEnterCriticalSection+0x150 (FPO: [Non-Fpo]) (CONV: stdcall)
	00fcf768 0f712d83 00000004 feb606c9 00609dc0 Microsoft_VisualStudio_Setup_Configuration_Native!__acrt_lock+0x15 (FPO: [Non-Fpo]) (CONV: cdecl)
	00fcf79c 0f712efd 00fcf7b0 00fcf7d4 00fcf7b4 Microsoft_VisualStudio_Setup_Configuration_Native!__crt_seh_guarded_call<void>::operator()<<lambda_3518db117f0e7cdb002338c5d3c47b6c>,<lambda_b2ea41f6bbb362cd97d94c6828d90b61> &,<lambda_abdedf541bb04549bc734292b4a045d4> >+0x16 (FPO: [Non-Fpo]) (CONV: thiscall)
	00fcf7bc 0f712fc3 00000004 00fcf7d4 00000005 Microsoft_VisualStudio_Setup_Configuration_Native!__acrt_lock_and_call<<lambda_b2ea41f6bbb362cd97d94c6828d90b61> >+0x24 (FPO: [Non-Fpo]) (CONV: cdecl)
	00fcf7dc 0f7131ef 00609dc0 0f737d80 00000002 Microsoft_VisualStudio_Setup_Configuration_Native!construct_ptd+0x72 (FPO: [Non-Fpo]) (CONV: cdecl)
	00fcf7f4 0f712b51 0f7037f0 0f703b2e 00fcf844 Microsoft_VisualStudio_Setup_Configuration_Native!__acrt_getptd_noexit+0x60 (FPO: [0,0,4]) (CONV: cdecl)
	00fcf7f8 0f7037f0 0f703b2e 00fcf844 0f703d3e Microsoft_VisualStudio_Setup_Configuration_Native!__acrt_thread_attach+0x5 (FPO: [0,0,0]) (CONV: cdecl)
	00fcf7fc 0f703b2e 00fcf844 0f703d3e 0f6f0000 Microsoft_VisualStudio_Setup_Configuration_Native!__scrt_dllmain_crt_thread_attach+0x11 (FPO: [0,0,0]) (CONV: cdecl)
	00fcf804 0f703d3e 0f6f0000 00000002 00000000 Microsoft_VisualStudio_Setup_Configuration_Native!dllmain_crt_dispatch+0x2b (FPO: [Non-Fpo]) (CONV: stdcall)
	00fcf844 0f703e27 0f6f0000 00000002 00000000 Microsoft_VisualStudio_Setup_Configuration_Native!dllmain_dispatch+0x59 (FPO: [Non-Fpo]) (CONV: cdecl)
	00fcf858 76f99364 0f6f0000 00000002 00000000 Microsoft_VisualStudio_Setup_Configuration_Native!_DllMainCRTStartup+0x1c (FPO: [Non-Fpo]) (CONV: stdcall)
	00fcf878 76f99b72 0f703e0b 0f6f0000 00000002 ntdll!zzz_AsmCodeRange_End
	00fcf918 76f9985c 00fcf988 775118a2 00000000 ntdll!LdrpInitializeThread+0x15b (FPO: [Non-Fpo]) (CONV: stdcall)
	00fcf964 76f99889 00fcf988 76f60000 00000000 ntdll!_LdrpInitialize+0x1ad (FPO: [Non-Fpo]) (CONV: stdcall)
	00fcf974 00000000 00fcf988 76f60000 00000000 ntdll!LdrInitializeThunk+0x10 (FPO: [Non-Fpo]) (CONV: stdcall)
-----------------------------------------
DebugInfo          = 0x005f8650
Critical section   = 0x0f737b60 (Microsoft_VisualStudio_Setup_Configuration_Native!__acrt_lock_table+0x60)
LOCKED
LockCount          = 0x1
WaiterWoken        = No
OwningThread       = 0x000026c0
RecursionCount     = 0x3
LockSemaphore      = 0xD0
SpinCount          = 0x00000fa0
OwningThread DbgId = ~0s
OwningThread Stack =
	ChildEBP RetAddr  Args to Child              
	0018f09c 76faebae 000000dc 00000000 00000000 ntdll!ZwWaitForSingleObject+0x15 (FPO: [3,0,0])
	0018f100 76faea92 00000000 00000000 0018f168 ntdll!RtlpWaitOnCriticalSection+0x13e (FPO: [Non-Fpo]) (CONV: stdcall)
	0018f128 76f90329 770620c0 77b51002 0f729228 ntdll!RtlEnterCriticalSection+0x150 (FPO: [Non-Fpo]) (CONV: stdcall)
	0018f1c4 76f90262 76210000 0018f200 00000000 ntdll!LdrGetProcedureAddressEx+0x159 (FPO: [Non-Fpo]) (CONV: stdcall)
	0018f1e0 74c11f7c 76210000 0018f200 00000000 ntdll!LdrGetProcedureAddress+0x18 (FPO: [Non-Fpo]) (CONV: stdcall)
	0018f208 0f717923 76210000 0f729228 00000001 KERNELBASE!GetProcAddress+0x44 (FPO: [Non-Fpo]) (CONV: stdcall)
	0018f224 0f717cf6 00000015 0f729228 0f729220 Microsoft_VisualStudio_Setup_Configuration_Native!try_get_function+0x66 (FPO: [Non-Fpo]) (CONV: cdecl)
	0018f244 0f713aa0 005fefa8 00603e40 005fefa8 Microsoft_VisualStudio_Setup_Configuration_Native!__acrt_IsValidLocaleName+0x27 (FPO: [Non-Fpo]) (CONV: stdcall)
	0018f450 0f71403f 005fefa8 0018f540 00000083 Microsoft_VisualStudio_Setup_Configuration_Native!_expandlocale+0x227 (FPO: [Non-Fpo]) (CONV: cdecl)
	0018f64c 0f713513 00603e40 00000000 005fefa8 Microsoft_VisualStudio_Setup_Configuration_Native!_wsetlocale_nolock+0x1f5 (FPO: [Non-Fpo]) (CONV: cdecl)
	0018f668 0f713446 fe5207cd 0018f728 00603e40 Microsoft_VisualStudio_Setup_Configuration_Native!<lambda_e378711a6f6581bf7f0efd7cdf97f5d9>::operator()+0x29 (FPO: [0,0,4]) (CONV: thiscall)
	0018f698 0f71348a 0018f6ac 0018f6d0 0018f6b0 Microsoft_VisualStudio_Setup_Configuration_Native!__crt_seh_guarded_call<void>::operator()<<lambda_c76fdea48760d5f9368b465f31df4405>,<lambda_e378711a6f6581bf7f0efd7cdf97f5d9> &,<lambda_e927a58b2a85c081d733e8c6192ae2d2> >+0x23 (FPO: [Non-Fpo]) (CONV: thiscall)
	0018f6b8 0f7134e2 00000004 0018f6d0 00000000 Microsoft_VisualStudio_Setup_Configuration_Native!__acrt_lock_and_call<<lambda_e378711a6f6581bf7f0efd7cdf97f5d9> >+0x24 (FPO: [Non-Fpo]) (CONV: cdecl)
	0018f6e4 0f713400 fe520641 00000000 005fefa8 Microsoft_VisualStudio_Setup_Configuration_Native!<lambda_2af78c5f5901b1372d98f9ab3177dfa6>::operator()+0x54 (FPO: [Non-Fpo]) (CONV: thiscall)
	0018f714 0f713caa 0018f74f 0018f728 0018f73c Microsoft_VisualStudio_Setup_Configuration_Native!__crt_seh_guarded_call<void>::operator()<<lambda_70818de7b02deff9841e8b0962a60ed9>,<lambda_2af78c5f5901b1372d98f9ab3177dfa6> &,<lambda_f51fe5fd7c79a33db34fc9310f277369> &>+0x18 (FPO: [Non-Fpo]) (CONV: thiscall)
	0018f750 0f710f70 00000000 005fefa8 005fc500 Microsoft_VisualStudio_Setup_Configuration_Native!_wsetlocale+0x79 (FPO: [Non-Fpo]) (CONV: cdecl)
	0018f76c 0f710fa6 00000000 005fc500 fe5206ed Microsoft_VisualStudio_Setup_Configuration_Native!call_wsetlocale+0x7f (FPO: [Non-Fpo]) (CONV: cdecl)
	0018f7b8 0f702949 00000000 005fc500 0018f82c Microsoft_VisualStudio_Setup_Configuration_Native!setlocale+0x18 (FPO: [Non-Fpo]) (CONV: cdecl)
	0018f7cc 0f6fdc97 0018f82c 005fc500 fe520941 Microsoft_VisualStudio_Setup_Configuration_Native!std::_Locinfo::_Locinfo_ctor+0x34 (FPO: [Non-Fpo]) (CONV: cdecl)
	0018f814 0f6fdd3c 005fc500 fe5209f9 00000001 Microsoft_VisualStudio_Setup_Configuration_Native!std::_Locinfo::_Locinfo+0xc7 (FPO: [Non-Fpo]) (CONV: thiscall)

These seems enough like #72 to make me wonder if our CI build is uses the wrong CRT. We may have to hardcode it in the vcxproj project.

@heaths
Copy link
Member

heaths commented Jul 21, 2017

@randellhodges can you verify that vswhere.exe in https://ci.appveyor.com/api/buildjobs/srdnp2g8g5a504b1/artifacts/bin%2FRelease.zip works for you?

@randellhodges
Copy link
Author

Sadly, it eventually hangs. The first loop of 100 worked fine. I changed it to go up to 1000 and a few dozen runs into it, it hung.

I tried the recommendation over on the linked issue. I am not a C++ developer but I went ahead and grabbed the master version, installed C++ and pulled down the 10.0.14393.0 SDK and tried. It too hung.

@heaths
Copy link
Member

heaths commented Jul 22, 2017

Just in case it's different, can you take and upload another dump? I'm working with the Visual C++ team to determine why this seemingly known issue in an older UCRT is not fixed by the rebuild.

@randellhodges
Copy link
Author

@heaths
Copy link
Member

heaths commented Jul 24, 2017

In order to see if the problem is fixed, I really need you to try https://ci.appveyor.com/api/buildjobs/srdnp2g8g5a504b1/artifacts/bin%2FRelease.zip. I don't build an official, recommended version until we know a bug is fixed.

@randellhodges
Copy link
Author

randellhodges commented Jul 25, 2017

Is that not the same one I posted the dump for 3 days ago? The post right above it that says "new dump"?

Or did I dump the wrong version or something?

@heaths
Copy link
Member

heaths commented Jul 25, 2017

You said you tested on the version from "master". That's not the fixed version.

@randellhodges
Copy link
Author

The first dump
https://1drv.ms/u/s!AoeS4SeEx6cLgfs7_X9PHxIyw2IPow
was against master

The second dump (which I said New Dump)
https://1drv.ms/u/s!AoeS4SeEx6cLgfs8uD_ijHUtfpSITg
was against
https://ci.appveyor.com/api/buildjobs/srdnp2g8g5a504b1/artifacts/bin%2FRelease.zip

Maybe I should not have said "New Dump".

@heaths
Copy link
Member

heaths commented Jul 25, 2017

You said,

I tried the recommendation over on the linked issue. I am not a C++ developer but I went ahead and grabbed the master version, installed C++ and pulled down the 10.0.14393.0 SDK and tried. It too hung.

That doesn't sound like you used the built binary which contains the fix that should prevent the deadlock. The new dump you provided does not contain the symbol while the EXE does.

@randellhodges
Copy link
Author

randellhodges commented Jul 25, 2017

I had many more comments after that. Let's just chalk it up to a misunderstanding and move on.

I guess that last dump I provided was not against the correct version. My apologies. I thought it was against the one you requested. I am confident that I have now tested the correct one and, while it took a while longer to hang, it still eventually hung.

Here is the dump you requested:
https://1drv.ms/u/s!AoeS4SeEx6cLgfs9MCQAcUraE8CLLg

@heaths
Copy link
Member

heaths commented Jul 28, 2017

We believe we have identified the problem but cannot reproduce it. Please extra the contents (at least the DLL, but should symbols be necessary the PDB as well) of https://1drv.ms/u/s!AjCAGrsc9hWUgZiMIeKFXZBzVRXdrZI to %ProgramData%\Microsoft\VisualStudio\Setup\x86 and try again.

@randellhodges
Copy link
Author

@heaths Sorry, I still got the hang. I tried it against both v2.0.2 and the appveyor build you had me try.

image

Here is a the latest dump:
https://1drv.ms/u/s!AoeS4SeEx6cLgfs-tZQKLGXmCDLwUw

Sometimes I could get it to run 100 fine but then the next time I tried 100 it would hang. If I increase it to 1000, on my machine, no version would ever get thru a 1000 run loop.

Anything else I can provide for you?

@heaths
Copy link
Member

heaths commented Jul 28, 2017

I sincerely apologize. I gave you the wrong file that did not include the fix. Please do the same with https://1drv.ms/u/s!AjCAGrsc9hWUgZiMIeKFXZBzVRXdrZI (same link - just replaced the binaries).

@randellhodges
Copy link
Author

randellhodges commented Jul 28, 2017

Good news, the latest ones seem to solve the issue! I did multiple loops of 1000 and not a single hang!

@heaths
Copy link
Member

heaths commented Jul 28, 2017

Thank you for the help! Unfortunately, this is a problem in the DLL (as you probably noticed) that ships with Visual Studio 2017 and will be in the next update after Visual Studio 2017 version 15.3 ships, i.e. Visual Studio 2017 version 15.4 preview. We're in final testing for version 15.3 and with a single report of this deadlock it doesn't meet the bar. But if this is blocking you currently, you're welcome to use those private bits which will be replaced by the official build version when released.

@mwindsor-beoped
Copy link

Just had this same issue with VS2017 15.8.3 and VS2015 installed, combined with the latest VSTS packaged version of the build agent (2.140.0, vswhere v1.0.62) plus an updated vswhere (2.5.2).

Only thing that would fix it was the patched version of the DLL above (previously had 1.17.1230).

Thanks for keeping the download available.

And +1 to the reports :)

@heaths
Copy link
Member

heaths commented Sep 11, 2018

Which version of vswhere were you running? A newer, 2.x version against any DLL version 1.13 or newer that we released shouldn't repro this. Both binaries had to be recompiled against a new UCRT.

@mwindsor-beoped
Copy link

I believe all this information was in my original comment, however, putting it another way...

Initially I was using vswhere v1.0.62, then having done some research, updated vswhere to 2.5.2, which had the same issue. It was only by downloading an updating the patched DLL that I was able to fix the issue. Previously I'd had version 1.17.1230.

@heaths
Copy link
Member

heaths commented Sep 20, 2018

You said,

Just had this same issue with VS2017 15.8.3 and VS2015 installed, combined with the latest VSTS packaged version of the build agent (2.140.0, vswhere v1.0.62) plus an updated vswhere (2.5.2).

Which one were you running? You list at least two locations of vswhere that are different versions. The "patched DLL" is older - I figured the deadlock in the COM server DLL long before 1.17, but it also required a change to vswhere. The two must go together. So a 1.0 version of vswhere - if you were running the build agent one - could be the problem.

If using a 2.0 version with 1.17, I'd need to see a minidump or at least the stack to confirm it's somehow the same issue, or a new one.

@mwindsor-beoped
Copy link

Yes, I mentioned multiple versions because it failed with both (i.e. I tried running both), and I tried to give a hint of why I'd tried the different versions (normally more information is better than less?) In short, the 1.0 version failed, which it sounds like you would have expected and which was what made me try a later version. I then tried 2.5.2, which still failed.
Installing the patched version resolved the issue for me. Was just sharing my experience in case it helped others consider the solution, and have no interest in doing diagnostic tests now that I have an operating system.
Basically, I don't want to break what is a brittle system.
Thanks for you attention though.

@heaths
Copy link
Member

heaths commented Sep 21, 2018

The patched version is older and does not support many things on which VS and partner applications depend. By not helping to identify why a similar behavior still occurs on your system (the original problem was a deadlock in the CRT), this system is made brittle for everyone. At the very least, your system does not have the required version for all features to work correctly and will subsequently get updated at some later date when we add more features to the query API on which new VS features may depend.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants