Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash on startup, related to defterm #10233

Closed
zadjii-msft opened this issue May 27, 2021 · 12 comments · Fixed by #10261
Closed

Crash on startup, related to defterm #10233

zadjii-msft opened this issue May 27, 2021 · 12 comments · Fixed by #10261
Assignees
Labels
Area-DefApp Needs-Tag-Fix Doesn't match tag requirements Priority-1 A description (P1) Resolution-Fix-Committed Fix is checked in, but it might be 3-4 weeks until a release. Severity-Crash Crashes are real bad news.

Comments

@zadjii-msft
Copy link
Member

zadjii-msft commented May 27, 2021

Microsoft.WindowsTerminalPreview_1.9.1445.0, internal OS build

We're hitting the CATCH_FAIL_FAST:

void TerminalPage::_StartInboundListener()
    {
        if (_shouldStartInboundListener)
        {
            _shouldStartInboundListener = false;

            try
            {
                winrt::Microsoft::Terminal::TerminalConnection::ConptyConnection::StartInboundListener();
            }
            // If we failed to start the listener, it will throw.
            // We should fail fast here or the Terminal will be in a very strange state.
            // We only start the listener if the Terminal was started with the COM server
            // `-Embedding` flag and we make no tabs as a result.
            // Therefore, if the listener cannot start itself up to make that tab with
            // the inbound connection that caused the COM activation in the first place...
            // we would be left with an empty terminal frame with no tabs.
            // Instead, crash out so COM sees the server die and things unwind
            // without a weird empty frame window.
            CATCH_FAIL_FAST()
        }
    }

Dump is in c:\users\migrie\dev\startup-crash.dmp. Repo's been hot yesterday so this might have already been filed - I'll dedupe later.

stack
0:000> k
 # Child-SP          RetAddr               Call Site
00 00000066`5b8f9820 00007ff8`82bee109     KERNELBASE!RaiseFailFastException+0x152 [minkernel\kernelbase\xcpt.c @ 1198] 
01 00000066`5b8f9e00 00007ff8`82beb428     TerminalApp!wil::details::WilDynamicLoadRaiseFailFastException+0x49 [E:\BA\97\s\dep\wil\include\wil\result_macros.h @ 1773] 
02 00000066`5b8f9e30 00007ff8`82beb40c     TerminalApp!wil::details::WilRaiseFailFastException+0x18 [E:\BA\97\s\dep\wil\include\wil\result_macros.h @ 1713] 
03 00000066`5b8f9e60 00007ff8`82beb9c3     TerminalApp!wil::details::WilFailFast+0xa8 [E:\BA\97\s\dep\wil\include\wil\result_macros.h @ 3420] 
04 00000066`5b8f9f30 00007ff8`82beb512     TerminalApp!wil::details::ReportFailure_NoReturn<3>+0x21b [E:\BA\97\s\dep\wil\include\wil\result_macros.h @ 3463] 
05 00000066`5b8fb420 00007ff8`82c2cffc     TerminalApp!wil::details::ReportFailure_Base<3,0>+0x2e [E:\BA\97\s\dep\wil\include\wil\result_macros.h @ 3488] 
06 00000066`5b8fb480 00007ff8`82c2cf7a     TerminalApp!wil::details::ReportFailure_CaughtExceptionCommonNoReturnBase<3>+0x80 [E:\BA\97\s\dep\wil\include\wil\result_macros.h @ 3590] 
07 (Inline Function) --------`--------     TerminalApp!wil::details::ReportFailure_CaughtExceptionCommon+0x14 [E:\BA\97\s\dep\wil\include\wil\result_macros.h @ 3599] 
08 00000066`5b8fb510 00007ff8`82c31723     TerminalApp!wil::details::ReportFailure_CaughtException<3>+0x32 [E:\BA\97\s\dep\wil\include\wil\result_macros.h @ 3807] 
09 00000066`5b8fc570 00007ff8`82bbc8e1     TerminalApp!wil::details::in1diag3::FailFast_CaughtException+0x13 [E:\BA\97\s\dep\wil\include\wil\result_macros.h @ 4719] 
0a 00000066`5b8fc5c0 00007ff8`bcf81080     TerminalApp!`winrt::TerminalApp::implementation::TerminalPage::_StartInboundListener'::`1'::catch$0+0x22 [E:\BA\97\s\src\cascadia\TerminalApp\TerminalPage.cpp @ 359] 
0b 00000066`5b8fc5f0 00007ff8`bcf826a5     VCRUNTIME140_1!_CallSettingFrame_LookupContinuationIndex+0x20 [D:\a01\_work\26\s\src\vctools\crt\vcruntime\src\eh\amd64\handlers.asm @ 98] 
0c 00000066`5b8fc620 00007ff8`c7ce74c6     VCRUNTIME140_1!__FrameHandler4::CxxCallCatchBlock+0x115 [D:\a01\_work\26\s\src\vctools\crt\vcruntime\src\eh\frame.cpp @ 1393] 
0d 00000066`5b8fc700 00007ff8`82bbc8b9     ntdll!RcFrameConsolidation+0x6 [minkernel\ntos\rtl\amd64\capture.asm @ 1124] 
0e 00000066`5b8fe9c0 00007ff8`82bab2cb     TerminalApp!winrt::TerminalApp::implementation::TerminalPage::_StartInboundListener+0x19 [E:\BA\97\s\src\cascadia\TerminalApp\TerminalPage.cpp @ 348] 
0f 00000066`5b8fe9f0 00007ff8`82baacdb     TerminalApp!winrt::TerminalApp::implementation::TerminalPage::_OnFirstLayout+0xbb [E:\BA\97\s\src\cascadia\TerminalApp\TerminalPage.cpp @ 330] 
10 (Inline Function) --------`--------     TerminalApp!winrt::Windows::Foundation::TypedEventHandler<winrt::Windows::UI::Xaml::FrameworkElement,winrt::Windows::Foundation::IInspectable>::<lambda_c45f47b7e8e0f80bc2a45684739b042d>::operator()+0x1a [E:\BA\97\s\src\cascadia\TerminalApp\Generated Files\winrt\Windows.Foundation.h @ 2511] 
11 00000066`5b8fea40 00007ff8`81857cd3     TerminalApp!winrt::impl::delegate<winrt::Windows::Foundation::TypedEventHandler<winrt::Windows::UI::Xaml::FrameworkElement,winrt::Windows::Foundation::IInspectable>,<lambda_c45f47b7e8e0f80bc2a45684739b042d> >::Invoke+0x3b [E:\BA\97\s\src\cascadia\TerminalApp\Generated Files\winrt\Windows.Foundation.h @ 894] 
12 00000066`5b8fea90 00007ff8`81857b9a     Windows_UI_Xaml!GetErrorContextIndex+0x117c03
13 00000066`5b8feaf0 00007ff8`8185c2e8     Windows_UI_Xaml!GetErrorContextIndex+0x117aca
14 00000066`5b8feb30 00007ff8`81994285     Windows_UI_Xaml!GetErrorContextIndex+0x11c218
15 00000066`5b8feb80 00007ff8`8185cc09     Windows_UI_Xaml!DllCanUnloadNow+0x34925
16 00000066`5b8febf0 00007ff8`8182404d     Windows_UI_Xaml!GetErrorContextIndex+0x11cb39
17 00000066`5b8fec90 00007ff8`818231af     Windows_UI_Xaml!GetErrorContextIndex+0xe3f7d
18 00000066`5b8fed20 00007ff8`819f51f4     Windows_UI_Xaml!GetErrorContextIndex+0xe30df
19 00000066`5b8fee30 00007ff8`819f5001     Windows_UI_Xaml!DllCanUnloadNow+0x95894
1a 00000066`5b8fee90 00007ff8`819f4f20     Windows_UI_Xaml!DllCanUnloadNow+0x956a1
1b 00000066`5b8feed0 00007ff8`818ba7c1     Windows_UI_Xaml!DllCanUnloadNow+0x955c0
1c 00000066`5b8fef30 00007ff8`818ba6a3     Windows_UI_Xaml!DllGetActivationFactory+0x49e71
1d 00000066`5b8fef70 00007ff8`817fcfb0     Windows_UI_Xaml!DllGetActivationFactory+0x49d53
1e 00000066`5b8fefa0 00007ff8`817fce40     Windows_UI_Xaml!GetErrorContextIndex+0xbcee0
1f 00000066`5b8fefe0 00007ff8`817e6e55     Windows_UI_Xaml!GetErrorContextIndex+0xbcd70
@zadjii-msft zadjii-msft added Severity-Crash Crashes are real bad news. Area-DefApp labels May 27, 2021
@ghost ghost added Needs-Triage It's a new issue that the core contributor team needs to triage at the next triage meeting Needs-Tag-Fix Doesn't match tag requirements labels May 27, 2021
@zadjii-msft zadjii-msft self-assigned this May 27, 2021
@araujofrancisco

This comment has been minimized.

@DHowett

This comment has been minimized.

@zadjii-msft

This comment has been minimized.

@araujofrancisco

This comment has been minimized.

@zadjii-msft
Copy link
Member Author

Oh no

I tried reproing this with the debugger attached to the launch of the Terminal, so I could breakpoint immediately. Guess what, the problem went away, and now it's no longer hitting at all. Like it won't repro even without the debugger. Shoot.

I'm a tad worried that the package was somehow busted and launching with the debugger made the OS just sort it's shit out. But no way to know now 😕

Niksa said he'd take a look at the dump though, so I'll leave this open for him. I don't think there's anything valuable in it, considering we're in the CATCH_FAIL_FAST, and not where the exception was thrown. If he can't find anything from it, then I'll let him just close this out, and keep an eye out for other reports.

@zadjii-msft zadjii-msft assigned miniksa and unassigned zadjii-msft May 27, 2021
@zadjii-msft
Copy link
Member Author

It also looks like this is (failure?) 13cbe5b8-092a-28b2-3256-509530451824 in watson. So it's definitely real.

@zadjii-msft zadjii-msft added this to the Terminal v2.0 milestone May 27, 2021
@miniksa
Copy link
Member

miniksa commented May 27, 2021

How to find the real exception from a wil fail fast:

  1. Starting from fail fast dump, do .ecxr to get into the exception record.
  2. kP to dump the stack with parameters
  3. From the frame with VCRUNTIME140_1!__FrameHandler4::CxxCallCatchBlock, find the struct _EXCEPTION_RECORD * pExcept parameter and call .exr <address> with that parameter address. In my case the address was 0x000000665b8fd3d0`
  4. I got one with Exception Code 0x80000029 and 15 parameters. 3 of the parameters in this exception record have the sameish address to the exception record... parameters 1, 4, and 6 for me.
0:000> .exr 0x00000066`5b8fd3d0
ExceptionAddress: 0000000000000000
   ExceptionCode: 80000029
  ExceptionFlags: 00000022
NumberParameters: 15
   Parameter[0]: 00007ff8bcf82590
   Parameter[1]: 000000665b8fd538
   Parameter[2]: 00007ff882bbc8bf
   Parameter[3]: 0000000000000000
   Parameter[4]: 000000665b8fdfd0
   Parameter[5]: 00007ff882b50000
   Parameter[6]: 000000665b8fe810
   Parameter[7]: 0000000000000000
   Parameter[8]: 0000000019930520
   Parameter[9]: 00007ff882b50000
   Parameter[10]: 0000000000000000
   Parameter[11]: 0000000000000001
   Parameter[12]: 0000000000000000
   Parameter[13]: 0000000000000000
   Parameter[14]: 0000000000000000	
  1. Try .exr-ing each of those addresses. Params 1 and 4 looked like nonsense. But I struck gold on 6 and it dumped out a C++ EH exception
    Param 1:
0:000> .exr 000000665b8fd538
ExceptionAddress: 0000000000000000
   ExceptionCode: 5b8fe9c0
  ExceptionFlags: 00000066
NumberParameters: 1536153568
   Parameter[0]: 000000665b8fd5f8
   Parameter[1]: 00007ff882d41b60
   Parameter[2]: 000000665b8fd5b8
   Parameter[3]: 0000006600000000
   Parameter[4]: 0000000000000000
   Parameter[5]: 0000024830c0f400
   Parameter[6]: 0000006600000000
   Parameter[7]: 0000000000000000
   Parameter[8]: 0000000100000000
   Parameter[9]: 00007ff882d41bac
   Parameter[10]: 000000665b8fd810
   Parameter[11]: 000000665b8fd620
   Parameter[12]: 0000000000000000
   Parameter[13]: 001e551100000001
   Parameter[14]: 00007ff8c7c46770

That Exception Code looks like non-sense. They usually start with 8xxx or Cxxx or 0000 depending on if it's an HRESULT, NTSTATUS, or Win32 error.

Param 4:

0:000> .exr 000000665b8fdfd0
ExceptionAddress: 0000000000000000
   ExceptionCode: 00000000
  ExceptionFlags: 00000000
NumberParameters: 0

Yeah that looks like nothing.

Param 6:

0:000> .exr 000000665b8fe810
ExceptionAddress: 00007ff8c552491c (KERNELBASE!RaiseException+0x000000000000006c)
   ExceptionCode: e06d7363 (C++ EH exception)
  ExceptionFlags: 00000081
NumberParameters: 4
   Parameter[0]: 0000000019930520
   Parameter[1]: 000000665b8fe960
   Parameter[2]: 00007ff882d41b88
   Parameter[3]: 00007ff882b50000
  pExceptionObject: 000000665b8fe960
  _s_ThrowInfo    : 00007ff882d41b88

Jackpot.

  1. Find a Raymond Chen blog post that says how to dig this apart https://devblogs.microsoft.com/oldnewthing/20100730-00/?p=13273 and follow directions.

dd on Parameter 2

0:000> dd 00007ff882d41b88 l4
00007ff8`82d41b88  00000000 0005aefc 00000000 001f1ba8

dd on the 4th chunk + Parameter 3

0:000> dd 00007ff882b50000+001f1ba8 l2
00007ff8`82d41ba8  00000001 001f1b60

dd on the 2nd chunk + Parameter 3

0:000> dd 00007ff882b50000+001f1b60 l2
00007ff8`82d41b60  00000000 001f73d0

da on the 2nd chunk + Parameter 3 + 0x10

0:000> da 00007ff882b50000+001f73d0+10
00007ff8`82d473e0  ".?AUhresult_error@winrt@@"

OK so it's a ::winrt::hresult_error. Now let's try to see what's inside.

  1. With type information in hand, dump type on Parameter 1.
0:000> dt 000000665b8fe960 winrt::hresult_error
WindowsTerminal!winrt::hresult_error
   +0x000 m_debug_reference : winrt::handle_type<winrt::impl::bstr_traits>
   +0x008 m_debug_magic    : 0xaabbccdd
   +0x00c m_code           : winrt::hresult
   +0x010 m_info           : winrt::com_ptr<winrt::impl::IRestrictedErrorInfo>

HEY-O. That feels almost reasonable.

Clicky through on m_code which will do

0:000> dx -r1 (*((WindowsTerminal!winrt::hresult *)0x665b8fe96c))
(*((WindowsTerminal!winrt::hresult *)0x665b8fe96c))                 [Type: winrt::hresult]
    [+0x000] value            : -2147023728 [Type: int]

0:000> ? 0n-2147023728
Evaluate expression: -2147023728 = ffffffff`80070490

0:000> !err 80070490
0x80070490 (FACILITY_WIN32 - Win32 Undecorated Error Codes): Element not found.

OK so 80070490. That's clearly an HRESULT. Lookup turns it into E_NOTFOUND.

Clicky through on m_info which will do

0:000> dx -r1 (*((WindowsTerminal!winrt::com_ptr<winrt::impl::IRestrictedErrorInfo> *)0x665b8fe970))
(*((WindowsTerminal!winrt::com_ptr<winrt::impl::IRestrictedErrorInfo> *)0x665b8fe970))                 [Type: winrt::com_ptr<winrt::impl::IRestrictedErrorInfo>]
    [+0x000] m_ptr            : 0x2483bf54278 [Type: winrt::impl::IRestrictedErrorInfo *]

Clicky through on m_ptr which will do...

0:000> dx -r1 ((WindowsTerminal!winrt::impl::IRestrictedErrorInfo *)0x2483bf54278)
((WindowsTerminal!winrt::impl::IRestrictedErrorInfo *)0x2483bf54278)                 : 0x2483bf54270 [Type: CRestrictedError * (derived from winrt::impl::IRestrictedErrorInfo *)]
    [+0x000] Microsoft::WRL::RuntimeClass<Microsoft::WRL::RuntimeClassFlags<2>,Microsoft::WRL::ChainInterfaces<ICreateRestrictedErrorInfo3,ICreateRestrictedErrorInfo2,ICreateRestrictedErrorInfo,Microsoft::WRL::Details::Nil,Microsoft::WRL::Details::Nil,Microsoft::WRL::Details::Nil,Microsoft::WRL::Details::Nil,Microsoft::WRL::Details::Nil,Microsoft::WRL::Details::Nil,Microsoft::WRL::Details::Nil>,IRestrictedErrorInfo,Microsoft::WRL::ChainInterfaces<IErrorInfoWithRestrictedPropagation,IErrorInfo,Microsoft::WRL::Details::Nil,Microsoft::WRL::Details::Nil,Microsoft::WRL::Details::Nil,Microsoft::WRL::Details::Nil,Microsoft::WRL::Details::Nil,Microsoft::WRL::Details::Nil,Microsoft::WRL::Details::Nil,Microsoft::WRL::Details::Nil>,IRestrictedErrorInfoContext,Microsoft::WRL::ChainInterfaces<ILanguageExceptionErrorInfo2,ILanguageExceptionErrorInfo,Microsoft::WRL::Details::Nil,Microsoft::WRL::Details::Nil,Microsoft::WRL::Details::Nil,Microsoft::WRL::Details::Nil,Microsoft::WRL::Details::Nil,Microsoft::WRL::Details::Nil,Microsoft::WRL::Details::Nil,Microsoft::WRL::Details::Nil>,IRestrictedErrorInfoTelemetry,IRestrictedErrorInfoInternal,IMarshal,IRestrictedErrorRpcMarshal> : ref count=2 [Type: Microsoft::WRL::RuntimeClass<Microsoft::WRL::RuntimeClassFlags<2>,Microsoft::WRL::ChainInterfaces<ICreateRestrictedErrorInfo3,ICreateRestrictedErrorInfo2,ICreateRestrictedErrorInfo,Microsoft::WRL::Details::Nil,Microsoft::WRL::Details::Nil,Microsoft::WRL::Details::Nil,Microsoft::WRL::Details::Nil,Microsoft::WRL::Details::Nil,Microsoft::WRL::Details::Nil,Microsoft::WRL::Details::Nil>,IRestrictedErrorInfo,Microsoft::WRL::ChainInterfaces<IErrorInfoWithRestrictedPropagation,IErrorInfo,Microsoft::WRL::Details::Nil,Microsoft::WRL::Details::Nil,Microsoft::WRL::Details::Nil,Microsoft::WRL::Details::Nil,Microsoft::WRL::Details::Nil,Microsoft::WRL::Details::Nil,Microsoft::WRL::Details::Nil,Microsoft::WRL::Details::Nil>,IRestrictedErrorInfoContext,Microsoft::WRL::ChainInterfaces<ILanguageExceptionErrorInfo2,ILanguageExceptionErrorInfo,Microsoft::WRL::Details::Nil,Microsoft::WRL::Details::Nil,Microsoft::WRL::Details::Nil,Microsoft::WRL::Details::Nil,Microsoft::WRL::Details::Nil,Microsoft::WRL::Details::Nil,Microsoft::WRL::Details::Nil,Microsoft::WRL::Details::Nil>,IRestrictedErrorInfoTelemetry,IRestrictedErrorInfoInternal,IMarshal,IRestrictedErrorRpcMarshal>]
    [+0x050] _pszDescription  : 0x2483bec17b0 : "Element not found..." [Type: wchar_t *]
    [+0x058] _pszRestrictedDescription : 0x2483bec8590 : "Element not found..." [Type: wchar_t *]
    [+0x060] _pszCapabilitySid : 0x0 [Type: wchar_t *]
    [+0x068] _hrError         : 0x80070490 (Element not found.) [Type: HRESULT]
    [+0x070] _pszSectionName  : 0x0 [Type: wchar_t *]
    [+0x078] _hSection        : 0x0 [Type: void *]
    [+0x080] _dwVersion       : 0x10002 [Type: unsigned long]
    [=0x7ff8c70e6058] _szPrefix        : "RestrictedErrorObject-" [Type: wchar_t [0]]
    cMaxFramesCountInStackText : 0xf [Type: unsigned short]
    cOffsetBufferSize : 0x11 [Type: unsigned int]
    [+0x084] _cStackBackTrace : 0x29 [Type: unsigned short]
    [+0x088] _ppvStackBackTrace : 0x2483ab26690 [Type: void * *]
    [+0x090] _stackBackTracePointerSize : 0x8 [Type: unsigned short]
    [+0x098] _stowedExceptionInformation : 0x80070490 (Element not found.) [Type: _STOWED_EXCEPTION_INFORMATION_V2]
    [+0x0d0] _spLanguageException : {...} [Type: Microsoft::WRL::ComPtr<IUnknown>]
    [+0x0d8] _correlationId   : {5C6E72EA-C795-3658-2381-6B507ED4B09D} [Type: _GUID]
    [+0x0e8] _spPreviousCRestrictedError : {...} [Type: Microsoft::WRL::ComPtr<IRestrictedErrorInfo>]
    [+0x0f0] _spPropagationCRestrictedErrorHead : {...} [Type: Microsoft::WRL::ComPtr<IRestrictedErrorInfo>]
    [+0x0f8] _bIsTransformed  : 0 [Type: int]
    [+0x0fc] _bRestrictMarshalingData : 0 [Type: int]
    [+0x100] _numPropagations : 0x0 [Type: unsigned short]
    [+0x108] _pszSignature    : 0x2483be4fc80 : "cb4e986d262520cea7cccbd9e149900e" [Type: wchar_t *]
    [+0x110] m_pofProcessName : 0x2483beca150 : "WindowsTerminal.exe" [Type: wchar_t *]
    [+0x118] m_pofStackText   : 0x24833f09130 : "combase.dll{7211F699-80A6-CBCE-F336-18686CA7E916}+0xf4694..TerminalApp.dll{701E349C-364C-47CC-97F5-35EC1F45FEAD}+0x2e47b..TerminalApp.dll+0x2e39a..TerminalApp.dll+0x2e2e2..TerminalApp.dll+0xe626c..TerminalApp.dll+0x6c8b9..TerminalApp.dll+0x5b2cb..TerminalApp.dll+0x5acdb..Windows.UI.Xaml.dll{E4CA2B0B-5146-136F-DE5C-F21AF51E3290}+0x1a7cd3..Windows.UI.Xaml.dll+0x1a7b9a..Windows.UI.Xaml.dll+0x1ac2e8..Windows.UI.Xaml.dll+0x2e4285..Windows.UI.Xaml.dll+0x1acc09..Windows.UI.Xaml.dll+0x17404d..Wi... [Type: wchar_t *]
    [+0x120] m_stowedExceptionErorText [Type: wil::unique_any_t<wil::details::unique_storage<wil::details::resource_policy<unsigned short *,void * (__cdecl*)(void *),&LocalFree,wistd::integral_constant<unsigned __int64,0>,unsigned short *,unsigned short *,0,std::nullptr_t> > >]
    [+0x128] _snapshotHandle  : 0x0 [Type: HPSS__ *]
    [+0x130] _snapshotContext [Type: _CONTEXT]
    [+0x600] _snashpotThreadId : 0x0 [Type: unsigned long]
    [+0x608] _snapshotReturnAddress : 0x0 [Type: void *]
    [+0x610] _pSnapshotCleanupTimer : 0x0 [Type: _TP_TIMER *]
    [+0x618] _dwOriginatingProcessId : 0xe84 [Type: unsigned long]
    [+0x61c] _dwDestinationProcessId : 0x0 [Type: unsigned long]
    CURRENT_VERSION  : 0x10002 [Type: unsigned long]

And it looks like we are almost in business. Now to figure out how to decode this...

@DHowett DHowett added Priority-1 A description (P1) and removed Needs-Triage It's a new issue that the core contributor team needs to triage at the next triage meeting labels May 27, 2021
@DHowett DHowett modified the milestones: Terminal v2.0, Terminal v1.10 May 27, 2021
@miniksa
Copy link
Member

miniksa commented May 27, 2021

This particular line ended up a dead end. Doing dps on the property _ppvStackBackTrace (which is the same value as the one inside _stowedExceptionInformation didn't reveal anything in the stack except the rethrowing of this error at the boundary between TerminalApp and TerminalConnection. Unfortunately the original error of 0x80070490 (E_NOTFOUND) is generated inside TerminalConnection so this felt like a dead end.

Except with @zadjii-msft identifying the Watson bucket 13cbe5b8-092a-28b2-3256-509530451824, I found crash dumps on the back end that contained additional trace data from the COM activation system. I asked @brialmsft internally and he helped crack it open and identify that somehow despite the application package full name being Microsoft.WindowsTerminalPreview_1.9.1445.0_x64__8wekyb3d8bbwe, the registration of the COM Server object in TerminalConnection ended up looking up Microsoft.WindowsTerminalPreview_1.8.1032.0_x64__8wekyb3d8bbwe and thus not finding what it needed. He also confirmed that this is where the E_NOTFOUND error is coming from.

This shouldn't have happened. But... we need to transfer this issue internally to have the platform teams figure out why. I will link this to #10243 as a main bug because there are several others related that look like the same version sharding cause.

@miniksa
Copy link
Member

miniksa commented May 27, 2021

(In this bug, I will remove the fail fast, turning it into a log statement instead, and defapp just won't work until you start another session. Better than crashing but mildly unsatisfying. Also should only happen when new versions are pushed which isn't every day.)

@miniksa
Copy link
Member

miniksa commented May 27, 2021

See internal MSFT:33501832

@ghost ghost added the In-PR This issue has a related PR label May 28, 2021
@ghost ghost closed this as completed in #10261 May 28, 2021
@ghost ghost removed the In-PR This issue has a related PR label May 28, 2021
ghost pushed a commit that referenced this issue May 28, 2021
…arch startup (#10261)

Stop startup crash by logging when monarch fails to register inbound connections, but still crash when COM attempted to start us

## References
- See also #10243 

## PR Checklist
* [x] Closes #10233
* [x] CLA signed. If not, go over [here](https://cla.opensource.microsoft.com/microsoft/Terminal) and sign the CLA

## Detailed Description of the Pull Request / Additional comments
- This should stop the crash on launch until we can get the internal teams to resolve the catalog issue
- I left the COM -Embedding start fail fast though so it won't take forever to time out (as default timeout is 3-5 minutes). I will change that if it becomes necessary.

## Validation Steps Performed
- I basically have to guess at this one based on the crash dump and Watson logs because it happens sporadically when the platform messes up on us.
@ghost ghost added the Resolution-Fix-Committed Fix is checked in, but it might be 3-4 weeks until a release. label May 28, 2021
DHowett pushed a commit that referenced this issue Jun 1, 2021
…arch startup (#10261)

Stop startup crash by logging when monarch fails to register inbound connections, but still crash when COM attempted to start us

## References
- See also #10243

## PR Checklist
* [x] Closes #10233
* [x] CLA signed. If not, go over [here](https://cla.opensource.microsoft.com/microsoft/Terminal) and sign the CLA

## Detailed Description of the Pull Request / Additional comments
- This should stop the crash on launch until we can get the internal teams to resolve the catalog issue
- I left the COM -Embedding start fail fast though so it won't take forever to time out (as default timeout is 3-5 minutes). I will change that if it becomes necessary.

## Validation Steps Performed
- I basically have to guess at this one based on the crash dump and Watson logs because it happens sporadically when the platform messes up on us.

(cherry picked from commit d8647e0)
@ghost
Copy link

ghost commented Jul 14, 2021

🎉This issue was addressed in #10261, which has now been successfully released as Windows Terminal v1.9.1942.0.:tada:

Handy links:

@ghost
Copy link

ghost commented Jul 14, 2021

🎉This issue was addressed in #10261, which has now been successfully released as Windows Terminal Preview v1.10.1933.0.:tada:

Handy links:

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-DefApp Needs-Tag-Fix Doesn't match tag requirements Priority-1 A description (P1) Resolution-Fix-Committed Fix is checked in, but it might be 3-4 weeks until a release. Severity-Crash Crashes are real bad news.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants