Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interop/UnmanagedCallersOnly fails on OSX. #38189

Closed
sandreenko opened this issue Jun 20, 2020 · 5 comments · Fixed by #38254
Closed

Interop/UnmanagedCallersOnly fails on OSX. #38189

sandreenko opened this issue Jun 20, 2020 · 5 comments · Fixed by #38254
Labels
area-Interop-coreclr blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' bug
Milestone

Comments

@sandreenko
Copy link
Contributor

See examples in https://dev.azure.com/dnceng/public/_build/results?buildId=697078&view=ms.vss-test-web.build-test-results-tab:

 Starting:    Interop.UnmanagedCallersOnly.XUnitWrapper
    Interop/UnmanagedCallersOnly/UnmanagedCallersOnlyTest/UnmanagedCallersOnlyTest.sh [FAIL]
      corerun(40162,0x7000034a6000) malloc: *** error for object 0x7ff51f42ac90: pointer being freed was not allocated
      *** set a breakpoint in malloc_error_break to debug
      /private/tmp/helix/working/AB440946/w/B9E409AE/e/Interop/UnmanagedCallersOnly/UnmanagedCallersOnlyTest/UnmanagedCallersOnlyTest.sh: line 317: 40162 Illegal instruction: 4  $LAUNCHER $ExePath "${CLRTestExecutionArguments[@]}"
      
      Return code:      1
      Raw output file:      /private/tmp/helix/working/AB440946/w/B9E409AE/e/Interop/UnmanagedCallersOnly/Reports/Interop.UnmanagedCallersOnly/UnmanagedCallersOnlyTest/UnmanagedCallersOnlyTest.output.txt
      Raw output:
@sandreenko sandreenko added bug blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' labels Jun 20, 2020
@sandreenko sandreenko added this to the 5.0.0 milestone Jun 20, 2020
@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added area-Interop-coreclr untriaged New issue has not been triaged by the area owner labels Jun 20, 2020
@AaronRobinsonMSFT
Copy link
Member

@sandreenko This seems to be an issue with a native thread created via std::thread entering the runtime and then begin destroyed.

The test is as follows:

namespace
{
    struct ProxyCallContext
    {
        CALLBACKPROC CallbackProc;
        int N;
        int Result;
    };

    void ProxyCall(ProxyCallContext* cxt)
    {
        cxt->Result = CallManagedProc(cxt->CallbackProc, cxt->N);
    } // A/V is upon the exiting of the thread. The cxt is properly updated.
}

extern "C" DLL_EXPORT int STDMETHODCALLTYPE CallManagedProcOnNewThread(CALLBACKPROC pCallbackProc, int n)
{
    ProxyCallContext cxt{ pCallbackProc, n, 0 };
    std::thread newThreadToRuntime{ ProxyCall, &cxt };

    // Wait for new thread to complete
    newThreadToRuntime.join();

    return cxt.Result;
}

I think the new native thread entering the runtime is the issue here, but I don't know why as of yet.

@AaronRobinsonMSFT
Copy link
Member

I honestly have no idea why this is failing, but is (a) recent and (b) doesn't happen when I use pthreads manually. Therefore I am going to use pthreads manually on non-Windows platforms.

@janvorli If you are feeling bored and interested in picking at some weird behavior, this is something for you :)

@AaronRobinsonMSFT
Copy link
Member

I finally found the issue. It is because during the std::thread constructor call, there is an inner data structure that is allocated using the call stack below - which uses the new provided by coreclr. During the cleanup, a different delete is used that uses the system free. This inner data structure is merely for convenience in with the std::thread API. I am confident with fix since it is now explicit instead of implied and also since I know the underlying cause.

Allocation:

operator new(unsigned long) (/runtime/src/coreclr/src/utilcode/clrhost_nodependencies.cpp:322)
std::__1::thread::thread<void* (&)(void*), (anonymous namespace)::ProxyCallContext*, void>(void* (&)(void*), (anonymous namespace)::ProxyCallContext*&&) (/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include/c++/v1/thread:362)
std::__1::thread::thread<void* (&)(void*), (anonymous namespace)::ProxyCallContext*, void>(void* (&)(void*), (anonymous namespace)::ProxyCallContext*&&) (/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include/c++/v1/thread:373)
::CallManagedProcOnNewThread(CALLBACKPROC, int) (/runtime/src/coreclr/tests/src/Interop/UnmanagedCallersOnly/UnmanagedCallersOnlyDll.cpp:52)
1093B6BD0 (@1093b6bd0..1093b6c0c:3)
1097E0171 (@1097e0171..1097e01ba:3)
1093B72C3 (@1093b72c3..1093b7316:3)
1093AA66C (@1093aa66c..1093aa6ca:3)
CallDescrWorkerInternal (/runtime/src/coreclr/src/vm/amd64/calldescrworkeramd64.S:98)
CallDescrWorkerWithHandler(CallDescrData*, int) (/runtime/src/coreclr/src/vm/callhelpers.cpp:69)
MethodDescCallSite::CallTargetWorker(unsigned long const*, unsigned long*, int) (/runtime/src/coreclr/src/vm/callhelpers.cpp:545)
MethodDescCallSite::Call_RetArgSlot(unsigned long const*) (/runtime/src/coreclr/src/vm/callhelpers.h:459)
RunMainInternal(Param*) (/runtime/src/coreclr/src/vm/assembly.cpp:1456)
RunMain(MethodDesc*, short, int*, REF<PtrArray>*)::$_1::operator()(Param*) const::'lambda'(Param*)::operator()(Param*) const (/runtime/src/coreclr/src/vm/assembly.cpp:1524)
RunMain(MethodDesc*, short, int*, REF<PtrArray>*)::$_1::operator()(Param*) const (/runtime/src/coreclr/src/vm/assembly.cpp:1526)
RunMain(MethodDesc*, short, int*, REF<PtrArray>*) (/runtime/src/coreclr/src/vm/assembly.cpp:1526)
Assembly::ExecuteMainMethod(REF<PtrArray>*, int) (/runtime/src/coreclr/src/vm/assembly.cpp:1636)
CorHost2::ExecuteAssembly(unsigned int, char16_t const*, int, char16_t const**, unsigned int*) (/runtime/src/coreclr/src/vm/corhost.cpp:385)
::coreclr_execute_assembly(void *, unsigned int, int, const char **, const char *, unsigned int *) (/runtime/src/coreclr/src/dlls/mscoree/unixinterface.cpp:415)
ExecuteManagedAssembly(char const*, char const*, char const*, int, char const**) (/runtime/src/coreclr/src/hosts/unixcoreruncommon/coreruncommon.cpp:507)

Deallocation:

free (@free:4)
_pthread_tsd_cleanup (@_pthread_tsd_cleanup:120)
_pthread_exit (@_pthread_exit:26)
_pthread_start (@_pthread_override_qos_class_end_direct:3)
thread_start (@thread_start:8)

@jkotas
Copy link
Member

jkotas commented Jun 23, 2020

Does this mean that a 3rd party code that happens to use std::thread is going to hit the same problem?

We are supposed to link the CoreCLR overridden operator new with private visibility, so that it is used just by the CoreCLR code and nothing else. Maybe this got broken again?

@AaronRobinsonMSFT
Copy link
Member

Does this mean that a 3rd party code that happens to use std::thread is going to hit the same problem?

Probably since the test doesn't link against anything special here.

We are supposed to link the CoreCLR overridden operator new with private visibility, so that it is used just by the CoreCLR code and nothing else. Maybe this got broken again?

Boo. That is unfortunate. Let me look into this some more then. If that is indeed the case it might make sense to leave this code as is since it is helpful finding that regression.

@AaronRobinsonMSFT AaronRobinsonMSFT removed the untriaged New issue has not been triaged by the area owner label Jun 23, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Dec 8, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-Interop-coreclr blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants