Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assertion failed 'retExpr->gtOper == GT_RETURN' caused by SuppressGCTransition #13654

Closed
jkotas opened this issue Oct 26, 2019 · 17 comments · Fixed by dotnet/coreclr#27473
Closed
Assignees
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI bug
Milestone

Comments

@jkotas
Copy link
Member

jkotas commented Oct 26, 2019

Starting:    System.Runtime.Extensions.Tests (parallel test collections = on, max threads = 2)

Assert failure(PID 29805 [0x0000746d], Thread: 29824 [0x7480]): Assertion failed 'retExpr->gtOper == GT_RETURN' in 'System.Diagnostics.Tests.StopwatchTests:GetTimestamp()' (IL size 26)

    File: /__w/2/s/src/jit/morph.cpp Line: 18653
    Image: /home/helixbot/work/409fcbf4-4e6f-4b59-a1b8-38c6da3609f0/Payload/dotnet
@jkotas
Copy link
Member Author

jkotas commented Oct 26, 2019

@AaronRobinsonMSFT We should hold off adding SuppressGCTransition in any more places until this is fixed. (I expect that the problem is affecting all platforms. We just got lucky to not hit it anywhere else yet.)

@AaronRobinsonMSFT
Copy link
Member

@jkotas Agreed.

@AaronRobinsonMSFT AaronRobinsonMSFT self-assigned this Oct 26, 2019
tannergooding referenced this issue in dotnet/coreclr Oct 26, 2019
* Update dependencies from https://github.com/dotnet/corefx build 20191024.13

- Microsoft.NETCore.Platforms - 5.0.0-alpha.1.19524.13
- Microsoft.Private.CoreFx.NETCoreApp - 5.0.0-alpha.1.19524.13

* Update dependencies from https://github.com/dotnet/corefx build 20191024.14

- Microsoft.NETCore.Platforms - 5.0.0-alpha.1.19524.14
- Microsoft.Private.CoreFx.NETCoreApp - 5.0.0-alpha.1.19524.14

* Update dependencies from https://github.com/dotnet/corefx build 20191025.6

- Microsoft.NETCore.Platforms - 5.0.0-alpha.1.19525.6
- Microsoft.Private.CoreFx.NETCoreApp - 5.0.0-alpha.1.19525.6

* Workaround https://github.com/dotnet/coreclr/issues/27465
Dotnet-GitSync-Bot referenced this issue in Dotnet-GitSync-Bot/corefx Oct 26, 2019
* Update dependencies from https://github.com/dotnet/corefx build 20191024.13

- Microsoft.NETCore.Platforms - 5.0.0-alpha.1.19524.13
- Microsoft.Private.CoreFx.NETCoreApp - 5.0.0-alpha.1.19524.13

* Update dependencies from https://github.com/dotnet/corefx build 20191024.14

- Microsoft.NETCore.Platforms - 5.0.0-alpha.1.19524.14
- Microsoft.Private.CoreFx.NETCoreApp - 5.0.0-alpha.1.19524.14

* Update dependencies from https://github.com/dotnet/corefx build 20191025.6

- Microsoft.NETCore.Platforms - 5.0.0-alpha.1.19525.6
- Microsoft.Private.CoreFx.NETCoreApp - 5.0.0-alpha.1.19525.6

* Workaround https://github.com/dotnet/coreclr/issues/27465

Signed-off-by: dotnet-bot <dotnet-bot@microsoft.com>
Dotnet-GitSync-Bot referenced this issue in Dotnet-GitSync-Bot/corert Oct 26, 2019
* Update dependencies from https://github.com/dotnet/corefx build 20191024.13

- Microsoft.NETCore.Platforms - 5.0.0-alpha.1.19524.13
- Microsoft.Private.CoreFx.NETCoreApp - 5.0.0-alpha.1.19524.13

* Update dependencies from https://github.com/dotnet/corefx build 20191024.14

- Microsoft.NETCore.Platforms - 5.0.0-alpha.1.19524.14
- Microsoft.Private.CoreFx.NETCoreApp - 5.0.0-alpha.1.19524.14

* Update dependencies from https://github.com/dotnet/corefx build 20191025.6

- Microsoft.NETCore.Platforms - 5.0.0-alpha.1.19525.6
- Microsoft.Private.CoreFx.NETCoreApp - 5.0.0-alpha.1.19525.6

* Workaround https://github.com/dotnet/coreclr/issues/27465

Signed-off-by: dotnet-bot <dotnet-bot@microsoft.com>
@AaronRobinsonMSFT
Copy link
Member

AaronRobinsonMSFT commented Oct 26, 2019

Issue doesn't repro on Windows x64. I am using the following code and setting COMPlus_TieredCompilation=0 to try and reproduce the issue. The generated code gen on Windows is unusual, but the JIT doesn't assert. I will try on a Linux x64 or macOS and if I still don't see issue fallback to the CoreFX test itself.

Managed

using System;
using System.Threading;
using System.Runtime.InteropServices;

namespace System.Runtime.InteropServices
{
    public class SuppressGCTransitionAttribute : Attribute
    {
        public SuppressGCTransitionAttribute() { }
    }
}

partial class SW
{
    public static long GetTimestamp()
    {
        return QueryPerformanceCounter();
    }
}

partial class SW
{
    private static long QueryPerformanceCounter()
    {
        return (long)NativeLib.GetTimestamp();
    }

    static class NativeLib
    {
        public const string Path = @"NativeLib.dll";

        [DllImport(Path, EntryPoint = "SystemNative_GetTimestamp", ExactSpelling = true)]
        [SuppressGCTransition]
        internal static extern ulong GetTimestamp();

    }
}

namespace PInvokeTesting
{
    public unsafe class Program
    {
        private static readonly ManualResetEvent s_sleepEvent = new ManualResetEvent(false);

        static void Main(string[] args)
        {
            long ts1 = SW.GetTimestamp();
            Sleep();
            long ts2 = SW.GetTimestamp();
            Assert.NotEqual(ts1, ts2);
        }

        private static void Sleep(int milliseconds = 1)
        {
            s_sleepEvent.WaitOne(milliseconds);
        }
    }

    public static class Assert
    {
        static public void NotEqual(long a, long b)
        {
            if (a == b)
            {
                throw new Exception("Bad");
            }
        }
    }
}

Native

#include <cstdint>

#define EXPORT extern "C" __declspec(dllexport)
#define CALLCONV __stdcall

namespace
{
    std::atomic<uint64_t> _n{ 0 };
}

EXPORT
uint64_t CALLCONV SystemNative_GetTimestamp()
{
    return (++_n);
}

Dotnet-GitSync-Bot referenced this issue in Dotnet-GitSync-Bot/mono Oct 26, 2019
* Update dependencies from https://github.com/dotnet/corefx build 20191024.13

- Microsoft.NETCore.Platforms - 5.0.0-alpha.1.19524.13
- Microsoft.Private.CoreFx.NETCoreApp - 5.0.0-alpha.1.19524.13

* Update dependencies from https://github.com/dotnet/corefx build 20191024.14

- Microsoft.NETCore.Platforms - 5.0.0-alpha.1.19524.14
- Microsoft.Private.CoreFx.NETCoreApp - 5.0.0-alpha.1.19524.14

* Update dependencies from https://github.com/dotnet/corefx build 20191025.6

- Microsoft.NETCore.Platforms - 5.0.0-alpha.1.19525.6
- Microsoft.Private.CoreFx.NETCoreApp - 5.0.0-alpha.1.19525.6

* Workaround https://github.com/dotnet/coreclr/issues/27465

Signed-off-by: dotnet-bot <dotnet-bot@microsoft.com>
stephentoub referenced this issue in dotnet/corefx Oct 26, 2019
* Update dependencies from https://github.com/dotnet/corefx build 20191024.13

- Microsoft.NETCore.Platforms - 5.0.0-alpha.1.19524.13
- Microsoft.Private.CoreFx.NETCoreApp - 5.0.0-alpha.1.19524.13

* Update dependencies from https://github.com/dotnet/corefx build 20191024.14

- Microsoft.NETCore.Platforms - 5.0.0-alpha.1.19524.14
- Microsoft.Private.CoreFx.NETCoreApp - 5.0.0-alpha.1.19524.14

* Update dependencies from https://github.com/dotnet/corefx build 20191025.6

- Microsoft.NETCore.Platforms - 5.0.0-alpha.1.19525.6
- Microsoft.Private.CoreFx.NETCoreApp - 5.0.0-alpha.1.19525.6

* Workaround https://github.com/dotnet/coreclr/issues/27465

Signed-off-by: dotnet-bot <dotnet-bot@microsoft.com>
jkotas referenced this issue in dotnet/corert Oct 26, 2019
* Update dependencies from https://github.com/dotnet/corefx build 20191024.13

- Microsoft.NETCore.Platforms - 5.0.0-alpha.1.19524.13
- Microsoft.Private.CoreFx.NETCoreApp - 5.0.0-alpha.1.19524.13

* Update dependencies from https://github.com/dotnet/corefx build 20191024.14

- Microsoft.NETCore.Platforms - 5.0.0-alpha.1.19524.14
- Microsoft.Private.CoreFx.NETCoreApp - 5.0.0-alpha.1.19524.14

* Update dependencies from https://github.com/dotnet/corefx build 20191025.6

- Microsoft.NETCore.Platforms - 5.0.0-alpha.1.19525.6
- Microsoft.Private.CoreFx.NETCoreApp - 5.0.0-alpha.1.19525.6

* Workaround https://github.com/dotnet/coreclr/issues/27465

Signed-off-by: dotnet-bot <dotnet-bot@microsoft.com>
@AaronRobinsonMSFT
Copy link
Member

Also unable to reproduce this on macOS using the above source. I am going to try Linux, but getting less optimistic the code above represents the issue or this is a targeted issue on the Linux platform.

@jkotas
Copy link
Member Author

jkotas commented Oct 27, 2019

You may need to disable tiered compilation to reproduce this.

@AaronRobinsonMSFT
Copy link
Member

@jkotas I have been passing COMPlus_TieredCompilation=0 in all scenarios and verifying it via JIT dump. Is there another way to accomplish/verify this?

@tannergooding
Copy link
Member

I had repro'd locally on linux via the CoreFX test directly.

It required a checked CoreCLR and a Release build of CoreFX. I disabled tiered compilation by exporting an environment variable in bash.

@AaronRobinsonMSFT
Copy link
Member

@tannergooding Yeah. That is what I am doing now. I'm not that competent with CoreFX testing and find it incredibly hard to do quickly so alas am building the world.

If you still have a local repo is there any way you could provide the JIT dump for the test? (e.g. COMPlus_JitDump=GetTimestamp I don't think that will be exactly what I want because the name is ambiguous, but that is the spirit.)

@tannergooding
Copy link
Member

I wont be able to tonight, but can tomorrow morning.

@AaronRobinsonMSFT
Copy link
Member

I just don't understand how CoreFX tests work. I am following the directions here with very little luck or progress. I built from the root first with ./build.sh -restore -build -buildtests -c Release -f netcoreapp /p:CoreCLROverridePath=[HOME]/source/coreclr/bin/Product/Linux.x64.Checked but I am unable to figure out how to run just a single test.

~/source/corefx/src/System.Runtime.Extensions/tests$ ~/source/corefx/.dotnet/dotnet build -restore -c Release -f netcoreapp /t:RebuildAndTest /p:XunitMethodName=System.Diagnostics.Tests.StopwatchTests:GetTimestamp
...
[HOME]/source/corefx/.dotnet/sdk/3.0.100/Sdks/Microsoft.NET.Sdk/targets/Microsoft.NET.TargetFrameworkInference.targets(93,5): error NETSDK1013: The TargetFramework value '' was not recognized. It may be misspelled. If not, then the TargetFrameworkIdentifier and/or TargetFrameworkVersion properties must be specified explicitly. [[HOME]/source/corefx/src/System.Runtime.Extensions/tests/System.Runtime.Extensions.Tests.csproj]
...

@tannergooding
Copy link
Member

I just manually invoke the dotnet that lives under artifacts/bin/testhost on the xunit.console.dll that lives under the appropriate test folder (artifacts/bin/System.Runtime.Extension.Tests) and specifying the runtime config json, the test dll, and -method Namespace.Class.Method

@tannergooding
Copy link
Member

(Paths are from memory, as I dont have my computer handy, but they should be roughly correct).

Also worth noting that the failure log gives a full command line for running all tests for a given test.dll (should be something like To run the test manually: pushd .; dotnet <lots of args>; popd)

@tannergooding
Copy link
Member

tannergooding commented Oct 27, 2019

tagoo@GOODING-UBUNTU:~/source/repos/corefx/artifacts/bin/System.Runtime.Extensions.Tests/netcoreapp-Unix-Release$ ~/source/repos/corefx/artifacts/bin/testhost/netcoreapp-Linux-Release-x64/dotnet exec --runtimeconfig System.Runtime.Extensions.Tests.runtimeconfig.json xunit.console.dll System.Runtime.Extensions.Tests.dll -method System.Diagnostics.Tests.StopwatchTests.GetTimestamp > ./JitDump.txt

Assert failure(PID 4430 [0x0000114e], Thread: 4445 [0x115d]): Assertion failed 'retExpr->gtOper == GT_RETURN' in 'System.Diagnostics.Tests.StopwatchTests:GetTimestamp()' (IL size 26)

    File: /home/tagoo/source/repos/coreclr/src/jit/morph.cpp Line: 18653
    Image: /home/tagoo/source/repos/corefx/artifacts/bin/testhost/netcoreapp-Linux-Release-x64/dotnet

Aborted (core dumped)

JitDump.txt

marek-safar referenced this issue in mono/mono Oct 27, 2019
* Update dependencies from https://github.com/dotnet/corefx build 20191024.13

- Microsoft.NETCore.Platforms - 5.0.0-alpha.1.19524.13
- Microsoft.Private.CoreFx.NETCoreApp - 5.0.0-alpha.1.19524.13

* Update dependencies from https://github.com/dotnet/corefx build 20191024.14

- Microsoft.NETCore.Platforms - 5.0.0-alpha.1.19524.14
- Microsoft.Private.CoreFx.NETCoreApp - 5.0.0-alpha.1.19524.14

* Update dependencies from https://github.com/dotnet/corefx build 20191025.6

- Microsoft.NETCore.Platforms - 5.0.0-alpha.1.19525.6
- Microsoft.Private.CoreFx.NETCoreApp - 5.0.0-alpha.1.19525.6

* Workaround https://github.com/dotnet/coreclr/issues/27465

Signed-off-by: dotnet-bot <dotnet-bot@microsoft.com>
@AaronRobinsonMSFT
Copy link
Member

AaronRobinsonMSFT commented Oct 27, 2019

[EDIT] Now have a repro on Windows x64

This is an interesting issue that I am going to need some help from @dotnet/jit-contrib to help out with.

If I remove the XUnit.Assert.NotEqual() call or I remove the RunTest() and place its contents directly into Main() the issue goes away. Still trying to figure it out why this happens, but I can reproduce this failure on Linux using the following example:

using System;
using System.Diagnostics;
using System.Threading;

public class Program
{
    static void Main(string[] args)
    {
        RunTest();
    }
    static void RunTest()
    {
        long ts1 = Stopwatch.GetTimestamp();
        long ts2 = Stopwatch.GetTimestamp();
        NotEqual(ts1, ts2);
    }
    static public void NotEqual(long a, long b)
    {
        Console.WriteLine("NotEqual");
    }
}

Windows repro

using System;
using System.Diagnostics;
using System.Threading;
using System.Runtime.InteropServices;

namespace System.Runtime.InteropServices
{
    public class SuppressGCTransitionAttribute : Attribute
    {
        public SuppressGCTransitionAttribute() { }
    }
}
static class NativeLib
{
    public const string Path = @"NativeLib.dll";
    [DllImport(Path, EntryPoint = "SystemNative_GetTimestamp")]
    [SuppressGCTransition]
    internal static extern ulong GetTimestamp();
}
namespace PInvokeTesting
{
    public unsafe class Program
    {
        static void Main(string[] args)
        {
            RunTest();
        }
        static void RunTest()
        {
            long ts1 = (long)NativeLib.GetTimestamp();
            long ts2 = (long)NativeLib.GetTimestamp();
            NotEqual(ts1, ts2);
        }
        static public void NotEqual(long a, long b)
        {
            Console.WriteLine("NotEqual");
        }
    }
}

@AaronRobinsonMSFT
Copy link
Member

I think I see what is going on here and it is not great. The issue here is fgCreateGCPoll(). This function just wasn't made to do what is desired for the SuppressGCTransitionScenario. The generated code is acceptable for many scenarios, particularly loops, but in some circumstances (e.g. a sequence of unmanaged calls with suppress GC transition) the generated code is definitely not what is desired. I am going to need to dig into this function and probably rewrite the insertion of a GC_POLL call.

/cc @jkotas @dotnet/jit-contrib @briansull

@AaronRobinsonMSFT
Copy link
Member

See dotnet/coreclr#27473

@AaronRobinsonMSFT AaronRobinsonMSFT changed the title [linux/x64] Assertion failed 'retExpr->gtOper == GT_RETURN' caused by SuppressGCTransition Assertion failed 'retExpr->gtOper == GT_RETURN' caused by SuppressGCTransition Oct 27, 2019
@msftgits msftgits transferred this issue from dotnet/coreclr Jan 31, 2020
@msftgits msftgits added this to the 5.0 milestone Jan 31, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Dec 11, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants