New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When an assembly is marked with ParallelScope.None and there are Parallelizable tests Nunit hangs #2261

Closed
danpowell88 opened this Issue Jun 16, 2017 · 8 comments

Comments

Projects
None yet
5 participants
@danpowell88

danpowell88 commented Jun 16, 2017

Using the following

<packages>
  <package id="NUnit" version="3.7.1" targetFramework="net452" />
  <package id="NUnit.Console" version="3.6.1" targetFramework="net452" />
  <package id="NUnit.ConsoleRunner" version="3.6.1" targetFramework="net452" />
  <package id="NUnit.Extension.NUnitProjectLoader" version="3.5.0" targetFramework="net452" />
  <package id="NUnit.Extension.NUnitV2Driver" version="3.6.0" targetFramework="net452" />
  <package id="NUnit.Extension.NUnitV2ResultWriter" version="3.5.0" targetFramework="net452" />
  <package id="NUnit.Extension.TeamCityEventListener" version="1.0.2" targetFramework="net452" />
  <package id="NUnit.Extension.VSProjectLoader" version="3.5.0" targetFramework="net452" />
</packages>

in a class library targetting .NET Framework 4.5.2

If I include
[assembly: Parallelizable(ParallelScope.None)]

in my AssemblyInfo.cs file and then have tests like the following

using NUnit.Framework;

namespace ClassLibrary1
{

    namespace ClassLibrary1.Tests
    {
        [TestFixture]
        [Parallelizable(ParallelScope.Children)]
        public class Tests
        {
            [Test]
            public void Test1()
            {
                Assert.Fail();
            }

            [Test]
            public void Test2()
            {
                Assert.Fail();
            }
        }
    }
}

The test runner hangs and will not return, as soon as I remove the assembly attribute the tests run fine.

Repro solution here https://ufile.io/1zbgk

@danpowell88

This comment has been minimized.

Show comment
Hide comment
@danpowell88

danpowell88 Jun 16, 2017

This may be related also.

If I change the tests to be Fixture level with having None on the Assembly the tests do not hang and return a result but my Resharper test runner is telling me that

All unit tests finished, but test process still running. Abort?

danpowell88 commented Jun 16, 2017

This may be related also.

If I change the tests to be Fixture level with having None on the Assembly the tests do not hang and return a result but my Resharper test runner is telling me that

All unit tests finished, but test process still running. Abort?

@appel1

This comment has been minimized.

Show comment
Hide comment
@appel1

appel1 Jun 16, 2017

It looks like WorkItemQueue is restored to an empty queue after the parallel work items have been queued so they are lost. This causes nunit to get stuck in ParallelWorkItemDIspatcher.OnEndOfShift since State is not Complete but there are no shifts to start since all queues are empty.

appel1 commented Jun 16, 2017

It looks like WorkItemQueue is restored to an empty queue after the parallel work items have been queued so they are lost. This causes nunit to get stuck in ParallelWorkItemDIspatcher.OnEndOfShift since State is not Complete but there are no shifts to start since all queues are empty.

@CharliePoole

This comment has been minimized.

Show comment
Hide comment
@CharliePoole

CharliePoole Jun 16, 2017

Member

The problem is that the queue state is restored without having previously been saved. It happens because we save the queue when the test worker becomes busy after dequeuing a non-parallel item. In this case, it encounters a non-parallel item while runnng a higher-level non-parallel item and so runs it directly on the same thread, without enqueuing / dequeuing anything.

Member

CharliePoole commented Jun 16, 2017

The problem is that the queue state is restored without having previously been saved. It happens because we save the queue when the test worker becomes busy after dequeuing a non-parallel item. In this case, it encounters a non-parallel item while runnng a higher-level non-parallel item and so runs it directly on the same thread, without enqueuing / dequeuing anything.

@CharliePoole

This comment has been minimized.

Show comment
Hide comment
@CharliePoole

CharliePoole Jun 16, 2017

Member

Workaround in this case is to eliminate use of ParallelScope.None on the assembly. It's generally not needed, since non-parallel is the default, but this bug is a side-effect of using it.

Member

CharliePoole commented Jun 16, 2017

Workaround in this case is to eliminate use of ParallelScope.None on the assembly. It's generally not needed, since non-parallel is the default, but this bug is a side-effect of using it.

@CharliePoole

This comment has been minimized.

Show comment
Hide comment
@CharliePoole

CharliePoole Jun 17, 2017

Member

Notes toward a solution...

  1. Any Non-Parallel TestFixture or SetUpFixture may have Parallel children, except for TestFixtures that specify single-threaded. When that happens, we want to be running from an isolated set of queues.

  2. At the time of starting the fixture, we don't know if it will have parallel children, so initially we want to deal with it for all such fixtures. We may be able to optimize this later on. (Optimization would require TestWorkers to know more than they now do.)

  3. Fixtures are started by the TestWorker, upon pulling it from the queue. Since any worker is running only parallel or non-parallel tests, the dispatcher can tell when a non-p fixture is being started by handling the Busy event from non-p workers only. It creates the isolated queues at that point.

  4. When a non-p Assembly or SetUpFixture starts a non-p TestFixture, it currently doesn't enqueue the work item but directly executes it on the same thread, as an optimization. However, this means that the dispatcher does not know such a fixture is starting.

  5. OTOH, this dispatcher always knows when a shift is ending. At end of shift, it always drops one level of isolation. This causes a potential mismatch between entering and exiting isolation levels.

  6. One possible solution is to always enqueue child items, forgoing the optimization of item (4). We could also limit the optimization to test cases, since that's what it is really aimed at. Having non-p cases run quickly is important because it takes work by the user to modify a fixture in order to run test cases in parallel.

  7. Another possible solution is to have the CompositeWorkItem know when an isolated set of queues is needed and to ask the dispatcher to do it. This means that the CompositeWorkItem will need to be aware of a number of things it is not now aware of. I'll have to look at this more closely.

  8. The hardest part of this is testing it. Since the errors involve a race condition, it takes multiple runs of a test-assembly to detect it and it requires multiple test assemblies with different mixes of attributes to do it thoroughly. For now, I'll continue to do this on an adhoc basis.

@nunit/framework-team Any thoughts on the above would be useful as I move ahead.

Member

CharliePoole commented Jun 17, 2017

Notes toward a solution...

  1. Any Non-Parallel TestFixture or SetUpFixture may have Parallel children, except for TestFixtures that specify single-threaded. When that happens, we want to be running from an isolated set of queues.

  2. At the time of starting the fixture, we don't know if it will have parallel children, so initially we want to deal with it for all such fixtures. We may be able to optimize this later on. (Optimization would require TestWorkers to know more than they now do.)

  3. Fixtures are started by the TestWorker, upon pulling it from the queue. Since any worker is running only parallel or non-parallel tests, the dispatcher can tell when a non-p fixture is being started by handling the Busy event from non-p workers only. It creates the isolated queues at that point.

  4. When a non-p Assembly or SetUpFixture starts a non-p TestFixture, it currently doesn't enqueue the work item but directly executes it on the same thread, as an optimization. However, this means that the dispatcher does not know such a fixture is starting.

  5. OTOH, this dispatcher always knows when a shift is ending. At end of shift, it always drops one level of isolation. This causes a potential mismatch between entering and exiting isolation levels.

  6. One possible solution is to always enqueue child items, forgoing the optimization of item (4). We could also limit the optimization to test cases, since that's what it is really aimed at. Having non-p cases run quickly is important because it takes work by the user to modify a fixture in order to run test cases in parallel.

  7. Another possible solution is to have the CompositeWorkItem know when an isolated set of queues is needed and to ask the dispatcher to do it. This means that the CompositeWorkItem will need to be aware of a number of things it is not now aware of. I'll have to look at this more closely.

  8. The hardest part of this is testing it. Since the errors involve a race condition, it takes multiple runs of a test-assembly to detect it and it requires multiple test assemblies with different mixes of attributes to do it thoroughly. For now, I'll continue to do this on an adhoc basis.

@nunit/framework-team Any thoughts on the above would be useful as I move ahead.

@rprouse

This comment has been minimized.

Show comment
Hide comment
@rprouse

rprouse Jun 17, 2017

Member

@CharliePoole, you know the code better than me, but removing the optimization in step 4 seems to be the conceptually correct fix. Option 7 adds additional complexity. The optimization seems reasonable and I would have probably done the same thing, but it feels like it disconnects those test runs from the rest of the system. Do we have any numbers on the increase in test time for queuing non-p test fixtures rather than running them on the same thread?

Member

rprouse commented Jun 17, 2017

@CharliePoole, you know the code better than me, but removing the optimization in step 4 seems to be the conceptually correct fix. Option 7 adds additional complexity. The optimization seems reasonable and I would have probably done the same thing, but it feels like it disconnects those test runs from the rest of the system. Do we have any numbers on the increase in test time for queuing non-p test fixtures rather than running them on the same thread?

@CharliePoole

This comment has been minimized.

Show comment
Hide comment
@CharliePoole

CharliePoole Jun 17, 2017

Member

I wish we did have some numbers. I've observed improvements, but it's anecdotal and also, I believe, masked by the Sleep(1) issue that is discussed elsewhere.

I'd like to see a set of integration tests, completely separate from our unit tests, which ran a bunch of assemblies with instrumentation.

Member

CharliePoole commented Jun 17, 2017

I wish we did have some numbers. I've observed improvements, but it's anecdotal and also, I believe, masked by the Sleep(1) issue that is discussed elsewhere.

I'd like to see a set of integration tests, completely separate from our unit tests, which ran a bunch of assemblies with instrumentation.

@jnm2

This comment has been minimized.

Show comment
Hide comment
@jnm2

jnm2 Jun 17, 2017

Contributor

I think the same as @rprouse.

I'd like to see a set of integration tests, completely separate from our unit tests, which ran a bunch of assemblies with instrumentation.

Maybe we should track this with an issue and add to it each time we think of another scenario where we need integration tests?

Contributor

jnm2 commented Jun 17, 2017

I think the same as @rprouse.

I'd like to see a set of integration tests, completely separate from our unit tests, which ran a bunch of assemblies with instrumentation.

Maybe we should track this with an issue and add to it each time we think of another scenario where we need integration tests?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment