Skip to content

Commit

Permalink
Threading Diagnoser (#1227)
Browse files Browse the repository at this point in the history
* implement tests

* implement the diagnoser

* make the tests pass

* make it work for CoreRT

* add console line argument and new attribute

* make it a warning, not error (to support -f netcoreapp2.x --runtimes netcoreapp3.0)

* add docs

* update release docs

* avoid "No data found" exception for dotnet test -f netcoreapp2.1

* code review fixes
  • Loading branch information
adamsitnik committed Aug 21, 2019
1 parent ea0468f commit 4080abd
Show file tree
Hide file tree
Showing 31 changed files with 485 additions and 22 deletions.
20 changes: 20 additions & 0 deletions docs/_changelog/header/v0.11.6.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,9 @@
[#1214](https://github.com/dotnet/BenchmarkDotNet/pull/1214),
[#1218](https://github.com/dotnet/BenchmarkDotNet/pull/1218),
[#1219](https://github.com/dotnet/BenchmarkDotNet/pull/1219)
* **ThreadingDiagnoser** - it uses [new APIs](https://github.com/dotnet/corefx/issues/35500) exposed in .NET Core 3.0 to report `Completed Work Items` and `Lock Contentions`.
[#1154](https://github.com/dotnet/BenchmarkDotNet/issues/1154),
[#1227](https://github.com/dotnet/BenchmarkDotNet/pull/1227)
* MemoryDiagnoser includes memory allocated by **all Threads** that were live during benchmark execution: a new GC API was exposed in .NET Core 3.0 preview6+. It allows to get the number of allocated bytes for all threads.
[#1155](https://github.com/dotnet/BenchmarkDotNet/pull/1155),
[#1153](https://github.com/dotnet/BenchmarkDotNet/issues/1153),
Expand Down Expand Up @@ -122,6 +125,23 @@ You should redesign your benchmark and remove the side-effects. You can use `Ope

In this case, you should try to reduce the number of invocation, by adding `[ShortRunJob]` attribute or using `Job.Short` for custom configuration.

## ThreadingDiagnoser

The `ThreadingDiagnoser` uses [new APIs](https://github.com/dotnet/corefx/issues/35500) exposed in .NET Core 3.0 to report:

* Completed Work Items: The number of work items that have been processed in ThreadPool (per single operation)
* Lock Contentions: The number of times there **was contention** upon trying to take a Monitor's lock (per single operation)

### Source code

[!code-csharp[IntroThreadingDiagnoser.cs](../../../samples/BenchmarkDotNet.Samples/IntroThreadingDiagnoser.cs)]

### Output

| Method | Mean | StdDev | Median | Completed Work Items | Lock Contentions |
|-------------------- |--------------:|-----------:|--------------:|---------------------:|-----------------:|
| CompleteOneWorkItem | 8,073.5519 ns | 69.7261 ns | 8,111.6074 ns | 1.0000 | - |

## Multiple frameworks support


Expand Down
7 changes: 7 additions & 0 deletions docs/articles/configs/diagnosers.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ The current Diagnosers are:
- Native Memory Profiler (`NativeMemoryProfiler`)
It uses `EtwProfiler` to profile the code using ETW and adds the extra columns `Allocated native memory` and `Native memory leak`.
Please see Wojciech Nagórski's [blog post](https://wojciechnagorski.com/2019/08/analyzing-native-memory-allocation-with-benchmarkdotnet/) for all the details.
- Threading Diagnoser (`ThreadingDiagnoser`) - .NET Core 3.0+ diagnoser that reports some Threading statistics.

## Usage

Expand All @@ -54,6 +55,7 @@ private class Config : ManualConfig
Add(MemoryDiagnoser.Default);
Add(new InliningDiagnoser());
Add(new EtwProfiler());
Add(ThreadingDiagnoser.Default);
}
}
```
Expand All @@ -66,6 +68,7 @@ You can also use one of the following attributes (apply it on a class that conta
[EtwProfiler]
[ConcurrencyVisualizerProfiler]
[NativeMemoryProfiler]
[ThreadingDiagnoser]
```

In BenchmarkDotNet, 1kB = 1024B, 1MB = 1024kB, and so on. The column Gen X means number of GC collections per 1000 operations for that generation.
Expand All @@ -77,6 +80,8 @@ In BenchmarkDotNet, 1kB = 1024B, 1MB = 1024kB, and so on. The column Gen X means
* Mono currently [does not](http://stackoverflow.com/questions/40234948/how-to-get-the-number-of-allocated-bytes-in-mono) expose any api to get the number of allocated bytes. That's why our Mono users will get `?` in Allocated column.
* In order to get the number of allocated bytes in cross platform way we are using `GC.GetAllocatedBytesForCurrentThread` which recently got [exposed](https://github.com/dotnet/corefx/pull/12489) for netcoreapp1.1. That's why BenchmarkDotNet does not support netcoreapp1.0 from version 0.10.1.
* MemoryDiagnoser is `99.5%` accurate about allocated memory when using default settings or Job.ShortRun (or any longer job than it).
* Threading Diagnoser:
* Works only for .NET Core 3.0+
* HardwareCounters:
* Windows 8+ only (we plan to add Unix support in the future)
* No Hyper-V (Virtualization) support
Expand Down Expand Up @@ -113,3 +118,5 @@ In BenchmarkDotNet, 1kB = 1024B, 1MB = 1024kB, and so on. The column Gen X means
[!include[IntroTailcall](../samples/IntroTailcall.md)]

[!include[IntroNativeMemory](../samples/IntroNativeMemory.md)]

[!include[IntroThreadingDiagnoser](../samples/ThreadingDiagnoser.md)]
27 changes: 27 additions & 0 deletions docs/articles/samples/IntroThreadingDiagnoser.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
---
uid: BenchmarkDotNet.Samples.IntroThreadingDiagnoser
---

## Sample: IntroThreadingDiagnoser

The `ThreadingDiagnoser` uses new APIs [exposed](https://github.com/dotnet/corefx/issues/35500) in .NET Core 3.0 to report:

* Completed Work Items: The number of work items that have been processed in ThreadPool (per single operation)
* Lock Contentions: The number of times there **was contention** upon trying to take a Monitor's lock (per single operation)

### Source code

[!code-csharp[IntroThreadingDiagnoser.cs](../../../samples/BenchmarkDotNet.Samples/IntroThreadingDiagnoser.cs)]

### Output

| Method | Mean | StdDev | Median | Completed Work Items | Lock Contentions |
|-------------------- |--------------:|-----------:|--------------:|---------------------:|-----------------:|
| CompleteOneWorkItem | 8,073.5519 ns | 69.7261 ns | 8,111.6074 ns | 1.0000 | - |

### Links

* @docs.diagnosers
* The permanent link to this sample: @BenchmarkDotNet.Samples.IntroThreadingDiagnoser

---
19 changes: 19 additions & 0 deletions samples/BenchmarkDotNet.Samples/IntroThreadingDiagnoser.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
using BenchmarkDotNet.Attributes;
using System.Threading;

namespace BenchmarkDotNet.Samples
{
[ThreadingDiagnoser] // ENABLE the diagnoser
public class IntroThreadingDiagnoser
{
[Benchmark]
public void CompleteOneWorkItem()
{
ManualResetEvent done = new ManualResetEvent(initialState: false);

ThreadPool.QueueUserWorkItem(m => (m as ManualResetEvent).Set(), done);

done.WaitOne();
}
}
}
14 changes: 14 additions & 0 deletions src/BenchmarkDotNet/Attributes/ThreadingDiagnoserAttribute.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
using System;
using BenchmarkDotNet.Configs;
using BenchmarkDotNet.Diagnosers;

namespace BenchmarkDotNet.Attributes
{
[AttributeUsage(AttributeTargets.Class)]
public class ThreadingDiagnoserAttribute : Attribute, IConfigSource
{
public IConfig Config { get; }

public ThreadingDiagnoserAttribute() => Config = ManualConfig.CreateEmpty().With(ThreadingDiagnoser.Default);
}
}
2 changes: 1 addition & 1 deletion src/BenchmarkDotNet/Code/CodeGenerator.cs
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ internal static string Generate(BuildPartition buildPartition)
.Replace("$PassArguments$", passArguments)
.Replace("$EngineFactoryType$", GetEngineFactoryTypeName(benchmark))
.Replace("$Ref$", provider.UseRefKeyword ? "ref" : null)
.Replace("$MeasureGcStats$", buildInfo.Config.HasMemoryDiagnoser() ? "true" : "false")
.Replace("$MeasureExtraStats$", buildInfo.Config.HasExtraStatsDiagnoser() ? "true" : "false")
.Replace("$Encoding$", buildInfo.Config.Encoding.ToTemplateString())
.Replace("$DisassemblerEntryMethodName$", DisassemblerConstants.DisassemblerEntryMethodName)
.Replace("$WorkloadMethodCall$", provider.GetWorkloadMethodCall(passArguments)).ToString();
Expand Down
4 changes: 4 additions & 0 deletions src/BenchmarkDotNet/Configs/ImmutableConfig.cs
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,10 @@ internal ImmutableConfig(

public bool HasMemoryDiagnoser() => diagnosers.Contains(MemoryDiagnoser.Default);

public bool HasThreadingDiagnoser() => diagnosers.Contains(ThreadingDiagnoser.Default);

public bool HasExtraStatsDiagnoser() => HasMemoryDiagnoser() || HasThreadingDiagnoser();

public IDiagnoser GetCompositeDiagnoser(BenchmarkCase benchmarkCase, RunMode runMode)
{
var diagnosersForGivenMode = diagnosers.Where(diagnoser => diagnoser.GetRunMode(benchmarkCase) == runMode).ToImmutableHashSet();
Expand Down
3 changes: 3 additions & 0 deletions src/BenchmarkDotNet/ConsoleArguments/CommandLineOptions.cs
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,9 @@ public class CommandLineOptions
[Option('m', "memory", Required = false, Default = false, HelpText = "Prints memory statistics")]
public bool UseMemoryDiagnoser { get; set; }

[Option('t', "threading", Required = false, Default = false, HelpText = "Prints threading statistics")]
public bool UseThreadingDiagnoser { get; set; }

[Option('d', "disasm", Required = false, Default = false, HelpText = "Gets disassembly of benchmarked code")]
public bool UseDisassemblyDiagnoser { get; set; }

Expand Down
2 changes: 2 additions & 0 deletions src/BenchmarkDotNet/ConsoleArguments/ConfigParser.cs
Original file line number Diff line number Diff line change
Expand Up @@ -187,6 +187,8 @@ private static IConfig CreateConfig(CommandLineOptions options, IConfig globalCo

if (options.UseMemoryDiagnoser)
config.Add(MemoryDiagnoser.Default);
if (options.UseThreadingDiagnoser)
config.Add(ThreadingDiagnoser.Default);
if (options.UseDisassemblyDiagnoser)
config.Add(DisassemblyDiagnoser.Create(new DisassemblyDiagnoserConfig(recursiveDepth: options.DisassemblerRecursiveDepth, printPrologAndEpilog: true, printDiff: options.DisassemblerDiff)));
if (!string.IsNullOrEmpty(options.Profiler))
Expand Down
5 changes: 4 additions & 1 deletion src/BenchmarkDotNet/Diagnosers/DiagnoserResults.cs
Original file line number Diff line number Diff line change
Expand Up @@ -5,17 +5,20 @@ namespace BenchmarkDotNet.Diagnosers
{
public class DiagnoserResults
{
public DiagnoserResults(BenchmarkCase benchmarkCase, long totalOperations, GcStats gcStats)
public DiagnoserResults(BenchmarkCase benchmarkCase, long totalOperations, GcStats gcStats, ThreadingStats threadingStats)
{
BenchmarkCase = benchmarkCase;
TotalOperations = totalOperations;
GcStats = gcStats;
ThreadingStats = threadingStats;
}

public BenchmarkCase BenchmarkCase { get; }

public long TotalOperations { get; }

public GcStats GcStats { get; }

public ThreadingStats ThreadingStats { get; }
}
}
75 changes: 75 additions & 0 deletions src/BenchmarkDotNet/Diagnosers/ThreadingDiagnoser.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
using System;
using System.Collections.Generic;
using BenchmarkDotNet.Analysers;
using BenchmarkDotNet.Columns;
using BenchmarkDotNet.Engines;
using BenchmarkDotNet.Exporters;
using BenchmarkDotNet.Jobs;
using BenchmarkDotNet.Loggers;
using BenchmarkDotNet.Portability;
using BenchmarkDotNet.Reports;
using BenchmarkDotNet.Running;
using BenchmarkDotNet.Toolchains.DotNetCli;
using BenchmarkDotNet.Validators;

namespace BenchmarkDotNet.Diagnosers
{
public class ThreadingDiagnoser : IDiagnoser
{
public static readonly ThreadingDiagnoser Default = new ThreadingDiagnoser();

private ThreadingDiagnoser() { }

public IEnumerable<string> Ids => new[] { nameof(ThreadingDiagnoser) };

public IEnumerable<IExporter> Exporters => Array.Empty<IExporter>();

public IEnumerable<IAnalyser> Analysers => Array.Empty<IAnalyser>();

public void DisplayResults(ILogger logger) { }

public RunMode GetRunMode(BenchmarkCase benchmarkCase) => RunMode.NoOverhead;

public void Handle(HostSignal signal, DiagnoserActionParameters parameters) { }

public IEnumerable<Metric> ProcessResults(DiagnoserResults results)
{
yield return new Metric(CompletedWorkItemCountMetricDescriptor.Instance, results.ThreadingStats.CompletedWorkItemCount / (double)results.ThreadingStats.TotalOperations);
yield return new Metric(LockContentionCountMetricDescriptor.Instance, results.ThreadingStats.LockContentionCount / (double)results.ThreadingStats.TotalOperations);
}

public IEnumerable<ValidationError> Validate(ValidationParameters validationParameters)
{
if (!RuntimeInformation.IsNetCore || NetCoreAppSettings.Current.Value.IsOlderThan(TargetFrameworkMoniker.NetCoreApp30))
{
yield return new ValidationError(false, $"{nameof(ThreadingDiagnoser)} supports only .NET Core 3.0+");
}
}

private class CompletedWorkItemCountMetricDescriptor : IMetricDescriptor
{
internal static readonly IMetricDescriptor Instance = new CompletedWorkItemCountMetricDescriptor();

public string Id => "CompletedWorkItemCount";
public string DisplayName => "Completed Work Items";
public string Legend => "The number of work items that have been processed in ThreadPool (per single operation)";
public string NumberFormat => "#0.0000";
public UnitType UnitType => UnitType.Dimensionless;
public string Unit => "Count";
public bool TheGreaterTheBetter => false;
}

private class LockContentionCountMetricDescriptor : IMetricDescriptor
{
internal static readonly IMetricDescriptor Instance = new LockContentionCountMetricDescriptor();

public string Id => "LockContentionCount";
public string DisplayName => "Lock Contentions";
public string Legend => "The number of times there was contention upon trying to take a Monitor's lock (per single operation)";
public string NumberFormat => "#0.0000";
public UnitType UnitType => UnitType.Dimensionless;
public string Unit => "Count";
public bool TheGreaterTheBetter => false;
}
}
}
23 changes: 14 additions & 9 deletions src/BenchmarkDotNet/Engines/Engine.cs
Original file line number Diff line number Diff line change
Expand Up @@ -42,14 +42,14 @@ public class Engine : IEngine
private readonly EnginePilotStage pilotStage;
private readonly EngineWarmupStage warmupStage;
private readonly EngineActualStage actualStage;
private readonly bool includeMemoryStats;
private readonly bool includeExtraStats;

internal Engine(
IHost host,
IResolver resolver,
Action dummy1Action, Action dummy2Action, Action dummy3Action, Action<long> overheadAction, Action<long> workloadAction, Job targetJob,
Action globalSetupAction, Action globalCleanupAction, Action iterationSetupAction, Action iterationCleanupAction, long operationsPerInvoke,
bool includeMemoryStats, Encoding encoding, string benchmarkName)
bool includeExtraStats, Encoding encoding, string benchmarkName)
{

Host = host;
Expand All @@ -64,7 +64,7 @@ internal Engine(
IterationSetupAction = iterationSetupAction;
IterationCleanupAction = iterationCleanupAction;
OperationsPerInvoke = operationsPerInvoke;
this.includeMemoryStats = includeMemoryStats;
this.includeExtraStats = includeExtraStats;
BenchmarkName = benchmarkName;

Resolver = resolver;
Expand Down Expand Up @@ -128,16 +128,16 @@ public RunResults Run()

Host.AfterMainRun();

var workGcHasDone = includeMemoryStats
? MeasureGcStats(new IterationData(IterationMode.Workload, IterationStage.Actual, 0, invokeCount, UnrollFactor))
: GcStats.Empty;
(GcStats workGcHasDone, ThreadingStats threadingStats) = includeExtraStats
? GetExtraStats(new IterationData(IterationMode.Workload, IterationStage.Actual, 0, invokeCount, UnrollFactor))
: (GcStats.Empty, ThreadingStats.Empty);

if (EngineEventSource.Log.IsEnabled())
EngineEventSource.Log.BenchmarkStop(BenchmarkName);

var outlierMode = TargetJob.ResolveValue(AccuracyMode.OutlierModeCharacteristic, Resolver);

return new RunResults(idle, main, outlierMode, workGcHasDone, Encoding);
return new RunResults(idle, main, outlierMode, workGcHasDone, threadingStats, Encoding);
}

public Measurement RunIteration(IterationData data)
Expand Down Expand Up @@ -177,7 +177,7 @@ public Measurement RunIteration(IterationData data)
return measurement;
}

private GcStats MeasureGcStats(IterationData data)
private (GcStats, ThreadingStats) GetExtraStats(IterationData data)
{
// we enable monitoring after main target run, for this single iteration which is executed at the end
// so even if we enable AppDomain monitoring in separate process
Expand All @@ -186,15 +186,20 @@ private GcStats MeasureGcStats(IterationData data)

IterationSetupAction(); // we run iteration setup first, so even if it allocates, it is not included in the results

var initialThreadingStats = ThreadingStats.ReadInitial(); // this method might allocate
var initialGcStats = GcStats.ReadInitial();

WorkloadAction(data.InvokeCount / data.UnrollFactor);

var finalGcStats = GcStats.ReadFinal();
var finalThreadingStats = ThreadingStats.ReadFinal();

IterationCleanupAction(); // we run iteration cleanup after collecting GC stats

return (finalGcStats - initialGcStats).WithTotalOperations(data.InvokeCount * OperationsPerInvoke);
GcStats gcStats = (finalGcStats - initialGcStats).WithTotalOperations(data.InvokeCount * OperationsPerInvoke);
ThreadingStats threadingStats = (finalThreadingStats - initialThreadingStats).WithTotalOperations(data.InvokeCount * OperationsPerInvoke);

return (gcStats, threadingStats);
}

private void GcCollect()
Expand Down
2 changes: 1 addition & 1 deletion src/BenchmarkDotNet/Engines/EngineFactory.cs
Original file line number Diff line number Diff line change
Expand Up @@ -118,7 +118,7 @@ private static Engine CreateEngine(EngineParameters engineParameters, Job job, A
engineParameters.IterationSetupAction,
engineParameters.IterationCleanupAction,
engineParameters.OperationsPerInvoke,
engineParameters.MeasureGcStats,
engineParameters.MeasureExtraStats,
engineParameters.Encoding,
engineParameters.BenchmarkName);
}
Expand Down
2 changes: 1 addition & 1 deletion src/BenchmarkDotNet/Engines/EngineParameters.cs
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ public class EngineParameters
public Action GlobalCleanupAction { get; set; }
public Action IterationSetupAction { get; set; }
public Action IterationCleanupAction { get; set; }
public bool MeasureGcStats { get; set; }
public bool MeasureExtraStats { get; set; }

[PublicAPI] public Encoding Encoding { get; set; }

Expand Down
Loading

0 comments on commit 4080abd

Please sign in to comment.