Skip to content
Merged

V4 #43

Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 7 additions & 15 deletions .github/workflows/publish-nuget-packages.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,24 +4,16 @@ on:
release:
types: [published, prereleased]

jobs:
publish-v2:
jobs:
publish:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v1
- uses: actions/setup-dotnet@v1
with:
dotnet-version: '5.0.100'
- run: dotnet pack src/prometheus-net.DotNetRuntime --include-symbols -c "ReleaseV2" --output "build/"
- run: dotnet nuget push "build/prometheus-net.DotNetRuntime.2.*.symbols.nupkg" -k ${{ secrets.NUGET_API_KEY }} -s "https://api.nuget.org/v3/index.json" -n true


publish-v3:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v1
- uses: actions/setup-dotnet@v1
with:
dotnet-version: '5.0.100'
- run: dotnet pack src/prometheus-net.DotNetRuntime --include-symbols -c "ReleaseV3" --output "build/"
- run: dotnet nuget push "build/prometheus-net.DotNetRuntime.3.*.symbols.nupkg" -k ${{ secrets.NUGET_API_KEY }} -s "https://api.nuget.org/v3/index.json" -n true
- run: arrTag=(${GITHUB_REF//\// })
- run: VERSION="${arrTag[2]}"
- run: echo "Version is $VERSION"
- run: dotnet pack src/prometheus-net.DotNetRuntime --include-symbols -c "Release" -p:PackageVersion=$VERSION --output "build/"
- run: dotnet nuget push "build/prometheus-net.DotNetRuntime.*.symbols.nupkg" -k ${{ secrets.NUGET_API_KEY }} -s "https://api.nuget.org/v3/index.json" -n true
25 changes: 10 additions & 15 deletions .github/workflows/run-tests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,22 +6,17 @@ on:
pull_request:

jobs:
test-v2:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v1
- uses: actions/setup-dotnet@v1
- name: Setup .NET Core 3.1
uses: actions/setup-dotnet@v1
with:
dotnet-version: 3.1.x
- name: Setup .NET Core 5.0
uses: actions/setup-dotnet@v1
with:
dotnet-version: '5.0.100'
# excluding When_IO_work_is_executed_on_the_thread_pool_then_the_number_of_io_threads_is_measured for now, for some reason we don't seem to be
# generating IO thread events in the github actions environment
- run: dotnet test -c "DebugV2" --filter Name!=When_IO_work_is_executed_on_the_thread_pool_then_the_number_of_io_threads_is_measured

test-v3:
runs-on: ubuntu-latest
steps:
dotnet-version: 5.0.x
- uses: actions/checkout@v1
- uses: actions/setup-dotnet@v1
with:
dotnet-version: '5.0.100'
- run: dotnet test -c "DebugV3" --filter Name!=When_IO_work_is_executed_on_the_thread_pool_then_the_number_of_io_threads_is_measured
# This test constantly passes localy (windows + linux) but fails in the test environment. Don't have the time/ inclination to figure out why this is right now..
- run: dotnet test -c "Debug" --filter Name!=When_blocking_work_is_executed_on_the_thread_pool_then_thread_pool_delays_are_measured
77 changes: 38 additions & 39 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# prometheus-net.DotNetMetrics
A plugin for the [prometheus-net](https://github.com/prometheus-net/prometheus-net) package, exposing .NET core runtime metrics including:
A plugin for the [prometheus-net](https://github.com/prometheus-net/prometheus-net) package, [exposing .NET core runtime metrics](docs/metrics-exposed.md) including:
- Garbage collection collection frequencies and timings by generation/ type, pause timings and GC CPU consumption ratio
- Heap size by generation
- Bytes allocated by small/ large object heap
Expand All @@ -8,22 +8,21 @@ A plugin for the [prometheus-net](https://github.com/prometheus-net/prometheus-n
- Lock contention
- Exceptions thrown, broken down by type

These metrics are essential for understanding the peformance of any non-trivial application. Even if your application is well instrumented, you're only getting half the story- what the runtime is doing completes the picture.
These metrics are essential for understanding the performance of any non-trivial application. Even if your application is well instrumented, you're only getting half the story- what the runtime is doing completes the picture.

## Installation
Supports .NET core v2.2+ but **.NET core v3.0+ is recommended**. There are a [number of bugs present in the .NET core 2.2 runtime](https://github.com/djluck/prometheus-net.DotNetRuntime/issues?q=is%3Aissue+is%3Aopen+label%3A".net+core+2.2+bug")
that can impact metric collection or runtime stability.
## Using this package
### Requirements
- .NET core 3.1 (runtime version 3.1.11+ is recommended)/ .NET 5.0
- The [prometheus-net](https://github.com/prometheus-net/prometheus-net) package

Add the packge from [nuget](https://www.nuget.org/packages/prometheus-net.DotNetRuntime):
### Install it
The package can be installed from [nuget](https://www.nuget.org/packages/prometheus-net.DotNetRuntime):
```powershell
# If you're using v3.* of prometheus-net
dotnet add package prometheus-net.DotNetRuntime

# If you're using v2.* of prometheus-net
dotnet add package prometheus-net.DotNetRuntime --version 2.2.0
```

And then start the collector:
### Start collecting metrics
You can start metric collection with:
```csharp
IDisposable collector = DotNetRuntimeStatsBuilder.Default().StartCollecting()
```
Expand All @@ -34,49 +33,49 @@ IDisposable collector = DotNetRuntimeStatsBuilder
.Customize()
.WithContentionStats()
.WithJitStats()
.WithThreadPoolSchedulingStats()
.WithThreadPoolStats()
.WithGcStats()
.WithExceptionStats()
.StartCollecting();
```

Once the collector is registered, you should see metrics prefixed with `dotnet_` visible in your metric output (make sure you are [exporting your metrics](https://github.com/prometheus-net/prometheus-net#http-handler)).
## Sample Grafana dashboard
The metrics exposed can drive a rich dashboard, giving you a graphical insight into the performance of your application ( [exported dashboard available here](examples/NET_runtime_metrics_dashboard.json)):

![Grafana dashboard sample](docs/grafana-example.PNG)
## Performance impact
### Choosing a `CaptureLevel`
By default the library will default generate metrics based on [event counters](https://docs.microsoft.com/en-us/dotnet/core/diagnostics/event-counters). This allows for basic instrumentation of applications with very little performance overhead.

You can enable higher-fidelity metrics by providing a custom `CaptureLevel`, e.g:
```
DotNetRuntimeStatsBuilder
.Customize()
.WithGcStats(CaptureLevel.Informational)
.WithExceptionStats(CaptureLevel.Errors)
...
```

Most builder methods allow the passing of a custom `CaptureLevel`- see the [documentation on exposed metrics](docs/metrics-exposed.md) for more information.

### Performance impact of `CaptureLevel.Errors`+
The harder you work the .NET core runtime, the more events it generates. Event generation and processing costs can stack up, especially around these types of events:
- **JIT stats**: each method compiled by the JIT compiler emits two events. Most JIT compilation is performed at startup and depending on the size of your application, this could impact your startup performance.
- **GC stats**: every 100KB of allocations, an event is emitted. If you are consistently allocating memory at a rate > 1GB/sec, you might like to disable GC stats.
- **.NET thread pool scheduling stats**: For every work item scheduled on the thread pool, two events are emitted. If you are scheduling thousands of items per second on the thread pool, you might like to disable scheduling events or decrease the sampling rate of these events.
- **GC stats with `CaptureLevel.Verbose`**: every 100KB of allocations, an event is emitted. If you are consistently allocating memory at a rate > 1GB/sec, you might like to disable GC stats.
- **Exception stats with `CaptureLevel.Errors`**: for every exception throw, an event is generated.

### Sampling
To counteract some of the performance impacts of measuring .NET core runtime events, sampling can be configured on supported collectors:
```csharp
IDisposable collector = DotNetRuntimeStatsBuilder.Customize()
// Only 1 in 10 contention events will be sampled
.WithContentionStats(sampleRate: SampleEvery.TenEvents)
// Only 1 in 100 JIT events will be sampled
.WithJitStats(sampleRate: SampleEvery.HundredEvents)
// Every event will be sampled (disables sampling)
.WithThreadPoolSchedulingStats(sampleRate: SampleEvery.OneEvent)
.StartCollecting();
```
There is also a [performance issue present in .NET core 3.1](https://github.com/dotnet/runtime/issues/43985#issuecomment-800629516) that will see CPU consumption grow over time when long-running trace sessions are used.

The default sample rates are listed below:
## Examples
An example `docker-compose` stack is available in the [`examples/`](examples/) folder. Start it with:

| Event collector | Default sample rate |
| ------------------------------ | ------------------------|
| `ThreadPoolSchedulingStats` | `SampleEvery.TenEvents` |
| `JitStats` | `SampleEvery.TenEvents` |
| `ContentionStats` | `SampleEvery.TwoEvents` |
```
docker-compose up -d
```

You can then visit [`http://localhost:3000`](http://localhost:3000) to view metrics being generated by a sample application.

While the default sampling rates provide a decent balance between accuracy and resource consumption if you're concerned with the accuracy of metrics at all costs,
then feel free to change the sampling rate to `SampleEvery.OneEvent`. If minimal resource consumption (especially memory), is your goal you might like to
reduce the sampling rate.
### Grafana dashboard
The metrics exposed can drive a rich dashboard, giving you a graphical insight into the performance of your application ( [exported dashboard available here](examples/grafana/provisioning/dashboards/NET_runtime_metrics_dashboard.json)):

![Grafana dashboard sample](docs/grafana-example.PNG)

## Further reading
- The mechanism for listening to runtime events is outlined in the [.NET core 2.2 release notes](https://docs.microsoft.com/en-us/dotnet/core/whats-new/dotnet-core-2-2#core).
Expand Down
Loading