Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pkg/stanza] Remove batching in LogEmitter behind feature gate #38428

Conversation

andrzej-stencel
Copy link
Member

Description

Removes batching in LogEmitter to prevent data loss during ungraceful shutdown of the collector. See #35456 for details.

This is done behind a feature gate, as it may have a negative performance impact, depending on user configuration. See the added documentation for the feature gate.

On implementation side, this was done by renaming the existing LogEmitter struct to BatchingLogEmitter and introducing a new SynchronousLogEmitter, see pkg/stanza/adapter/emitter.go.

Link to tracking issue

Testing

Added unit tests in pkg/stanza/adapter/emitter_test.go.

Adapted the benchmarks pkg/stanza/adapter/receiver_test.go to run for both the existing BatchingLogEmitter and the new SynchronousLogEmitter.

Documentation

Added documentation for the feature gate.

@github-actions github-actions bot added the processor/logstransform Logs Transform processor label Mar 7, 2025
@andrzej-stencel
Copy link
Member Author

Here is the result of the BenchmarkReceiver* benchmarks from pkg/stanza/adapter/receiver_test.go. I have replaced test case names for readability in GitHub:

  • BenchmarkReceiverWithBatchingLogEmitter = BRWBLE - this is the original behavior without this change
  • BenchmarkReceiverWithSynchronousLogEmitter = BRWSLE - this is the behavior of a
  • BenchmarkReceiverWithSynchronousLogEmitterAndBatchingInput = BRWSLEABI
$ go test -benchmem -run=^$ -bench BenchmarkReceiver . -v -benchtime=1s -count=6 | tee /tmp/benchmark-receiver.txt
goos: linux
goarch: amd64
pkg: github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/adapter
cpu: 13th Gen Intel(R) Core(TM) i7-13700H
BRWBLE
BRWBLE/1_logs
BRWBLE/1_logs-20                     10         100047159 ns/op            4875 B/op         83 allocs/op
BRWBLE/1_logs-20                     10         100156176 ns/op            4753 B/op         82 allocs/op
BRWBLE/1_logs-20                     10         100109411 ns/op            4881 B/op         83 allocs/op
BRWBLE/1_logs-20                     10         100086203 ns/op            4924 B/op         83 allocs/op
BRWBLE/1_logs-20                     10         100134626 ns/op            4753 B/op         82 allocs/op
BRWBLE/1_logs-20                     10         100120652 ns/op            4753 B/op         82 allocs/op
BRWBLE/10_logs
BRWBLE/10_logs-20                    10         100071178 ns/op           19344 B/op        509 allocs/op
BRWBLE/10_logs-20                    10         100082806 ns/op           19492 B/op        511 allocs/op
BRWBLE/10_logs-20                    10         100112047 ns/op           19411 B/op        510 allocs/op
BRWBLE/10_logs-20                    10         100160894 ns/op           19395 B/op        510 allocs/op
BRWBLE/10_logs-20                    10         100116103 ns/op           19401 B/op        510 allocs/op
BRWBLE/10_logs-20                    10         100112331 ns/op           19435 B/op        509 allocs/op
BRWBLE/100_logs
BRWBLE/100_logs-20                 2781            443204 ns/op          163959 B/op       4733 allocs/op
BRWBLE/100_logs-20                 2708            425692 ns/op          163955 B/op       4733 allocs/op
BRWBLE/100_logs-20                 3194            403997 ns/op          163952 B/op       4733 allocs/op
BRWBLE/100_logs-20                 3884            429406 ns/op          163954 B/op       4733 allocs/op
BRWBLE/100_logs-20                 2511            457171 ns/op          163963 B/op       4733 allocs/op
BRWBLE/100_logs-20                 2527            431096 ns/op          163959 B/op       4733 allocs/op
BRWBLE/1000_logs
BRWBLE/1000_logs-20                 298           3816645 ns/op         1639628 B/op      47337 allocs/op
BRWBLE/1000_logs-20                 256           3918620 ns/op         1639646 B/op      47337 allocs/op
BRWBLE/1000_logs-20                 289           3778077 ns/op         1639638 B/op      47337 allocs/op
BRWBLE/1000_logs-20                 243           4751336 ns/op         1639649 B/op      47337 allocs/op
BRWBLE/1000_logs-20                 270           4061866 ns/op         1639576 B/op      47337 allocs/op
BRWBLE/1000_logs-20                 256           4108661 ns/op         1639636 B/op      47337 allocs/op
BRWBLE/10000_logs
BRWBLE/10000_logs-20                 26          40603455 ns/op        16395677 B/op     473367 allocs/op
BRWBLE/10000_logs-20                 24          42668238 ns/op        16396379 B/op     473375 allocs/op
BRWBLE/10000_logs-20                 28          43403587 ns/op        16396335 B/op     473374 allocs/op
BRWBLE/10000_logs-20                 27          41481401 ns/op        16395769 B/op     473372 allocs/op
BRWBLE/10000_logs-20                 27          41116547 ns/op        16396223 B/op     473374 allocs/op
BRWBLE/10000_logs-20                 39          43809102 ns/op        16396483 B/op     473375 allocs/op
BRWBLE/100000_logs
BRWBLE/100000_logs-20                 3         388334858 ns/op        163967154 B/op   4733749 allocs/op
BRWBLE/100000_logs-20                 3         442210269 ns/op        163963349 B/op   4733754 allocs/op
BRWBLE/100000_logs-20                 3         421985974 ns/op        163962837 B/op   4733726 allocs/op
BRWBLE/100000_logs-20                 3         437625712 ns/op        163967992 B/op   4733762 allocs/op
BRWBLE/100000_logs-20                 3         414695915 ns/op        163964504 B/op   4733736 allocs/op
BRWBLE/100000_logs-20                 3         393860437 ns/op        163962885 B/op   4733739 allocs/op
BRWSLE
BRWSLE/1_logs
BRWSLE/1_logs-20                 153057              7070 ns/op            2625 B/op         73 allocs/op
BRWSLE/1_logs-20                 206821              7541 ns/op            2625 B/op         73 allocs/op
BRWSLE/1_logs-20                 190917              6373 ns/op            2625 B/op         73 allocs/op
BRWSLE/1_logs-20                 154072              6750 ns/op            2625 B/op         73 allocs/op
BRWSLE/1_logs-20                 215745              7299 ns/op            2625 B/op         73 allocs/op
BRWSLE/1_logs-20                 158467              6811 ns/op            2625 B/op         73 allocs/op
BRWSLE/10_logs
BRWSLE/10_logs-20                 17884             68644 ns/op           26254 B/op        730 allocs/op
BRWSLE/10_logs-20                 16659             66368 ns/op           26253 B/op        730 allocs/op
BRWSLE/10_logs-20                 18289             62490 ns/op           26252 B/op        730 allocs/op
BRWSLE/10_logs-20                 21823             61393 ns/op           26252 B/op        730 allocs/op
BRWSLE/10_logs-20                 18362             64411 ns/op           26251 B/op        730 allocs/op
BRWSLE/10_logs-20                 18589             67560 ns/op           26251 B/op        730 allocs/op
BRWSLE/100_logs
BRWSLE/100_logs-20                 1807            639261 ns/op          262530 B/op       7301 allocs/op
BRWSLE/100_logs-20                 1986            594824 ns/op          262508 B/op       7301 allocs/op
BRWSLE/100_logs-20                 1736            591616 ns/op          262512 B/op       7301 allocs/op
BRWSLE/100_logs-20                 1924            626313 ns/op          262511 B/op       7301 allocs/op
BRWSLE/100_logs-20                 1810            574889 ns/op          262512 B/op       7301 allocs/op
BRWSLE/100_logs-20                 2218            624204 ns/op          262503 B/op       7301 allocs/op
BRWSLE/1000_logs
BRWSLE/1000_logs-20                 189           6159109 ns/op         2625101 B/op      73011 allocs/op
BRWSLE/1000_logs-20                 212           5585857 ns/op         2625106 B/op      73011 allocs/op
BRWSLE/1000_logs-20                 265           4516101 ns/op         2625192 B/op      73012 allocs/op
BRWSLE/1000_logs-20                 273           4185511 ns/op         2625193 B/op      73011 allocs/op
BRWSLE/1000_logs-20                 213           6201304 ns/op         2625387 B/op      73013 allocs/op
BRWSLE/1000_logs-20                 182           6206619 ns/op         2625147 B/op      73011 allocs/op
BRWSLE/10000_logs
BRWSLE/10000_logs-20                 19          57581495 ns/op        26251194 B/op     730114 allocs/op
BRWSLE/10000_logs-20                 24          47268343 ns/op        26251490 B/op     730116 allocs/op
BRWSLE/10000_logs-20                 19          60669838 ns/op        26252080 B/op     730123 allocs/op
BRWSLE/10000_logs-20                 20          61554539 ns/op        26251274 B/op     730118 allocs/op
BRWSLE/10000_logs-20                 19          59685981 ns/op        26251741 B/op     730116 allocs/op
BRWSLE/10000_logs-20                 19          56506868 ns/op        26251557 B/op     730116 allocs/op
BRWSLE/100000_logs
BRWSLE/100000_logs-20                 2         605129432 ns/op        262515432 B/op   7301165 allocs/op
BRWSLE/100000_logs-20                 2         610875858 ns/op        262515648 B/op   7301185 allocs/op
BRWSLE/100000_logs-20                 2         627519854 ns/op        262517232 B/op   7301184 allocs/op
BRWSLE/100000_logs-20                 2         599077728 ns/op        262514352 B/op   7301174 allocs/op
BRWSLE/100000_logs-20                 2         565507172 ns/op        262514348 B/op   7301157 allocs/op
BRWSLE/100000_logs-20                 2         607427582 ns/op        262516880 B/op   7301191 allocs/op
BRWSLEABI
BRWSLEABI/1_logs
BRWSLEABI/1_logs-20                3331            365600 ns/op          149093 B/op       4305 allocs/op
BRWSLEABI/1_logs-20                3477            345958 ns/op          149092 B/op       4305 allocs/op
BRWSLEABI/1_logs-20                3944            367017 ns/op          149089 B/op       4305 allocs/op
BRWSLEABI/1_logs-20                2964            374290 ns/op          149088 B/op       4305 allocs/op
BRWSLEABI/1_logs-20                3148            392954 ns/op          149092 B/op       4305 allocs/op
BRWSLEABI/1_logs-20                4531            424137 ns/op          149086 B/op       4305 allocs/op
BRWSLEABI/10_logs
BRWSLEABI/10_logs-20               2730            419272 ns/op          149088 B/op       4305 allocs/op
BRWSLEABI/10_logs-20               2827            414739 ns/op          149089 B/op       4305 allocs/op
BRWSLEABI/10_logs-20               2505            399265 ns/op          149088 B/op       4305 allocs/op
BRWSLEABI/10_logs-20               4522            378317 ns/op          149090 B/op       4305 allocs/op
BRWSLEABI/10_logs-20               4126            394739 ns/op          149087 B/op       4305 allocs/op
BRWSLEABI/10_logs-20               3508            409009 ns/op          149090 B/op       4305 allocs/op
BRWSLEABI/100_logs
BRWSLEABI/100_logs-20              2593            398444 ns/op          149090 B/op       4305 allocs/op
BRWSLEABI/100_logs-20              2935            386873 ns/op          149090 B/op       4305 allocs/op
BRWSLEABI/100_logs-20              4172            388919 ns/op          149087 B/op       4305 allocs/op
BRWSLEABI/100_logs-20              3108            418106 ns/op          149088 B/op       4305 allocs/op
BRWSLEABI/100_logs-20              2932            404387 ns/op          149091 B/op       4305 allocs/op
BRWSLEABI/100_logs-20              2776            411405 ns/op          149088 B/op       4305 allocs/op
BRWSLEABI/1000_logs
BRWSLEABI/1000_logs-20              256           3956125 ns/op         1490843 B/op      43056 allocs/op
BRWSLEABI/1000_logs-20              312           3734846 ns/op         1490873 B/op      43056 allocs/op
BRWSLEABI/1000_logs-20              339           3713383 ns/op         1490875 B/op      43056 allocs/op
BRWSLEABI/1000_logs-20              279           4010198 ns/op         1490888 B/op      43056 allocs/op
BRWSLEABI/1000_logs-20              270           4138121 ns/op         1490878 B/op      43056 allocs/op
BRWSLEABI/1000_logs-20              292           3729990 ns/op         1490879 B/op      43056 allocs/op
BRWSLEABI/10000_logs
BRWSLEABI/10000_logs-20              31          40553944 ns/op        14908861 B/op     430567 allocs/op
BRWSLEABI/10000_logs-20              38          36818228 ns/op        14908922 B/op     430567 allocs/op
BRWSLEABI/10000_logs-20              28          38044832 ns/op        14908732 B/op     430565 allocs/op
BRWSLEABI/10000_logs-20              32          37831686 ns/op        14908660 B/op     430567 allocs/op
BRWSLEABI/10000_logs-20              34          40752769 ns/op        14908635 B/op     430565 allocs/op
BRWSLEABI/10000_logs-20              31          35594256 ns/op        14908478 B/op     430563 allocs/op
BRWSLEABI/100000_logs
BRWSLEABI/100000_logs-20              3         408265650 ns/op        149088696 B/op   4305659 allocs/op
BRWSLEABI/100000_logs-20              3         381525954 ns/op        149087117 B/op   4305646 allocs/op
BRWSLEABI/100000_logs-20              3         371295412 ns/op        149087602 B/op   4305655 allocs/op
BRWSLEABI/100000_logs-20              3         376515194 ns/op        149085784 B/op   4305645 allocs/op
BRWSLEABI/100000_logs-20              3         372438997 ns/op        149085954 B/op   4305640 allocs/op
BRWSLEABI/100000_logs-20              3         369667676 ns/op        149086434 B/op   4305640 allocs/op
PASS
ok      github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/adapter    190.230s
$ benchstat /tmp/benchmark-receiver.txt
goos: linux
goarch: amd64
pkg: github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/adapter
cpu: 13th Gen Intel(R) Core(TM) i7-13700H
                                                                 │ /tmp/benchamrk-receiver-syncemit.txt │
                                                                 │                sec/op                │
ReceiverWithBatchingLogEmitter/1_logs-20                                                   100.1m ±  0%
ReceiverWithBatchingLogEmitter/10_logs-20                                                  100.1m ±  0%
ReceiverWithBatchingLogEmitter/100_logs-20                                                 430.3µ ±  6%
ReceiverWithBatchingLogEmitter/1000_logs-20                                                3.990m ± 19%
ReceiverWithBatchingLogEmitter/10000_logs-20                                               42.07m ±  4%
ReceiverWithBatchingLogEmitter/100000_logs-20                                              418.3m ±  7%
ReceiverWithSynchronousLogEmitter/1_logs-20                                                6.941µ ±  9%
ReceiverWithSynchronousLogEmitter/10_logs-20                                               65.39µ ±  6%
ReceiverWithSynchronousLogEmitter/100_logs-20                                              609.5µ ±  6%
ReceiverWithSynchronousLogEmitter/1000_logs-20                                             5.872m ± 29%
ReceiverWithSynchronousLogEmitter/10000_logs-20                                            58.63m ± 19%
ReceiverWithSynchronousLogEmitter/100000_logs-20                                           606.3m ±  7%
ReceiverWithSynchronousLogEmitterAndBatchingInput/1_logs-20                                370.7µ ± 14%
ReceiverWithSynchronousLogEmitterAndBatchingInput/10_logs-20                               404.1µ ±  6%
ReceiverWithSynchronousLogEmitterAndBatchingInput/100_logs-20                              401.4µ ±  4%
ReceiverWithSynchronousLogEmitterAndBatchingInput/1000_logs-20                             3.845m ±  8%
ReceiverWithSynchronousLogEmitterAndBatchingInput/10000_logs-20                            37.94m ±  7%
ReceiverWithSynchronousLogEmitterAndBatchingInput/100000_logs-20                           374.5m ±  9%
geomean                                                                                    5.813m

                                                                 │ /tmp/benchamrk-receiver-syncemit.txt │
                                                                 │                 B/op                 │
ReceiverWithBatchingLogEmitter/1_logs-20                                                   4.701Ki ± 2%
ReceiverWithBatchingLogEmitter/10_logs-20                                                  18.95Ki ± 0%
ReceiverWithBatchingLogEmitter/100_logs-20                                                 160.1Ki ± 0%
ReceiverWithBatchingLogEmitter/1000_logs-20                                                1.564Mi ± 0%
ReceiverWithBatchingLogEmitter/10000_logs-20                                               15.64Mi ± 0%
ReceiverWithBatchingLogEmitter/100000_logs-20                                              156.4Mi ± 0%
ReceiverWithSynchronousLogEmitter/1_logs-20                                                2.563Ki ± 0%
ReceiverWithSynchronousLogEmitter/10_logs-20                                               25.64Ki ± 0%
ReceiverWithSynchronousLogEmitter/100_logs-20                                              256.4Ki ± 0%
ReceiverWithSynchronousLogEmitter/1000_logs-20                                             2.504Mi ± 0%
ReceiverWithSynchronousLogEmitter/10000_logs-20                                            25.04Mi ± 0%
ReceiverWithSynchronousLogEmitter/100000_logs-20                                           250.4Mi ± 0%
ReceiverWithSynchronousLogEmitterAndBatchingInput/1_logs-20                                145.6Ki ± 0%
ReceiverWithSynchronousLogEmitterAndBatchingInput/10_logs-20                               145.6Ki ± 0%
ReceiverWithSynchronousLogEmitterAndBatchingInput/100_logs-20                              145.6Ki ± 0%
ReceiverWithSynchronousLogEmitterAndBatchingInput/1000_logs-20                             1.422Mi ± 0%
ReceiverWithSynchronousLogEmitterAndBatchingInput/10000_logs-20                            14.22Mi ± 0%
ReceiverWithSynchronousLogEmitterAndBatchingInput/100000_logs-20                           142.2Mi ± 0%
geomean                                                                                    902.7Ki

                                                                 │ /tmp/benchamrk-receiver-syncemit.txt │
                                                                 │              allocs/op               │
ReceiverWithBatchingLogEmitter/1_logs-20                                                     82.50 ± 1%
ReceiverWithBatchingLogEmitter/10_logs-20                                                    510.0 ± 0%
ReceiverWithBatchingLogEmitter/100_logs-20                                                  4.733k ± 0%
ReceiverWithBatchingLogEmitter/1000_logs-20                                                 47.34k ± 0%
ReceiverWithBatchingLogEmitter/10000_logs-20                                                473.4k ± 0%
ReceiverWithBatchingLogEmitter/100000_logs-20                                               4.734M ± 0%
ReceiverWithSynchronousLogEmitter/1_logs-20                                                  73.00 ± 0%
ReceiverWithSynchronousLogEmitter/10_logs-20                                                 730.0 ± 0%
ReceiverWithSynchronousLogEmitter/100_logs-20                                               7.301k ± 0%
ReceiverWithSynchronousLogEmitter/1000_logs-20                                              73.01k ± 0%
ReceiverWithSynchronousLogEmitter/10000_logs-20                                             730.1k ± 0%
ReceiverWithSynchronousLogEmitter/100000_logs-20                                            7.301M ± 0%
ReceiverWithSynchronousLogEmitterAndBatchingInput/1_logs-20                                 4.305k ± 0%
ReceiverWithSynchronousLogEmitterAndBatchingInput/10_logs-20                                4.305k ± 0%
ReceiverWithSynchronousLogEmitterAndBatchingInput/100_logs-20                               4.305k ± 0%
ReceiverWithSynchronousLogEmitterAndBatchingInput/1000_logs-20                              43.06k ± 0%
ReceiverWithSynchronousLogEmitterAndBatchingInput/10000_logs-20                             430.6k ± 0%
ReceiverWithSynchronousLogEmitterAndBatchingInput/100000_logs-20                            4.306M ± 0%
geomean                                                                                     25.47k

Comparing results for 1, 10 logs doesn't make much sense as batching in LogEmitter with flush timeout 100ms causes only 10 iterations per second.

Comparing results for 100 and more logs, we can see that using SynchronousLogEmitter without batching gives ~40-60% worse performance than with BatchingLogEmiiter:

  • time per op: 609.5µ vs. 430.3µ
  • alloc bytes per op: 256.4Ki vs. 160.1Ki
  • allocs per op: 7301 vs. 4733

SynchronousLogEmitter with batching gives ~10% better performance than BatchingLogEmitter:

  • time per op: 401.4µ vs. 430.3µ
  • alloc bytes per op: 145.6Ki vs. 160.1Ki
  • allocs per op: 4305 vs. 4733

@djaglowski djaglowski merged commit 6fa23c0 into open-telemetry:main Mar 7, 2025
171 checks passed
@github-actions github-actions bot added this to the next release milestone Mar 7, 2025
@andrzej-stencel andrzej-stencel deleted the synchronous-logemitter-behind-feature-gate branch March 7, 2025 14:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[pkg/stanza] Prevent data loss due to LogEmitter's buffer
3 participants