Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved and batched logs translation #2892

Conversation

pmalek-sumo
Copy link
Contributor

Description: Introduce an aggregation layer to internal/stanza that translates entry.Entry into pdata.Logs aggregating logs coming from the same Resource into one entry.

Link to tracking Issue: #2330

Testing: unit tests added

@codecov
Copy link

codecov bot commented Mar 26, 2021

Codecov Report

Merging #2892 (b69f891) into main (736647a) will decrease coverage by 0.01%.
The diff coverage is 92.14%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #2892      +/-   ##
==========================================
- Coverage   91.55%   91.53%   -0.02%     
==========================================
  Files         465      465              
  Lines       22848    22962     +114     
==========================================
+ Hits        20918    21019     +101     
- Misses       1437     1446       +9     
- Partials      493      497       +4     
Flag Coverage Δ
integration 68.96% <ø> (ø)
unit 90.51% <92.14%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
internal/stanza/config.go 100.00% <ø> (ø)
internal/stanza/converter.go 95.29% <88.54%> (-4.71%) ⬇️
internal/stanza/factory.go 95.00% <100.00%> (+1.25%) ⬆️
internal/stanza/receiver.go 100.00% <100.00%> (ø)
receiver/filelogreceiver/filelog.go 100.00% <100.00%> (ø)
receiver/k8sclusterreceiver/watcher.go 95.29% <0.00%> (-2.36%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 736647a...b69f891. Read the comment docs.

@tigrannajaryan
Copy link
Member

Does this resolve all comments raised in #2694 ?

Please resolve merge conflicts.

@pmalek-sumo pmalek-sumo force-pushed the issue-2330-improved-logs-translation branch 3 times, most recently from 5eb053b to d869864 Compare March 30, 2021 17:15
@pmalek-sumo
Copy link
Contributor Author

Does this resolve all comments raised in #2694 ?

Please resolve merge conflicts.

@tigrannajaryan
To the best of my knowledge I've addresses all of the comments from #2694.

There are just 2 that might be worth discussing/agreeing on:


Another thing that I'm looking into and I believe might be worth raising a separate issue to track is to check if implementing this PR with a configurable number of workers which would be responsible for converting the entries in their separate goroutines and then they would pipe them through a channel where the aggregation would happen (with the already implemented flushing mechanism).

@pmalek-sumo
Copy link
Contributor Author

I've run testbed tests on this PR once more to post them here for reference:

( make otelcontribcol && cd testbed/ && TEST_ARGS="-run TestLog10kDPS -count 1" ./runtests.sh )
GO111MODULE=on CGO_ENABLED=0 go build -o ./bin/otelcontribcol_darwin_amd64 \
                -ldflags "-X github.com/open-telemetry/opentelemetry-collector-contrib/internal/version.GitHash=d869864db -X github.com/open-telemetry/opentelemetry-collector-contrib/internal/version.Version=v0.23.0-62-gd869864db -X go.opentelemetry.io/collector/internal/version.BuildType=release" ./cmd/otelcontribcol
=== RUN   TestLog10kDPS
=== RUN   TestLog10kDPS/OTLP
2021/03/31 15:21:03 Starting mock backend...
2021/03/31 15:21:03 Starting Agent (/Users/pmalek/code/opentelemetry/opentelemetry-collector-contrib-rebase/bin/otelcontribcol_darwin_amd64)
2021/03/31 15:21:03 Writing Agent log to /Users/pmalek/code/opentelemetry/opentelemetry-collector-contrib-rebase/testbed/tests/results/TestLog10kDPS/OTLP/agent.log
2021/03/31 15:21:04 Agent running, pid=56338
2021/03/31 15:21:05 Starting load generator at 10000 items/sec.
2021/03/31 15:21:06 Agent RAM (RES):   0 MiB, CPU: 0.0% | Sent:     10800 items | Received:     9,700 items (3,232/sec)
2021/03/31 15:21:09 Agent RAM (RES):  45 MiB, CPU: 7.3% | Sent:     40800 items | Received:    39,800 items (6,633/sec)
2021/03/31 15:21:12 Agent RAM (RES):  48 MiB, CPU: 9.0% | Sent:     70800 items | Received:    69,900 items (7,766/sec)
2021/03/31 15:21:15 Agent RAM (RES):  48 MiB, CPU: 8.7% | Sent:    100800 items | Received:    99,900 items (8,325/sec)
2021/03/31 15:21:18 Agent RAM (RES):  49 MiB, CPU: 9.0% | Sent:    130800 items | Received:   130,000 items (8,666/sec)
2021/03/31 15:21:20 Stopped generator. Sent:    149900 items
2021/03/31 15:21:20 Gracefully terminating Agent pid=56338, sending SIGTEM...
2021/03/31 15:21:20 Stopping process monitor.
2021/03/31 15:21:20 Agent process stopped, exit code=0
2021/03/31 15:21:20 Sent and received data matches.
2021/03/31 15:21:20 Stopping mock backend...
2021/03/31 15:21:20 Stopped backend. Received:   149,900 items (8,835/sec)
=== RUN   TestLog10kDPS/filelog
2021/03/31 15:21:20 Starting mock backend...
2021/03/31 15:21:20 Starting Agent (/Users/pmalek/code/opentelemetry/opentelemetry-collector-contrib-rebase/bin/otelcontribcol_darwin_amd64)
2021/03/31 15:21:20 Writing Agent log to /Users/pmalek/code/opentelemetry/opentelemetry-collector-contrib-rebase/testbed/tests/results/TestLog10kDPS/filelog/agent.log
2021/03/31 15:21:20 Agent running, pid=57215
2021/03/31 15:21:20 Starting load generator at 10000 items/sec.
2021/03/31 15:21:23 Agent RAM (RES):   7 MiB, CPU: 0.0% | Sent:     29900 items | Received:    26,600 items (8,866/sec)
2021/03/31 15:21:26 Agent RAM (RES):  48 MiB, CPU:16.6% | Sent:     59900 items | Received:    57,400 items (9,567/sec)
2021/03/31 15:21:29 Agent RAM (RES):  49 MiB, CPU:15.7% | Sent:     89900 items | Received:    86,800 items (9,644/sec)
2021/03/31 15:21:32 Agent RAM (RES):  49 MiB, CPU:15.7% | Sent:    119900 items | Received:   117,000 items (9,750/sec)
2021/03/31 15:21:35 Agent RAM (RES):  49 MiB, CPU:15.7% | Sent:    149900 items | Received:   147,100 items (9,806/sec)
2021/03/31 15:21:35 Stopped generator. Sent:    150000 items
2021/03/31 15:21:36 Gracefully terminating Agent pid=57215, sending SIGTEM...
2021/03/31 15:21:36 Stopping process monitor.
2021/03/31 15:21:36 Agent process stopped, exit code=0
2021/03/31 15:21:36 Sent and received data matches.
2021/03/31 15:21:36 Stopping mock backend...
2021/03/31 15:21:36 Stopped backend. Received:   150,000 items (9,713/sec)
=== RUN   TestLog10kDPS/kubernetes_containers
2021/03/31 15:21:36 Starting mock backend...
2021/03/31 15:21:36 Starting Agent (/Users/pmalek/code/opentelemetry/opentelemetry-collector-contrib-rebase/bin/otelcontribcol_darwin_amd64)
2021/03/31 15:21:36 Writing Agent log to /Users/pmalek/code/opentelemetry/opentelemetry-collector-contrib-rebase/testbed/tests/results/TestLog10kDPS/kubernetes_containers/agent.log
2021/03/31 15:21:36 Agent running, pid=58001
2021/03/31 15:21:36 Starting load generator at 10000 items/sec.
2021/03/31 15:21:39 Agent RAM (RES):   9 MiB, CPU: 0.0% | Sent:     29900 items | Received:    26,900 items (8,967/sec)
2021/03/31 15:21:42 Agent RAM (RES):  53 MiB, CPU:44.2% | Sent:     59900 items | Received:    57,000 items (9,499/sec)
2021/03/31 15:21:45 Agent RAM (RES):  53 MiB, CPU:42.3% | Sent:     89900 items | Received:    87,000 items (9,666/sec)
2021/03/31 15:21:48 Agent RAM (RES):  54 MiB, CPU:42.7% | Sent:    119900 items | Received:   116,900 items (9,741/sec)
2021/03/31 15:21:51 Agent RAM (RES):  54 MiB, CPU:42.7% | Sent:    149900 items | Received:   146,900 items (9,793/sec)
2021/03/31 15:21:51 Stopped generator. Sent:    150000 items
2021/03/31 15:21:51 Gracefully terminating Agent pid=58001, sending SIGTEM...
2021/03/31 15:21:51 Stopping process monitor.
2021/03/31 15:21:51 Agent process stopped, exit code=0
2021/03/31 15:21:51 Sent and received data matches.
2021/03/31 15:21:51 Stopping mock backend...
2021/03/31 15:21:51 Stopped backend. Received:   150,000 items (9,772/sec)
=== RUN   TestLog10kDPS/k8s_CRI-Containerd
2021/03/31 15:21:51 Starting mock backend...
2021/03/31 15:21:51 Starting Agent (/Users/pmalek/code/opentelemetry/opentelemetry-collector-contrib-rebase/bin/otelcontribcol_darwin_amd64)
2021/03/31 15:21:51 Writing Agent log to /Users/pmalek/code/opentelemetry/opentelemetry-collector-contrib-rebase/testbed/tests/results/TestLog10kDPS/k8s_CRI-Containerd/agent.log
2021/03/31 15:21:51 Agent running, pid=58802
2021/03/31 15:21:51 Starting load generator at 10000 items/sec.
2021/03/31 15:21:54 Agent RAM (RES):   8 MiB, CPU: 0.0% | Sent:     29900 items | Received:    26,900 items (8,967/sec)
2021/03/31 15:21:57 Agent RAM (RES):  52 MiB, CPU:38.5% | Sent:     59900 items | Received:    56,900 items (9,484/sec)
2021/03/31 15:22:00 Agent RAM (RES):  53 MiB, CPU:37.7% | Sent:     89900 items | Received:    87,100 items (9,678/sec)
2021/03/31 15:22:03 Agent RAM (RES):  53 MiB, CPU:38.3% | Sent:    119900 items | Received:   117,200 items (9,767/sec)
2021/03/31 15:22:06 Agent RAM (RES):  54 MiB, CPU:38.3% | Sent:    149900 items | Received:   147,200 items (9,813/sec)
2021/03/31 15:22:06 Stopped generator. Sent:    150000 items
2021/03/31 15:22:06 Gracefully terminating Agent pid=58802, sending SIGTEM...
2021/03/31 15:22:06 Stopping process monitor.
2021/03/31 15:22:06 Agent process stopped, exit code=0
2021/03/31 15:22:06 Sent and received data matches.
2021/03/31 15:22:06 Stopping mock backend...
2021/03/31 15:22:06 Stopped backend. Received:   150,000 items (9,776/sec)
=== RUN   TestLog10kDPS/k8s_CRI-Containerd_no_attr_ops
2021/03/31 15:22:06 Starting mock backend...
2021/03/31 15:22:06 Starting Agent (/Users/pmalek/code/opentelemetry/opentelemetry-collector-contrib-rebase/bin/otelcontribcol_darwin_amd64)
2021/03/31 15:22:06 Writing Agent log to /Users/pmalek/code/opentelemetry/opentelemetry-collector-contrib-rebase/testbed/tests/results/TestLog10kDPS/k8s_CRI-Containerd_no_attr_ops/agent.log
2021/03/31 15:22:06 Agent running, pid=59573
2021/03/31 15:22:06 Starting load generator at 10000 items/sec.
2021/03/31 15:22:09 Agent RAM (RES):   8 MiB, CPU: 0.0% | Sent:     29900 items | Received:    26,900 items (8,966/sec)
2021/03/31 15:22:12 Agent RAM (RES):  51 MiB, CPU:30.2% | Sent:     59900 items | Received:    56,800 items (9,467/sec)
2021/03/31 15:22:15 Agent RAM (RES):  52 MiB, CPU:31.0% | Sent:     89900 items | Received:    86,800 items (9,644/sec)
2021/03/31 15:22:18 Agent RAM (RES):  52 MiB, CPU:30.7% | Sent:    119900 items | Received:   117,000 items (9,750/sec)
2021/03/31 15:22:21 Agent RAM (RES):  52 MiB, CPU:30.6% | Sent:    149900 items | Received:   146,900 items (9,793/sec)
2021/03/31 15:22:21 Stopped generator. Sent:    150000 items
2021/03/31 15:22:22 Gracefully terminating Agent pid=59573, sending SIGTEM...
2021/03/31 15:22:22 Stopping process monitor.
2021/03/31 15:22:22 Agent process stopped, exit code=0
2021/03/31 15:22:22 Sent and received data matches.
2021/03/31 15:22:22 Stopping mock backend...
2021/03/31 15:22:22 Stopped backend. Received:   150,000 items (9,772/sec)
=== RUN   TestLog10kDPS/CRI-Containerd
2021/03/31 15:22:22 Starting mock backend...
2021/03/31 15:22:22 Starting Agent (/Users/pmalek/code/opentelemetry/opentelemetry-collector-contrib-rebase/bin/otelcontribcol_darwin_amd64)
2021/03/31 15:22:22 Writing Agent log to /Users/pmalek/code/opentelemetry/opentelemetry-collector-contrib-rebase/testbed/tests/results/TestLog10kDPS/CRI-Containerd/agent.log
2021/03/31 15:22:22 Agent running, pid=60377
2021/03/31 15:22:22 Starting load generator at 10000 items/sec.
2021/03/31 15:22:25 Agent RAM (RES):   9 MiB, CPU: 0.0% | Sent:     29900 items | Received:    27,000 items (9,001/sec)
2021/03/31 15:22:28 Agent RAM (RES):  50 MiB, CPU:17.6% | Sent:     59900 items | Received:    56,700 items (9,450/sec)
2021/03/31 15:22:31 Agent RAM (RES):  51 MiB, CPU:17.7% | Sent:     89900 items | Received:    86,700 items (9,633/sec)
2021/03/31 15:22:34 Agent RAM (RES):  51 MiB, CPU:18.7% | Sent:    119900 items | Received:   116,900 items (9,741/sec)
2021/03/31 15:22:37 Agent RAM (RES):  51 MiB, CPU:17.0% | Sent:    149900 items | Received:   146,900 items (9,793/sec)
2021/03/31 15:22:37 Stopped generator. Sent:    150000 items
2021/03/31 15:22:37 Gracefully terminating Agent pid=60377, sending SIGTEM...
2021/03/31 15:22:37 Stopping process monitor.
2021/03/31 15:22:37 Agent process stopped, exit code=0
2021/03/31 15:22:37 Sent and received data matches.
2021/03/31 15:22:37 Stopping mock backend...
2021/03/31 15:22:37 Stopped backend. Received:   150,000 items (9,775/sec)
--- PASS: TestLog10kDPS (93.81s)
    --- PASS: TestLog10kDPS/OTLP (16.97s)
    --- PASS: TestLog10kDPS/filelog (15.44s)
    --- PASS: TestLog10kDPS/kubernetes_containers (15.35s)
    --- PASS: TestLog10kDPS/k8s_CRI-Containerd (15.34s)
    --- PASS: TestLog10kDPS/k8s_CRI-Containerd_no_attr_ops (15.35s)
    --- PASS: TestLog10kDPS/CRI-Containerd (15.35s)
PASS
ok      github.com/open-telemetry/opentelemetry-collector-contrib/testbed/tests 95.082s

@pmalek-sumo
Copy link
Contributor Author

I have also prepared a draft PR #2949 with my approach at adding worker pool to this PR which I believe is a better design due to separation of concerns (smaller, independent units of work, no lock contention, communication via channels). Unfortunately with current state of that PR there a slightly higher resource consumption.

I'm attaching the report from testbed tests:

( make otelcontribcol && cd testbed/ && TEST_ARGS="-run TestLog10kDPS -count 1" ./runtests.sh )
GO111MODULE=on CGO_ENABLED=0 go build -o ./bin/otelcontribcol_darwin_amd64 \
                -ldflags "-X github.com/open-telemetry/opentelemetry-collector-contrib/internal/version.GitHash=cd58655d5 -X github.com/open-telemetry/opentelemetry-collector-contrib/internal/version.Version=v0.23.0-63-gcd58655d5 -X go.opentelemetry.io/collector/internal/version.BuildType=release" ./cmd/otelcontribcol
=== RUN   TestLog10kDPS
=== RUN   TestLog10kDPS/OTLP
2021/03/31 16:25:28 Starting mock backend...
2021/03/31 16:25:28 Starting Agent (/Users/pmalek/code/opentelemetry/opentelemetry-collector-contrib-rebase/bin/otelcontribcol_darwin_amd64)
2021/03/31 16:25:28 Writing Agent log to /Users/pmalek/code/opentelemetry/opentelemetry-collector-contrib-rebase/testbed/tests/results/TestLog10kDPS/OTLP/agent.log
2021/03/31 16:25:29 Agent running, pid=10119
2021/03/31 16:25:30 Starting load generator at 10000 items/sec.
2021/03/31 16:25:31 Agent RAM (RES):   0 MiB, CPU: 0.0% | Sent:     11200 items | Received:     9,900 items (3,300/sec)
2021/03/31 16:25:34 Agent RAM (RES):  46 MiB, CPU: 7.3% | Sent:     41200 items | Received:    39,900 items (6,649/sec)
2021/03/31 16:25:37 Agent RAM (RES):  48 MiB, CPU: 8.3% | Sent:     71200 items | Received:    70,000 items (7,777/sec)
2021/03/31 16:25:40 Agent RAM (RES):  49 MiB, CPU:11.3% | Sent:    101200 items | Received:   100,000 items (8,333/sec)
2021/03/31 16:25:43 Agent RAM (RES):  49 MiB, CPU:10.0% | Sent:    131200 items | Received:   130,100 items (8,673/sec)
2021/03/31 16:25:45 Stopped generator. Sent:    149900 items
2021/03/31 16:25:45 Gracefully terminating Agent pid=10119, sending SIGTEM...
2021/03/31 16:25:45 Stopping process monitor.
2021/03/31 16:25:45 Agent process stopped, exit code=0
2021/03/31 16:25:45 Sent and received data matches.
2021/03/31 16:25:45 Stopping mock backend...
2021/03/31 16:25:45 Stopped backend. Received:   149,900 items (8,804/sec)
=== RUN   TestLog10kDPS/filelog
2021/03/31 16:25:45 Starting mock backend...
2021/03/31 16:25:45 Starting Agent (/Users/pmalek/code/opentelemetry/opentelemetry-collector-contrib-rebase/bin/otelcontribcol_darwin_amd64)
2021/03/31 16:25:45 Writing Agent log to /Users/pmalek/code/opentelemetry/opentelemetry-collector-contrib-rebase/testbed/tests/results/TestLog10kDPS/filelog/agent.log
2021/03/31 16:25:45 Agent running, pid=10176
2021/03/31 16:25:45 Starting load generator at 10000 items/sec.
2021/03/31 16:25:48 Agent RAM (RES):   7 MiB, CPU: 0.0% | Sent:     29900 items | Received:    26,600 items (8,866/sec)
2021/03/31 16:25:51 Agent RAM (RES):  48 MiB, CPU:20.2% | Sent:     59900 items | Received:    56,600 items (9,434/sec)
2021/03/31 16:25:54 Agent RAM (RES):  48 MiB, CPU:20.3% | Sent:     89900 items | Received:    86,800 items (9,644/sec)
2021/03/31 16:25:57 Agent RAM (RES):  49 MiB, CPU:20.7% | Sent:    119900 items | Received:   117,300 items (9,775/sec)
2021/03/31 16:26:00 Agent RAM (RES):  49 MiB, CPU:19.0% | Sent:    149900 items | Received:   147,300 items (9,820/sec)
2021/03/31 16:26:00 Stopped generator. Sent:    150000 items
2021/03/31 16:26:01 Gracefully terminating Agent pid=10176, sending SIGTEM...
2021/03/31 16:26:01 Stopping process monitor.
2021/03/31 16:26:01 Agent process stopped, exit code=0
2021/03/31 16:26:01 Sent and received data matches.
2021/03/31 16:26:01 Stopping mock backend...
2021/03/31 16:26:01 Stopped backend. Received:   150,000 items (9,776/sec)
=== RUN   TestLog10kDPS/kubernetes_containers
2021/03/31 16:26:01 Starting mock backend...
2021/03/31 16:26:01 Starting Agent (/Users/pmalek/code/opentelemetry/opentelemetry-collector-contrib-rebase/bin/otelcontribcol_darwin_amd64)
2021/03/31 16:26:01 Writing Agent log to /Users/pmalek/code/opentelemetry/opentelemetry-collector-contrib-rebase/testbed/tests/results/TestLog10kDPS/kubernetes_containers/agent.log
2021/03/31 16:26:01 Agent running, pid=10228
2021/03/31 16:26:01 Starting load generator at 10000 items/sec.
2021/03/31 16:26:04 Agent RAM (RES):   7 MiB, CPU: 0.0% | Sent:     29900 items | Received:    27,000 items (9,001/sec)
2021/03/31 16:26:07 Agent RAM (RES):  54 MiB, CPU:46.8% | Sent:     59900 items | Received:    56,900 items (9,483/sec)
2021/03/31 16:26:10 Agent RAM (RES):  54 MiB, CPU:48.3% | Sent:     89900 items | Received:    87,100 items (9,678/sec)
2021/03/31 16:26:13 Agent RAM (RES):  55 MiB, CPU:46.7% | Sent:    119900 items | Received:   117,300 items (9,775/sec)
2021/03/31 16:26:16 Agent RAM (RES):  55 MiB, CPU:47.0% | Sent:    149900 items | Received:   146,800 items (9,787/sec)
2021/03/31 16:26:16 Stopped generator. Sent:    150000 items
2021/03/31 16:26:16 Gracefully terminating Agent pid=10228, sending SIGTEM...
2021/03/31 16:26:16 Stopping process monitor.
2021/03/31 16:26:16 Agent process stopped, exit code=0
2021/03/31 16:26:16 Sent and received data matches.
2021/03/31 16:26:16 Stopping mock backend...
2021/03/31 16:26:16 Stopped backend. Received:   150,000 items (9,741/sec)
=== RUN   TestLog10kDPS/k8s_CRI-Containerd
2021/03/31 16:26:16 Starting mock backend...
2021/03/31 16:26:16 Starting Agent (/Users/pmalek/code/opentelemetry/opentelemetry-collector-contrib-rebase/bin/otelcontribcol_darwin_amd64)
2021/03/31 16:26:16 Writing Agent log to /Users/pmalek/code/opentelemetry/opentelemetry-collector-contrib-rebase/testbed/tests/results/TestLog10kDPS/k8s_CRI-Containerd/agent.log
2021/03/31 16:26:16 Agent running, pid=10278
2021/03/31 16:26:16 Starting load generator at 10000 items/sec.
2021/03/31 16:26:19 Agent RAM (RES):   8 MiB, CPU: 0.0% | Sent:     29900 items | Received:    26,900 items (8,966/sec)
2021/03/31 16:26:22 Agent RAM (RES):  53 MiB, CPU:42.2% | Sent:     59900 items | Received:    56,900 items (9,483/sec)
2021/03/31 16:26:25 Agent RAM (RES):  55 MiB, CPU:43.0% | Sent:     89900 items | Received:    86,700 items (9,633/sec)
2021/03/31 16:26:28 Agent RAM (RES):  55 MiB, CPU:41.0% | Sent:    119900 items | Received:   116,900 items (9,741/sec)
2021/03/31 16:26:31 Agent RAM (RES):  56 MiB, CPU:42.3% | Sent:    149900 items | Received:   147,100 items (9,806/sec)
2021/03/31 16:26:31 Stopped generator. Sent:    150000 items
2021/03/31 16:26:31 Gracefully terminating Agent pid=10278, sending SIGTEM...
2021/03/31 16:26:31 Stopping process monitor.
2021/03/31 16:26:32 Agent process stopped, exit code=0
2021/03/31 16:26:32 Sent and received data matches.
2021/03/31 16:26:32 Stopping mock backend...
2021/03/31 16:26:32 Stopped backend. Received:   150,000 items (9,753/sec)
=== RUN   TestLog10kDPS/k8s_CRI-Containerd_no_attr_ops
2021/03/31 16:26:32 Starting mock backend...
2021/03/31 16:26:32 Starting Agent (/Users/pmalek/code/opentelemetry/opentelemetry-collector-contrib-rebase/bin/otelcontribcol_darwin_amd64)
2021/03/31 16:26:32 Writing Agent log to /Users/pmalek/code/opentelemetry/opentelemetry-collector-contrib-rebase/testbed/tests/results/TestLog10kDPS/k8s_CRI-Containerd_no_attr_ops/agent.log
2021/03/31 16:26:32 Agent running, pid=10329
2021/03/31 16:26:32 Starting load generator at 10000 items/sec.
2021/03/31 16:26:35 Agent RAM (RES):   8 MiB, CPU: 0.0% | Sent:     29900 items | Received:    26,800 items (8,934/sec)
2021/03/31 16:26:38 Agent RAM (RES):  51 MiB, CPU:38.2% | Sent:     59900 items | Received:    57,000 items (9,500/sec)
2021/03/31 16:26:41 Agent RAM (RES):  52 MiB, CPU:39.6% | Sent:     89900 items | Received:    86,800 items (9,644/sec)
2021/03/31 16:26:44 Agent RAM (RES):  52 MiB, CPU:38.7% | Sent:    119900 items | Received:   117,000 items (9,750/sec)
2021/03/31 16:26:47 Agent RAM (RES):  52 MiB, CPU:39.7% | Sent:    149900 items | Received:   147,300 items (9,820/sec)
2021/03/31 16:26:47 Stopped generator. Sent:    150000 items
2021/03/31 16:26:47 Gracefully terminating Agent pid=10329, sending SIGTEM...
2021/03/31 16:26:47 Stopping process monitor.
2021/03/31 16:26:47 Agent process stopped, exit code=0
2021/03/31 16:26:47 Sent and received data matches.
2021/03/31 16:26:47 Stopping mock backend...
2021/03/31 16:26:47 Stopped backend. Received:   150,000 items (9,760/sec)
=== RUN   TestLog10kDPS/CRI-Containerd
2021/03/31 16:26:47 Starting mock backend...
2021/03/31 16:26:47 Starting Agent (/Users/pmalek/code/opentelemetry/opentelemetry-collector-contrib-rebase/bin/otelcontribcol_darwin_amd64)
2021/03/31 16:26:47 Writing Agent log to /Users/pmalek/code/opentelemetry/opentelemetry-collector-contrib-rebase/testbed/tests/results/TestLog10kDPS/CRI-Containerd/agent.log
2021/03/31 16:26:47 Agent running, pid=10379
2021/03/31 16:26:47 Starting load generator at 10000 items/sec.
2021/03/31 16:26:50 Agent RAM (RES):   8 MiB, CPU: 0.0% | Sent:     29900 items | Received:    26,800 items (8,935/sec)
2021/03/31 16:26:53 Agent RAM (RES):  50 MiB, CPU:25.6% | Sent:     59900 items | Received:    57,200 items (9,534/sec)
2021/03/31 16:26:56 Agent RAM (RES):  52 MiB, CPU:27.0% | Sent:     89900 items | Received:    87,000 items (9,667/sec)
2021/03/31 16:26:59 Agent RAM (RES):  52 MiB, CPU:25.3% | Sent:    119900 items | Received:   117,400 items (9,783/sec)
2021/03/31 16:27:02 Agent RAM (RES):  53 MiB, CPU:26.0% | Sent:    149900 items | Received:   147,200 items (9,813/sec)
2021/03/31 16:27:02 Stopped generator. Sent:    150000 items
2021/03/31 16:27:02 Gracefully terminating Agent pid=10379, sending SIGTEM...
2021/03/31 16:27:02 Stopping process monitor.
2021/03/31 16:27:02 Agent process stopped, exit code=0
2021/03/31 16:27:02 Sent and received data matches.
2021/03/31 16:27:02 Stopping mock backend...
2021/03/31 16:27:02 Stopped backend. Received:   150,000 items (9,759/sec)
--- PASS: TestLog10kDPS (93.90s)
    --- PASS: TestLog10kDPS/OTLP (17.03s)
    --- PASS: TestLog10kDPS/filelog (15.34s)
    --- PASS: TestLog10kDPS/kubernetes_containers (15.40s)
    --- PASS: TestLog10kDPS/k8s_CRI-Containerd (15.38s)
    --- PASS: TestLog10kDPS/k8s_CRI-Containerd_no_attr_ops (15.37s)
    --- PASS: TestLog10kDPS/CRI-Containerd (15.37s)
PASS
ok      github.com/open-telemetry/opentelemetry-collector-contrib/testbed/tests 95.274s

cc @djaglowski @tigrannajaryan

@djaglowski
Copy link
Member

Worker pool aside for the moment, I do see an improvement in performance on these changes alone, and the code looks good.

I think we just need a few more tests to meet code coverage standards.

Relevant results of make e2e-test run locally on main:

Test                                    |Result|Duration|CPU Avg%|CPU Max%|RAM Avg MiB|RAM Max MiB|Sent Items|Received Items|
----------------------------------------|------|-------:|-------:|-------:|----------:|----------:|---------:|-------------:|
Log10kDPS/filelog                       |PASS  |     15s|    26.6|    28.3|         41|         48|    149900|        149900|
Log10kDPS/kubernetes_containers         |PASS  |     15s|    58.4|    59.3|         46|         55|    150000|        150000|
Log10kDPS/k8s_CRI-Containerd            |PASS  |     15s|    55.4|    56.0|         44|         52|    149900|        149900|
Log10kDPS/k8s_CRI-Containerd_no_attr_ops|PASS  |     15s|    47.8|    49.4|         44|         52|    150000|        150000|
Log10kDPS/CRI-Containerd                |PASS  |     15s|    32.4|    33.0|         42|         50|    149900|        149900|

The same on this branch:

Test                                    |Result|Duration|CPU Avg%|CPU Max%|RAM Avg MiB|RAM Max MiB|Sent Items|Received Items|
----------------------------------------|------|-------:|-------:|-------:|----------:|----------:|---------:|-------------:|
Log10kDPS/filelog                       |PASS  |     15s|    22.9|    23.9|         40|         48|    150000|        150000|
Log10kDPS/kubernetes_containers         |PASS  |     15s|    55.7|    57.0|         44|         52|    150000|        150000|
Log10kDPS/k8s_CRI-Containerd            |PASS  |     15s|    52.8|    55.4|         44|         52|    150000|        150000|
Log10kDPS/k8s_CRI-Containerd_no_attr_ops|PASS  |     15s|    42.8|    43.7|         44|         52|    150000|        150000|
Log10kDPS/CRI-Containerd                |PASS  |     15s|    25.8|    27.5|         42|         50|    149900|        149900|

@tigrannajaryan
Copy link
Member

@djaglowski do you mind if I assign this to you to review?

@djaglowski
Copy link
Member

@tigrannajaryan No problem!

Copy link
Member

@tigrannajaryan tigrannajaryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pmalek-sumo would it be possible for you to add an ASCII diagram that shows how Converter works, what goroutines exist, how they interact using channels? It would help tremendously to understand the logic.

@pmalek-sumo
Copy link
Contributor Author

pmalek-sumo commented Apr 7, 2021

@pmalek-sumo would it be possible for you to add an ASCII diagram that shows how Converter works, what goroutines exist, how they interact using channels? It would help tremendously to understand the logic.

I can try to sketch something up 👍

@pmalek-sumo pmalek-sumo force-pushed the issue-2330-improved-logs-translation branch from d869864 to b69f891 Compare April 9, 2021 11:39
@pmalek-sumo
Copy link
Contributor Author

@djaglowski @tigrannajaryan

I've added a diagram in converter.go. I do realize that the design is not ideal hence my proposal to pursue the workerpool approach at a later time (when this gets merged? 🤞 )

Copy link
Member

@djaglowski djaglowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me. Nice diagram!

Copy link
Member

@tigrannajaryan tigrannajaryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @pmalek-sumo

@tigrannajaryan tigrannajaryan merged commit 1638363 into open-telemetry:main Apr 9, 2021
@pmalek-sumo pmalek-sumo deleted the issue-2330-improved-logs-translation branch April 13, 2021 16:57
pmatyjasek-sumo pushed a commit to pmatyjasek-sumo/opentelemetry-collector-contrib that referenced this pull request Apr 28, 2021
Introduce an aggregation layer to internal/stanza that translates [entry.Entry](https://github.com/open-telemetry/opentelemetry-log-collection/blob/83ae56123ba0bd4cd284c3a20ed7450a606af513/entry/entry.go#L43-L51) into pdata.Logs aggregating logs coming from the same Resource into one entry.

**Link to tracking Issue:** open-telemetry#2330

**Testing:** unit tests added
alexperez52 referenced this pull request in open-o11y/opentelemetry-collector-contrib Aug 18, 2021
Bumps [github.com/golangci/golangci-lint](https://github.com/golangci/golangci-lint) from 1.38.0 to 1.39.0.
- [Release notes](https://github.com/golangci/golangci-lint/releases)
- [Changelog](https://github.com/golangci/golangci-lint/blob/master/CHANGELOG.md)
- [Commits](golangci/golangci-lint@v1.38.0...v1.39.0)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
mstumpfx referenced this pull request in mstumpfx/opentelemetry-collector-contrib Aug 31, 2021
Introduce an aggregation layer to internal/stanza that translates [entry.Entry](https://github.com/open-telemetry/opentelemetry-log-collection/blob/83ae56123ba0bd4cd284c3a20ed7450a606af513/entry/entry.go#L43-L51) into pdata.Logs aggregating logs coming from the same Resource into one entry.

**Link to tracking Issue:** open-telemetry#2330

**Testing:** unit tests added
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants