Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: span compression bug where a buffered span would not be sent when an incompressible sibling ended #3076

Merged
merged 4 commits into from Dec 20, 2022

Conversation

trentm
Copy link
Member

@trentm trentm commented Dec 19, 2022

https://github.com/elastic/apm/blob/main/specs/agents/handling-huge-traces/tracing-spans-compress.md#span-buffering says:

A buffered span gets reported when

  1. its parent ends
  2. a non-compressible sibling ends

Before this change, case 2 was not being handled. Further, a possibly
buffered span on the ending sibling would incorrectly get sent twice.
The scenario is as follows:

   trans t0
     span s1: _bufferdSpan=s2
       span s2: ended, compressible -> buffered on s1
       span s3: incompressible, _bufferdSpan=s4
         span s4: ended, compressible -> buffered on s3

What happens when s3 ends? We expect:

  • s4 is encoded and sent
  • s2 is encoded and sent
  • s3 is encoded and sent

Before this change the agent would send: s4, s4, s3

Refs: https://discuss.elastic.co/t/apm-agent-send-duplicate-record-in-mysql/321440

…n an incompressible sibling ended

https://github.com/elastic/apm/blob/main/specs/agents/handling-huge-traces/tracing-spans-compress.md#span-buffering says:
> A buffered span gets reported when
>  1. its parent ends
>  2. a non-compressible sibling ends

Before this change, case 2 was not being handled. Further, a possibly
buffered span on the ending sibling would incorrectly get sent *twice*.
The scenario is as follows:

   trans t0
     span s1: _bufferdSpan=s2
       span s2: ended, compressible -> buffered on s1
       span s3: incompressible, _bufferdSpan=s4
         span s4: ended, compressible -> buffered on s3

What happens when s3 ends? We expect:
- s4 is encoded and sent
- s2 is encoded and sent
- s3 is encoded and sent

Before this change the agent would send: s4, s4, s3

Refs: https://discuss.elastic.co/t/apm-agent-send-duplicate-record-in-mysql/321440
@github-actions github-actions bot added the agent-nodejs Make available for APM Agents project planning. label Dec 19, 2022
@elastic-apm-tech elastic-apm-tech added this to In Progress in APM-Agents (OLD) Dec 19, 2022
@trentm trentm self-assigned this Dec 19, 2022
@apmmachine
Copy link
Collaborator

apmmachine commented Dec 19, 2022

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2022-12-20T20:18:55.528+0000

  • Duration: 28 min 38 sec

Test stats 🧪

Test Results
Failed 0
Passed 320238
Skipped 0
Total 320238

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • run module tests for <modules> : Run TAV tests for one or more modules, where <modules> can be either a comma separated list of modules (e.g. memcached,redis) or the string literal ALL to test all modules

  • run benchmark tests : Run the benchmark test only.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

@@ -414,24 +414,26 @@ Instrumentation.prototype.addEndedSpan = function (span) {
// if I have ended and I have something buffered, send that buffered thing
if (span.getBufferedSpan()) {
this._encodeAndSendSpan(span.getBufferedSpan())
span.setBufferedSpan(null)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

REVIEW NOTE: This clearing avoids the bug where we would encode and send a span twice -- the first time here, the second time in the next if-block.

Comment on lines +422 to +425
const buffered = parentSpan && parentSpan.getBufferedSpan()
if (buffered) {
this._encodeAndSendSpan(buffered)
span.setBufferedSpan(null)
parentSpan.setBufferedSpan(null)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

REVIEW NOTE: This corrects the logic to make the spec pseudo code: this block is meant to be able looking at the possibly buffered span on the parent. This fixes the case where a buffered span would not get sent if it was followed by an incompressible sibling span.

@trentm
Copy link
Member Author

trentm commented Dec 20, 2022

Here is a snippet of the output of a test run showing the newly added test failing before the fix:

# ensure a span buffered on parent is sent when incompressible span ends
ok 18 s2 is a child of s1
ok 19 s2 is buffered on s1
ok 20 s3 is a child of s1
ok 21 s4 is a child of s3
ok 22 s4 is buffered on s3
ok 23 3 events sent
ok 24 3 spans sent
not ok 25 sent spans are s4,s2,s3; in that order
  ---
    operator: equal
    expected: 's4,s2,s3'
    actual:   's4,s4,s3'
    at: <anonymous> (/Users/trentm/el/apm-agent-nodejs6/test/instrumentation/span-compression.test.js:185:9)
    stack: |-
      Error: sent spans are s4,s2,s3; in that order
          at Test.assert [as _assert] (/Users/trentm/el/apm-agent-nodejs6/node_modules/tape/lib/test.js:312:48)
          at Test.bound [as _assert] (/Users/trentm/el/apm-agent-nodejs6/node_modules/tape/lib/test.js:95:17)
          at Test.strictEqual (/Users/trentm/el/apm-agent-nodejs6/node_modules/tape/lib/test.js:476:7)
          at Test.bound [as equals] (/Users/trentm/el/apm-agent-nodejs6/node_modules/tape/lib/test.js:95:17)
          at /Users/trentm/el/apm-agent-nodejs6/test/instrumentation/span-compression.test.js:185:9
          at Timeout._onTimeout (/Users/trentm/el/apm-agent-nodejs6/test/_mock_http_client.js:80:7)
          at listOnTimeout (node:internal/timers:559:17)
          at processTimers (node:internal/timers:502:7)
  ...

1..25
# tests 25
# pass  24
# fail  1

@trentm trentm merged commit 01effc5 into main Dec 20, 2022
@trentm trentm deleted the trentm/span-compression-dupe-spans branch December 20, 2022 21:48
APM-Agents (OLD) automation moved this from In Progress to Done Dec 20, 2022
@trentm trentm mentioned this pull request Dec 21, 2022
@trentm
Copy link
Member Author

trentm commented Dec 21, 2022

elastic-apm-node@3.41.1 is published with this fix now.

v1v added a commit to v1v/apm-agent-nodejs that referenced this pull request Jan 16, 2023
…lemetry-actions

* upstream/main: (148 commits)
  chore(deps): bump jsonwebtoken and @azure/msal-node (elastic#3087)
  fix bitrot.js dev-util to work for Next.js versions (elastic#3086)
  synchronize json schema specs (elastic#3082)
  chore(deps): bump json5 from 1.0.1 to 1.0.2 (elastic#3085)
  synchronize json schema specs (elastic#3078)
  chore(deps-dev): bump fastify from 4.10.2 to 4.11.0 (elastic#3083)
  chore(deps): bump next (elastic#3081)
  docs: fix header name with sampled flag (elastic#3069)
  3.41.1 (elastic#3077)
  fix: span compression bug where a buffered span would not be sent when an incompressible sibling ended (elastic#3076)
  chore(deps-dev): bump wait-on from 6.0.1 to 7.0.1 (elastic#3075)
  chore(deps-dev): bump undici from 5.12.0 to 5.14.0 (elastic#3068)
  chore(deps-dev): bump got from 11.8.5 to 11.8.6 (elastic#3067)
  chore(deps-dev): bump koa from 2.13.4 to 2.14.1 (elastic#3066)
  chore(deps-dev): bump @hapi/hapi from 21.0.0 to 21.1.0 (elastic#3058)
  chore(deps-dev): bump @fastify/formbody from 7.3.0 to 7.4.0 (elastic#3057)
  3.41.0 (elastic#3064)
  Support publishing a snapshot build for each commit to main (elastic#3050)
  fix: Add `tracestate` to the `TransactionOptions` TypeScript type (elastic#3063)
  fix: avoid IPv4 vs IPv6 ambiguity in default 'serverUrl' by using '127.0.0.1' rather than 'localhost' (elastic#3049)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
agent-nodejs Make available for APM Agents project planning.
Projects
Development

Successfully merging this pull request may close these issues.

None yet

2 participants