Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RUM-699 fix: RUM context not being linked to started span #1615

Merged

Conversation

ncreated
Copy link
Collaborator

@ncreated ncreated commented Jan 5, 2024

What and why?

🐞 Fixes the bug of RUM information not being linked correctly on tracer.startSpan().

Screenshot 2024-01-05 at 14 31 46

How?

The problem was introduced in V2 where we started sending RUM information in DatadogContext by propagating its updates on message bus. Because message bus adds one more asynchronous call, the information was not available in to tracer at the right time:

rumMonitor.startView(key: "view", name: "FooView")
let span = tracer.startSpan(operationName: "span") // <--- here, we haven't yet 
                                                   // seen Foo's `view.id` in `DatadogContext`
// ...
span.finish()

Under the hood, the propagation model was this:

  • (in RUM) set RUM info in DatadogContext (async on context queue)
  • (in Core) broadcast DatadogContext update to all modules (async on message bus queue)
  • (in Trace) receive DatadogContext and store RUM info in MessageReceiver
  • (in Trace) on starting span: capture received RUM info from MessageReceiver
  • (in Trace) on finishing span: create & write span (async on context queue)

It is now simplified and changed to:

  • (in RUM) set RUM info in DatadogContext (async on context queue)
  • (in Trace) on starting span: capture latest DatadogContext (async on context queue)
  • (in Trace) on finishing span: create & write span with DatadogContext captured on starting it (async on context queue)

Ultimately, RUM context is no longer passed to Trace through message bus and MessageReceiver. Instead, it is queried by DatadogTrace with the new capability added to FeatureScope:

public protocol FeatureScope {
    /// Retrieve the core context.
    ///
    /// A feature can use this method to request the Datadog context valid at the moment of the call.
    ///
    /// - Parameter block: The block to execute; it is called on the context queue.
    func context(_ block: @escaping (DatadogContext) -> Void)

    // ...
}

🎁 Part of the requirement to enable this work was removing tracer.queue and replacing it with @ReadWriteLock to synchronise tracer and span states. According to my benchmark measures, there is no performance change coming from this PR (both in Tracer and in DatadogCore).

Review checklist

  • Feature or bugfix MUST have appropriate tests (unit, integration)
  • Make sure each commit and the PR mention the Issue number or JIRA reference
  • Add CHANGELOG entry for user facing changes

Custom CI job configuration (optional)

  • Run unit tests for Core, RUM, Trace, Logs, CR and WVT
  • Run unit tests for Session Replay
  • Run integration tests
  • Run smoke tests
  • Run tests for tools/

@ncreated ncreated self-assigned this Jan 5, 2024
@ncreated ncreated force-pushed the ncreated/RUM-699/fix-rum-context-not-associating-with-spans branch from 8d089ce to 271509b Compare January 5, 2024 13:30
@datadog-datadog-prod-us1
Copy link

datadog-datadog-prod-us1 bot commented Jan 5, 2024

Datadog Report

Branch report: ncreated/RUM-699/fix-rum-context-not-associating-with-spans
Commit report: 2ecfae1
Test service: dd-sdk-ios

✅ 0 Failed, 2739 Passed, 0 Skipped, 12m 40.75s Wall Time
🔻 Test Sessions change in coverage: 9 decreased, 5 increased

🔻 Code Coverage Decreases vs Default Branch (9)

This report shows up to 5 code coverage decreases.

  • test DatadogLogsTests iOS 44.73% (Δ-0.16%) - Details
  • test DatadogLogsTests tvOS 44.80% (Δ-0.15%) - Details
  • test DatadogInternalTests tvOS 81.21% (Δ-0.06%) - Details
  • test DatadogRUMTests iOS 79.38% (Δ-0.03%) - Details
  • test DatadogWebViewTrackingTests iOS 23.05% (Δ-0.02%) - Details

@ncreated ncreated marked this pull request as ready for review January 5, 2024 13:59
@ncreated ncreated requested review from a team as code owners January 5, 2024 13:59
@ncreated ncreated marked this pull request as draft January 5, 2024 14:02
@ncreated ncreated force-pushed the ncreated/RUM-699/fix-rum-context-not-associating-with-spans branch from 271509b to 1845541 Compare January 5, 2024 14:10
@ncreated ncreated marked this pull request as ready for review January 5, 2024 14:10
Comment on lines -323 to -329
// Still on context thread: send `Writer` to EWC caller. The writer implements `AsyncWriter`, so
// the implementation of `writer.write(value:)` will run asynchronously without blocking the context thread.
do {
try block(context, writer)
} catch {
telemetry.error("Failed to execute feature scope", error: error)
}
Copy link
Collaborator Author

@ncreated ncreated Jan 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is intentional and makes the block() not throwing. Rationale is: the core being agnostic has no idea on how to handle or recover from the error that occurs in block passed to Feature (like RUM or SR). We had zero occurrence of "Failed to execute feature scope" in telemetry as no existing implementation of this block actually throws.

maxep
maxep previously approved these changes Jan 10, 2024
Copy link
Member

@maxep maxep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! so glad to see the tracer queue removed 🎉

ganeshnj
ganeshnj previously approved these changes Jan 11, 2024
Copy link
Contributor

@ganeshnj ganeshnj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good cleanup. Nice work.

@ncreated ncreated dismissed stale reviews from ganeshnj and maxep via 947f8b8 January 11, 2024 09:12
@ncreated ncreated force-pushed the ncreated/RUM-699/fix-rum-context-not-associating-with-spans branch from 1845541 to 947f8b8 Compare January 11, 2024 09:12
@ncreated ncreated force-pushed the ncreated/RUM-699/fix-rum-context-not-associating-with-spans branch from 7efade8 to 2ecfae1 Compare January 11, 2024 09:45
@ncreated ncreated merged commit 3686259 into develop Jan 11, 2024
8 checks passed
@ncreated ncreated deleted the ncreated/RUM-699/fix-rum-context-not-associating-with-spans branch January 11, 2024 10:15
@ncreated ncreated mentioned this pull request Jan 25, 2024
8 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants