Skip to content

chore(rfc): Datadog traces (from trace-agent) early support#9634

Merged
prognant merged 20 commits intomasterfrom
prognant/rfc/support-datadog-traces
Dec 16, 2021
Merged

chore(rfc): Datadog traces (from trace-agent) early support#9634
prognant merged 20 commits intomasterfrom
prognant/rfc/support-datadog-traces

Conversation

@prognant
Copy link
Copy Markdown
Contributor

@prognant prognant commented Oct 15, 2021

closes #9572

Quick summary of this RFC:

  • It brings context to implement a datadog traces (traces coming out of the trace-agent) source & sink
  • It suggests to implement a (relatively) vendor agnostic trace struct inside vector-core that would allow lossless Datadog traces & OTLP traces representation.

rendered

Signed-off-by: prognant <pierre.rognant@datadoghq.com>
Signed-off-by: prognant <pierre.rognant@datadoghq.com>
Signed-off-by: prognant <pierre.rognant@datadoghq.com>
@netlify
Copy link
Copy Markdown

netlify Bot commented Oct 15, 2021

✔️ Deploy Preview for vector-project canceled.

🔨 Explore the source changes: 636b05e

🔍 Inspect the deploy log: https://app.netlify.com/sites/vector-project/deploys/61b9f1c2b84e18000854b4ca

Signed-off-by: prognant <pierre.rognant@datadoghq.com>
Comment on lines +90 to +95
* User will be able to ingest trace from the trace agent
* Vector config would then consist in: `datadog_trace` source -> some filtering/enrichment transform ->
`datadog_trace` sink
* Datadog trace agent can be configured to send traces to any arbitrary endpoint using the `apm_config.apm_dd_url`
[config key](https://github.com/DataDog/datadog-agent/blob/34a5589/pkg/config/apm.go#L61-L87)
* This change is a pure addition to Vector, there will be no impact on existing feature
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't we want to support the datadog_agent -> datadog_trace pipeline? And enabling trace forwarding would be similar to DataDog/datadog-agent#9450 in that the user would configure the agent like so:

vector.traces.enabled: true
vector.traces.url: http://vector.company.tld/

Copy link
Copy Markdown
Contributor Author

@prognant prognant Oct 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, this makes sense, and the other way around, if a user want to split traces ingestion from logs/metrics ingestion a separate source could still be added to the topology. This however might push for additional options to enable/disable some datatype per datadog_agent source.

Comment thread rfcs/2021-10-15-9572-accept-datadog-traces.md Outdated
Comment thread rfcs/2021-10-15-9572-accept-datadog-traces.md Outdated
Comment thread rfcs/2021-10-15-9572-accept-datadog-traces.md Outdated
Comment thread rfcs/2021-10-15-9572-accept-datadog-traces.md Outdated
Comment thread rfcs/2021-10-15-9572-accept-datadog-traces.md Outdated
Comment thread rfcs/2021-10-15-9572-accept-datadog-traces.md Outdated
Comment thread rfcs/2021-10-15-9572-accept-datadog-traces.md Outdated
Comment thread rfcs/2021-10-15-9572-accept-datadog-traces.md Outdated
Comment thread rfcs/2021-10-15-9572-accept-datadog-traces.md Outdated
Comment thread rfcs/2021-10-15-9572-accept-datadog-traces.md Outdated
Comment thread rfcs/2021-10-15-9572-accept-datadog-traces.md Outdated
Comment thread rfcs/2021-10-15-9572-accept-datadog-traces.md Outdated
Comment thread rfcs/2021-10-15-9572-accept-datadog-traces.md Outdated
Comment thread rfcs/2021-10-15-9572-accept-datadog-traces.md Outdated
Comment thread rfcs/2021-10-15-9572-accept-datadog-traces.md Outdated
Comment thread rfcs/2021-10-15-9572-accept-datadog-traces.md Outdated
Comment thread rfcs/2021-10-15-9572-accept-datadog-traces.md Outdated
Comment thread rfcs/2021-10-15-9572-accept-datadog-traces.md Outdated
Comment thread rfcs/2021-10-15-9572-accept-datadog-traces.md Outdated
Comment thread rfcs/2021-10-15-9572-accept-datadog-traces.md Outdated
Comment thread rfcs/2021-10-15-9572-accept-datadog-traces.md Outdated
Comment thread rfcs/2021-10-15-9572-accept-datadog-traces.md Outdated
Comment thread rfcs/2021-10-15-9572-accept-datadog-traces.md Outdated
Comment thread rfcs/2021-10-15-9572-accept-datadog-traces.md Outdated
Signed-off-by: prognant <pierre.rognant@datadoghq.com>
Signed-off-by: prognant <pierre.rognant@datadoghq.com>
@bits-bot
Copy link
Copy Markdown

bits-bot commented Nov 3, 2021

CLA assistant check
All committers have signed the CLA.

Copy link
Copy Markdown
Contributor

@binarylogic binarylogic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to understand the decision better to use log events before approving this, since the high-level UX is affected by that.

Comment thread rfcs/2021-10-15-9572-accept-datadog-traces.md Outdated
Comment thread rfcs/2021-10-15-9572-accept-datadog-traces.md Outdated
Comment thread rfcs/2021-10-15-9572-accept-datadog-traces.md
@binarylogic
Copy link
Copy Markdown
Contributor

See #9634 (comment) for my updated opinion. Once that is reflected in here I will approve.

@prognant prognant requested a review from binarylogic December 13, 2021 12:42
Comment thread rfcs/2021-10-15-9572-accept-datadog-traces.md Outdated
Comment on lines +127 to +131
To keep vector-core as generic as possible, the first implementation will decode datadog traces as `LogEvent`, the
resulting event will be deeper than usual but this should not be a problem. To keep track that those `LogEvent`s are
traces, we could add a simple `Option<bool>` to the `EventMetadata` struct. However this may not be immediately
required. While being naive this does not predates future vector evolution towards event traits by not adding a new
concrete type into the `Event` enum.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was chatting with @lukesteensen this morning about this and I think there may be some confusion here about the modeling. We were thinking that we would add a new enum type, Trace, to Event and just have it wrap LogEvent. That is:

enum Event {
  // ....
  Trace(LogEvent)
}

This would allow us to treat traces as a first-class type in Vector, but reuse most of the LogEvent implementation, for now at least.

Comment on lines +127 to +129
To keep vector-core as generic as possible, the first implementation will decode datadog traces as `LogEvent`, the
resulting event will be deeper than usual but this should not be a problem. To keep track that those `LogEvent`s are
traces, we could add a simple `Option<bool>` to the `EventMetadata` struct. However this may not be immediately
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think previously we had discussed using a new enum variant on Event to keep things straight and ensure traces cannot simply be used in all the same way logs can without us explicitly enabling it. So the current definition of Event would change to something like:

pub enum Event {
    Log(LogEvent),
    Metric(Metric),
    Trace(LogEvent),
}

This has the same benefit of reusing internals, but the additional benefit of locking down what we can do with traces in these early days. If they're treated as logs from a user or even topology perspective, we could run into weird incompatibilities or difficulty changing uses if/when we decide to implement a more specific API for traces.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, guess I should have refreshed before posting this 😄 @jszwedko was too quick.

prognant and others added 2 commits December 15, 2021 11:05
@prognant
Copy link
Copy Markdown
Contributor Author

prognant commented Dec 15, 2021

@jszwedko @lukesteensen I don't remember we spoke about this this solution, or I missed something - my fault. Anyway it is a very convenient solution to have a dedicated trace type while leveraging existing code. I amended the RFC to reflect that.

@prognant prognant changed the title chore(rfcs): Datadog traces (from trace-agent) early support chore(rfc): Datadog traces (from trace-agent) early support Dec 15, 2021
Copy link
Copy Markdown
Collaborator

@jszwedko jszwedko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @prognant !

@prognant prognant enabled auto-merge (squash) December 15, 2021 15:55
Copy link
Copy Markdown
Contributor

@binarylogic binarylogic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent work

@prognant prognant merged commit 717dcac into master Dec 16, 2021
@prognant prognant deleted the prognant/rfc/support-datadog-traces branch December 16, 2021 15:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[RFC] Write a RFC about ingesting trace from the Datadog trace agent

9 participants