Skip to content

Conversation

@duncanpharvey
Copy link
Collaborator

@duncanpharvey duncanpharvey commented Mar 28, 2025

What does this PR do?

Migrate trace-mini-agent crate from libdatadog to this repo.

Motivation

Develop library crates used solely by Serverless independently from libdatadog.

https://datadoghq.atlassian.net/browse/SVLS-6543

Additional Notes

  • Renamed from trace-mini-agent to trace-agent

Commands to migrate crate along with commit history from libdatadog.

cd libdatadog
git subtree split -P trace-mini-agent -b duncan-harvey/split-trace-mini-agent
git checkout duncan-harvey/split-trace-mini-agent
git push

cd serverless-components
git checkout -b duncan-harvey/trace-mini-agent
git subtree add -P crates/trace-mini-agent git@github.com:DataDog/libdatadog.git duncan-harvey/split-trace-mini-agent
git push

Describe how to test/QA your changes

Update trace-agent dependency in Cargo.toml to reference the latest commit in this feature branch.

trace-agent = { git = "https://github.com/DataDog/serverless-components/", rev = "b59a13734db60b36300fcfa2528760d3b3d6b18c" }

For testing Azure Functions I updated the dependency in the datadog-serverless-trace-mini-agent crate, built the libdatadog binaries, deployed them to Azure, and confirmed both custom and runtime metrics are sent to Datadog successfully.

thedavl and others added 30 commits April 4, 2023 16:08
* Implemented tracing and agent sampling in sidecar

* Address CR feedback

Signed-off-by: Bob Weinand <bob.weinand@datadoghq.com>

* Polyfill memfd on old glibc targets

Signed-off-by: Bob Weinand <bob.weinand@datadoghq.com>

* Small nit from CR applied

Signed-off-by: Bob Weinand <bob.weinand@datadoghq.com>

---------

Signed-off-by: Bob Weinand <bob.weinand@datadoghq.com>
* Explicitly mark all tests where MIRI doesn't run

* Update the github action to run miri on the entire library
* Only setting the timeout to 0 ms is not enough, and there was a race
  in the test between the dns lookup and the timeout that intermittently
  failed the test (most likely on Windows).
duncanpharvey and others added 18 commits October 15, 2024 10:37
* add span tags for azure spring apps

* fix unit test
* feat: Prefer DD_PROXY_HTTPS over HTTPS_PROXY

* fix: no proxy on ints

* fix: clippy thx
These features will help us migrate to 1.x, as suggested in the
migration guide: https://hyper.rs/guides/1/upgrading/ .
* fix: potentially dangling temporary

I was trying out the Rust 2024 edition which required a new nightly.
There's now a warning for dangling_pointers_from_temporaries. This
could be a false positive but I am not certain.

* test: skip problematic miri tests on MacOS

These are all due to kqueue, example:

```
error: unsupported operation: can't call foreign function `kqueue` on OS `macos`
   --> /levi.morrison/.cargo/registry/src/index.crates.io-6f17d22bba15001f/mio-1.0.2/src/sys/unix/selector/kqueue.rs:76:48
    |
76  |         let kq = unsafe { OwnedFd::from_raw_fd(syscall!(kqueue())?) };
    |                                                ^^^^^^^^^^^^^^^^^^ can't call foreign function `kqueue` on OS `macos`
```

* test: fix env test when running locally

If the local env has this set, it fails.
info is too noisy for customers
…gle Cloud Functions (#770)

* Add _dd.gcrfx.resource_name to mini agent for Google Cloud Functions

* fix lint

* fix lint

* add create_test_gcp_span

* add create_test_gcp_span

* update test_process_trace

* update tests

* simplify function

* reformat file

* reformat file

* FIX TESTS!!!!!!

* move helpers function

* move helpers function

* lint

* update code from comments

* update enrich_span_with_google_cloud_function_metadata

* lint

* update tag to remove _dd prefix

* lint

* update enrich_span_with_google_cloud_function_metadata to todo instead of return
- APMSP-1756 add initial integration tests for trace exporter
- APMSP-1779 add span deserialization and Payload collection construction to send data integration tests
- Return the offending key in the error message when decoding v04 msgpack spans and we encounter an invalid key
- Fix confusing var name in dogstatsd client from PR #890
* add `aggregator.rs`

adds an aggregator which gives batches limited by intake payload size

* update `TraceFlusher` to use the `TraceAggregator`

* make `MiniAgent` use `TraceAggregator` on `TraceFlusher`

* fmt + clippy

* add license
* Add v05 decoding.

* Integrate v05 decoding/encoding in the trace exporter.

* Use TraceChunkSpan as a common placeholder for v04 and v05 incoming payloads.
* Add support for V05 payloads in TracerPayloadCollection.

* Refactor send and send_deser_ser.

* Prevent invalid modes combination in the builder in order to avoid panicking.

* Improve error handling in collect_trace_chunks_function.

* Enable conversion from v04 to v05.

* Add integration tests for v05 format.

* Solve PR comments.
Add clippy warnings and allows for panic macros to most crates

Add an extension crate for mutex in ddcommon to isolate the unwrap to one location in code to avoid the allow annotations. We aren't going to stop unwrapping mutexes anytime soon. 

Replace use of lazy_static with OnceLock for ddcommon, ddtelemetry, live-debugger, tools, sidecar
…ptionnal (#929)

# What does this PR do?

* Add a new feature "mini_agent" on trace_utils and do not use it bydefault. This is going to be used to progressively separate code used by only the mini agent from common code=
* Hide flate2 usage under the compression flag
* Hide hyper-proxy dependency under the proxy feature flag
# What does this PR do?

Update all direct dependencies from hyper 0.14 to hyper 1.x

The path of migration is not always obvious:
* The hyper::Client struct was dropped
  *  Migration: use hyper_utils::client::legacy
* hyper::Body was dropped in favor of a trait that people are free to implement
  * Migration: This one is hard to fix. hyper::Body was a unifying struct over multiple behaviors (incoming and outgoing requests) (empty body, single body, streaming body)
    To match the previous behaviour I implemented hyper_migration::Body, an enum abstracting over all listed uses cases.
    This type is used everywhere to unify the Body types used in libdatadog
* hyper::Server was dropped in favor of a simpler, connection oriented server 
  * Migration: implement the accept spawn responder task loop ourselves. This is one of the things where I am the less sure we keep previous behaviour. Looking at hyper 0.14 code I don't think we are missing any part with the current migrated code in the mini-agent but without integration tests it's hard to be sure.
* Because of the migration, hyper_util::Error is less descriptive than hyper::Error, which makes migration of the TraceExporterError not easy. We try to unwrap and find the source error as much as we can to preserve the behavior, and use anyhow to store the source error
* hyper_proxy use in trace-util for the mini-agent has not migrated to hyper 1 (and is in general not maintained)
  * Migration: Use `hyper-http-proxy` a fork that did the update. We could also vendor the code because it's pretty small


We're not completely free of hyper 0.14 though sadly because httpmock, a dev-dependency of the data-pipeline crate has not migrated yet to hyper 1.
# What does this PR do?

This refactor splits the logic in `collect_trace_chunks` between the trace exporter spans (v04 and v05) and the mini agent spans (pb::Spans).
it completely removes usage of the `TraceCollection` struct from data-pipeline, and instead introduces the `TraceChunks` enum to differentiate between v04 and v05.
 
Currently the way the code is structured makes replacing ByteString with the slice harder due to shared lifetime.
Furthermore, the enums encodes two different codepaths, the spanBytes and span pb which never interact with each other. So having function that handle both span bytes and pb spans is pure complexity overhead.
This refactor also removes a bunch of panics and lines of code that were here because to handle the "fake" pb spans and trace exporter spans overlap, which is practice never happens.

Lastly, this remove the TracerParams struct. Every occurence of if was creating it, and invoking `TryInto<TracerCollection>` just after on it. So replacing it by a simple function is a lot less complex for the same feature set.

# Motivation

Prepare for using `SpanSlice<'a>` instead of `SpanBytes` in the trace exporter.
…a1c474074a4367f'

git-subtree-dir: crates/trace-mini-agent
git-subtree-mainline: 1be056e
git-subtree-split: 94f84ec
@duncanpharvey duncanpharvey force-pushed the duncan-harvey/trace-mini-agent branch 2 times, most recently from 85c95b2 to d9dc949 Compare April 16, 2025 17:50
@duncanpharvey duncanpharvey force-pushed the duncan-harvey/trace-mini-agent branch from d9dc949 to b59a137 Compare April 16, 2025 17:54
@duncanpharvey duncanpharvey changed the title Migrate trace-mini-agent from libdatadog Migrate trace-mini-agent from libdatadog to serverless-components as trace-agent Apr 16, 2025
@duncanpharvey duncanpharvey marked this pull request as ready for review April 18, 2025 15:21
@@ -0,0 +1,33 @@
[package]
name = "trace-agent"
description = "A subset of the trace agent that is shipped alongside tracers in serverless environments"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's either remove it, or rename it to in serverless environments removing alongside tracers

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed!

@duncanpharvey duncanpharvey merged commit 7867680 into main Apr 18, 2025
18 checks passed
@duncanpharvey duncanpharvey deleted the duncan-harvey/trace-mini-agent branch April 18, 2025 19:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.