Skip to content

Build libdatadog using builder crate#24

Closed
hoolioh wants to merge 9 commits into
mainfrom
julio/use-builder-cargo-install
Closed

Build libdatadog using builder crate#24
hoolioh wants to merge 9 commits into
mainfrom
julio/use-builder-cargo-install

Conversation

@hoolioh
Copy link
Copy Markdown

@hoolioh hoolioh commented Mar 27, 2026

Build libdatadog using builder crate

Why?

The previous approach hardcoded SHA256 checksums and download URLs for each platform release in LIB_GITHUB_RELEASES (Rakefile), required the http gem, and needed manual updates on every libdatadog release.

Two parallel PRs tackled this independently: #24 introduced the full build-from-source + pre-built download workflow with supply-chain hardening, and #33 introduced a cleaner modular structure using namespaced tasks, Pathname-based path handling, and a dedicated tasks/ directory. This PR consolidates both, taking the best of each.

What does this PR do?

Replaces the old fetch-and-extract workflow with two complementary tasks:

libdatadog:build — builds libdatadog from source for the current platform using libdatadog's own builder crate. Requires a Rust toolchain.

  • Fetches the builder via cargo install --rev <LIB_COMMIT_SHA> --locked (pinned to a commit SHA, not a mutable tag, for supply-chain safety)
  • Supports local development via LIBDATADOG_SOURCE=/path/to/libdatadog
  • Supports feature overrides via LIBDATADOG_FEATURES
  • Outputs artifacts directly to vendor/libdatadog-<version>/<platform>/

fetch_release_artifacts — downloads pre-built tarballs from the GitHub release for all supported platforms into vendor/. Used by CI during publish; no Rust required.

The package task is now driven by the gemspec, which automatically includes all files found under vendor/ for the supported platforms. The hardcoded per-platform gem list in push_to_rubygems is replaced with a Dir.glob.

Structure (from #33)

Build logic lives in tasks/build.rake with a modular BuildFromSource structure (Target, Paths, Builder sub-modules) using Pathname throughout. Tasks are namespaced under libdatadog: (libdatadog:build, libdatadog:clean), keeping the top-level Rakefile focused on packaging and release. The libdatadog:clean task removes intermediates (tmp/) and vendor output in one step.

Security (from #24)

cargo install uses --rev <LIB_COMMIT_SHA> --locked rather than a mutable git tag, ensuring the builder binary is reproducibly pinned to a specific commit. LIB_COMMIT_SHA is tracked alongside LIB_VERSION in version.rb.

How to test the change?

Build from source (requires Rust 1.84.1+):

bundle exec rake libdatadog:build

Build from a local libdatadog checkout:

LIBDATADOG_SOURCE=/path/to/libdatadog bundle exec rake libdatadog:build

Download pre-built artifacts and package the gem locally:

bundle exec rake package_from_github

Clean up intermediates and vendor output:

bundle exec rake libdatadog:clean

Additional Notes

  • Drops the http gem dependency; uses stdlib net/http instead
  • Rust toolchain is only needed for libdatadog:build; all other tasks including the default test suite work without it
  • LIB_COMMIT_SHA added to version.rb alongside LIB_VERSION; both must be updated on each libdatadog release
  • Nix dev shell updated with rustc, cargo, cmake, and supporting build tools
  • Consolidates Build libdatadog using builder crate #24 and Build using libdatadog's builder crate #33

@hoolioh hoolioh changed the title Use builder crate to build the artifacts. Build libdatadog using builder crate Mar 27, 2026
@hoolioh hoolioh force-pushed the julio/use-builder-cargo-install branch from c27ac53 to 66e9cd0 Compare April 21, 2026 10:35
@hoolioh hoolioh marked this pull request as ready for review April 21, 2026 11:11
@hoolioh hoolioh requested a review from a team as a code owner April 21, 2026 11:11
Comment thread spec/gem_packaging.rb Outdated
it "prefixes all public symbols in .so files" do
so_files = Dir.glob("vendor/libdatadog-#{Libdatadog::LIB_VERSION}/**/*.so")
expect(so_files.size).to be 4
expect(so_files.size).to be >= 1
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something feels off here, I don't think we'd want to be dynamic, as we want to make sure we package the expected files.

Having this to 4 kind of acts as a proxy for exact files; probably we should make it another dedicated test.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've modified to expect a file for each supported target.

Comment thread spec/libdatadog_spec.rb Outdated
context "for the current platform" do
let(:current_platform) do
platform = Gem::Platform.local.to_s
platform = platform[0..-5] if platform.end_with?("-gnu")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, looks like some normalisation slipped through.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: I don't quite understand why this is needed -- we already had some tests for when the prefix ends with -gnu; why did this part need updating? 👀

Comment thread Rakefile Outdated
Comment on lines +38 to +39
rustc_output = `rustc -vV`
raise "rustc not found or failed" unless $?.success?
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see a couple of tools are needed, either we expect them present or we should probably move this to a dedicated prerequisite task depended upon.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a new task to check that the toolchain is present and write a section in the README to install it.

Comment thread Rakefile Outdated
Comment thread Rakefile Outdated
Comment on lines +54 to +60
[
"cargo", "install",
"--path", File.join(source_path, "builder"),
"--bin", "release",
"--root", install_root,
"--force"
]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shares a lot with the command below (diff: --path vs --git + --tag) and should probably be factored together.

Comment thread Rakefile Outdated
puts "Building libdatadog for #{ruby_platform} (#{host_triple})"
puts "Output: #{target_directory}"

system(env, binary, "--out", target_directory) || raise("Builder failed")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does it know which source tree to build from? Neither source_path nor LIBDATADOG_SOURCE_PATH seem to have been communicated.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LIBDATADOG_SOURCE_PATH is intended to use the builder from a local folder so the builder, when installed, will take that repo as the source tree. Same thing happens when pulled from a tag, when installed it knows that it was installed from that specific tag and will pull all the needed packages from the same reference.

Comment thread Rakefile Outdated

# macOS package (Apple Silicon)
Helpers.package_for(gemspec, ruby_platform: "arm64-darwin", files: Helpers.files_for("arm64-darwin"))
built_platforms = Dir.glob("vendor/libdatadog-#{Libdatadog::LIB_VERSION}/*/").map { |d| File.basename(d) }
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes built_platforms dynamic instead of explicit, which might lead to issues.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 the "fallback" gem should always contain all the platforms we specified above and this shouldn't be picked up dynamically. A "fallback" gem without all the platforms is an incorrect fallback gem.

Alternatively, we could allow the gem to be packaged still, but never released on rubygems.org, although since it's easy to build locally I'm not sure it's useful to have this "incomplete fallback gem" build ability.

To be clear this is a blocker in my opinion for merging this PR

Comment thread Rakefile Outdated
Helpers.fix_file_permissions_for_gem(gemspec.files)

# Fallback package with all built binaries
Helpers.package_for(gemspec, ruby_platform: nil, files: Helpers.files_for(*built_platforms))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If only one platform is built (e.g the current one) it looks like this would build a ruby package with a single platform; thus a silent failure or other mistake resulting in missing but expected files would produce an incomplete package.

Comment thread Rakefile Outdated
Comment on lines +131 to +133
built_platforms.each do |platform|
Helpers.package_for(gemspec, ruby_platform: platform, files: Helpers.files_for(platform))
end
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest three distinct set of tasks:

  • build for current platform: must run on expected native platform
  • package binary gem for one platform: can run on any platform but must be provided platform-specific file(s) to package; if platform is same as above it can all be done locally in sequence; otherwise file(s) must come as artifact from another build job
  • package for ruby (== "all" platforms) gem platform: can run on any platform but must be provided all files to package; file(s) must come as artifact from another build job

In all cases tasks fail when any prerequisite is missing.

This creates a system that cannot fail by being always consistent.

Comment thread Rakefile
"gem push pkg/libdatadog-#{Libdatadog::VERSION}-aarch64-linux.gem",
"gem push pkg/libdatadog-#{Libdatadog::VERSION}-arm64-darwin.gem"
].each do |command|
Dir.glob("pkg/libdatadog-#{Libdatadog::VERSION}*.gem").each do |gem_file|
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, explicit is better than implicit.

We need to explicitly spawn CI jobs on specific kinds of workers anyway so it's not like we're making anything automatic.

Copy link
Copy Markdown
Member

@ivoanjo ivoanjo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks overall sane/reasonable but the dynamic platform detection at this point is dangerous as it would allow us to release in a broken state, so I'm marking as request changes.

Comment thread spec/gem_packaging.rb
Comment on lines -37 to 56
it "prefixes all public symbols in .so files" do
so_files = Dir.glob("vendor/libdatadog-#{Libdatadog::LIB_VERSION}/**/*.so")
expect(so_files.size).to be 4

so_files.each do |so_file|
raw_symbols = `nm -D --defined-only #{so_file}`

symbols = raw_symbols.split("\n").map { |symbol| symbol.split(" ").last.downcase }.sort
it "prefixes all public symbols in shared library files" do
shared_lib_files = Dir.glob("vendor/libdatadog-#{Libdatadog::LIB_VERSION}/**/*.{so,dylib}")
expect(shared_lib_files.size).to be >= 1

shared_lib_files.each do |shared_lib_file|
# macOS nm doesn't use a dynamic symbol table (-D); use -g (global) instead
nm_flags = shared_lib_file.end_with?(".dylib") ? "-g --defined-only" : "-D --defined-only"
raw_symbols = `nm #{nm_flags} #{shared_lib_file}`

# macOS nm prefixes C symbols with "_"; strip it for consistent matching.
# Linker-injected symbols (e.g. __mh_dylib_header) have two leading underscores and
# still start with "_" after stripping one — reject them to avoid false failures.
symbols = raw_symbols.split("\n")
.map { |symbol| symbol.split(" ").last.downcase.sub(/\A_/, "") }
.reject { |sym| sym.start_with?("_") }
.sort
expect(symbols.size).to be > 20 # Quick sanity check
expect(symbols).to all(
start_with("ddog_").or(start_with("blaze_"))
)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: This test is so simple that I would soft-suggest breaking it into two: one for Linux, another one for macOS.

That would avoid all the weird branching options for one vs the other as in the end, very little code is shared

Comment thread spec/libdatadog_spec.rb Outdated
context "for the current platform" do
let(:current_platform) do
platform = Gem::Platform.local.to_s
platform = platform[0..-5] if platform.end_with?("-gnu")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: I don't quite understand why this is needed -- we already had some tests for when the prefix ends with -gnu; why did this part need updating? 👀

Comment thread .gitignore Outdated
/Gemfile.lock

/vendor/
/ext/
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: Why is an ext folder needed?

Comment thread Rakefile Outdated
"aarch64-unknown-linux-gnu" => "aarch64-linux",
"aarch64-unknown-linux-musl" => "aarch64-linux-musl",
"aarch64-apple-darwin" => "arm64-darwin",
"x86_64-apple-darwin" => "arm64-darwin"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not correct, we deliberately chose to not start supporting the legacy macOS x86-64.

I guess we could've documented it better -- can I ask that you replace this with a comment saying we explicitly are not supporting it?

Comment thread Rakefile Outdated
target_file_hash = Digest::SHA256.hexdigest(File.read(target_file))
desc "Build libdatadog FFI library from source using the builder crate"
task :build_ffi do
rustc_output = `rustc -vV`
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I was hoping we could avoid the "everyone builds with a different rust version" kinda thing.

At least, I'd like it to make it opt-in, if possible: That is, by default I think there should be a given rust version enforced by the builder crate, and you need to pass in an extra flag if you want to say "I want to use my own rust".

Would that be possible? This way for instance in the future we could bump the rust version everyone uses in the builder crate itself (+ update our images) rather than having a rag-tag of rust versions that every library picks.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would that rust version be obtained? A bump means:

  • everyone's pipeline would break until they figure out how to install the new version

OR

  • builder installs the necessary rustc via rustup or something; I'm not sure that's in scope

I'm not quite sure of the expected gain here.

Tangent: probably the easiest way to achieve that is via Nix; as in, it's already the case in this repo for Nix users because of flake.lock, making it work across repos would be a mere libdatadog providing their own flake and consumers using it as input. Then you'd get a shot at actual reproducibility.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would that rust version be obtained? A bump means:

  • everyone's pipeline would break until they figure out how to install the new version

Yeap, I think that's fine and exactly what I'm suggesting -- if upstream moves, our CI goes red, and we have two options -- pass the --allow-custom-rust-version or update our version, which I think makes sense here?

As a reminder, we don't pick up new major versions of libdatadog anyway from dd-trace-rb, so having this CI needing small updates when something changes upstream is not particularly new.

Comment thread Rakefile Outdated
Comment thread Rakefile Outdated
end
install_cmd = if source_path
[
"cargo", "install",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: If cargo is missing packaging will fail anyway so not sure it's worth adding an explicit check

Comment thread Rakefile Outdated

env = {
"PROFILE" => "release",
"OPT_LEVEL" => "3",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1000 yes please this should be a default in the builder crate and we should not override it unless there's a really good reason to

Comment thread Rakefile Outdated

# macOS package (Apple Silicon)
Helpers.package_for(gemspec, ruby_platform: "arm64-darwin", files: Helpers.files_for("arm64-darwin"))
built_platforms = Dir.glob("vendor/libdatadog-#{Libdatadog::LIB_VERSION}/*/").map { |d| File.basename(d) }
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 the "fallback" gem should always contain all the platforms we specified above and this shouldn't be picked up dynamically. A "fallback" gem without all the platforms is an incorrect fallback gem.

Alternatively, we could allow the gem to be packaged still, but never released on rubygems.org, although since it's easy to build locally I'm not sure it's useful to have this "incomplete fallback gem" build ability.

To be clear this is a blocker in my opinion for merging this PR

Comment thread Rakefile
Comment on lines +175 to +188
def self.fix_file_permissions_for_gem(files)
files.each do |path|
next unless File.file?(path)

filename = File.basename(path)
current_permissions = File.stat(path).mode & 0o777
expected = EXECUTABLE_FILES.include?(filename) ? 0o755 : 0o644

if current_permissions != expected
puts "Fixing permissions for #{path}: #{current_permissions.to_s(8)} -> #{expected.to_s(8)}"
FileUtils.chmod(expected, path)
end
end
end
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: This looks like a copy of the method after it... why is this needed again?

@hoolioh hoolioh force-pushed the julio/use-builder-cargo-install branch from 80e8dc3 to e87bfce Compare May 4, 2026 10:56
@hoolioh hoolioh requested a review from a team as a code owner May 5, 2026 08:50
@hoolioh hoolioh closed this May 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants