Skip to content

Validate and publish#37

Open
lloeki wants to merge 5 commits into
lloeki/builder-cifrom
lloeki/builder-ci-publish
Open

Validate and publish#37
lloeki wants to merge 5 commits into
lloeki/builder-cifrom
lloeki/builder-ci-publish

Conversation

@lloeki
Copy link
Copy Markdown
Member

@lloeki lloeki commented May 5, 2026

Why?

FUP to #34 (chained PR)

What does this PR do?

Publish (conditionally) to rubygems.org; but before publishing, ensure that all variants of the gem:

  • can be installed properly on every target platform
  • can be used to compile C against the shared library on every target platform

How to test the change?

  • CI should be green
  • the publish step cannot be tested until an actual release happens

Additional Notes:

This is intentionally additive: the previous workflow is kept to allow a release "the old way" in short order until the "new way" is fully validated.

Once validated, a subsequent PR will remove the "old way" CI process and (optionally?) the fetch+extract Rake tasks.

JIRA:

@lloeki lloeki changed the base branch from main to lloeki/builder-ci May 5, 2026 16:05
@lloeki lloeki force-pushed the lloeki/builder-ci-publish branch 2 times, most recently from 37cceac to 32e42eb Compare May 5, 2026 19:19
@lloeki lloeki marked this pull request as ready for review May 6, 2026 10:40
@lloeki lloeki requested a review from a team as a code owner May 6, 2026 10:41
@lloeki lloeki requested a review from hoolioh May 6, 2026 10:41
@lloeki lloeki mentioned this pull request May 6, 2026
Comment on lines +166 to +261
validate-linux:
strategy:
fail-fast: false
matrix:
gem_source: [binary, ruby]
platform:
- ruby_platform: x86_64-linux
gem_platform: x86_64-linux
base: ubuntu-24.04
image: ghcr.io/datadog/images-rb/engines/ruby:4.0-gnu-gcc
- ruby_platform: x86_64-linux-musl
gem_platform: x86_64-linux
base: ubuntu-24.04
image: ghcr.io/datadog/images-rb/engines/ruby:4.0-musl-gcc
- ruby_platform: aarch64-linux
gem_platform: aarch64-linux
base: ubuntu-24.04-arm
image: ghcr.io/datadog/images-rb/engines/ruby:4.0-gnu-gcc
- ruby_platform: aarch64-linux-musl
gem_platform: aarch64-linux
base: ubuntu-24.04-arm
image: ghcr.io/datadog/images-rb/engines/ruby:4.0-musl-gcc
name: Validate (${{ matrix.platform.ruby_platform }}, ${{ matrix.gem_source }})
needs: [package]
runs-on: ${{ matrix.platform.base }}
steps:
- name: Download gem
uses: actions/download-artifact@3e5f45b2cfb9172054b4087a40e8e0b5a5461e7c # v8.0.1
with:
name: gem-${{ matrix.gem_source == 'ruby' && 'ruby' || matrix.platform.gem_platform }}
path: pkg
- name: Start container
run: docker run --rm --detach --name validate --volume "${PWD}:${PWD}" -w "${PWD}" ${{ matrix.platform.image }} sleep 86400
- name: Install gem
run: docker exec validate gem install pkg/*.gem
- name: Validate gem
run: |
docker exec validate ruby -e '
require "libdatadog"
puts "version: #{Libdatadog::VERSION}"
puts "platform: #{RUBY_PLATFORM}"
puts "binaries: #{Libdatadog.available_binaries}"
puts "pkgconfig: #{Libdatadog.pkgconfig_folder}"
puts "receiver: #{Libdatadog.path_to_crashtracking_receiver_binary}"
raise "pkgconfig_folder is nil" unless Libdatadog.pkgconfig_folder
raise "crashtracking receiver not found" unless Libdatadog.path_to_crashtracking_receiver_binary
'
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
sparse-checkout: spec/smoke
- name: Smoke test (compile and run C program against libdatadog)
run: docker exec validate ruby spec/smoke/run.rb
- name: Stop container
if: always()
run: docker stop validate || true

validate-macos:
strategy:
fail-fast: false
matrix:
gem_source: [binary, ruby]
platform:
- ruby_platform: arm64-darwin
gem_platform: arm64-darwin
base: macos-15
name: Validate (${{ matrix.platform.ruby_platform }}, ${{ matrix.gem_source }})
needs: [package]
runs-on: ${{ matrix.platform.base }}
steps:
- uses: ruby/setup-ruby@7372622e62b60b3cb750dcd2b9e32c247ffec26a # v1.302.0
with:
ruby-version: "4.0"
- name: Download gem
uses: actions/download-artifact@3e5f45b2cfb9172054b4087a40e8e0b5a5461e7c # v8.0.1
with:
name: gem-${{ matrix.gem_source == 'ruby' && 'ruby' || matrix.platform.gem_platform }}
path: pkg
- name: Install gem
run: gem install pkg/*.gem
- name: Validate gem
run: |
ruby -e '
require "libdatadog"
puts "version: #{Libdatadog::VERSION}"
puts "platform: #{RUBY_PLATFORM}"
puts "binaries: #{Libdatadog.available_binaries}"
puts "pkgconfig: #{Libdatadog.pkgconfig_folder}"
puts "receiver: #{Libdatadog.path_to_crashtracking_receiver_binary}"
raise "pkgconfig_folder is nil" unless Libdatadog.pkgconfig_folder
raise "crashtracking receiver not found" unless Libdatadog.path_to_crashtracking_receiver_binary
'
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
sparse-checkout: spec/smoke
- name: Smoke test (compile and run C program against libdatadog)
run: ruby spec/smoke/run.rb
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's quite a bit of duplication here -- can we maybe push this inside Rake?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we can have much in rake beyond this block:

          ruby -e '
            require "libdatadog"
            puts "version:    #{Libdatadog::VERSION}"
            puts "platform:   #{RUBY_PLATFORM}"
            puts "binaries:   #{Libdatadog.available_binaries}"
            puts "pkgconfig:  #{Libdatadog.pkgconfig_folder}"
            puts "receiver:   #{Libdatadog.path_to_crashtracking_receiver_binary}"
            raise "pkgconfig_folder is nil" unless Libdatadog.pkgconfig_folder
            raise "crashtracking receiver not found" unless Libdatadog.path_to_crashtracking_receiver_binary
          '

Is that what you mean?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not everything else? E.g. installing the gem, running the smoke test etc -- at least to me rake is a lot easier to debug and test locally, and CI then could be a single "bundle exec rake do_the_testing"

Copy link
Copy Markdown
Member Author

@lloeki lloeki May 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not everything else?

      - name: Start container
        run: docker run --rm --detach --name validate --volume "${PWD}:${PWD}" -w "${PWD}" ${{ matrix.platform.image }} sleep 86400
- uses: ruby/setup-ruby@7372622e62b60b3cb750dcd2b9e32c247ffec26a # v1.302.0

Well, we can't really install via rake without ruby? Also the execution model differs between macos and linux.

      - name: Download gem
        uses: actions/download-artifact@3e5f45b2cfb9172054b4087a40e8e0b5a5461e7c # v8.0.1

Not relevant to do locally? Or do you want to be able to download an artifact using a simple rake command, say, from an Actions URL? I could do add that as nice-to-have rake github:artifact:fetch[${id}] tooling to save clicking around, but I feel it's out of scope for this PR. Also I think we'd keep using the github artifact action in the workflow.

Alternatively there's rake libdatadog:build to build for the current platform.

      - name: Install gem
        run: gem install pkg/*.gem

This needs an artifact gem from above, otherwise there's standard rake install already.

      - name: Validate gem
        run: |
          ruby -e '
            require "libdatadog"
            puts "version:    #{Libdatadog::VERSION}"
            puts "platform:   #{RUBY_PLATFORM}"
            puts "binaries:   #{Libdatadog.available_binaries}"
            puts "pkgconfig:  #{Libdatadog.pkgconfig_folder}"
            puts "receiver:   #{Libdatadog.path_to_crashtracking_receiver_binary}"
            raise "pkgconfig_folder is nil" unless Libdatadog.pkgconfig_folder
            raise "crashtracking receiver not found" unless Libdatadog.path_to_crashtracking_receiver_binary
          '

Probably the most complex and useful to factor out; but then it would need the code checked out, which it currently doesn't have (see item just below)

Locally, even though it needs an artifact gem from above, it could be useful.

      - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
        with:
          sparse-checkout: spec/smoke

Locally, the repository is checked out already. But most importantly this check and all previous tests are done without ever cloning the repository: they essentially check that everything works fine outside of a development or development-like environment (gemfile around, etc...); that's why it does a sparse checkout in cone mode (subfolder).

      - name: Smoke test (compile and run C program against libdatadog)
        run: ruby spec/smoke/run.rb

The role of this test is to be run outside of the repo: checking that the installed gem behaves as expected in an isolated environment. Sure it can work in the repo too, and could be useful, but making it a rake task or a spec starts making it more dependent on a specific environment (Gemfile, task files, rspec/minitest/whatever, etc...). Admittedly the run.rb script could be a rake task that can be evaluated in both a dev context and this freestanding environment.

I aimed for simple here though. I'm OK to make some of these additions to have more useful tooling in subsequent PRs, but I think it is out of scope for this PR.

Comment thread tasks/gem.rake
Comment on lines +1 to +24
# frozen_string_literal: true

require "pathname"
require "rubygems/package"
require "rubygems/package/tar_reader"
require "zlib"

# Builds .gem packages from vendored libdatadog artifacts.
#
# Intentionally duplicates some Rakefile logic so both can coexist during
# the migration; the Rakefile originals will be removed once the new tasks
# are fully adopted.
module GemPackaging
module Platform
# Mapping from gem platform names to the vendor platform directories
# included in that gem. We bundle glibc and musl variants together
# for Linux to work around https://github.com/rubygems/rubygems/issues/3174
GEM_TO_VENDOR = {
"x86_64-linux" => ["x86_64-linux", "x86_64-linux-musl"],
"aarch64-linux" => ["aarch64-linux", "aarch64-linux-musl"],
"arm64-darwin" => ["arm64-darwin"]
}.freeze

ALL_VENDOR = GEM_TO_VENDOR.values.flatten.freeze
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is kinda weird/unexpected to me. If we want to make a release with the previous process, can we make it from a branch?

I'm slightly concerned that we're adding a lot of complexity to avoid using a branch 👀 :hurtrealbad:

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean that some code exists in both tasks/gem.rake and Rakefile?

My process here is to make sure the existing workflows don't break; especially publish.yml which can only be tested by publishing.

As I described in the PR:

This is intentionally additive: the previous workflow is kept to allow a release "the old way" in short order until the "new way" is fully validated.

Once validated, a subsequent PR will remove the "old way" CI process and (optionally?) the fetch+extract Rake tasks.

The idea behind the duplication is that some bits are copied over and changed as needed for the new gem building steps, but making them shared is both useless churn since they should go away once validated and more risky than "no change at all".

I get the feeling about the branch, but then the branch would get out of sync if/when a libdatadog update lands, plus while the worfklow_dispatch of the publish workflow allows for branch selection, it is unknown whether releasing from a branch actually works in code within the workflow, so I just thought it best to not change anything and be sure it's 100% identical, code or environment.

Once merged, on the next release we can do this process:

  • attempt release with the new way
    • if succeeded:
      • open PR to remove the old way
    • if failed:
      • immediately proceed with release the old way
      • fix the new way without self-inflicted artificial pressure

It's a noop noop noop maneuver.

@lloeki lloeki force-pushed the lloeki/builder-ci branch from 2809bd4 to 3a47bdf Compare May 21, 2026 11:49
@lloeki lloeki force-pushed the lloeki/builder-ci-publish branch from 8015815 to fabf682 Compare May 21, 2026 15:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants