Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Offering: changes required for macOS agent builds on Apple Silicon #18173

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

timsutton
Copy link

@timsutton timsutton commented Jul 13, 2023

What does this PR do?

This PR is a demonstration of changes I needed to make in order to perform a full agent build on Apple Silicon, including a couple changes in Datadog's forks of omnibus and omnibus-software. Many of the changes are explicit overrides to use arm arch over x86_64, so those should all be considered as hacks and not intended to actually merge. I have been just hacking at trying to get something that builds end to end and installs + runs on a machine without Rosetta.

The high-level summary:

  • pull in a current libffi dependency
  • replace x86_64 with arm build target on the DD Mac App and openssl3 dependencies
  • skip including some extra dependencies that ended up building and shipping pgadmin/pg tools, lxml etc. when it wasn't needed
  • tell omnibus to ignore git-fsmonitor stuff from .git (a quick hack, not related to Apple Silicon but just due to newer git tooling – I didn't check omnibus upstream for this)
  • hack to include the proper architectures in the Distribution XML ERB used in omnibus resources (best would be if we could directly use Add hostArchitectures to macOS Distribution file chef/omnibus#1005 but it relies on some other methods we didn't have in the DD fork)
  • as a bonus, remove a swift-stdlib-tool flag that seems to have been removed in Xcode 14.3, so that current Xcode versions can be used to build this

Motivation

At my dayjob I work for a large customer of DataDog. We've opened a feature req (via a support ticket) but so far it's been hinted that this is probably not coming up around the corner. We run a fairly large Mac build cluster that is very performance-sensitive and currently this agent is the only reason we need to install Rosetta2.

Additional Notes

I am definitely not aiming to get these changes landed in exactly their current form. I just wanted to present these changes as something for y'all to use if you haven't done this work internally already. I'm also happy to work with folks on making some of these changes more backwards-compatible. However they do also still require changes in the omnibus/omnibus-software forks as well.

I don't think there's value in trying to compile all these dependencies as Universal, btw. The python stuff gets annoying, go doesn't provide an out-of-the-box way to do it and it would make the package extremely large. I'd assume that when you folks are building Apple Silicon packages that you'd be building packages which aren't dual-architecture.

I also realize that you maybe don't have Apple Silicon machines available for CI. My goal here is to first just get the support landed into the main branches, and leave that part up to you :). In doing this work I'm now confident that I'm able to build these independently out of tagged branches.

Possible Drawbacks / Trade-offs

This branch can only be used to build an arm64 darwin pkg, x86_64 code won't work.

Describe how to test/QA your changes

I took an empty Ventura (macOS 13) VM with:

Reviewer's Checklist

  • If known, an appropriate milestone has been selected; otherwise the Triage milestone is set.
  • Use the major_change label if your change either has a major impact on the code base, is impacting multiple teams or is changing important well-established internals of the Agent. This label will be use during QA to make sure each team pay extra attention to the changed behavior. For any customer facing change use a releasenote.
  • A release note has been added or the changelog/no-changelog label has been applied.
  • Changed code has automated tests for its functionality.
  • Adequate QA/testing plan information is provided if the qa/skip-qa label is not applied.
  • At least one team/.. label has been applied, indicating the team(s) that should QA this change.
  • If applicable, docs team has been notified or an issue has been opened on the documentation repo.
  • If applicable, the need-change/operator and need-change/helm labels have been applied.
  • If applicable, the k8s/<min-version> label, indicating the lowest Kubernetes version compatible with this feature.
  • If applicable, the config template has been updated.

@timsutton timsutton requested review from a team as code owners July 13, 2023 21:05
@bits-bot
Copy link
Collaborator

bits-bot commented Jul 13, 2023

CLA assistant check
All committers have signed the CLA.

@timsutton
Copy link
Author

timsutton commented Jul 14, 2023

I am going to start splitting out these changes so that they are easier to integrate. I haven't filled out all the PR descriptions yet but please let me know if I'm on the right track here to get these landed. This is most of them, still 1-2 more to come:

Who should I tag for reviews on these once they're all in non-draft? @KSerrania would you be interested in helping me get these changes integrated?

@timsutton
Copy link
Author

timsutton commented Jul 15, 2023

I looked more closely at how one might integrate the libffi dependency update. Would it be preferable to keep the existing 3.2.1 libffi (which is six years old!) as the default_version and then we could override that version for just osx + arm? (It needs changing the url from sourceware.org over to github.com so there's some more overriding to do there)

It also seems like indeed these steps aren't necessary on at least 3.4.4, because an earlier step in make install already performs: install -m ffi.h ffitarget.h '/opt/datadog/embedded/include'. So I would plan to just bypass that for 3.4.4.

@KSerrania
Copy link
Contributor

Hey @timsutton,

Thanks for taking the time to make this PR!

We understand that the extra dependency of Rosetta in order to run our current macOS Agent build may add some overhead to the Agent.

As much as we would like to merge the PR to enable your team to support your builds, we don’t have the infrastructure to test, build and release the macOS M1 Agent to be confident that we can maintain this yet.

Our current stance is to revisit the support of macOS M1 once our CI providers support this platform. We will leave this PR open, and revisit it and acknowledge your contribution for these changes once we schedule work on native M1 support.

@cedricvanrompay-datadog cedricvanrompay-datadog removed the request for review from a team August 10, 2023 11:30
@timsutton
Copy link
Author

timsutton commented Aug 10, 2023

Hey @timsutton,

Thanks for taking the time to make this PR!

Hi @KSerrania, thanks for the response. Mine inline here!

We understand that the extra dependency of Rosetta in order to run our current macOS Agent build may add some overhead to the Agent.

It does. When we examine our CPU usage, the Intel agent running via Rosetta is always at the top of the usage list after any build/test activity, which makes them stand out (usually around 2-5 %) whenever we audit for performance issues. By contrast, the native builds drop this down below 1% and thus fall into general background noise.

As much as we would like to merge the PR to enable your team to support your builds, we don’t have the infrastructure to test, build and release the macOS M1 Agent to be confident that we can maintain this yet.

Our current stance is to revisit the support of macOS M1 once our CI providers support this platform. We will leave this PR open, and revisit it and acknowledge your contribution for these changes once we schedule work on native M1 support.

I can understand, you and the maintainers currently don't have the CI infra to test and perform release builds for Apple Silicon.

I'd just like to state my original intentions from this work. My hope was to be able to merge in support into these repos that would affect only arm64 darwin packages, and not alter any dependencies or configurations for other platforms.

The end goals here would be:

  1. Enable the possibility for manual Apple Silicon builds directly from the default branch of the datadog-agent repo. When this depends on (just a few) small fixes in datadog/omnibus-ruby and datadog/omnibus-software (in pinned branches), there's more friction since needing to keep these up to date.

  2. Make it easier for maintainers to begin testing/releasing Apple Silicon package builds whenever they are ready to do so in the future, without immediately introducing any additional testing required (for example the existing Intel darwin packages).

This would also make it easier for us to integrate Apple-Silicon native packages earlier and potentially contribute any followup fixes if we notice any others which are needed.

Would you be open to taking PRs in these other supporting omnibus repos as an initial step? The only dependency I could see which actually needed an updated version was libffi in omnibus-software, but for example we could leave the existing 3.2.1 version as the default, and then specify the newer libffi version only for the arm64 darwin platform combination. This way it shouldn't need to change and dependencies' versions for all the existing platforms for the Datadog agent packages.

@timsutton
Copy link
Author

timsutton commented Nov 28, 2023

@KSerrania Want to ping on this again, since GitHub Actions now offers M1-based runner infrastructure, is there interest in picking this up? Like I mentioned earlier, I'd be open to at least starting to get incremental support landed (for example in omnibus software definitions – py2/3 wheels, xcode targets, libffi – etc.) so that existing Intel macOS builds' dependencies or any other logic should not need to change.

@KSerrania
Copy link
Contributor

Hey @timsutton, thanks for the reminder! According to actions/runner-images#8439 (comment), M1 macOS GitHub Actions runners are indeed available, but only in their paid version, even for open-source repositories. We currently use the free offer for open-source repositories for the Intel macOS Agent builds, and switching to these paid runners would be quite costly, given that they are expensive ($0.16/minute).

The free version of these runners for open-source repositories will be available at a later date. Let's revisit this request when this happens.

@kgantchev
Copy link

kgantchev commented Jan 15, 2024

M1 macOS GitHub Actions runners are indeed available, but only in their paid version, even for open-source repositories.

Might I make a suggestion? Feel free to use FlyCI's M2 runners. Our runners are on average 2x faster and 2x cheaper than GitHub's AND we have a free tier for OSS projects (see below).

Install Instructrions

Easily replace your runners:

jobs:
 ci:
-    runs-on: macos-latest
+    runs-on: flyci-macos-large-latest-m2
   steps:
   - name: 👀 Checkout repo
     uses: actions/checkout@v4

If your repo is public, then FlyCI offers 500 mins/month of free M2 runner usage.

@cb-clintonreece
Copy link

Hey @timsutton, thanks for the reminder! According to actions/runner-images#8439 (comment), M1 macOS GitHub Actions runners are indeed available, but only in their paid version, even for open-source repositories. We currently use the free offer for open-source repositories for the Intel macOS Agent builds, and switching to these paid runners would be quite costly, given that they are expensive ($0.16/minute).

The free version of these runners for open-source repositories will be available at a later date. Let's revisit this request when this happens.

Hi @KSerrania, FYI that these free OSS runners were made available earlier this year.

https://github.blog/changelog/2024-01-30-github-actions-introducing-the-new-m1-macos-runner-available-to-open-source/

@timsutton
Copy link
Author

I'd been meaning to post the same info! Some of the same patches I have in my forks omnibus-ruby and omnibus-software are still needed in order to get Apple Silicon native pkgs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants