Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] Add macos build test #3994

Merged
merged 39 commits into from
May 28, 2024
Merged

[CI] Add macos build test #3994

merged 39 commits into from
May 28, 2024

Conversation

Jokeren
Copy link
Contributor

@Jokeren Jokeren commented May 24, 2024

No description provided.

~/.triton/nvidia
~/.triton/pybind11
~/.triton/json
key: ${{ runner.os }}-${{ runner.arch }}-llvm-${{ steps.cache-key.outputs.llvm }}-nvidia-${{ steps.cache-key.outputs.nvidia }}-pybind11-${{ steps.cache-key.outputs.pybind11 }}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of copy-pasting sections like this one, can we use the YAML anchors? Replace this section with

- *cache-build-dependencies-step

(Also it seems that we're caching ~/.triton/json here but I don't see it in the CUDA runner. That's part of the reason to use the anchors, so that code that's supposed to be the same is the same.)

Copy link
Contributor Author

@Jokeren Jokeren May 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem, I'll do that. Just trying to make the build runner pass first.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ty ty

Sorry that there's some unused cruft in here. I'm happy to remove it if after you're done with your changes if @ThomasRaoux is ok with that.

@@ -155,7 +155,7 @@ jobs:
runs-on: ${{ matrix.runner }}
timeout-minutes: 30
env:
CMAKE_BUILD_TYPE: "MIN_SIZE_REL"
CMAKE_BUILD_TYPE: "Debug"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to over-communicate, I would not expect "Debug" or "MinSizeRel" is the fastest build; I'd expect -O1 is probably faster than both.

But also seeing as Debug seems to be the same speed as Release (15m) maybe something else is causing the slowness...

@jlebar
Copy link
Collaborator

jlebar commented May 27, 2024

github is not loading the logs for the most recent build so I wasn't able to investigate this myself, but it may be worth investigating if you haven't how much of the 12m30s build step is due to downloading dependencies versus actually building Triton. IIRC downloading the CUDA and nvidia dependencies was slow for Linux, so we started caching them. But I'm not sure if the cache works here yet.

@Jokeren Jokeren marked this pull request as ready for review May 28, 2024 12:53
@Jokeren Jokeren requested a review from ptillet as a code owner May 28, 2024 12:53
@Jokeren
Copy link
Contributor Author

Jokeren commented May 28, 2024

Hi @jlebar, it's ready for re-review. :)

Comment on lines 410 to 411
brew install ccache
brew install llvm
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
brew install ccache
brew install llvm
brew install ccache llvm

TRITON_BUILD_WITH_O1: "true"
# macos-latest has a 3-core M1 vcpu
# https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners/about-github-hosted-runners#standard-github-hosted-runners-for-public-repositories
MAX_JOBS: 3
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is quite the right comment. ninja -j will read the number of CPUs and launch f(ncpu) jobs, I think 2*ncpu.

You're restricting to 3 jobs and because I presume you found it's faster? Probably due to limited RAM?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Due to limited RAM.

I'd like to launch ncpu instead of 2*cpu because M1 doesn't have hyperthreading.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, just suggest updating the comment to explain the reason, which isn't really that M1 only has 3 cores.

(Also usually build systems will do 2ncpu even if there's no hyperthreading, because sometimes jobs are I/O bound. Indeed I think they probably do 2nvcpu, i.e. if M1 had hyperthreading it would by default launch 12 jobs.)

python3 -m pip install cython setuptools wheel cmake==3.24 ninja pytest-xdist lit
- name: Install Triton
env:
TRITON_BUILD_WITH_CCACHE: "true"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see us caching the ccache directory, in which case ccache is just overhead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, forgot the caching step...

@@ -35,7 +35,8 @@ endif()

# Check if the platform is MacOS
if(APPLE)
set(PROTON_PYTHON_LDFLAGS "-undefined dynamic_lookup -flto")
set(CMAKE_SHARED_LIBRARY_SUFFIX ".so")
set(PROTON_PYTHON_LDFLAGS "-undefined dynamic_lookup")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
set(PROTON_PYTHON_LDFLAGS "-undefined dynamic_lookup")
# Other platforms build with -flto, but we found that this adds significant overhead to our macos CI without providing a major benefit.
set(PROTON_PYTHON_LDFLAGS "-undefined dynamic_lookup")

Copy link
Collaborator

@jlebar jlebar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Woo, I'm excited for more CI.

@jlebar
Copy link
Collaborator

jlebar commented May 28, 2024

LGTM!

@Jokeren Jokeren merged commit 706174d into main May 28, 2024
6 checks passed
@Jokeren Jokeren deleted the keren/macos branch May 28, 2024 17:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants