Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multithreaded CI builds #3311

Merged
merged 12 commits into from
Sep 28, 2023
Merged

Multithreaded CI builds #3311

merged 12 commits into from
Sep 28, 2023

Conversation

netrunnereve
Copy link
Contributor

The CUDA install is still super slow, but if you have the threads you might as well use them! This halves build time for some targets.

@cebtenzzre
Copy link
Collaborator

Is there any reason we shouldn't use -j $(nproc) on Linux and -j $(sysctl -n hw.logicalcpu) on macOS?

@staviq
Copy link
Collaborator

staviq commented Sep 23, 2023

Is there any reason we shouldn't use -j $(nproc) on Linux and -j $(sysctl -n hw.logicalcpu) on macOS?

I second this, since adding self hosted runners on forks for testing PRs is trivial, detecting local CPU count can be very useful.

While we're at it,

GitHub appears to automatically prioritize self hosted runners, and the only thing you need to do is to give them the same tag windows-latest etc. Docker isn't needed, self hosted runners work just fine on persistent image ( normal computer with normal OS ), and that makes subsequent jobs even faster too.

As soon as you register a self hosted runner in your forked repo, CI jobs will run on it automatically.

@netrunnereve
Copy link
Contributor Author

Is there any reason we shouldn't use -j $(nproc) on Linux and -j $(sysctl -n hw.logicalcpu) on macOS?

Done, and added for Windows too. That's a much better way of doing it.

@@ -472,7 +472,7 @@ jobs:
run: |
sudo pkg update
sudo pkg install -y gmake automake autoconf pkgconf llvm15 clinfo clover opencl clblast openblas
gmake CC=/usr/local/bin/clang15 CXX=/usr/local/bin/clang++15
gmake CC=/usr/local/bin/clang15 CXX=/usr/local/bin/clang++15 -j `sysctl -n hw.ncpu`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason you switched syntax for this one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The FreeBSD shell doesn't support the $() syntax.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Should we switch all the others then?
This is probably the difference between sh and bash

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably the difference between sh and bash

Nope. Modern POSIX sh supports $(), you can try it out for yourself with the dash shell. FreeBSD defaults to tcsh which doesn't support $().

Copy link

@ErikBjare ErikBjare left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@ggerganov ggerganov merged commit 0512d66 into ggerganov:master Sep 28, 2023
9 of 10 checks passed
@netrunnereve netrunnereve deleted the ci_threads branch September 28, 2023 19:34
joelkuiper added a commit to vortext/llama.cpp that referenced this pull request Oct 2, 2023
…example

* 'master' of github.com:ggerganov/llama.cpp:
  ggml-cuda : perform cublas mat mul of quantized types as f16 (ggerganov#3412)
  llama.cpp : add documentation about rope_freq_base and scale values (ggerganov#3401)
  train : fix KQ_pos allocation (ggerganov#3392)
  llama : quantize up to 31% faster on Linux and Windows with mmap (ggerganov#3206)
  readme : update hot topics + model links (ggerganov#3399)
  readme : add link to grammars app (ggerganov#3388)
  swift : fix build on xcode 15 (ggerganov#3387)
  build : enable more non-default compiler warnings (ggerganov#3200)
  ggml_tensor: update the structure comments. (ggerganov#3283)
  ggml : release the requested thread pool resource (ggerganov#3292)
  llama.cpp : split llama_context_params into model and context params (ggerganov#3301)
  ci : multithreaded builds (ggerganov#3311)
  train : finetune LORA (ggerganov#2632)
  gguf : basic type checking in gguf_get_* (ggerganov#3346)
  gguf : make token scores and types optional (ggerganov#3347)
  ci : disable freeBSD builds due to lack of VMs (ggerganov#3381)
  llama : custom attention mask + parallel decoding + no context swaps (ggerganov#3228)
  docs : mark code as Bash (ggerganov#3375)
  readme : add Mistral AI release 0.1 (ggerganov#3362)
  ggml-cuda : perform cublas fp16 matrix multiplication as fp16 (ggerganov#3370)
yusiwen pushed a commit to yusiwen/llama.cpp that referenced this pull request Oct 7, 2023
* mac and linux threads

* windows

* Update build.yml

* Update build.yml

* Update build.yml

* automatically get thread count

* windows syntax

* try to fix freebsd

* Update build.yml

* Update build.yml

* Update build.yml
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants