Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add jl_getaffinity and jl_setaffinity #53402

Merged
merged 18 commits into from
May 25, 2024
Merged

Conversation

carstenbauer
Copy link
Member

@carstenbauer carstenbauer commented Feb 20, 2024

This PR adds two functions jl_getaffinity and jl_setaffinity to the runtime, which are slim wrappers around uv_thread_getaffinity and uv_thread_setaffinity and can be used to set the affinity of Julia threads.

This will

  • simplify thread pinning (ThreadPinning.jl currently pins threads by spawning tasks that run the necessary ccalls) and
  • enable users to also pin GC threads (or, more generally, all Julia threads).

Example:

bauerc@n2lcn0146 julia git:(cb/affinity)  
➜ ./julia -q --startup-file=no --threads 2,3 --gcthreads 4,1
julia> cpumasksize = @ccall uv_cpumask_size()::Cint
1024

julia> mask = zeros(Cchar, cpumasksize);

julia> jl_getaffinity(tid, mask, cpumasksize) = ccall(:jl_getaffinity, Int32, (Int16, Ptr{Cchar}, Int32), tid, mask, cpumasksize)
jl_getaffinity (generic function with 1 method)

julia> jl_getaffinity(1, mask, cpumasksize)
0

julia> print(mask[1:Sys.CPU_THREADS])
Int8[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]

julia> mask[1] = 0;

julia> jl_setaffinity(tid, mask, cpumasksize) = ccall(:jl_setaffinity, Int32, (Int16, Ptr{Cchar}, Int32), tid, mask, cpumasksize)
jl_setaffinity (generic function with 1 method)

julia> jl_setaffinity(1, mask, cpumasksize)
0

julia> fill!(mask, 0);

julia> jl_getaffinity(1, mask, cpumasksize)
0

julia> print(mask[1:Sys.CPU_THREADS])
Int8[0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]

(cc @vchuravy, @gbaraldi)

Would be great to get this into 1.11 (despite feature freeze) because otherwise we won't be able to pin GC threads until 1.12 (likely not until the end of the year).

Closes #53073

@carstenbauer carstenbauer added the domain:multithreading Base.Threads and related functionality label Feb 20, 2024
@carstenbauer
Copy link
Member Author

carstenbauer commented Feb 20, 2024

Open questions:

  • Currently I use jl_atomic_load_acquire to query jl_n_threads and jl_atomic_load_relaxed to get jl_all_tls_states. Is this ok?
  • I put the declaration at the end of julia_threads.h. Is this fine?
  • I added the two new functions to jl_exported_funcs.inc. Is this how I should do it, or is this auto-generated somehow?
  • To what extend do we test/document these functions?

Copy link
Sponsor Member

@vchuravy vchuravy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs at least some minimal tests

src/threading.c Outdated Show resolved Hide resolved
src/threading.c Outdated Show resolved Hide resolved
@carstenbauer
Copy link
Member Author

Added minimal tests and incorporated the changes suggested by @vchuravy.

src/threading.c Outdated Show resolved Hide resolved
@carstenbauer
Copy link
Member Author

carstenbauer commented Feb 21, 2024

Adding backport label for 1.11 for the reasons mentioned above (should also be very straightforward to backport). Please chime in if you disagree.

@carstenbauer carstenbauer added the backport 1.11 Change should be backported to release-1.11 label Feb 21, 2024
@KristofferC KristofferC mentioned this pull request Feb 26, 2024
28 tasks
test/threads.jl Outdated Show resolved Hide resolved
KristofferC added a commit that referenced this pull request Mar 1, 2024
Backported PRs:
- [x] #53361 <!-- 🤖 [master] Bump the SparseArrays stdlib from c9f7293
to cb602d7 -->
- [x] #53300 <!-- allow external absint to hold custom data in
`codeinst.inferred` -->
- [x] #53342 <!-- Add `Base.wrap` to docs -->
- [x] #53372 <!-- Silence warnings in `test/file.jl` -->
- [x] #53357 <!-- 🤖 [master] Bump the Pkg stdlib from 6dd0e7c9e to
76070d295 -->
- [x] #53373 <!-- fix sysimage-native-code=no option with pkgimages -->
- [x] #53333 <!-- More consistent return value for annotations, take 2
-->
- [x] #53354 <!-- fix code coverage bug in tail position and `else` -->
- [x] #53407 <!-- fix sysimage-native-code=yes option -->
- [x] #53388 <!-- Fix documentation: thread pool of main thread -->
- [x] #53355 <!-- Fix synchronization issues on the GC scheduler. -->
- [x] #53429 <!-- Subtype: skip slow-path in `local_∀_∃_subtype` if
inputs contain no ∃ typevar. -->
- [x] #53437 <!-- Add debug variant of loader_trampolines.o -->
- [x] #53284 <!-- Add annotate! method for AnnotatedIOBuffer -->
- [x] #53466 <!-- [MozillaCACerts_jll] Update to v2023-12-12 -->
- [x] #53467 <!-- [LibGit2_jll] Update to v1.7.2 -->
- [x] #53326 <!-- RFC: when loading code for internal purposes, load
stdlib files directly, bypassing DEPOT_PATH, LOAD_PATH, and stale checks
-->
- [x] #53332
- [x] #53320 <!-- Add `Sys.isreadable, Sys.iswriteable`, update `ispath`
-->
- [x] #53476

Contains multiple commits, manual intervention needed:
- [ ] #53285 <!-- Add update mechanism for Terminfo, and common
user-alias data -->

Non-merged PRs with backport label:
- [ ] #53424 <!-- yet more atomics & cache-line fixes on work-stealing
queue -->
- [ ] #53408 <!-- task splitting: change additive accumulation to
multiplicative -->
- [ ] #53403 <!-- Move parallel precompilation to Base -->
- [ ] #53402 <!-- Add `jl_getaffinity` and `jl_setaffinity` -->
- [ ] #53391 <!-- Default to the medium code model in x86 linux -->
- [ ] #53125 <!-- coverage: count coverage where explicitly requested by
inference only -->
- [ ] #52694 <!-- Reinstate similar for AbstractQ for backward
compatibility -->
@KristofferC KristofferC mentioned this pull request Mar 1, 2024
60 tasks
src/threading.c Outdated Show resolved Hide resolved
@vtjnash vtjnash added the status:merge me PR is reviewed. Merge when all tests are passing label Mar 7, 2024
@IanButterworth
Copy link
Sponsor Member

Windows failures appear related

Test Failed at C:\buildkite-agent\builds\win2k22-amdci6-1\julialang\julia-master\julia-a50eb050b7\share\julia\test\threads.jl:353
--
  | Expression: jl_setaffinity(1, mask, cpumasksize) == 0
  | Evaluated: -4071 == 0

@carstenbauer
Copy link
Member Author

Strange. Looking at the libuv source code and https://docs.libuv.org/en/v1.x/errors.html, this return code seems to refer to "invalid argument". We pass 3 arguments, 1, mask, and cpumasksize. The latter is queried from libuv itself and thus should be safe. I think we can also rule out that 1 is the issue. That leaves us with mask, which is just mask = fill!(zeros(Cchar, cpumasksize), 1). Unfortunately, I don't have a Windows machine to test this live. On Linux, when I try to provide illegal masks, I don't get the error code we're seeing here:

julia> jl_setaffinity(1, fill!(zeros(Cchar, cpumasksize), 0), cpumasksize) # different error code if all mask entries are zero
-22

julia> jl_setaffinity(1, fill!(zeros(Cchar, cpumasksize+3), 1), cpumasksize) # increasing the mask size beyond cpumasksize doesn't seem to be a problem?
0

Of course, we could simply not test this on Windows but, in contrast to macOS, it should work there...

KristofferC added a commit that referenced this pull request Mar 17, 2024
Backported PRs:
- [x] #39071 <!-- Add a lazy `logrange` function and `LogRange` type -->
- [x] #51802 <!-- Allow AnnotatedStrings in log messages -->
- [x] #53369 <!-- Orthogonalize re-indexing for FastSubArrays -->
- [x] #48050 <!-- improve `--heap-size-hint` arg handling -->
- [x] #53482 <!-- add IR encoding for EnterNode -->
- [x] #53499 <!-- Avoid compiler warning about redefining jl_globalref_t
-->
- [x] #53507 <!-- update staled `Core.Compiler.Effects` documentation
-->
- [x] #53408 <!-- task splitting: change additive accumulation to
multiplicative -->
- [x] #53523 <!-- add back an alias for `check_top_bit` -->
- [x] #53377 <!-- add _readdirx for returning more object info gathered
during dir scan -->
- [x] #53525 <!-- fix InteractiveUtils call in Base.runtests on failure
-->
- [x] #53540 <!-- use more efficient `_readdirx` for tab completion -->
- [x] #53545 <!-- use `_readdirx` for `walkdir` -->
- [x] #53551 <!-- revert "Add @create_log_macro for making custom styled
logging macros (#52196)" -->
- [x] #53554 <!-- Always return a value in 1-d circshift! of
abstractarray.jl -->
- [x] #53424 <!-- yet more atomics & cache-line fixes on work-stealing
queue -->
- [x] #53571 <!-- Update Documenter to v1.3 for inventory writing -->
- [x] #53403 <!-- Move parallel precompilation to Base -->
- [x] #53589 <!-- add back `unsafe_convert` to pointer for arrays -->
- [x] #53596 <!-- build: remove extra .a file -->
- [x] #53606 <!-- fix error path in `precompilepkgs` -->
- [x] #53004 <!-- Unexport with, at_with, and ScopedValue from Base -->
- [x] #53629 <!-- typo fix in scoped values docs -->
- [x] #53630 <!-- sroa: Fix incorrect scope counting -->
- [x] #53598 <!-- Use Base parallel precompilation to build stdlibs -->
- [x] #53649 <!-- precompilepkgs: package in boths deps and weakdeps are
in fact only weak -->
- [x] #53671 <!-- Fix bootstrap Base precompile in cross compile
configuration -->
- [x] #52125 <!-- Load Pkg if not already to reinstate missing package
add prompt -->
- [x] #53602 <!-- Handle zero on arrays of unions of number types and
missings -->
- [x] #53516 <!-- permit NamedTuple{<:Any, Union{}} to be created -->
- [x] #53643 <!-- Bump CSL to 1.1.1 to fix libgomp bug -->
- [x] #53679 <!-- move precompile workload back from Base -->
- [x] #53663 <!-- add isassigned methods for reinterpretarray -->
- [x] #53662 <!-- [REPL] fix incorrectly cleared line after completions
accepted -->
- [x] #53611 <!-- Linalg: matprod_dest for Diagonal and adjvec -->
- [x] #53659 <!-- fix #52025, re-allow all implicit pointer casts in
cconvert for Array -->
- [x] #53631 <!-- LAPACK: validate input parameters to throw informative
errors -->
- [x] #53628 <!-- Make some improvements to the Scoped Values
documentation. -->
- [x] #53655 <!-- Change tbaa of ptr_phi to tbaa_value  -->
- [x] #53391 <!-- Default to the medium code model in x86 linux -->
- [x] #53699 <!-- Move `isexecutable, isreadable, iswritable` to
`filesystem.jl` -->
- [x] #41232 <!-- Fix linear indexing for ReshapedArray if the parent
has offset axes -->
- [x] #53527 <!-- Enable analyzegc checks for try catch and fix found
issues -->
- [x] #52092 
- [x] #53682 <!-- Increase build precompilation -->
- [x] #53720 
- [x] #53553 <!-- typeintersect: fix `UnionAll` unaliasing bug caused by
innervars. -->

Contains multiple commits, manual intervention needed:
- [ ] #53305 <!-- Propagate inbounds in isassigned with CartesianIndex
indices -->

Non-merged PRs with backport label:
- [ ] #53736 <!-- fix literal-pow to return the right type when the base
is -1 -->
- [ ] #53707 <!-- Make ScopedValue public -->
- [ ] #53696 <!-- add invokelatest to on_done callback in bracketed
paste -->
- [ ] #53660 <!-- put Logging back in default sysimage -->
- [ ] #53509 <!-- revert moving "creating packages" from Pkg.jl -->
- [ ] #53452 <!-- RFC: allow Tuple{Union{}}, returning Union{} -->
- [ ] #53402 <!-- Add `jl_getaffinity` and `jl_setaffinity` -->
- [ ] #52694 <!-- Reinstate similar for AbstractQ for backward
compatibility -->
- [ ] #51928 <!-- Styled markdown, with a few tweaks -->
- [ ] #51816 <!-- User-themable stacktraces -->
- [ ] #51811 <!-- Make banner size depend on terminal size -->
- [ ] #51479 <!-- prevent code loading from lookin in the versioned
environment when building Julia -->
@KristofferC KristofferC mentioned this pull request Mar 20, 2024
41 tasks
@carstenbauer
Copy link
Member Author

carstenbauer commented Mar 21, 2024

Test are passing now (I choose a mask where only the first CPU thread is marked as 1, i.e. pinning to the first CPU thread).

However, depending on how the CI actually runs (does it have exclusive machine access?) I could imagine that the first CPU thread isn't always guaranteed to be available to the process..... Let me test the original mask (all ones) once again to see if the Windows CI failure was just a fluke. If it wasn't I think I would vote to simply disable this test on Windows.

@carstenbauer
Copy link
Member Author

Alright same test that was passing above is now suddenly failing...

@carstenbauer
Copy link
Member Author

@IanButterworth, given we're fine with not testing on Windows - I don't know why it fails and don't know how to reliably fix the test for it - this can be merged now.

While not ideal, I don't think it's too bad to not test this on Windows. After all, (1) this isn't documented API and (2) pinning threads typically is only really used on Linux (clusters) anyways. But I'm open to suggestions, of course.

@IanButterworth
Copy link
Sponsor Member

Seems this needs a rebase now. Also, I just wanted to double-check that merging this wouldn't introduce a flaky test?

@IanButterworth
Copy link
Sponsor Member

Bump (given this has merge me)

@KristofferC KristofferC mentioned this pull request Apr 17, 2024
59 tasks
carstenbauer and others added 4 commits April 17, 2024 11:55
Co-authored-by: Valentin Churavy <vchuravy@users.noreply.github.com>
@carstenbauer
Copy link
Member Author

@IanButterworth I've just rebased. Let's see if the test is flaky. (If it turns out to be problematic, maybe we shouldn't test this - internal feature - at all?)

@giordano
Copy link
Contributor

There's a test failure in 32-bit windows job in "threads" set, but on the phone I can't easily see the error message and tell if it's relevant.

@DilumAluthge
Copy link
Member

CI is now all green. @vtjnash @vchuravy

@vchuravy vchuravy merged commit 065aeb6 into JuliaLang:master May 25, 2024
7 checks passed
@DilumAluthge DilumAluthge removed the status:merge me PR is reviewed. Merge when all tests are passing label May 25, 2024
KristofferC added a commit that referenced this pull request May 28, 2024
Backported PRs:
- [x] #53665 <!-- use afoldl instead of tail recursion for tuples -->
- [x] #53976 <!-- LinearAlgebra: LazyString in interpolated error
messages -->
- [x] #54005 <!-- make `view(::Memory, ::Colon)` produce a Vector -->
- [x] #54010 <!-- Overload `Base.literal_pow` for `AbstractQ` -->
- [x] #54069 <!-- Allow PrecompileTools to see MI's inferred by foreign
abstract interpreters -->
- [x] #53750 <!-- inference correctness: fields and globals can revert
to undef -->
- [x] #53984 <!-- Profile: fix heap snapshot is valid char check -->
- [x] #54102 <!-- Explicitly compute stride in unaliascopy for SubArray
-->
- [x] #54070 <!-- Fix integer overflow in `skip(s::IOBuffer,
typemax(Int64))` -->
- [x] #54013 <!-- Support case-changes to Annotated{String,Char}s -->
- [x] #53941 <!-- Fix writing of AnnotatedChars to AnnotatedIOBuffer -->
- [x] #54137 <!-- Fix typo in docs for `partialsortperm` -->
- [x] #54129 <!-- use correct size when creating output data from an
IOBuffer -->
- [x] #54153 <!-- Fixup IdSet docstring -->
- [x] #54143 <!-- Fix `make install` from tarballs -->
- [x] #54151 <!-- LinearAlgebra: Correct zero element in
`_generic_matvecmul!` for block adj/trans -->
- [x] #54213 <!-- Add `public` statement to `Base.GC` -->
- [x] #54222 <!-- Utilize correct tbaa when emitting stores of unions.
-->
- [x] #54233 <!-- set MAX_OS_WRITE on unix -->
- [x] #54255 <!-- fix `_checked_mul_dims` in the presence of 0s and
overflow. -->
- [x] #54259 <!-- Fix typo in `readuntil` -->
- [x] #54251 <!-- fix typo in gc_mark_memory8 when chunking a large
array -->
- [x] #54276 <!-- Fix solve for complex `Hermitian` with non-vanishing
imaginary part on diagonal -->
- [x] #54248 <!-- ensure package callbacks are invoked when no valid
precompile file exists for an "auto loaded" stdlib -->
- [x] #54308 <!-- Implement eval-able AnnotatedString 2-arg show -->
- [x] #54302 <!-- Specialised substring equality for annotated strs -->
- [x] #54243 <!-- prevent `package_callbacks` to run multiple time for a
single package -->
- [x] #54350 <!-- add a precompile signature to Artifacts code that is
used by JLLs -->
- [x] #54331 <!-- correctly track freed bytes in
jl_genericmemory_to_string -->
- [x] #53509 <!-- revert moving "creating packages" from Pkg.jl -->
- [x] #54335 <!-- When accessing the data pointer for an array, first
decay it to a Derived Pointer -->
- [x] #54239 <!-- Make sure `fieldcount` constant-folds for `Tuple{...}`
-->
- [x] #54288
- [x] #54067
- [x] #53715 <!-- Add read/write specialisation for IOContext{AnnIO} -->
- [x] #54289 <!-- Rework annotation ordering/optimisations -->
- [x] #53815 <!-- create phantom task for GC threads -->
- [x] #54130 <!-- inference: handle `LimitedAccuracy` in
`handle_global_assignment!` -->
- [x] #54428 <!-- Move ConsoleLogging.jl into Base -->
- [x] #54332 <!-- Revert "add unsetindex support to more copyto methods
(#51760)" -->
- [x] #53826 <!-- Make all command-line options documented in all
related files -->
- [x] #54465 <!-- typeintersect: conservative typevar subtitution during
`finish_unionall` -->
- [x] #54514 <!-- typeintersect: followup cleanup for the nothrow path
of type instantiation -->
- [x] #54499 <!-- make `@doc x` work without REPL loaded -->
- [x] #54210 <!-- attach finalizer in `mmap` to the correct object -->
- [x] #54359 <!-- Pkg REPL: cache `pkg_mode` lookup -->

Non-merged PRs with backport label:
- [ ] #54471 <!-- Actually setup jit targets when compiling
packageimages instead of targeting only one -->
- [ ] #54457 <!-- Make `String(::Memory)` copy -->
- [ ] #54323 <!-- inference: fix too conservative effects for recursive
cycles -->
- [ ] #54322 <!-- effects: add new `@consistent_overlay` macro -->
- [ ] #54191 <!-- make `AbstractPipe` public -->
- [ ] #53957 <!-- tweak how filtering is done for what packages should
be precompiled -->
- [ ] #53882 <!-- Warn about cycles in extension precompilation -->
- [ ] #53707 <!-- Make ScopedValue public -->
- [ ] #53452 <!-- RFC: allow Tuple{Union{}}, returning Union{} -->
- [ ] #53402 <!-- Add `jl_getaffinity` and `jl_setaffinity` -->
- [ ] #53286 <!-- Raise an error when using `include_dependency` with
non-existent file or directory -->
- [ ] #52694 <!-- Reinstate similar for AbstractQ for backward
compatibility -->
- [ ] #51479 <!-- prevent code loading from lookin in the versioned
environment when building Julia -->
@KristofferC KristofferC mentioned this pull request May 29, 2024
21 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 1.11 Change should be backported to release-1.11 domain:multithreading Base.Threads and related functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Option to pin GC threads
6 participants