Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add some aliasing warnings to docstrings for mutating functions in Base #50824

Merged
merged 17 commits into from Oct 27, 2023

Conversation

gdalle
Copy link
Contributor

@gdalle gdalle commented Aug 7, 2023

Functions like sum!(B, A) have undefined behavior when A and B share memory. We might fix that in the long run, but in the short run, doc warnings are better than nothing.

Related issues:

See also: https://discourse.julialang.org/t/did-julia-community-do-something-to-improve-its-correctness/102515/

  • accumulate!
  • all!
  • any!
  • append! (unsure)
  • asyncmap!
  • broadcast!
  • circcopy!
  • circshift!
  • clamp!
  • conj!
  • copy!
  • copyto!
  • count!
  • cumprod!
  • cumsum!
  • delete!
  • deleteat!
  • digits!
  • empty!
  • extrema!
  • fill!
  • filter!
  • findmax!
  • findmin!
  • get!
  • hex2bytes!
  • insert!
  • intersect!
  • invpermute!
  • keepat!
  • kron!
  • map!
  • maximum!
  • merge! (unsure)
  • mergewith! (unsure)
  • minimum!
  • modifyproperty!
  • partialsort!
  • partialsortperm!
  • permute!
  • permutedims!
  • pop!
  • popat!
  • popfirst!
  • prepend! (unsure)
  • prod!
  • push!
  • pushfirst!
  • put!
  • read!
  • readbytes!
  • replace!
  • replaceproperty!
  • resize!
  • reverse!
  • setdiff!
  • setindex!
  • setproperty!
  • sizehint!
  • sort!
  • sortperm!
  • splice!
  • sum!
  • swapproperty!
  • symdiff!
  • take!
  • union!
  • unique!
  • unsafe_copyto!
  • unsafe_store!

@oscardssmith oscardssmith added the domain:docs This change adds or pertains to documentation label Aug 7, 2023
@oscardssmith
Copy link
Member

oscardssmith commented Aug 7, 2023

Should reduce! be on the list?

@gbaraldi
Copy link
Member

gbaraldi commented Aug 7, 2023

I would even change should to must.

@gdalle
Copy link
Contributor Author

gdalle commented Aug 7, 2023

Should reduce! be on the list?

Looked for it in the code base but the search didn't yield anything. Perhaps it doesn't exist 🤯

@gdalle
Copy link
Contributor Author

gdalle commented Aug 7, 2023

I would even change should to must.

done

@gustaphe
Copy link

gustaphe commented Aug 7, 2023

Is "alias with" a self-explanatory phrase? Not a native speaker, but I would understand this better if it said "share memory with" or something similar.

@gdalle
Copy link
Contributor Author

gdalle commented Aug 7, 2023

Is "alias with" a self-explanatory phrase? Not a native speaker, but I would understand this better if it said "share memory with" or something similar.

I took the formulation from LinearAlgebra.mul! but I agree that "alias" is not obvious for beginners, and it's a CS-specific term. If this gets one or two likes I'm changing it

@gdalle
Copy link
Contributor Author

gdalle commented Aug 7, 2023

Replacement done, I think this is ready for review

@CameronBieganek
Copy link
Contributor

I agree with adding warnings to the docstrings. However, I'm not sure what to do with map!. I believe the map!(f, x, x) idiom is meant to be supported, e.g., the following should work:

julia> x = [1, 2];

julia> map!(x -> 2x, x, x); x
2-element Vector{Int64}:
 2
 4

In fact, there is a unit test for that use case.

@gdalle
Copy link
Contributor Author

gdalle commented Aug 7, 2023

I just took a cue from #50814 but it's easy to remove. Should I leave map! aside?

@mikmoore
Copy link
Contributor

mikmoore commented Aug 7, 2023

Good catch. map! is tricky because it should definitely work when the destination is not aliased or is identical to a source, because it reads the element before writing and only reads/writes that element once. But you can still cause mischief with misaligned aliasing:

julia> x = ones(Int,4); @views map!(+, x[begin+1:end], x[begin+1:end], x[begin:end-1]); x
4-element Vector{Int64}:
 1
 2
 3
 4

We need to specify (or non-specify) an evaluation order for map!. If we want to commit to a specific order, something like (1) The collections are evaluated in iteration order. But if there are cases where that currently is not true, or we desire to leave room for such cases in the future, then we might do better to say (2) The order in which the collections are iterated is undefined.

Either of these would mean that map!(f,x,x) is well-behaved, but (1) would introduce a contract for my misaligned example while (2) would leave it as undefined behavior. A specified order would preclude SIMD or other re-ordering unless the compiler could prove there was no possible dependency. That makes me more sympathetic to (2). Going (2) to (1) in the future would be non-breaking but (1) to (2) would be breaking. For that reason, I think the "easy" answer is (2) unless we want to debate the contract here.

We should also consider the behavior of broadcast!, although it is not necessary that it match map!.

@CameronBieganek
Copy link
Contributor

I actually had a PR (#47012) for the map! docstring a while ago, but I closed it since no one responded to it. However, after closing, Jameson did respond with this comment:

I think there may be exceptions (e.g. SparseArrays) to this, so it may not be quite right to say it is necessarily the same order as iteration, only that eventually each element will be updated once.

@CameronBieganek
Copy link
Contributor

(2) The order in which the collections are iterated is undefined.

Yeah, that might be the safest thing to do for now.

@danielwe
Copy link
Contributor

danielwe commented Aug 7, 2023

The "...intended to operate without making any allocations" clause seems redundant and buries the key point. How about sticking to mul!s version? I.e., "Note that destination must not share memory with any of the elements in collection..." etc.

@gdalle
Copy link
Contributor Author

gdalle commented Aug 8, 2023

I removed the redundancy and fixed the map! docstring

base/accumulate.jl Outdated Show resolved Hide resolved
@gdalle
Copy link
Contributor Author

gdalle commented Aug 8, 2023

Gonna use @jakobnissen's suggestion and run

filter(map(String, names(Base))) do i
    endswith(i, '!')
end

to see if I missed a few spots.

@gdalle
Copy link
Contributor Author

gdalle commented Aug 9, 2023

OK just went through the whole list, see the first message. If anyone objects to my selection of "aliasing-vulnerable" mutating functions, let them speak now or forever hold their peace

@CameronBieganek
Copy link
Contributor

Can this be backported to 1.10? If 1.10 becomes an LTS release, it might be nice to have these changes in the docs.

Co-authored-by: Cameron Bieganek <8310743+CameronBieganek@users.noreply.github.com>
@gdalle
Copy link
Contributor Author

gdalle commented Aug 9, 2023

Can this be backported to 1.10? If 1.10 becomes an LTS release, it might be nice to have these changes in the docs.

Don't ask me, I'm just visiting around these parts 🤣

@LilithHafner
Copy link
Member

Question for triage:

Should we add a boilerplate warning to every mutating function that can misbehave when passed aliased inputs?

@LilithHafner LilithHafner added the status:triage This should be discussed on a triage call label Oct 2, 2023
KristofferC added a commit that referenced this pull request Oct 2, 2023
Backported PRs:
- [x] #48625 <!-- add replace(io, str, patterns...) -->
- [x] #48387 <!-- Improve documentation of sort-related functions -->
- [x] #48363 <!-- Revise sort.md and docstrings in sort.jl -->
- [x] #48977 <!-- Update SparseArrays.jl stdlib for SuiteSparse 7 -->
- [x] #50719 <!-- fix `CyclePadding(::DataType)` -->
- [x] #50694 <!-- inference: permit recursive type traits -->
- [x] #50860 <!-- Add `Base.get_extension` to docs/API -->
- [x] #50594 <!-- Disallow non-index Integer types in isassigned -->
- [x] #50802 <!-- Makes IntrusiveLinkedListSynchronized mutable to avoid
allocation on poptask -->
- [x] #50858 <!-- Add a `threadpool` parameter to `Channel` constructor
-->
- [x] #50874 <!-- Restrict COFF to a single thread when symbol count is
high -->
- [x] #50822 <!-- Add default method for setmodifiers! -->
- [x] #50730 <!-- Fix integer overflow in `isapprox` -->
- [x] #50850 <!-- Remove weird Rational dispatch and add pi functions to
list -->
- [x] #50809 <!-- Limit type-printing in MethodError -->
- [x] #50915 <!-- Add note the `Task` about sticky bit -->
- [x] #50929 <!-- when widening tuple types in tmerge, only widen the
complex parts -->
- [x] #50928 <!-- Bump JuliaSyntax to 0.4.6 -->
- [x] #50959 <!-- Update libssh2 patches -->
- [x] #50823 <!-- Make ranges more robust with unsigned indexes. -->
- [x] #48542 <!-- Add docs on task-specific buffering using
multithreading -->
- [x] #50912 <!-- Separate foreign threads into a :foreign threadpool
-->
- [x] #51010 <!-- Add ORIGIN to SuiteSparse rpath on Linux/FreeBSD -->
- [x] #50753 <!-- cat: remove unused promote_eltype methods that confuse
inference -->
- [x] #51027 <!-- Implement realloc accounting correctly -->
- [x] #51019 <!-- fix a case of potentially use of undefined variable
when handling error in distributed message processing -->
- [x] #51039 <!-- Revert "Optimize findall(f, ::AbstractArray{Bool})
(#42202)" -->
- [x] #51036 <!-- add missing invoke edge for nospecialize targets -->
- [x] #51042 <!-- inference: fix return_type_tfunc modeling of concrete
functions -->
- [x] #51114 <!-- Workaround upstream FreeBSD issue #272992 -->
- [x] #50892 <!-- Add `JL_DLLIMPORT` to `small_typeof` declaration -->
- [x] #51154 <!-- broadcast: use recursion rather than ntuple to map
over a tuple -->
- [x] #51153 <!-- fix debug typo in "add missing invoke edge for
nospecialize targets (#51036)" -->
- [x] #51222 <!-- Check again if the tty is open inside the IO lock -->
- [x] #51236 <!-- Add lock around uv_unref during init -->
- [x] #51243 <!-- GMP: Gracefully handle more overflows. -->
- [x] #51254 <!-- Ryu: make sure adding zeros does not overwrite
trailing dot -->
- [x] #51175 <!-- shorten stale_age for cachefile lock -->
- [x] #51300 <!-- fix method definition error for bad vararg -->
- [x] #51307 <!-- fix force-throw ctrl-C on Windows -->
- [x] #51303 <!-- ensure revising structs is safe -->
- [x] #51393 
- [x] #51403 

Need manual backport:
- [x] #51009 <!-- fix #50562, regression in `in` of tuple of Symbols -->
- [x] #51053 <!-- Bump Statistics stdlib -->
- [x] #51013 <!-- fix #50709, type bound error due to free typevar in
sparam env -->
- [x] #51305 <!-- reduce test time for rounding and floatfuncs -->

Contains multiple commits, manual intervention needed:
- [ ] #50663 <!-- Fix Expr(:loopinfo) codegen -->
- [ ] #51035 <!-- refactor GC scanning code to reflect jl_binding_t are
now first class -->
- [ ] #51092 <!-- inference: fix bad effects for recursion -->
- [x] #51247 <!-- Check if malloc has succeeded before incrementing gc
stats -->
- [x] #51294 <!-- use LibGit2_jll for LibGit2 library -->

Non-merged PRs with backport label:
- [ ] #51132 <!-- Handle `AbstractQ` in concatenation -->
- [x] #51029 <!-- testdefs: make sure that if a test set changes the
active project, they change it back when they're done -->
- [ ] #50919 <!-- Code loading: do the "skipping mtime check for stdlib"
check regardless of the value of `ispath(f)` -->
- [ ] #50824 <!-- Add some aliasing warnings to docstrings for mutating
functions -->
- [x] #50385 <!-- Precompile pidlocks: add to NEWS and docs -->
- [ ] #49805 <!-- Limit TimeType subtraction to AbstractDateTime -->
@KristofferC KristofferC mentioned this pull request Oct 3, 2023
31 tasks
@gdalle
Copy link
Contributor Author

gdalle commented Oct 23, 2023

@LilithHafner did this come up in triage? Honestly at this point I'm happy to revert to whatever state of the PR is deemed more mergeable

@LilithHafner
Copy link
Member

No. Last Traige we only talked about the Memory type (#51319) I'll try to make sure we talk about this this week.

@rafaqz
Copy link
Contributor

rafaqz commented Oct 23, 2023

Should we add a boilerplate warning to every mutating function that can misbehave when passed aliased inputs?

Yes

@mkitti
Copy link
Contributor

mkitti commented Oct 24, 2023

  1. We should clearly document which arguments may be mutated.
  2. We should provide a warning, possibly, common as it is now.
  3. The common warning should link to a section in the docs that discusses aliasing and mutating functions.
  4. This section should explain to a general user what aliasing and mutating means.

@LilithHafner
Copy link
Member

Triage says merge!

@LilithHafner LilithHafner added status:merge me PR is reviewed. Merge when all tests are passing and removed status:triage This should be discussed on a triage call labels Oct 26, 2023
Copy link
Member

@LilithHafner LilithHafner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some usages of _DOCS_ALIASING_WARNING need to be qualified for Julia to build.

base/sort.jl Outdated Show resolved Hide resolved
base/sort.jl Outdated Show resolved Hide resolved
@LilithHafner LilithHafner merged commit 58030da into JuliaLang:master Oct 27, 2023
5 of 7 checks passed
@LilithHafner LilithHafner removed the status:merge me PR is reviewed. Merge when all tests are passing label Oct 27, 2023
@LilithHafner
Copy link
Member

Thanks @gdalle! Sorry I made this process so long.

@gdalle
Copy link
Contributor Author

gdalle commented Oct 27, 2023

Don't apologize, thorough reviews are better than no reviews :)

@gdalle gdalle deleted the doc_aliasing_warnings branch October 27, 2023 14:44
KristofferC added a commit that referenced this pull request Nov 2, 2023
Backported PRs:
- [x] #50932 <!-- types: fix hash values of Vararg -->
- [x] #50975 <!-- Use rr-safe `nopl; rdtsc` sequence -->
- [x] #50989 <!-- fix incorrect results in `expm1(::Union{Float16,
Float32})` -->
- [x] #51284 <!-- Avoid infinite loop when doing SIGTRAP in arm64-apple
-->
- [x] #51332 <!-- Add s4 field to Xoshiro -->
- [x] #51397 <!-- call Pkg precompile hook in latest world -->
- [x] #51405 <!-- Remove fallback that assigns a module to inlined
frames. -->
- [x] #51491 <!-- Throw clearer ArgumentError for strip with two string
args -->
- [x] #51531 <!-- fix `_tryonce_download_from_cache` (busybox.exe
download error) -->
- [x] #51541 <!-- Fix string index error in tab completion code -->
- [x] #51530 <!-- Don't mark nonlocal symbols as hidden -->
- [x] #51557 <!-- Fix last startup & shutdown precompiles -->
- [x] #51512 <!-- avoid limiting Type{Any} to Type -->
- [x] #51595 <!-- reset `maxprobe` on `empty!` -->
- [x] #51582 <!-- Aggressive constprop in LinearAlgebra.wrap -->
- [x] #51592 <!-- correctly track element pointer in heap snapshot -->
- [x] #51326 <!-- complete false & true more generally as vals -->
- [x] #51376 <!-- make `hash(::Xoshiro)` compatible with `==` -->
- [x] #51557 <!-- Fix last startup & shutdown precompiles -->
- [x] #51845 
- [x] #51840 
- [x] #50663 <!-- Fix Expr(:loopinfo) codegen -->
- [x] #51863 <!-- LLVM 15.0.7-9 -->

Contains multiple commits, manual intervention needed:

- [ ] #51035 <!-- refactor GC scanning code to reflect jl_binding_t are
now first class -->
- [ ] #51092 <!-- inference: fix bad effects for recursion -->

Non-merged PRs with backport label:
- [ ] #51479 <!-- prevent code loading from lookin in the versioned
environment when building Julia -->
- [ ] #51414 <!-- improvements on GC scheduler shutdown -->
- [ ] #51366 <!-- Handle infix operators in REPL completion -->
- [ ] #50919 <!-- Code loading: do the "skipping mtime check for stdlib"
check regardless of the value of `ispath(f)` -->
- [ ] #50824 <!-- Add some aliasing warnings to docstrings for mutating
functions in Base -->
- [ ] #49805 <!-- Limit TimeType subtraction to AbstractDateTime -->
KristofferC pushed a commit that referenced this pull request Nov 6, 2023
@KristofferC KristofferC mentioned this pull request Nov 6, 2023
39 tasks
@KristofferC KristofferC added backport 1.10 Change should be backported to the 1.10 release and removed backport 1.10 Change should be backported to the 1.10 release labels Nov 27, 2023
KristofferC added a commit that referenced this pull request Dec 2, 2023
Backported PRs:
- [x] #51213 <!-- Wait for other threads to finish compiling before
exiting -->
- [x] #51520 <!-- Make allocopt respect the GC verifier rules with non
usual address spaces -->
- [x] #51598 <!-- Use a simple error when reporting sysimg load
failures. -->
- [x] #51757 <!-- fix parallel peakflop usage -->
- [x] #51781 <!-- Don't make pkgimages global editable -->
- [x] #51848 <!-- allow finalizers to take any locks and yield during
exit -->
- [x] #51847 <!-- add missing wait during Timer and AsyncCondition close
-->
- [x] #50824 <!-- Add some aliasing warnings to docstrings for mutating
functions in Base -->
- [x] #51885 <!-- remove chmodding the pkgimages -->
- [x] #50207 <!-- [devdocs] Improve documentation about building
external forks of LLVM -->
- [x] #51967 <!-- further fix to the new promoting method for
AbstractDateTime subtraction -->
- [x] #51980 <!-- macroexpand: handle const/atomic struct fields
correctly -->
- [x] #51995 <!-- [Artifacts] Pass artifacts dictionary to
`ensure_artifact_installed` dispatch -->
- [x] #52098 <!-- Fix errors in `sort` docstring -->
- [x] #52136 <!-- Bump JuliaSyntax to 0.4.7 -->
- [x] #52140 <!-- Make c func `abspath` consistent on Windows. Fix
tracking path conversion. -->
- [x] #52009 <!-- fix completion that resulted in startpos of 0 for `\\
-->
- [x] #52192 <!-- cap the number of GC threads to number of cpu cores
-->
- [x] #52206 <!-- Make have_fma consistent between interpreter and
compiled -->
- [x] #52027 <!-- fix Unicode.julia_chartransform for Julia 1.10 -->
- [x] #52217 <!-- More helpful error message for empty `cpu_target` in
`Base.julia_cmd` -->
- [x] #51371 <!-- Memoize `cwstring` when used for env lookup /
modification on Windows -->
- [x] #52214 <!-- Turn Method Overwritten Error into a PrecompileError
-- turning off caching -->
- [x] #51895 <!-- Devdocs on fixing precompile hangs, take 2 -->
- [x] #51596 <!-- Reland "Don't mark nonlocal symbols as hidden"" -->
- [x] #51834 <!-- [REPLCompletions] allow symbol completions within
incomplete macrocall expression -->
- [x] #52010 <!-- Revert "Support sorting iterators (#46104)" -->
- [x] #51430 <!-- add support for async backtraces of Tasks on any
thread -->
- [x] #51471 <!-- Fix segfault if root task is NULL -->
- [x] #52194 <!-- Fix multiversioning issues caused by the parallel llvm
work -->
- [x] #51035 <!-- refactor GC scanning code to reflect jl_binding_t are
now first class -->
- [x] #52030 <!-- Bump Statistics -->
- [x] #52189 <!-- codegen: ensure i1 bool is widened to i8 before
storing -->
- [x] #52228 <!-- Widen diagonal var during `Type` unwrapping in
`instanceof_tfunc` -->
- [x] #52182 <!-- jitlayers: replace sharedbytes intern pool with one
that respects alignment -->

Contains multiple commits, manual intervention needed:
- [ ] #51092 <!-- inference: fix bad effects for recursion -->

Non-merged PRs with backport label:
- [ ] #52196 <!-- Fix creating custom log level macros -->
- [ ] #52170 <!-- fix invalidations related to `ismutable` -->
- [ ] #51479 <!-- prevent code loading from lookin in the versioned
environment when building Julia -->
@KristofferC KristofferC removed the backport 1.10 Change should be backported to the 1.10 release label Dec 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:docs This change adds or pertains to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet