Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add devdocs on UB #54099

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Add devdocs on UB #54099

wants to merge 2 commits into from

Conversation

Keno
Copy link
Member

@Keno Keno commented Apr 16, 2024

This adds some basic devdocs for undefined behavior, since there keep being complaints that it's not written down anywhere. The list of UB is just what I remembered off the top of my head, it's probably incomplete, since we haven't historically been particularly careful about this. This way, at least when people add new sources of UB, they'll have a centralized place to write that down.

This adds some basic devdocs for undefined behavior, since
there keep being complaints that it's not written down anywhere.
The list of UB is just what I remembered off the top of my head,
it's probably incomplete, since we haven't historically been
particularly careful about this. This way, at least when people
add new sources of UB, they'll have a centralized place to write
that down.
Copy link
Contributor

@Seelengrab Seelengrab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for writing this down! I've added some comments/fixed some typos.
It's great to see this being written down in the context of julia :)

doc/src/devdocs/ub.md Outdated Show resolved Hide resolved
doc/src/devdocs/ub.md Outdated Show resolved Hide resolved
doc/src/devdocs/ub.md Outdated Show resolved Hide resolved

Additionally, the allowable behaviors for a given program are not unique. For example, the `@fastmath` macro gives wide semantic latitude for floating point math rearrangements and two subsequent invocation of the same operation, even on the same values is not semantically required to produce the same answer. The situation is similar for asynchronous operations, random number generation, et.c

*Undefined Behavior* (UB) occurs when a julia program semantically perform an operation that is assumed to never happen. In such a situation, the language semantics do not constrain the behavior of the implementation, so any behavior of the program is allowable, including crashes, memory corruption, incorrect behavior, etc. As such, it is very important to avoid writing programs that semantically execute undefined behavior.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
*Undefined Behavior* (UB) occurs when a julia program semantically perform an operation that is assumed to never happen. In such a situation, the language semantics do not constrain the behavior of the implementation, so any behavior of the program is allowable, including crashes, memory corruption, incorrect behavior, etc. As such, it is very important to avoid writing programs that semantically execute undefined behavior.
*Undefined Behavior* (UB) occurs when a julia program semantically performs an operation that is assumed to never happen. In such a situation, the language semantics do not constrain the behavior of the implementation, so any behavior of the program is allowable, including crashes, memory corruption, incorrect behavior, etc. As such, it is very important to avoid writing programs that semantically execute undefined behavior.

I think it's not good to call this "Undefined Behavior". The behavior that does happen is (usually) something sensible/understandable, albeit outside of the semantics of the language. E.g. crashes & memory corruption is a common result when going outside the semantics of the language - running rm -rf --no-preserve-root / is not (at least not as a result of the existence of UB alone - the compiler inserting such a thing when it detects/takes advantage of UB for some optimization is a bit absurd).

I understand that this follows the definition C uses (as described e.g. here in sufficient detail), but IMO that's not a good definition. Zig is also planning to move away from this term for much the same reasons as stated in the article.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

running rm -rf --no-preserve-root /

This can definitely happen though. Obviously the compiler won't try to insert it by itself, but if it's present anywhere else in the code, UB can and will be able to cause incorrect jumps to such code.

As for undefined behavior, I'm sympathetic to the argument that undefined is not the correct word in the absence of multiple implementations, and I have no problem with using a different name, but I do think that undefined behavior is a correct description of what's happening.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Obviously the compiler won't try to insert it by itself, but if it's present anywhere else in the code, UB can and will be able to cause incorrect jumps to such code.

Yes exactly. I think what bothers me about the current wording is that it can be read as if that incorrect jump would be (in other execution chains) correct behavior of the original code, which isn't necessarily true either. For example, you can build up that string & call system on it, if the appropriate gadgets are available to use e.g. return oriented programming for malicious code execution. Such uses quite clearly aren't a "behavior of the program".

As for undefined behavior, I'm sympathetic to the argument that undefined is not the correct word in the absence of multiple implementations, and I have no problem with using a different name, but I do think that undefined behavior is a correct description of what's happening.

Yes, that's my stance as well. Behaviors outside of the defined semantics of Julia can in general have the same effects as "UB" in C, I just think that the term is too loaded with C specifics to be all that useful :)

doc/src/devdocs/ub.md Outdated Show resolved Hide resolved
doc/src/devdocs/ub.md Show resolved Hide resolved
- Incorrect use of annotations like `@inbounds`, `@assume_effects` in violation of their requirements [1]
- Retention of pointers to GC-tracked objects outside of a `@GC.preserve` region
- Memory modification of GC-tracked objects without appropriate write barriers from outside of julia (e.g. in native calls, debuggers, etc.)
- Violations of the memory model using `unsafe` operations (e.g. `unsafe_load` of an invalid pointer, pointer provenance violations, etc)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the memory model has been specified by now, can this link to the definition and #46739 be closed?

Also, how should provenance be established for globally constant pointers, e.g. for MMIO-based hardware interactions (e.g. a UART device on a raspberry pi, which lives at a fixed location in memory)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The memory model is not fully specified, it's been a longstanding item on Jameson's docket.

doc/src/devdocs/ub.md Outdated Show resolved Hide resolved
doc/src/devdocs/ub.md Show resolved Hide resolved

## Special cases explicitly NOT undefined behavior

- As of Julia 1.11, access to undefined bits types is no longer undefined behavior. It is still allowed to return an arbitrary value of the bits type, but the value returned must be the same for every access and use thereof is not undefined behavior. In LLVM terminology, the value is `freeze undef`, not `undef`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is referring to cases like x = Ref{Int}(); println(x[]), right? Does this imply that e.g. padding can be assumed to be constant too?

Moreover, not all bitpatterns are necessarily valid values of the bitstype - e.g. a type such as

struct MaskedByte
    byte::UInt8
    MaskedByte(b::UInt8) = new(b & 0x0f)
end

doesn't have the same valid bitpatterns as UInt8, so Ref{MaskedByte}()[] can't guarantee to produce an arbitrary (valid) value of that type without going through the constructor:

julia> reinterpret(UInt8, Ref{MaskedByte}()[])
0x70

It's infeasible to guarantee calling a constructor for such uses though, so maybe the values produced like that (if not undefined) should have their own name? This would of course also cover invalid values of @enum, which currently uses the term invalid for this:

julia> @enum Foo A=0x0 B C D

julia> reinterpret(Foo, Int32(0xff))
<invalid #255>::Foo = 255

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accessing padding is UB, as mentioned in the other comment. Violating inner constructor constraints for bits types is currently not UB or even disallowed, although I would like to clamp down on that in the future (by adding a bit in the type that disallows this).

Co-authored-by: Sukera <11753998+Seelengrab@users.noreply.github.com>
Copy link
Sponsor Member

@aviatesk aviatesk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for this writeup.


To illustrate the distinction, consider a statement like `print(Ref(1).x)`. The language semantics may specify that the observable behavior of this statement is that the value `1` is printed to `stdout`. However, whether or not the object `Ref` is actually allocated may not be semantically observable (even though it may be implicitly observable by looking at memory use, number of allocations, generated code, etc.). Because of this, the implementation is allowed to replace this statement with `print(1)`, which preserves all semantically observable behaviors.

Additionally, the allowable behaviors for a given program are not unique. For example, the `@fastmath` macro gives wide semantic latitude for floating point math rearrangements and two subsequent invocation of the same operation inside of that macro, even on the same values, is not semantically required to produce the same answer. The situation is similar for asynchronous operations, random number generation, etc.
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Additionally, the allowable behaviors for a given program are not unique. For example, the `@fastmath` macro gives wide semantic latitude for floating point math rearrangements and two subsequent invocation of the same operation inside of that macro, even on the same values, is not semantically required to produce the same answer. The situation is similar for asynchronous operations, random number generation, etc.
Additionally, the allowable behaviors for a given program are not unique. For example, the `@fastmath` macro gives wide semantic latitude for floating point math rearrangements and two subsequent invocations of the same operation inside of that macro, even on the same values, are not semantically required to produce the same answer. The situation is similar for asynchronous operations, random number generation, etc.

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this say parallel executions rather than asynchronous operations? I suppose those are basically the same thing, but we do make a fairly good number of defined behaviors of async calls (memory order for when they start, what random number they start with, etc)

Note that this explicitly applies to *semantically executed* undefined behavior. While julia's compiler is allowed to and does aggressively perform speculative execution of pure functions. Since the execution point is not semantically observable (though again indirectly observable through execution performance), this is allowable by the as-if rule. As such, speculative execution is inhibited unless the code in question is proven to be
free of undefined behavior.

The presence of undefined behavior is modeled as part of julia's effect system using the `:noub` effect bit. See the documentation for `@assume_effects` for more information on querying the compiler's effect model or overriding it for specific situations (e.g. where a dynamic check precludes potential UB from every actually being reached).
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The presence of undefined behavior is modeled as part of julia's effect system using the `:noub` effect bit. See the documentation for `@assume_effects` for more information on querying the compiler's effect model or overriding it for specific situations (e.g. where a dynamic check precludes potential UB from every actually being reached).
The presence of undefined behavior is modeled as part of julia's effect system using the `:noub` effect bit. See the documentation for [`Base.@assume_effects`](@ref) for more information on querying the compiler's effect model or overriding it for specific situations (e.g. where a dynamic check precludes potential UB from ever actually being reached).

The following is a list of sources of undefined behavior,
though it should currently not be considered exhaustive:

- Replacement of `const` values. Note that in interactive mode the compiler will issue a warning for this and some care is taken to mimize impact as a user convenience, but the behavior is not defined.
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Replacement of `const` values. Note that in interactive mode the compiler will issue a warning for this and some care is taken to mimize impact as a user convenience, but the behavior is not defined.
- Replacement of `const` values. Note that in interactive mode the compiler will issue a warning for this and some care is taken to minimize the impact as a user convenience, but the behavior is not defined.

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"is not defined" -> "it may cause undefined behavior"? Should we try to be consistent in calling it exactly UB, or are various equivalent phrases also acceptable?

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Replacement of `const` values. Note that in interactive mode the compiler will issue a warning for this and some care is taken to mimize impact as a user convenience, but the behavior is not defined.
- Replacement of `const` values. While the language itself does not define this behavior and the compiler may assume `const` values are never redefined, the compiler does take some care to minimize impact in interactive mode for user convenience.

I think equivalent phrases are ok if when it's clear we're talking about Julia-the-language — writing it as the above I think might have helped Matt-from-three-days-ago.

though it should currently not be considered exhaustive:

- Replacement of `const` values. Note that in interactive mode the compiler will issue a warning for this and some care is taken to mimize impact as a user convenience, but the behavior is not defined.
- Various modification of global state during precompile. Where possible, this is detected and yields an error, but detection is incomplete.
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Various modification of global state during precompile. Where possible, this is detected and yields an error, but detection is incomplete.
- Various modifications of global state during precompile. Where possible, this is detected and yields an error, but detection is incomplete.

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to add an asterisk here that the intent is to detect all forms of this in the future, but the current implementation does not?

A lemma of this maybe worth adding here is that any observation of mutable state from inside a generated function also will trigger UB (such as accessing a global Dict, or similar other examples that are documented already as forbidden)

- Incorrect implementation of a `Core.OptimizedGenerics` interface [1]
- Any invocation of undefined behavior in FFI code (e.g. `ccall`, `llvmcall`) according to the semantics of the respective language
- Incorrect signature types in `ccall` or `cfunction`, even if those signatures happen to yield the correct ABI on a particular platform
- Incorrect use of annotations like `@inbounds`, `@assume_effects` in violation of their requirements [1]
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Incorrect use of annotations like `@inbounds`, `@assume_effects` in violation of their requirements [1]
- Incorrect use of annotations like `@inbounds`, `@assume_effects` in violation of their requirements [^1]

- Incorrect use of annotations like `@inbounds`, `@assume_effects` in violation of their requirements [1]
- Retention of pointers to GC-tracked objects outside of a `@GC.preserve` region
- Memory modification of GC-tracked objects without appropriate write barriers from outside of julia (e.g. in native calls, debuggers, etc.)
- Violations of the memory model using `unsafe` operations (e.g. `unsafe_load` of an invalid pointer, pointer provenance violations, etc)
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Violations of the memory model using `unsafe` operations (e.g. `unsafe_load` of an invalid pointer, pointer provenance violations, etc)
- Violations of the memory model using `unsafe_` operations (e.g. `unsafe_load` of an invalid pointer, pointer provenance violations, etc)

- A value other than `false` (`reinterpret(UInt8, b) == 0x00`) or `true` (`reinterpret(UInt8, b) == 0x01`) for a `Bool` `b`.
- Invoking undefined behavior via compiler intrinsics.

[1] Incorrect use here may be UB, even if not semantically executed, please see the specific documentation of the feature.
Copy link
Sponsor Member

@aviatesk aviatesk Apr 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
[1] Incorrect use here may be UB, even if not semantically executed, please see the specific documentation of the feature.
[^1] Incorrect use here may be UB, even if not semantically executed, please see the specific documentation of the feature.


## Special cases explicitly NOT undefined behavior

- As of Julia 1.11, access to undefined bits types is no longer undefined behavior. It is still allowed to return an arbitrary value of the bits type, but the value returned must be the same for every access and use thereof is not undefined behavior. In LLVM terminology, the value is `freeze undef`, not `undef`.
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you remember when this change was implemented? It might mean we are now able to remove the check at

if !(length(argtypes) 2 && getfield_notundefined(obj, argtypes[2]))

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right, that should be removable. We already taint the consistency when allocating the object, so there is no longer any UB with accessing an object (since #52169 in November):

julia> struct A; b::Int; A() = new(); end
julia> Base.infer_effects(()) do; A(); end
(!c,+e,+n,+t,+s,+m,+u)

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, if it's assured that an undefined isbitstype-field consistently returns the same value, shouldn't those allocations also be considered :consistent, unless they are mutable?

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A specific, known allocation always returns the same value when observed, but a specific call / allocation site does not

@mbauman
Copy link
Sponsor Member

mbauman commented Apr 16, 2024

I really don't like this, but thank you.

My previous — obviously wrong — understanding was that the Julia compiler was trying to guard us from LLVM's undefined behaviors via freezes and such. In particular, because the observable behavior of Julia is the spec, it's impossible to know whether, e.g., accessing undef bitstypes are simply arbitrary values or UB — users only see the arbitrary values. Apparently this particular thing isn't UB anymore (yay!), but users are regularly leaning on other sorts of UB here, too.

I think part of my issue here is just how aggressively and adversarially other compiler teams have historically used a very legalistic reading of UB to do absurd things. For example, my mental model was that changing a const was just putting me at risk of using the old value in some places — and those places simply aren't specified. That's just fine. Now, sure, you might not decide to make entire function bodies no-ops when they contain a changed-const var, but another compiler dev could look at this list and see a juicy spot for a great optimization.

@oscardssmith
Copy link
Member

oscardssmith commented Apr 16, 2024

@mbauman there are already cases where changing a const can lead to a method being optimized out. Consider, for example

const a = true
r() = a
r()
a=false
function f()
    if r() == a
        println("always happens")
    else
        println("never happens")
    end
end
using Debugger
@run f()
always happens
f()
never happens

Despite the fact that r is defined to return a, this code currently prints out "never happens". In general, most inconsistencies end up as full scale UB because if the compiler can derive false, anything can happen.

Since we have had the ability to define typed globals, I wonder if we should make redefining a const into an error...

@mbauman
Copy link
Sponsor Member

mbauman commented Apr 16, 2024

That matches my expectations, though. It's just fine that r() uses an old value of a and f() uses a new one and that the two may not be consistent. That feels markedly different than UB — which grants the compiler leeway to compile f() to nothing (executing neither branch and outputting nothing) because you've broken some devdocs law.

Edit to add: I fully understand your point about how inconsistencies like these can "bubble up" to larger unexpected behaviors, but to my understanding the distinction between an undefined behavior and an unspecified behavior is exactly how the language behaves in doing that "bubbling up." If the value of a is unspecified, then all language semantics must still be followed; you just aren't sure what value you'll see at any given point. If using a is undefined behavior, however, then the language is free to use alternative semantics — it can decide to skip calls to f(a) entirely, throw away whole code blocks, etc.

@giordano giordano added the doc This change adds or pertains to documentation label Apr 16, 2024
- Any invocation of undefined behavior in FFI code (e.g. `ccall`, `llvmcall`) according to the semantics of the respective language
- Incorrect signature types in `ccall` or `cfunction`, even if those signatures happen to yield the correct ABI on a particular platform
- Incorrect use of annotations like `@inbounds`, `@assume_effects` in violation of their requirements [1]
- Retention of pointers to GC-tracked objects outside of a `@GC.preserve` region
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Observation and retention?

E.g.

GC.@preserve x begin
    p = Base.pointer_from_objref(x)
end

VS

p1 = Base.pointer_from_objref(x) # UB
GC.@preserve x begin
    p2 = Base.pointer_from_objref(x)
    @assert p1 == p2 # May not be true.
end

If we ever want to allow for a moving GC.

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Retention may be okay, since I am not certain the GC implements that anyways. In particular, if the only use of an object is ===, and we turn that into a pointer equality check later, does GC rooting know to preserve that pointer?

The other significant behavior of note here is that unsafe_pointer_from_objref does not "recover" the GC-tracked safety property, such that this is UB code as well, in all its forms, since the Base.pointer_from_objref(x) is allowed to escape the preserve region (via Base.unsafe_pointer_from_objref), which is not permitted:

x = GC.@preserve x Base.unsafe_pointer_from_objref(Base.pointer_from_objref(x))

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume you mean unsafe_pointer_to_objref

@Seelengrab
Copy link
Contributor

Seelengrab commented Apr 16, 2024

My previous — obviously wrong — understanding was that the Julia compiler was trying to guard us from LLVM's undefined behaviors via freezes and such.

That is not so - admittedly, without a specified memory model it's a bit hard to do for everything in this PR (things involving pointers in particular, though those operations generally inherit the semantics of C by necessity), but in principle everything here can be justified as being in this document through Julia semantics alone. Undefined behavior is not something that bubbles up from a lower level of abstraction, but is inherent to every level of abstraction, to some extent (and with sometimes more disliked behavior than other times). Getting an error when calling lowercase on a String containing malformed UTF-8 is just as much "undefined behavior" (in the sense that it's not defined what exactly happens when you try to do that) as misusing @inbounds is - the latter usually just has much more disastrous effects. Actually getting an error you can handle is a gazillion times better than silent memory corruption.

This difficulty is part of the reason why I want to get away from the term "undefined behavior", because people generally think "ah, this is this nasty C business with pointers and boundschecking and stuff" instead of "I'm relying on behavior that is not specified/guaranteed to continue to exist".

*Undefined Behavior* (UB) occurs when a julia program semantically perform an operation that is assumed to never happen. In such a situation, the language semantics do not constrain the behavior of the implementation, so any behavior of the program is allowable, including crashes, memory corruption, incorrect behavior, etc. As such, it is very important to avoid writing programs that semantically execute undefined behavior.

Note that this explicitly applies to *semantically executed* undefined behavior. While julia's compiler is allowed to and does aggressively perform speculative execution of pure functions. Since the execution point is not semantically observable (though again indirectly observable through execution performance), this is allowable by the as-if rule. As such, speculative execution is inhibited unless the code in question is proven to be
free of undefined behavior.
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, I think we only require (and implement) the slightly weaker form that it does not execute the undefined behavior, not that it is entire free of it (basically what the first point said of the runtime, but reiterated from the perspective of the compiler calling speculating the call)

- Violations of the memory model using `unsafe` operations (e.g. `unsafe_load` of an invalid pointer, pointer provenance violations, etc)
- Violations of TBAA guarantees (e.g. using `unsafe_wrap`)
- Mutation of data promised to be immutable (e.g. in `Base.StringVector`)
- Data races
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Data races
- Data races, although some limits are still placed upon the allowable behavior, per the memory model.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems very vague to me, to the point of not really being helpful 🤔 Even Rust doesn't give more justification to data races being UB other than calling them out for being UB.

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rust puts a limit that data races therefore are impossible because they would be UB if possible. Ours is closer to the Java memory model: while they are possible, they are not full UB, since we do define some limits on their (mis)behaviors

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm.. It's pretty easy to construct data races that result in UB though, e.g. with a function that vectorizes a loop over a Vector while simultaneously push!ing to that vector from another task. The reallocation results in OOB reads (and possibly writes) once the original Memory is GCed.

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It cannot be GCd while there remains a reference to the Memory. The question there is whether we force MemoryRef to always use double-word atomic relaxed loads and stores to make sure even the concurrent update is safe. Otherwise you could get a partially torn read/write there currently.

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And because Array uses inbounds annotations internally, it is already covered by that bullet point on the correctness of that

- Violations of TBAA guarantees (e.g. using `unsafe_wrap`)
- Mutation of data promised to be immutable (e.g. in `Base.StringVector`)
- Data races
- Modification of julia-internal mutable state (e.g. task schedulers, data types, etc.)
Copy link
Sponsor Member

@vtjnash vtjnash Apr 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...including overloading key functions such as getproperty(::Type{T}, ...), or show(io::IO, ::Type{T}) 😅

@vtjnash
Copy link
Sponsor Member

vtjnash commented Apr 16, 2024

Getting an error when calling lowercase on a String containing malformed UTF-8 is just as much "undefined behavior" (in the sense that it's not defined what exactly happens when you try to do that) as misusing @inbounds is

In C standards and related common usage, the former is called "unspecified behavior", not undefined behavior and makes a precise distinction between them. However it is also true that many UTF-8 decoders are unsound if they encounter any malformed data and therefore they will cause undefined behavior (but this is not true of the Julia decoder).


Additionally, the allowable behaviors for a given program are not unique. For example, the `@fastmath` macro gives wide semantic latitude for floating point math rearrangements and two subsequent invocation of the same operation inside of that macro, even on the same values, is not semantically required to produce the same answer. The situation is similar for asynchronous operations, random number generation, etc.

*Undefined Behavior* (UB) occurs when a julia program semantically perform an operation that is assumed to never happen. In such a situation, the language semantics do not constrain the behavior of the implementation, so any behavior of the program is allowable, including crashes, memory corruption, incorrect behavior, etc. As such, it is very important to avoid writing programs that semantically execute undefined behavior.
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is perhaps worth a note here that the term "unsafe", when it appears in julia documentation or function names, is typically intended to be interpreted exactly in these terms: of causing UB if any of the arguments do not carefully follow the contract of that function

@mbauman
Copy link
Sponsor Member

mbauman commented Apr 16, 2024

I don't disagree with any of the things in this PR being marked as unspecified or implementation-defined or flat-out invalid.

What I dislike is the term UB and the privilege that it traditionally grants the compiler — it's not generic "nasty C pointer stuff," it's the way in which UB wrecks mayhem around it. It's how UB propagates not just forwards in the execution context but also backwards in how it affects the entire function's compilation. It's how C compilers will notoriously delete error checks that were naively (and, yes, improperly) trying to "catch" some UB.

It's how, if invalid string codeunits were included in this list as UB, it'd grant Julia the license to skip the error entirely. And how that might only happen in some contexts. And how trying to test for invalid codeunits would itself be invalid and eligible for deletion. Even if the UB is scoped to s = lowercase(invalid_string), it still would allow Julia to skip the error entirely. And then do arbitrary deletions with downstream uses of s. That's why I dislike UB.

This document cuts two ways: it sets guardrails for users, and it grants license to the compiler devs. The guardrails for user APIs need to be in user docs, not devdocs.

@Keno
Copy link
Member Author

Keno commented Apr 16, 2024

That matches my expectations, though. It's just fine that r() uses an old value of a and f() uses a new one and that the two may not be consistent. That feels markedly different than UB — which grants the compiler leeway to compile f() to nothing (executing neither branch and outputting nothing) because you've broken some devdocs law.

const replacement is definitely full on UB. The runtime tries to recover as best it can to limit fallout (by disallowing some of the more egregious cases), but it's not at all modeled. Because of the checks, it took me a few minutes to come up with an example, but it's not particularly hard:

julia> struct Foo
       x::Any
       end

julia> const r = Foo(Ref{Any}());

julia> @noinline function f()
       x = getfield(@__MODULE__, :r).x
       Base.donotdelete(x)
       Base.compilerbarrier(:const, x)
       end
f (generic function with 1 method)

julia> g() = f()[]
g (generic function with 1 method)

julia> h() = g()
h (generic function with 1 method)

julia> @code_typed h()
CodeInfo(
1 ─ %1 = invoke Main.f()::Base.RefValue{Any}
│   %2 = Base.getfield(%1, :x)::Any
└──      return %2
) => Any

julia> const r = Foo(1.0)
WARNING: redefinition of constant Main.r. This may fail, cause incorrect answers, or produce other errors.
Foo(1.0)

julia> g()

[64236] signal (11.2): Segmentation fault: 11
in expression starting at none:0
typekeyvalue_hash at /Users/keno/julia/src/jltypes.c:1637 [inlined]
lookup_typevalue at /Users/keno/julia/src/jltypes.c:1067
jl_inst_arg_tuple_type at /Users/keno/julia/src/jltypes.c:2199
jl_f_tuple at /Users/keno/julia/src/builtins.c:915
indexed_iterate at ./pair.jl:42 [inlined]
indexed_iterate at ./pair.jl:42

Doing an example to turn this into launching space invaders is left as an exercise to the reader.

@Keno
Copy link
Member Author

Keno commented Apr 16, 2024

invalid string codeunits were included in this list as UB

Clearly they're not

@mbauman
Copy link
Sponsor Member

mbauman commented Apr 17, 2024

Clearly, I know.

That const replacement crash is a good example, thanks. But it still feels different to me than how UB has traditionally been exploited by other compilers. Maybe it’s just my feels, but even in that crash the compiler seems doing something reasonable — albeit problematic. I know that reasonable is in the eye of the beholder, but that’s at the crux of it — users and compiler devs have historically had very different views on what reasonable behaviors might be.

I just don’t want to have to argue with a legalistic and maliciously compliant compiler.

I dunno, obviously this ship has already sailed. It’s definitely better to have it written down.

@gbaraldi
Copy link
Member

gbaraldi commented Apr 17, 2024

I guess to be clear, we exploit similar things in julia i.e assuming consts don't change. It's just that our UB surface area is a lot smaller than C's which is the issue with C in itself. Any code with UB is a code with bugs, most of those become runtime errors for us, but C doesn't have that so it just assumes they never happen.

We could create a new term for it, but it would mean exactly the same as UB

@Seelengrab
Copy link
Contributor

In C standards and related common usage, the former is called "unspecified behavior", not undefined behavior and makes a precise distinction between them.

You're right, it's unspecified behavior, not undefined behavior. The exact definition of what that is though is also not consistent across languages, so it seems to me we'd be in need of a definition for that as well...

It's how, if invalid string codeunits were included in this list as UB, it'd grant Julia the license to skip the error entirely.

Note that I haven't claimed either "invalid string codeunits" nor "malformed string codeunits" (very different things!) themselves to be UB - String itself only guarantees to be a "bag of bytes", which is perfectly fine. In practice, I think we generally assume String to be well-formed UTF-8 though, which is why I'm saying that "UB is behavior that we assume never happens" is a bad definition of UB (and is where IMO the angle "let's leverage this for optimization" comes from). I'd much rather define UB in terms of hard language semantics/invariants that must not be violated, but for that we'll need a memory model...

This document cuts two ways: it sets guardrails for users, and it grants license to the compiler devs. The guardrails for user APIs need to be in user docs, not devdocs.

Yes, I agree with that! One caveat though: since Julia doesn't have a spec, writing this down doesn't grant more license to compiler devs than they already had, since the current compiler is the exact "guaranteed" semantics. What I'd love to see is a longer list of "these things are guaranteed NOT to be UB (i.e., they are always allowed/guaranteed by the language semantics)", because that is what ultimately restricts the sorts of optimizations that are possible/valid.

@adienes
Copy link
Contributor

adienes commented Apr 17, 2024

const replacement is definitely full on UB

IMO anything that is

  • easily detectable
  • "full on UB"

should error, rather than warn?

maybe also applies to the a value other than true or false for bool one


- As of Julia 1.11, access to undefined bits types is no longer undefined behavior. It is still allowed to return an arbitrary value of the bits type, but the value returned must be the same for every access and use thereof is not undefined behavior. In LLVM terminology, the value is `freeze undef`, not `undef`.

- As of Julia 1.12, loops that do not make forward progress are no longer considered undefined behavior.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- As of Julia 1.12, loops that do not make forward progress are no longer considered undefined behavior.
+ Loops that do not make forward progress are not considered undefined behavior.

I think as far as most regular users are concerned, ALL of this information may as well be "as of 1.12"

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get what you're saying, but this is still valuable. Perhaps they could be expressed with !!! compats.

Suggested change
- As of Julia 1.12, loops that do not make forward progress are no longer considered undefined behavior.
!!! compat "Julia 1.12"
Loops that do not make forward progress are no longer considered undefined behavior.

@jonas-schulze
Copy link
Contributor

Should the associativity within reduce and mapreduce be added to the list of implementation-defined behavior? See discussion in #53871.

@mbauman
Copy link
Sponsor Member

mbauman commented Apr 18, 2024

reduce and mapreduce be added to the list of implementation-defined behavior

No, I think that's a different sort of thing; every single function needs to specify the behaviors its callers can expect and its methods need to satisfy. Or similarly grant leeway. This is more about the semantics of the language itself.


I think I can finally express why I so strongly dislike the language of "undefined behavior."

I've known forever that you shouldn't rely on Core.Compiler.return_type. That it's not stable across versions, that it's not always going to work as expected depending upon the compilation mode, etc. And yet I wouldn't have thought to call that UB. It's just something that you shouldn't do.

And that's at the core of why I don't like the terminology of Undefined Behavior — in particular in a language without a standard to define behaviors. Undefined Behavior turns around the "above don't do that" into a threat. UB is "you can do that, but I'll trash your program and it'll be your fault, because it was invalid."

It is jargon, because of how it typically differs from an unspecified behavior, or even a plain reading of "the behavior is undefined." The value from undef bitstype array is unspecified, and you can use it. But were accessing it an undefined behavior, then you couldn't touch it or your program is trashed and it's be your fault. This is beyond the plain interpretation of the two words — and twisting them into a backwards threat. Only people who know compilers can see the distinction between unspecified and undefined. It's fine to use UB in devdocs, but these guidelines need to be user visible. And at that point, I'd much rather simply call these things invalid.

So back to const redefinitions. Redefining const, even in that segfaulting example, is currently ok if I also redefine all the functions that use that binding, too. It makes no sense to say it's full-on UB and at the same time "minimize impact" to preserve some behaviors. There are some behaviors that are defined — the warning is even documented! So what does it mean to me that it appears on this list?

All this to say, this is a great list. We need to do more of this. Other candidates include return types and Tuple{Union{}} and maybe even Tuple{:wat}.

@Keno
Copy link
Member Author

Keno commented Apr 18, 2024

And yet I wouldn't have thought to call that UB.

It's not UB

The value from undef bitstype array is unspecified, and you can use it. But were accessing it an undefined behavior, then you couldn't touch it or your program is trashed and it's be your fault.

It now is, because it's no longer UB. That was a language specification change as a result of #26764.

So back to const redefinitions. Redefining const, even in that segfaulting example, is currently ok if I also redefine all the functions that use that binding, too.

No, it's not guaranteed - there could have been world age capture.

It makes no sense to say it's full-on UB and at the same time "minimize impact" to preserve some behaviors.

UB is about guarantees. After UB occurs the system can no longer make any guarantees about the future behavior the system, basically because a proof of false now exists in the system. We would like to make it work (#40399) and of course we could make it an error right now, but general experience is that users would prefer the system to keep running, even if it becomes crashy, rather than having to restart the system (because the likely outcome of a crash is that you're restarting anyway).

Also finally, I don't know why people keep interpreting what I'm saying as trying to add UB everywhere. If you look at both the issues where we've removed significant sources of UB (#26764, #40009) and the one I mentioned above that may remove more (#40399), both of them were filed by me, as was the work that actually made #40009 not a disaster performance-wise (the termination tracking in the effect system). I think any reasonable observer would conclude that I'm in favor of removing as much UB as possible without compromising the overall performance of the system ;).

@mbauman
Copy link
Sponsor Member

mbauman commented Apr 18, 2024

I don't know why people keep interpreting what I'm saying as trying to add UB everywhere

That's not at all what I'm doing here. This whole thing started because I didn't understand what UB meant for Julia and because I thought others wouldn't understand the term. I've seen and so much appreciate how you've been removing it. Turns out, I still don't understand it — to the point that I can't communicate cogently with you (let alone the compiler). I mean, I get LLVM's poison values and things that are assumed to not happen. I really get and love and appreciate the idea of writing down "third rails" that we can't touch. I want more documented third rails!

But I don't understand UB in higher-level Julia terms. I've used the words "the behavior is undefined" in docstrings before, probably inappropriately. Am I using my words correctly if I describe const redefinition in these terms: There's a clear meaning to what redefining a const should do: it throws a warning in some cases, an error in others, and Julia tries to behave as though that's the value of the binding going forward. But doing so is a source for undefined behavior. It breaks the runtime in a fundamental manner that is not recoverable. Is that accurate? See, that's very different to me than saying that Julia doesn't have any meaningful behaviors when you redefine const, which is what I (and I think others) think when we read that const redefinition is UB.

The feedback I hope you hear is that this is a very specific term of art that I don't understand.

@Keno
Copy link
Member Author

Keno commented Apr 18, 2024

Am I using my words correctly if I describe const redefinition in these terms

Kind of? I think part of the problem is the focusing on the behavior of julia the program, but a lot of this is really about Julia the language, for which julia the program is an implementation. Julia the language has semantics that are independent of julia the program, although julia the program is required to be a correct implementation of Julia the language.

There's additional confusion, because some choices of the semantics of Julia the language are made explicitly due to implementation constraints of julia the program.

So what do we do in those cases? In some cases we, we change the semantics of the language to something that is suboptimal but reasonable (integer overflow overflows silently, does not error). In other cases, we change the semantics to produce an error, but in a small set of cases, an error is not feasible or desirable so the semantics are "julia the program may assume this does not happen".

So the question then is, what happens if the third case does happen? Basically, the answer is 🤷. From the moment any undefined behavior is executed, julia the program ceases to guarantee that it correctly implements the semantics of Julia the language. This state is in principle non-recoverable as you say. Of course it's possible that nothing happens and everything keeps working, but there's no guarantee. It's not really about corruption in the runtime, although that is of course a possible mechanism here (although there are other).

Maybe think of Julia the language as a set of tracks where julia the program is a train that runs on top of them. Generally, it may switch between tracks, and go up and down, through tunnels, over bridges, what have you, but it always stays on the track. Except that in some places, some asshole went and removed a big piece of track and put up a sign "warning: undefined rails ahead". So what happens if the train goes there anyway? Again 🤷. It depends on way too many circumstances - the weight of the train, the wind conditions, wear of the wheels, etc. Best we can say is 🤷 - it might crash, it might accidentally end up back on the tracks and keep going, or maybe something even weirder happened like it ended up at running through the backyard of a nuclear power plant while you're thinking, "weird, I didn't think there were rails there".

free of undefined behavior.

The presence of undefined behavior is modeled as part of julia's effect system using the `:noub` effect bit. See the documentation for `@assume_effects` for more information on querying the compiler's effect model or overriding it for specific situations (e.g. where a dynamic check precludes potential UB from every actually being reached).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
While the language semantics do allow, for example, formatting your hard drive if you redefine a const, we nevertheless try to make the behavior of the `julia` executable reasonable and user friendly wherever possible. Even in the presence of undefined behavior, we make an effort—though not always a successful one—to produce intuitive and predictable results.

In the spirit of @mbauman's comments, I'd like to amend "we can't promise anything" to "we can't promise anything, but we'll still try to be nice if we can". No semantic change, just a friendly reminder that we're all on the same team trying to make user experience better. This also documents an effort we already do make (c.f. the constant redefinition example which works pretty well in practice).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if for no other reason than to avoid SEO-association of "ransomware" with "Julia", I might use another description

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched to the less extreme, more canonical "format your hard drive" example.

@LilithHafner
Copy link
Member

The view that "if there is UB then anything goes" is incompatible with the real world where UB is prevalent. For example, according to that semantic, 1+1 == 3 would be valid because, as a nontrivial codebase, of course Base has UB in it and therefore anything goes.

For folks who are skeptical that Base has UB, I decided to find an example in the latest stable release (1.10.2). This function is marked as :total, which includes :nothrow, which applies to all possible arguments, not just semantic executions:

@assume_effects :total function array_subpadding(S, T)
lcm_size = lcm(sizeof(S), sizeof(T))
s, t = CyclePadding(S), CyclePadding(T)

And on some arguments it does throw:

julia> Base.array_subpadding(1, 2)
ERROR: MethodError: no method matching Base.CyclePadding(::Int64)

Closest candidates are:
  Base.CyclePadding(::P, ::Int64) where P
   @ Base reinterpretarray.jl:681
  Base.CyclePadding(::DataType)
   @ Base reinterpretarray.jl:719

Stacktrace:
 [1] array_subpadding(S::Int64, T::Int64)
   @ Base ./reinterpretarray.jl:731
 [2] top-level scope
   @ REPL[23]:1

To continue the train example, if I'm sitting around in the back yard of a nuclear power plant and suddenly a train runs over my clover garden, that's bad. It's reasonable for me to say "hey, there aren't tracks over here, please keep your trains on the tracks". The presence of a bit of "undefined tracks" in the train manufacturing plant (Base) does not not give the train driver license to careen their train into my clover garden unannounced.

Now, I understand that it's hard for train drivers to keep trains well behaved when they drive over "undefined tracks". But in a world where undefined tracks are everywhere, it is a part of the train driver's professional responsibilities.


"Julia is fast" is not in the spec, and is not promised by any legalistic agreements, but we still do a pretty good job, provided users write reasonable code. I think we should treat "Julia doesn't have crazy UB behaviors" similarly. If users are not going out of their way to construct pathological UB exploits, they shouldn't see crazy UB behaviors. And I think we should document that aspiration.


This is not blocking and I'd be happy to add this note (with compiler folks' approval) in a separate PR after this one merges.

@vtjnash
Copy link
Sponsor Member

vtjnash commented Apr 19, 2024

UB triggers only if the user executes it. The compiler is not allowed to speculatively insert UB. (Well, except possibly nothrow as we haven't decided yet if that annotation allows the compiler to invent UB if it does actually throw or if it does not permit hoisting call and only allows eliminating calls)

@mbauman
Copy link
Sponsor Member

mbauman commented Apr 19, 2024

Julia the language has semantics that are independent of julia the program, although julia the program is required to be a correct implementation of Julia the language.

Ahhh, thank you @Keno. OK, now I can much better understand what's written in this PR. I had this backwards — that julia the program itself is what defines Julia the language, and it does so through its observable behaviors, with caveats on what is allowed to be observable. And that's why I was finding UB to be backwards — and why I just wanted to define what's allowed to be observable or not allowed to be performed (like how I said I wanted to just document the "third rails"!).

This is a subtle shift in my own understanding of Julia, and I still don't quite fully appreciate why this distinction is valuable, but now I get what UB means in the context of Julia the language and how it affects julia the program. I think. Mostly.

@Seelengrab
Copy link
Contributor

This is a subtle shift in my own understanding of Julia, and I still don't quite fully appreciate why this distinction is valuable, but now I get what UB means in the context of Julia the language. I think. Mostly.

It's a very subtle thing, yeah - in Python, quite a lot of behavior from cpython has been "adopted" into Python-the-language, which makes it very hard to do meaningful changes. There's this very good talk by Armin Ronacher (How Python was shaped by leaky internals) that gets into some details about a similar distinction there.

Hopefully the distinction between "Julia-The-Language" and "julia-the-program" becomes more clear/obvious with "juliac-the-static-compiler" 👀

@Keno
Copy link
Member Author

Keno commented Apr 19, 2024

And on some arguments it does throw:

That definitely needs to be fixed. @assume_effects is too hard to use correctly, but that's yet another track of effort (#49273 and related).

@vtjnash
Copy link
Sponsor Member

vtjnash commented Apr 19, 2024

That definitely needs to be fixed

Should we split this into 2 cases? A case where any throw is UB (so hoisting is permitted without introducing UB), and a case where throwing is fine, but relying on that throw to exist is UB (so call elimination is fine without introducing UB, but hoisting is not fine)?

@mbauman
Copy link
Sponsor Member

mbauman commented Apr 19, 2024

One thing that's a little fuzzy to me is the various concepts and the wordings that are used for them:

  • Julia-the-language can be any of:
    • Julia
    • the language
  • julia-the-program:
    • julia
    • language implementation(s)
    • implementation(s)
    • compiler
    • julia's compiler
  • .jl-files-I-write and things I type into a REPL:
    • "a julia program"
    • programs

Let me try to use what I've learned:

Julia-the-language has semantics that define (or don't define!) and constrain (or don't constrain) behaviors that julia-the-program must have when executing my Julia programs. The "don't define" there is UB, the "don't constrain" is implementation-defined.

But then the programs I write can define (or not define) and constrain (or not) what behaviors subtypes should have or what functions mean or what method implementations must do. This is not the Language's UB or implementation-defined, it's just a program.

One question I still have: are the "non-unique allowable behaviors" things from the beginning (@fastmath, rng, async, etc) "implementation-defined" things? Or are they a not-a-language-spec thing? Or something else entirely?

@oscardssmith
Copy link
Member

One question I still have: are the "non-unique allowable behaviors" things from the beginning (@fastmath, rng, async, etc) "implementation-defined" things? Or are they a not-a-language-spec thing? Or something else entirely?

They are something else entirely (except for rng stream which is implementation defined). They define equivalence classes on what is allowable by the as-if rule. @fastmath in particular is a very fancy one which gives the compiler pretty wide latitude to change all your floating point results as long as the produced program has the right vibes.

@giordano
Copy link
Contributor

Minor nitpick to try and reduce further confusion: based on #53873, people can be confused by what Julia is, and our own CONTRIBUTING.md suggests the language should be referred to as "Julia", while the monospace julia should be limited to the executable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc This change adds or pertains to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet