Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: World-age partition bindings #54654

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

Keno
Copy link
Member

@Keno Keno commented Jun 2, 2024

This implements world-age partitioning of bindings as proposed in #40399. In effect, much like methods, the global view of bindings now depends on your currently executing world. This means that const bindings can now have different values in different worlds. In principle it also means that regular global variables could have different values in different worlds, but there is currently no case where the system does this.

Motivation

The reasons for this change are manifold:

  1. The primary motivation is to permit Revise to redefine structs. This has been a feature request since the very begining of Revise (redefining struct timholy/Revise.jl#18) and there have been numerous attempts over the past 7 years to address this, as well as countless duplicate feature request. A past attempt to implement the necessary julia support in Support type renaming #22721 failed because the consequences and semantics of re-defining bindings were not sufficiently worked out. One way to think of this implementation (at least with respect to types) is that it provides a well-grounded implementation of Support type renaming #22721.

  2. A secondary motivation is to make const-redefinition no longer UB (although const redefinition will still have a significant performance penalty, so it is not recommended). See e.g. the full discussion in Add devdocs on UB #54099 and Behavior of reassignment to a const #38584.

  3. Not currently implemented, but this mechanism can be used to re-compile code where bindings are introduced after the first compile, which is a common performance trap for new users (Track backedges through not-yet-defined bindings? #53958).

  4. Not currently implemented, but this mechanism can be used to clarify the semantics of bindings import and resolution to address issues like compiling top-level expressions forces global binding resolution #14055.

Implementation

In this PR:

  • Binding gets min_world/max_world fields like CodeInstance
  • Various lookup functions walk this linked list using the current task world_age as a key
  • Inference accumulates world bounds as it would for methods
  • Upon binding replacement, we walk all methods in the system, invalidating those whose uninferred IR references the replaced GlobalRef
  • One primary complication is that our IR definition permits const globals in value position, but if binding replacement is permitted, the validity of this may change after the fact. To address this, there is a helper in Core.Compiler that gets invoked in the type inference world and will rewrite the method source to be legal in all worlds.
  • A new @world macro can be used to access bindings from old world ages. This is used in printing for old objects.
  • The const-override behavior was changed to only be permitted at toplevel. The warnings about it being UB was removed.

Of particular note, this PR does not include any mechanism for invalidating methods whose signatures were created using an old Binding (or types whose fields were the result of a binding evaluation). There was some discussion among the compiler team of whether such a mechanism should exist in base, but the consensus was that it should not. In particular, although uncommon, a pattern like:

f() = Any
g(::f()) = 1
f() = Int

Does not redefine g. Thus to fully address the Revise issue, additional code will be required in Revise to track the dependency of various signatures and struct definitions on bindings.

Demo

julia> struct Foo
               a::Int
       end

julia> g() = Foo(1)
g (generic function with 1 method)

julia> g()
Foo(1)

julia> f(::Foo) = 1
f (generic function with 1 method)

julia> fold = Foo(1)
Foo(1)

julia> struct Foo
               a::Int
               b::Int
       end

julia> g()
ERROR: MethodError: no method matching Foo(::Int64)
The type `Foo` exists, but no method is defined for this combination of argument types when trying to construct it.

Closest candidates are:
  Foo(::Int64, ::Int64)
   @ Main REPL[6]:2
  Foo(::Any, ::Any)
   @ Main REPL[6]:2

Stacktrace:
 [1] g()
   @ Main ./REPL[2]:1
 [2] top-level scope
   @ REPL[7]:1

julia> f(::Foo) = 2
f (generic function with 2 methods)

julia> methods(f)
# 2 methods for generic function "f" from Main:
 [1] f(::Foo)
     @ REPL[8]:1
 [2] f(::@world(Foo, 0:26898))
     @ REPL[4]:1

julia> fold
@world(Foo, 0:26898)(1)

Performance consideration

On my machine, the validation required upon binding replacement for the full system image takes about 200ms. With CedarSim loaded (I tried OmniPackage, but it's not working on master), this increases about 5x. That's a fair bit of compute, but not the end of the world. Still, Revise may have to batch its validation. There may also be opportunities for performance improvement by operating on the compressed representation directly.

Semantic TODO

  • Do we want to change the resolution time of bindings to (semantically) resolve them immediately?
  • Do we want to introduce guard bindings when inference assumes the absence of a binding?
  • When (if ever) do globals get declared implicitly?

Implementation TODO

@Keno Keno requested a review from timholy June 2, 2024 21:50
@giordano giordano changed the title WIP: World-age parition bindings WIP: World-age partition bindings Jun 2, 2024
@Keno
Copy link
Member Author

Keno commented Jun 4, 2024

Summarizing discussion with @JeffBezanson (only part of it) @StefanKarpinski @vtjnash @topolarity @gbaraldi @oscardssmith from today about the remaining semantic questions:

Semantic TODO

When does binding resolution happen semantically?

Example 1

Right now, the precise point-in-time of binding resolution is ill-defined. To illustrate this, consider:

module Exporter
	export foo
	foo = 1
end

using .Exporter
f() = foo
julia> f()
1

julia> global foo = 2
ERROR: cannot assign a value to imported variable Exporter.foo from module Main
Stacktrace:
 [1] top-level scope
   @ REPL[5]:1
julia> global foo = 2
2

julia> f()
2

but also

julia> code_typed(f)
1-element Vector{Any}:
 CodeInfo(
1 ─ %1 = Main.foo::Any
└──      return %1
) => Any

julia> global foo = 2
ERROR: cannot assign a value to imported variable Exporter.foo from module Main
Stacktrace:
 [1] top-level scope
   @ REPL[5]:1

Basically, right now bindings get resolved whenever anything in the system happens to look at a binding, but since the running-or-not of inference is outside the semantics of the language, these semantics are ill-defined.

Example 2

Similar for binding ambiguousness:

module Exporter1
	export foo
	foo = 1
end

module Exporter2
	export foo
	foo = 2
end
using .Exporter1

with

julia> foo
1

julia> using .Exporter2
WARNING: using Exporter2.foo in module Main conflicts with an existing identifier.

julia> foo
1
julia> using .Exporter1

julia> using .Exporter2

julia> foo
WARNING: both Exporter2 and Exporter1 export "foo"; uses of it in module Main must be qualified
ERROR: UndefVarError: `foo` not defined in `Main`
Hint: It looks like two or more modules export different bindings with this name, resulting in ambiguity. Try explicitly importing it from a particular module, or qualifying the name with the module it should come from.

(similar to example 1, the explicit reference of foo is not necessary, anything in the system that might resolve the binding counts).

Proposed semantics

The proposed semantics are (and this was not fully spelled out in the discussion, so there may be some further debate on this) that bindings resolve (in decreasing order of priority)

  1. The most recently declared (in world age terms) global/const binding, unless explicitly deleted.
  2. The most recent (in world age terms) explicit import using Foo: x, unless explicitly deleted.
  3. An implicit import from all using'd modules available in the world age.

Note 1: Only toplevel global declares new bindings. A function level global declares a new (Any-typed) binding only if there was no previous top-level global.
Note 2: global x (even at top level) does not declare a new binding if previously declared global, otherwise equivalent to global x::Any.
Note 3: global x::T does not declare a new binding if x is already a global in the current module and T is egal to the currently declared type of x.

To see concrete effects, the assignment in Example 1, global foo = 2 becomes allowed. f() subsequently returns 2. In the first case of Example 2, the second use of foo will give the same ambiguous error as the first example does.

Overall, these semantics remove any consideration of code-execution order (with the exception of the ordering of world-age-incrementing top-level declarations) and should be a lot clearer. There should also be no change in binding resolution in cases that are not currently errors or warnings. Cases that are may change slightly, but as discussed above, we currently don't actually guarantee those resolutions, because binding resolution may happen at any time.

The one wrinkle here is that currently, there is precisely one case where bindings are introduced by non-toplevel code as discovered in #54607. We discussed this somewhat extensively. This change was introduced in 1.9 to allow modification of existing globals in other modules, but accidentally also permitted the creation of new bindings. The overall consensus was the independent of any semantics changes here, we need to correct this oversight, I will be submitting a PR shortly to attempt to correct this issue for 1.11, though we may have to go through a deprecation since it was in the system for two releases. Regardless, it shouldn't be an impediment for this change.

What to do about replacement of mutable bindings?

The semantics of binding replacement for const bindings are fairly clear and match the semantics of method replacement reasonably closely: The values that you see are the values that happened to be assigned when the world age is captured. However, this issue becomes trickier with mutable bindings. Here are some examples (I'll be using opaque closures as the canonical world age capture mechanism, but feel free to substitute tasks or whatever other world-age capture mechanism you prefer).

Example

For example, what does the following do:

global x::Integer = 1
old_age = get_world_counter()
oc = @opaque ()->(global x = x+1)
global x::Int = 3

Now, consider the following:

@world(x, old_age)
oc();
x
@world(x, old_age)

As implemented in this PR, bindings are fully partitioned, so this would give 1 2 3 2. However, the concern is e.g. with Revise, binding replacement may not always be fully obvious and users may not understand why e.g. long running tasks suddenly stop updating globals.

Other options

For completeness, I will list all the options we discussed, although some of these are probably bad:

Suggestion 1: Merge bindings with egal metadata across world ages.

This is not quite relevant to the example, but there was a proposal that in:

global x::Int # World Age 1
global x::Float64 # World Age 2
global x::Int # World Age 3

the bindings in world ages 1 and 3 should alias. I think we largely discarded this proposal as

  1. Not addressing the problem fully and
  2. Being confusing to users, as narrowing/widening does not get these semantics, so for people who don't know the precise semantics here, when bindings are shared or not can be confusing.
Suggestion 2: setglobal! assigns in every compatible world age

Basically, the semantics here would be getglobal reads the last setglobal! that assigned a value of a compatible type. So the result in the above example would change to 3 4 4 4, but in this slightly modification:

global x::Float64 = 1
old_age = get_world_counter()
oc = @opaque ()->(global x = x+1)
global x::Int = 3

you would get 1.0, 2.0, 3, 2.0 as in the current semantics.

We liked this semantically, but were concerned about the difficulty of implementation.

Suggestion 3: Writing outdated mutable bindings is an error

Relatively straightforward. If you replace a binding, then from that point on, all code running in old world ages will error upon assignment. In our running example, we would have 1 OldBindingError 3 1.
There are two primary downside here:

  1. an additional pointer-sized load on every global assignment, but that's likely reasonable and if you really wanted to could probably be fixed by rewriting the binding pointer.
  2. Any setglobal! call can never be proven nothrow

@topolarity raised the point that it seems odd to disallow this at top level, while permitting mutations through mutable objects, but that same concern applies to mutable values accessed through const

Suggestion 4: Do that, but also error on reads

Basically, as soon as you replace a mutable binding, the old one becomes toxic and loudly errors. In our running example, this gives OldBindingError, OldBindingError, 3, OldBindingError. This is quite aggressive, but I do think it is acceptable, because it requires the combination of code that:

  1. Runs in an old world
  2. Makes use of global mutable bindings
  3. Has those bindings replaced

This situation is not super common, particularly since global mutable state is generally discouraged in Julia (as in other languages).

An open question is whether we want to extend this behavior to const bindings with mutable type.

Suggestion 5 [late submission]: type assert (using the old binding type) on read and write

Similar to suggestion 4, but rather than erroring unconditionally, old world bindings would gain typeasserts (on read with the old type, on write with the new type). global x::T is equivalent to global x::T = convert(T, @world(x, get_world_counter()). Should also still be compatible with suggestion 2.

Conclusion

I think we arrived at starting with suggestion 4. It's the most conservative and we should explicitly reserve the potential of revisiting the error cases in the future. Suggestion 2 was also well liked, but implementation difficulty was a concern. Additionally, with the indicated semantic reservation, switching from suggestion 4 to suggestion 2 is feasible, but not vice versa.

Eager resolution of bindings?

@JeffBezanson is concerned about losing the redefinition error in the following case:

f() = sin(1)
f()
sin = 4

also known as the missing=false issue. There were several proposals to improve this:

  1. Eagerly resolving whether a binding is imported to a global and disallowing switching this until the module is closed (otherwise as above).
  2. Have an opt-in mode that disallows shadowing imported bindings entirely

Should we do guard entries?

I originally made this a semantic question, although it's really more of a performance consideration. That said, with the changes to binding resolution discussed above, I believe guard entries are required for correctness anyway, so I think this question is moot.

Keno added a commit to JuliaLang/Distributed.jl that referenced this pull request Jun 5, 2024
As discussed in [1], the implicit creation of bindings through the
setglobal! intrinsic was accidentally added in 1.9 unintentionally
and will be removed (ideally) or at the very least deprecated in 1.11.

The recommended replacement syntax is `Core.eval(mod, Expr(:global, sym))` to
introduce the binding and `invokelatest(setglobal!, mod, sym, val)` to set
it. The invokelatest is not presently required, but may be required
for JuliaLang/julia#54654, so it's included
in the recommendation.

[1] JuliaLang/julia#54607
Keno added a commit that referenced this pull request Jun 5, 2024
PR #44231 (part of Julia 1.9) introduced the ability to modify globals with `Mod.sym = val` syntax.
However, the intention of this syntax was always to modify *existing* globals in other modules.
Unfortunately, as implemented, it also implicitly creates new bindings in the other module, even
if the binding was not previously declared. This was not intended, but it's a bit of a syntax corner
case, so nobody caught it at the time.

After some extensive discussions and taking into account the near future direction we want to go
with bindings (#54654 for both), the consensus was reached that we should try to undo the implicit
creation of bindings (but not the ability to assign the *value* of globals in other modules).
Note that this was always an error until Julia 1.9, so hopefully it hasn't crept into too many
packages yet. We'll see what pkgeval says. If use is extensive, we may want to consider a softer
removal strategy.

Across base and stdlib, there's two cases affected by this change:
1. A left over debug statement in `precompile` that wanted to assign a new variable in Base for debugging. Removed in this PR.
2. Distributed wanting to create new bindings. This is a legimitate use case for wanting to
   create bindings in other modules. This is fixed in JuliaLang/Distributed.jl#102.

As noted in that PR, the recommended replacement where implicit binding creation is desired is:
```
Core.eval(mod, Expr(:global, sym))
invokelatest(setglobal!, mod, sym, val)
```

The `invokelatest` is not presently required, but may be needed by #54654, so it's included
in the recommendation now.

Fixes #54607
@vchuravy
Copy link
Sponsor Member

vchuravy commented Jun 5, 2024

I am not sure about any resolution that will cause code running to error.

I do think it is acceptable, because it requires the combination of code that:

  1. Runs in an old world
  2. Makes use of global mutable bindings
  3. Has those bindings replaced

This situation is not super common, particularly since global mutable state is generally discouraged in Julia (as in other languages).

I am not convinced that this situation isn't common and it would prohibit any use of global mutable bindings by any code that executes in a separate/frozen world-ages. We might in due time want to be able to execute "compilation unit" that are fully separated from the rest of the system.

However, the concern is e.g. with Revise, binding replacement may not always be fully obvious and users may not understand why e.g. long running tasks suddenly stop updating globals.

I do understand the concerns, but for me the conclusion seems more to be that globals must be accessed in the world-age of the task / or long running tasks must be restarted.

I am not convinced that the Revise use-case outweighs the statement: "Old code must be able to keep running"

@Keno
Copy link
Member Author

Keno commented Jun 5, 2024

I am not convinced that this situation isn't common and it would prohibit any use of global mutable bindings by any code that executes in a separate/frozen world-ages. We might in due time want to be able to execute "compilation unit" that are fully separated from the rest of the system.

Pretty much yes. I will admit that I generally think mutable globals are the wrong tool for anything beyond basic repl usage and I think of frozen world ages as a somewhat advanced feature, so I don't mind putting the complexity there as much. That said, separate compilation units can of course have their own entirely separate semantics and not observe global assignments from outside at all.

"Old code must be able to keep running"

What about Suggestion 5 where old code does mostly keep running, unless you assign a value to the global that is incompatible with the type restriction that was active in the captured world age. That seems like it would mostly let old code keep running unless you're changing something massively about the globals.

Keno added a commit to JuliaLang/Distributed.jl that referenced this pull request Jun 5, 2024
As discussed in [1], the implicit creation of bindings through the
setglobal! intrinsic was accidentally added in 1.9 unintentionally
and will be removed (ideally) or at the very least deprecated in 1.11.

The recommended replacement syntax is `Core.eval(mod, Expr(:global, sym))` to
introduce the binding and `invokelatest(setglobal!, mod, sym, val)` to set
it. The invokelatest is not presently required, but may be required
for JuliaLang/julia#54654, so it's included
in the recommendation.

[1] JuliaLang/julia#54607
@vchuravy
Copy link
Sponsor Member

vchuravy commented Jun 5, 2024

It feels weird to add extra runtime overhead to a language feature. In some sense I would want to keep the current semantics that long-running tasks must opt-in to seeing "new"/"unexpected" state with invokelatest.

global running::Bool =  true

@spawn begin
      while running
          # 
      end
end

global running::Int

That is the scenario we are worried about? If I now change running my task will never finish.

For me that is equivalent to

const running =  Ref{Bool}(true)

@spawn begin
      while running[]
          # 
      end
end

const running = Ref{Int}(0)

Of course you raised this point in your proposal:

An open question is whether we want to extend this behavior to const bindings with mutable type.

For me information shouldn't travel back in time and thus anything we do here can't impact code executing in a prior world-age. I would rather make the rule simple: "If you modify a binding, that modification is only visible from here on out" and not have complicated rules about unifying bindings across time.

The "right" way fro me to write long-running tasks for that in a hot reloading scenario is probably:

global running::Bool = true
function execute()
     # work
     return running
end

@spawn while true
           invokelates(execute) || break
end

@Keno
Copy link
Member Author

Keno commented Jun 5, 2024

It feels weird to add extra runtime overhead to a language feature
Yes, but the overhead is actually less than you might expect, because precompile can already unset bindings, which is somewhat equivalent.

That is the scenario we are worried about? If I now change running my task will never finish.

Yes, or even something more benign like global running::Union{Int, Bool} where the user would be confused by setting running = false after that declaration does not terminate the loop.

For me that is equivalent to
I would rather make the rule simple: "If you modify a binding, that modification is only visible from here on out" and not have complicated rules about unifying bindings across time.

Yes, that was also what I originally wanted to do and is what is implemented in this PR, but you and I both have a very sophisticated understanding of world ages. I fear that most users may not.

@Keno
Copy link
Member Author

Keno commented Jun 5, 2024

If you modify a binding

One of the biggest problems I have is that "modifying a binding" is not necessarily an explicit operation. From the users perspective, they just changed the type on a binding, or even worse in the:

struct Foo
   a::Int # changed to `Integer` by the user using Revise
end
global foo::Foo = Foo(1)

case, the user may not be touching the binding at all at the source level, but revise still has to do the rebinding.

@vchuravy
Copy link
Sponsor Member

vchuravy commented Jun 5, 2024

Absolutely! For me the question is if the Revise use-case trumps :

  1. No additional run-time costs
  2. Code that ran once in a world-age will not break, e.g. can we keep freezing world-ages

For me 2. is more important than the Revise use-case. We should empower Revise as much as possible, but also acknowledge that it can't do a perfect job with top-level state.

@Keno
Copy link
Member Author

Keno commented Jun 5, 2024

I don't think it's really about Revise. Revise is fine with either semantics. The question is more about the semantics of old-world code, for which there are two competing priorities:

  1. It should keep running as is
  2. It should match the mental model of the users

For 1, I really don't think that the typeassert solution is that bad. In the pre-typed-globals world (which is still extremely common), people would regularly do things like while running::Bool; end. Doing that implicitly is pretty much the semantics of Suggestion 5 with identical old-world behavior and I think it's a lot simpler of a mental model, because it does not require explaining to users the precise extents of world ages.

@Keno
Copy link
Member Author

Keno commented Jun 5, 2024

Said another way, I think the issue is mostly about type restriction on globals. If we didn't have that feature, I think the answer would be fairly obvious that there's only one mutable location for a binding (even if you can shadow it with a const for some subset of the world age). So then the question is if the type restriction feature is compelling enough of a reason to world-age fracture by type restriction semantically, and it just seems too niche for that to be realistic.

I think the semantics of:

  • There's only one global location for a given binding
  • Optionally you may declare a (world-age-fractured) type for the global which introduces implicit converts and typeasserts.

seems like a very simple mental model.

@StefanKarpinski
Copy link
Sponsor Member

I think I like that last approach, but I want to be sure what it means. Is the idea:

  • There is exactly one current value for a mutable global binding
  • If the binding in the current world looks like x::T then we guarantee that getting x returns a value of type T by e.g. doing convert(T, currentvalue(:x))::T
  • If this conversion is an error in the world age where an assignment happens, it's an immediate error and the assignment fails (the old value is retained)
  • If this conversion is not an error in the world age where an assignment happens, a new "true value" is established and used everywhere
  • In worlds where the true value can be converted to the expected type, the conversion is performed and the converted value is returned upon access to the global
  • In worlds where the true value cannot be converted to the expected type, accessing the global is an error.

Is that what you have in mind here, @Keno? One thing to consider is: if the user writes x::T = v do we consider v to be the true value of the global or do we consider convert(T, v)::T to be the true value?

@vchuravy
Copy link
Sponsor Member

vchuravy commented Jun 5, 2024

Thinking about it some more this morning, I think "Suggestion 5" may be workable. typeassert on read and write with the type being world-age based.

It allows changing global running::Bool to global running::Nothing and then back, once the user realized the mistake.

@Keno
Copy link
Member Author

Keno commented Jun 5, 2024

  • There is exactly one current value for a mutable global binding

Yes

  • If the binding in the current world looks like x::T then we guarantee that getting x returns a value of type T by e.g. doing convert(T, currentvalue(:x))::T

I wasn't suggesting the additional convert on access (only assignment), since we don't usually convert on access. Although arguably we don't have to, since the can never have a mismatch.

  • If this conversion is an error in the world age where an assignment happens, it's an immediate error and the assignment fails (the old value is retained)

Yes

  • If this conversion is not an error in the world age where an assignment happens, a new "true value" is established and used everywhere

Yes

  • In worlds where the true value can be converted to the expected type, the conversion is performed and the converted value is returned upon access to the global

Yes, modulo above question on convert-on-access

  • In worlds where the true value cannot be converted to the expected type, accessing the global is an error.

Yes

@Keno
Copy link
Member Author

Keno commented Jun 5, 2024

only assignment

Thinking about this some more (and consistent with what I wrote yesterday), I don't think we can introduce old-world converts to/from the new type either on access or on write, because in general the new type may be defined in a new world, so the convert is likely to fail. Of course, we could implicitly transition to the latest world, but I think that's too magical. I think the only semantics that are sensible are:

# M.x
getglobal(M, :x)::get_binding_type(M, :x)
# M.x = val
setglobal!(M, :x, convert(get_binding_type(M, :x), val)::invokelatest(get_binding_type, M, :x))

Where get_binding_type looks up the binding type for running world and get set/getglobal are semantically untyped (for these purposes anyway, those type asserts are probably implicit in those intrinsics).

Keno added a commit that referenced this pull request Jun 5, 2024
PR #44231 (part of Julia 1.9) introduced the ability to modify globals with `Mod.sym = val` syntax.
However, the intention of this syntax was always to modify *existing* globals in other modules.
Unfortunately, as implemented, it also implicitly creates new bindings in the other module, even
if the binding was not previously declared. This was not intended, but it's a bit of a syntax corner
case, so nobody caught it at the time.

After some extensive discussions and taking into account the near future direction we want to go
with bindings (#54654 for both), the consensus was reached that we should try to undo the implicit
creation of bindings (but not the ability to assign the *value* of globals in other modules).
Note that this was always an error until Julia 1.9, so hopefully it hasn't crept into too many
packages yet. We'll see what pkgeval says. If use is extensive, we may want to consider a softer
removal strategy.

Across base and stdlib, there's two cases affected by this change:
1. A left over debug statement in `precompile` that wanted to assign a new variable in Base for debugging. Removed in this PR.
2. Distributed wanting to create new bindings. This is a legimitate use case for wanting to
   create bindings in other modules. This is fixed in JuliaLang/Distributed.jl#102.

As noted in that PR, the recommended replacement where implicit binding creation is desired is:
```
Core.eval(mod, Expr(:global, sym))
invokelatest(setglobal!, mod, sym, val)
```

The `invokelatest` is not presently required, but may be needed by #54654, so it's included
in the recommendation now.

Fixes #54607
@StefanKarpinski
Copy link
Sponsor Member

The reason I mentioned convert-on-access is because you could have a situation like this:

global x::Any     # world 1
global x::String  # world 2
global x::Float64 # world 3
global x::Int     # world 4

What happens when one does x = 1 in world 4? If you had conversion on-access (or conversion on assignment to all compatible types), then when accessing x in world 3, you would get 1.0. I think that doing the convert eagerly versus lazily with caching would be largely equivalent. Either way, accessing x after that in world 2 would be an error. But one of the subtleties I wanted to tease out is what happens if you do x = big(1) in world 4 and then access x in world 1? Does world 1 see big(1) or Int(1)?

@Keno
Copy link
Member Author

Keno commented Jun 5, 2024

I think doing anything beyond the standard conversion on assignment we do right now is too magical. convert does not preserve object identity, so if you have a mutable object, different accesses to the objects no longer alias, and it all just becomes very confusing. I think it's fine in your example, for world 3 accesses to error as well.

@StefanKarpinski
Copy link
Sponsor Member

So basically:

  • Convert in the world where assignment happens
  • If the resulting value is type-compatible with an older world, also bind it there
  • In worlds where the new value is not type compatible, access becomes an error

A couple of clarifying questions:

  • What if assignment occurs in a world that is not the latest? Error?
  • Should we stop scanning historical worlds at the first failed assignment or continue?

@Keno
Copy link
Member Author

Keno commented Jun 5, 2024

Basically, the way to think about it is that there is only one storage location for Mod.sym, semantically typed by the binding type in the latest world age.

  • Convert in the world where assignment happens

yes

  • If the resulting value is type-compatible with an older world, also bind it there
  • In worlds where the new value is not type compatible, access becomes an error

There's only one storage location - if the actual stored value is not compatible with the binding type that was declared in a previous world age, access becomes an error.

A couple of clarifying questions:

  • What if assignment occurs in a world that is not the latest? Error?

Convert according to the old binding type and then attempt to assign (without convert) into the global. If the type (after convert in the old world) is not compatible with the latest declared type (without any convert), this is an error.

  • Should we stop scanning historical worlds at the first failed assignment or continue?

There's no scanning of historical worlds, the typeassert is on access.

Keno added a commit that referenced this pull request Jun 5, 2024
PR #44231 (part of Julia 1.9) introduced the ability to modify globals with `Mod.sym = val` syntax.
However, the intention of this syntax was always to modify *existing* globals in other modules.
Unfortunately, as implemented, it also implicitly creates new bindings in the other module, even
if the binding was not previously declared. This was not intended, but it's a bit of a syntax corner
case, so nobody caught it at the time.

After some extensive discussions and taking into account the near future direction we want to go
with bindings (#54654 for both), the consensus was reached that we should try to undo the implicit
creation of bindings (but not the ability to assign the *value* of globals in other modules).
Note that this was always an error until Julia 1.9, so hopefully it hasn't crept into too many
packages yet. We'll see what pkgeval says. If use is extensive, we may want to consider a softer
removal strategy.

Across base and stdlib, there's two cases affected by this change:
1. A left over debug statement in `precompile` that wanted to assign a new variable in Base for debugging. Removed in this PR.
2. Distributed wanting to create new bindings. This is a legimitate use case for wanting to
   create bindings in other modules. This is fixed in JuliaLang/Distributed.jl#102.

As noted in that PR, the recommended replacement where implicit binding creation is desired is:
```
Core.eval(mod, Expr(:global, sym))
invokelatest(setglobal!, mod, sym, val)
```

The `invokelatest` is not presently required, but may be needed by #54654, so it's included
in the recommendation now.

Fixes #54607
@topolarity
Copy link
Member

If the type (after convert in the old world) is not compatible with the latest declared type (without any convert), this is an error.

Does this have to be an error? I would have expected this to be an error upon access in the new world, rather than modification in the old one.

Otherwise, it seems like global x::Int has to be an error if the contents of x are incompatible with the new type, even though the variable may be initialized before any use in the new world.

@Keno
Copy link
Member Author

Keno commented Jun 5, 2024

Does this have to be an error? I would have expected this to be an error upon access in the new world, rather than modification in the old one.

It is feasible to make it symmetric, but then every new world access will need to have a type-assert on read, even if the binding hasn't been replaced, which has performance implications. I also think I like the semantics of erroring in the old world better, but I'm open to discussion.

Otherwise, it seems like global x::Int has to be an error if the contents of x are incompatible with the new type, even though the variable may be initialized before any use in the new world.

It tries to convert, and is an error if not possible.

@Keno
Copy link
Member Author

Keno commented Jun 5, 2024

but then every new world access will need to have a type-assert on read

I guess we could cache the typeassert for the newest world along with the world age and collapse the check into one.

So yeah, I think either is probably feasible. Your proposal does have the advantage of letting old code keep running if the new code never actually touches the global.

@vtjnash
Copy link
Sponsor Member

vtjnash commented Jun 13, 2024

I may have just misunderstood. I thought the intent of this PR is to clarify that those resolve dynamically in the execution world, instead of statically at the time it was defined. Is there a part of this PR semantics which also captures the exact world value (aka primary) at the time the definition?

@Keno
Copy link
Member Author

Keno commented Jun 13, 2024

The binding itself resolves in the execution world, but the execution world cannot declare new globals, i.e in

function foo()
    global x
    x = 1
end

It always look up the current x in execution world to decide what the binding is. However, "binding resolution" (the act of deciding what a particular binding means, whether a global, an import, etc) cannot happen at the execution point, because it now always require (at least semantically) a world age increment. So, at some point prior to execution, we would like to resolve that global x binding. Clearly if somebody did an explicit global x::Any or global x::Int at some point, that takes care of it, but to preserve current code, we also sometimes need to decide implicitly. My suggestion above was for when and how to do that.

Keno added a commit that referenced this pull request Jun 13, 2024
The same definition exists a few lines earlier. Split from #54654 where
it would be an error to try to do this before the invalidation mechanism
exists, but the change is generally applicable.
@vtjnash
Copy link
Sponsor Member

vtjnash commented Jun 13, 2024

So, at some point prior to execution, we would like to resolve that global x binding. Clearly if somebody did an explicit global x::Any or global x::Int at some point, that takes care of it, but to preserve current code, we also sometimes need to decide implicitly.

Okay, I think we are on the same page now with that. Thanks for clarifying. I thought this specific example was already handled (since there is only one possible meaning here: that of declaring there is a mutable global), but it looks like it was currently left as an unreliable runtime check for the behavior of the write. At least with respect to binding resolution, we already have defined that this occurs during method definition time:

julia> f() = (global sin = 3; global cos; nothing)

julia> sin
ERROR: UndefVarError: `sin` not defined in `Main`
Suggestion: add an appropriate import or assignment. This global was declared but not assigned.
Hint: a global variable of this name also exists in Base.

julia> cos
ERROR: UndefVarError: `cos` not defined in `Main`
Suggestion: add an appropriate import or assignment. This global was declared but not assigned.
Hint: a global variable of this name also exists in Base.

I thought one of the main points of this PR though was that Revise should be able to change what kind of binding these is afterwards. Or are you proposing that Revise would just re-evaluate all of the methods afterwards?

@Keno
Copy link
Member Author

Keno commented Jun 13, 2024

Well, the current behavior (at definition time) depends on whether or not the binding was resolved, which is not a notion that is semantically available anymore. For example, in my proposed semantics, this would introduce a global, rather than error (but without the Main., it would behave as it does right now).

julia> Main.sin
sin (generic function with 14 methods)

julia> f() = (global sin = 1; nothing)
f (generic function with 1 method)

julia> f()
ERROR: cannot assign a value to imported variable Base.sin from module Main
Stacktrace:
 [1] f()
   @ Main ./REPL[2]:1
 [2] top-level scope
   @ REPL[3]:1

Keno added a commit that referenced this pull request Jun 13, 2024
This is a prepratory commit for #54654 to change the lowering of
`const` and typed globals to be compatible with the new semantics.

Currently, we lower `const a::T = val` to:
```
const a
global a::T
a = val
```
(which further expands to typed-globals an implicit converts).

This works, because, under the hood, our const declarations
are actually assign-once globals. Note however, that this is
not syntactically reachable, since we have a parse error for
plain `const a`:

```
julia> const a
ERROR: ParseError:
# Error @ REPL[1]:1:1
const a
└─────┘ ── expected assignment after `const`
Stacktrace:
 [1] top-level scope
   @ none:1
```

However, this lowering is not atomic with respect to world age.
The semantics in #54654 require that the const-ness and the value
are established atomically (with respect to world age, potentially
on another thread) or undergo invalidation.

To resolve this issue, this PR changes the lowering of `const a::T = val` to:
```
let
    local a::T = val
    const (global a) = a
end
```
where the latter is a special syntax form `Expr(:const, GlobalRef(,:a), :a)`.

A similar change is made to const global declarations, which previously
lowered via intrinsic, i.e. `global a::T = val` lowered to:
```
global a
Core.set_binding_type!(Main, :a, T)
_T = Core.get_binding_type(Main, :a)
if !isa(val, _T)
    val = convert(_T, val)
end
a = val
```

This changes the `set_binding_type!` to instead be a syntax form `Expr(:globaldecl, :a, T)`.
This is not technically required, but we currently do not use intrinsics for
world-age affecting side-effects anywhere else in the system. In particular, after #54654,
it would be illegal to call `set_binding_type!` in anything but top-level context.
Now, we have discussed in the past that there should potentially be intrinsic functions
for global modifications (method table additions, etc), currently only reachable through
`Core.eval`, but such an intrinsic would require semantics that differ from both
the current `set_binding_type!` and the new `:globaldecl`.
Using an Expr form here is the most consistent with our current practice for
these sort of things elsewhere and accordingly, this PR removes the intrinsic.

Note that this PR does not yet change any syntax semantics,
although there could in principle be a reordering of
side-effects within an expression (e.g. things like
`global a::(@isdefined(a) ? Int : Float64)` might behave
differently after this commit. However, we never defined
the order of side effects (which is part of what this is
cleaning up, although, I am not formally defining any specific
ordering here either - #54654 will do some of that), and that
is not a common case, so this PR should be largely considered
non-semantic with respect to the syntax change.
Keno added a commit that referenced this pull request Jun 13, 2024
This is a re-worked extraction of #54654, adjusted to support the
new semantics arrived at over the course of that thread. Note that
this is the data-structure change only. The semantics in this PR
still match current master to the greatest extent possible. In
particular, this does not fully implement thread-safety of the
old semantics, since that works differently after #54654 (this
PR essentially implements the old semantics by mutating in place
data structures that are supposed to be immutable), though note
that there were some thread safety concerns in the existing also.

The core idea here is to split `Binding` in two: A new `Binding`
with minimal data and a `BindingPartition` that holds all data
that is world-age partitioned. In the present PR, these are always
in 1:1 correspondednce, but after #54654, there will be multiple
`BindingPartition`s for every `Binding`.

Essentially the `owner` and `ty` fields have been merged into
a new `restriction` field of `BindingPartition`, which may also
hold the value of a constant (consistent with the final semantics
reached in #54654). The non-partitioned binding->value field is
now used exclusively for non-constant globals. The disambiguation
for how to interpret the `restriction` field happens via flags.
`->imported` grew to 2 bits and can now be one of `NONE`/`IMPLICIT`/
`EXPLICIT`/`GUARD`. `GUARD` corresponds to the old `b->owner == NULL`
case. `NONE` corresponds to the old `b->owner == b` case, while
IMPLICIT/EXPLICIT correspond to `b->owner != b` and the old `imported`
flag. Other than that, the behavior of the flags is unchanged.

Additionally, fields are provided for `min_world`/`max_world`/`next`
and a `declared` flag in `Binding`, but these are unused in this PR
and prepratory only.

Because of above mentioned thread-safety concerns, I am not anticipating
merging this PR by itself, but I think it is a good review partition
to cut down on the number of changes in the main PRs.
KristofferC pushed a commit that referenced this pull request Jun 13, 2024
PR #44231 (part of Julia 1.9) introduced the ability to modify globals
with `Mod.sym = val` syntax. However, the intention of this syntax was
always to modify *existing* globals in other modules. Unfortunately, as
implemented, it also implicitly creates new bindings in the other
module, even if the binding was not previously declared. This was not
intended, but it's a bit of a syntax corner case, so nobody caught it at
the time.

After some extensive discussions and taking into account the near future
direction we want to go with bindings (#54654 for both), the consensus
was reached that we should try to undo the implicit creation of bindings
(but not the ability to assign the *value* of globals in other modules).
Note that this was always an error until Julia 1.9, so hopefully it
hasn't crept into too many packages yet. We'll see what pkgeval says. If
use is extensive, we may want to consider a softer removal strategy.

Across base and stdlib, there's two cases affected by this change:
1. A left over debug statement in `precompile` that wanted to assign a
new variable in Base for debugging. Removed in this PR.
2. Distributed wanting to create new bindings. This is a legimitate use
case for wanting to create bindings in other modules. This is fixed in
JuliaLang/Distributed.jl#102.

As noted in that PR, the recommended replacement where implicit binding
creation is desired is:
```
Core.eval(mod, Expr(:global, sym))
invokelatest(setglobal!, mod, sym, val)
```

The `invokelatest` is not presently required, but may be needed by
#54654, so it's included in the recommendation now.

Fixes #54607

(cherry picked from commit b7e7232)
@topolarity
Copy link
Member

If I have:

f() = (global x = 1; nothing)
...
const x = 2

And then I delete the const x definition (or rename it to avoid the conflict), how will Revise update the binding to the mutable state?

Keno added a commit that referenced this pull request Jun 14, 2024
This is a prepratory commit for #54654 to change the lowering of
`const` and typed globals to be compatible with the new semantics.

Currently, we lower `const a::T = val` to:
```
const a
global a::T
a = val
```
(which further expands to typed-globals an implicit converts).

This works, because, under the hood, our const declarations
are actually assign-once globals. Note however, that this is
not syntactically reachable, since we have a parse error for
plain `const a`:

```
julia> const a
ERROR: ParseError:
# Error @ REPL[1]:1:1
const a
└─────┘ ── expected assignment after `const`
Stacktrace:
 [1] top-level scope
   @ none:1
```

However, this lowering is not atomic with respect to world age.
The semantics in #54654 require that the const-ness and the value
are established atomically (with respect to world age, potentially
on another thread) or undergo invalidation.

To resolve this issue, this PR changes the lowering of `const a::T = val` to:
```
let
    local a::T = val
    const (global a) = a
end
```
where the latter is a special syntax form `Expr(:const, GlobalRef(,:a), :a)`.

A similar change is made to const global declarations, which previously
lowered via intrinsic, i.e. `global a::T = val` lowered to:
```
global a
Core.set_binding_type!(Main, :a, T)
_T = Core.get_binding_type(Main, :a)
if !isa(val, _T)
    val = convert(_T, val)
end
a = val
```

This changes the `set_binding_type!` to instead be a syntax form `Expr(:globaldecl, :a, T)`.
This is not technically required, but we currently do not use intrinsics for
world-age affecting side-effects anywhere else in the system. In particular, after #54654,
it would be illegal to call `set_binding_type!` in anything but top-level context.
Now, we have discussed in the past that there should potentially be intrinsic functions
for global modifications (method table additions, etc), currently only reachable through
`Core.eval`, but such an intrinsic would require semantics that differ from both
the current `set_binding_type!` and the new `:globaldecl`.
Using an Expr form here is the most consistent with our current practice for
these sort of things elsewhere and accordingly, this PR removes the intrinsic.

Note that this PR does not yet change any syntax semantics,
although there could in principle be a reordering of
side-effects within an expression (e.g. things like
`global a::(@isdefined(a) ? Int : Float64)` might behave
differently after this commit. However, we never defined
the order of side effects (which is part of what this is
cleaning up, although, I am not formally defining any specific
ordering here either - #54654 will do some of that), and that
is not a common case, so this PR should be largely considered
non-semantic with respect to the syntax change.
Keno added a commit that referenced this pull request Jun 14, 2024
This is a prepratory commit for #54654 to change the lowering of
`const` and typed globals to be compatible with the new semantics.

Currently, we lower `const a::T = val` to:
```
const a
global a::T
a = val
```
(which further expands to typed-globals an implicit converts).

This works, because, under the hood, our const declarations
are actually assign-once globals. Note however, that this is
not syntactically reachable, since we have a parse error for
plain `const a`:

```
julia> const a
ERROR: ParseError:
# Error @ REPL[1]:1:1
const a
└─────┘ ── expected assignment after `const`
Stacktrace:
 [1] top-level scope
   @ none:1
```

However, this lowering is not atomic with respect to world age.
The semantics in #54654 require that the const-ness and the value
are established atomically (with respect to world age, potentially
on another thread) or undergo invalidation.

To resolve this issue, this PR changes the lowering of `const a::T = val` to:
```
let
    local a::T = val
    const (global a) = a
end
```
where the latter is a special syntax form `Expr(:const, GlobalRef(,:a), :a)`.

A similar change is made to const global declarations, which previously
lowered via intrinsic, i.e. `global a::T = val` lowered to:
```
global a
Core.set_binding_type!(Main, :a, T)
_T = Core.get_binding_type(Main, :a)
if !isa(val, _T)
    val = convert(_T, val)
end
a = val
```

This changes the `set_binding_type!` to instead be a syntax form `Expr(:globaldecl, :a, T)`.
This is not technically required, but we currently do not use intrinsics for
world-age affecting side-effects anywhere else in the system. In particular, after #54654,
it would be illegal to call `set_binding_type!` in anything but top-level context.
Now, we have discussed in the past that there should potentially be intrinsic functions
for global modifications (method table additions, etc), currently only reachable through
`Core.eval`, but such an intrinsic would require semantics that differ from both
the current `set_binding_type!` and the new `:globaldecl`.
Using an Expr form here is the most consistent with our current practice for
these sort of things elsewhere and accordingly, this PR removes the intrinsic.

Note that this PR does not yet change any syntax semantics,
although there could in principle be a reordering of
side-effects within an expression (e.g. things like
`global a::(@isdefined(a) ? Int : Float64)` might behave
differently after this commit. However, we never defined
the order of side effects (which is part of what this is
cleaning up, although, I am not formally defining any specific
ordering here either - #54654 will do some of that), and that
is not a common case, so this PR should be largely considered
non-semantic with respect to the syntax change.
Keno added a commit that referenced this pull request Jun 14, 2024
This is a prepratory commit for #54654 to change the lowering of
`const` and typed globals to be compatible with the new semantics.

Currently, we lower `const a::T = val` to:
```
const a
global a::T
a = val
```
(which further expands to typed-globals an implicit converts).

This works, because, under the hood, our const declarations
are actually assign-once globals. Note however, that this is
not syntactically reachable, since we have a parse error for
plain `const a`:

```
julia> const a
ERROR: ParseError:
# Error @ REPL[1]:1:1
const a
└─────┘ ── expected assignment after `const`
Stacktrace:
 [1] top-level scope
   @ none:1
```

However, this lowering is not atomic with respect to world age.
The semantics in #54654 require that the const-ness and the value
are established atomically (with respect to world age, potentially
on another thread) or undergo invalidation.

To resolve this issue, this PR changes the lowering of `const a::T = val` to:
```
let
    local a::T = val
    const (global a) = a
end
```
where the latter is a special syntax form `Expr(:const, GlobalRef(,:a), :a)`.

A similar change is made to const global declarations, which previously
lowered via intrinsic, i.e. `global a::T = val` lowered to:
```
global a
Core.set_binding_type!(Main, :a, T)
_T = Core.get_binding_type(Main, :a)
if !isa(val, _T)
    val = convert(_T, val)
end
a = val
```

This changes the `set_binding_type!` to instead be a syntax form `Expr(:globaldecl, :a, T)`.
This is not technically required, but we currently do not use intrinsics for
world-age affecting side-effects anywhere else in the system. In particular, after #54654,
it would be illegal to call `set_binding_type!` in anything but top-level context.
Now, we have discussed in the past that there should potentially be intrinsic functions
for global modifications (method table additions, etc), currently only reachable through
`Core.eval`, but such an intrinsic would require semantics that differ from both
the current `set_binding_type!` and the new `:globaldecl`.
Using an Expr form here is the most consistent with our current practice for
these sort of things elsewhere and accordingly, this PR removes the intrinsic.

Note that this PR does not yet change any syntax semantics,
although there could in principle be a reordering of
side-effects within an expression (e.g. things like
`global a::(@isdefined(a) ? Int : Float64)` might behave
differently after this commit. However, we never defined
the order of side effects (which is part of what this is
cleaning up, although, I am not formally defining any specific
ordering here either - #54654 will do some of that), and that
is not a common case, so this PR should be largely considered
non-semantic with respect to the syntax change.
Keno added a commit that referenced this pull request Jun 14, 2024
This is a prepratory commit for #54654 to change the lowering of
`const` and typed globals to be compatible with the new semantics.

Currently, we lower `const a::T = val` to:
```
const a
global a::T
a = val
```
(which further expands to typed-globals an implicit converts).

This works, because, under the hood, our const declarations
are actually assign-once globals. Note however, that this is
not syntactically reachable, since we have a parse error for
plain `const a`:

```
julia> const a
ERROR: ParseError:
# Error @ REPL[1]:1:1
const a
└─────┘ ── expected assignment after `const`
Stacktrace:
 [1] top-level scope
   @ none:1
```

However, this lowering is not atomic with respect to world age.
The semantics in #54654 require that the const-ness and the value
are established atomically (with respect to world age, potentially
on another thread) or undergo invalidation.

To resolve this issue, this PR changes the lowering of `const a::T = val` to:
```
let
    local a::T = val
    const (global a) = a
end
```
where the latter is a special syntax form `Expr(:const, GlobalRef(,:a), :a)`.

A similar change is made to const global declarations, which previously
lowered via intrinsic, i.e. `global a::T = val` lowered to:
```
global a
Core.set_binding_type!(Main, :a, T)
_T = Core.get_binding_type(Main, :a)
if !isa(val, _T)
    val = convert(_T, val)
end
a = val
```

This changes the `set_binding_type!` to instead be a syntax form `Expr(:globaldecl, :a, T)`.
This is not technically required, but we currently do not use intrinsics for
world-age affecting side-effects anywhere else in the system. In particular, after #54654,
it would be illegal to call `set_binding_type!` in anything but top-level context.
Now, we have discussed in the past that there should potentially be intrinsic functions
for global modifications (method table additions, etc), currently only reachable through
`Core.eval`, but such an intrinsic would require semantics that differ from both
the current `set_binding_type!` and the new `:globaldecl`.
Using an Expr form here is the most consistent with our current practice for
these sort of things elsewhere and accordingly, this PR removes the intrinsic.

Note that this PR does not yet change any syntax semantics,
although there could in principle be a reordering of
side-effects within an expression (e.g. things like
`global a::(@isdefined(a) ? Int : Float64)` might behave
differently after this commit. However, we never defined
the order of side effects (which is part of what this is
cleaning up, although, I am not formally defining any specific
ordering here either - #54654 will do some of that), and that
is not a common case, so this PR should be largely considered
non-semantic with respect to the syntax change.
Keno added a commit that referenced this pull request Jun 19, 2024
This is a prepratory commit for #54654 to change the lowering of
`const` and typed globals to be compatible with the new semantics.

Currently, we lower `const a::T = val` to:
```
const a
global a::T
a = val
```
(which further expands to typed-globals an implicit converts).

This works, because, under the hood, our const declarations
are actually assign-once globals. Note however, that this is
not syntactically reachable, since we have a parse error for
plain `const a`:

```
julia> const a
ERROR: ParseError:
# Error @ REPL[1]:1:1
const a
└─────┘ ── expected assignment after `const`
Stacktrace:
 [1] top-level scope
   @ none:1
```

However, this lowering is not atomic with respect to world age.
The semantics in #54654 require that the const-ness and the value
are established atomically (with respect to world age, potentially
on another thread) or undergo invalidation.

To resolve this issue, this PR changes the lowering of `const a::T = val` to:
```
let
    local a::T = val
    const (global a) = a
end
```
where the latter is a special syntax form `Expr(:const, GlobalRef(,:a), :a)`.

A similar change is made to const global declarations, which previously
lowered via intrinsic, i.e. `global a::T = val` lowered to:
```
global a
Core.set_binding_type!(Main, :a, T)
_T = Core.get_binding_type(Main, :a)
if !isa(val, _T)
    val = convert(_T, val)
end
a = val
```

This changes the `set_binding_type!` to instead be a syntax form `Expr(:globaldecl, :a, T)`.
This is not technically required, but we currently do not use intrinsics for
world-age affecting side-effects anywhere else in the system. In particular, after #54654,
it would be illegal to call `set_binding_type!` in anything but top-level context.
Now, we have discussed in the past that there should potentially be intrinsic functions
for global modifications (method table additions, etc), currently only reachable through
`Core.eval`, but such an intrinsic would require semantics that differ from both
the current `set_binding_type!` and the new `:globaldecl`.
Using an Expr form here is the most consistent with our current practice for
these sort of things elsewhere and accordingly, this PR removes the intrinsic.

Note that this PR does not yet change any syntax semantics,
although there could in principle be a reordering of
side-effects within an expression (e.g. things like
`global a::(@isdefined(a) ? Int : Float64)` might behave
differently after this commit. However, we never defined
the order of side effects (which is part of what this is
cleaning up, although, I am not formally defining any specific
ordering here either - #54654 will do some of that), and that
is not a common case, so this PR should be largely considered
non-semantic with respect to the syntax change.
Keno added a commit that referenced this pull request Jun 19, 2024
This is a prepratory commit for #54654 to change the lowering of
`const` and typed globals to be compatible with the new semantics.

Currently, we lower `const a::T = val` to:
```
const a
global a::T
a = val
```
(which further expands to typed-globals an implicit converts).

This works, because, under the hood, our const declarations
are actually assign-once globals. Note however, that this is
not syntactically reachable, since we have a parse error for
plain `const a`:

```
julia> const a
ERROR: ParseError:
# Error @ REPL[1]:1:1
const a
└─────┘ ── expected assignment after `const`
Stacktrace:
 [1] top-level scope
   @ none:1
```

However, this lowering is not atomic with respect to world age.
The semantics in #54654 require that the const-ness and the value
are established atomically (with respect to world age, potentially
on another thread) or undergo invalidation.

To resolve this issue, this PR changes the lowering of `const a::T = val` to:
```
let
    local a::T = val
    const (global a) = a
end
```
where the latter is a special syntax form `Expr(:const, GlobalRef(,:a), :a)`.

A similar change is made to const global declarations, which previously
lowered via intrinsic, i.e. `global a::T = val` lowered to:
```
global a
Core.set_binding_type!(Main, :a, T)
_T = Core.get_binding_type(Main, :a)
if !isa(val, _T)
    val = convert(_T, val)
end
a = val
```

This changes the `set_binding_type!` to instead be a syntax form `Expr(:globaldecl, :a, T)`.
This is not technically required, but we currently do not use intrinsics for
world-age affecting side-effects anywhere else in the system. In particular, after #54654,
it would be illegal to call `set_binding_type!` in anything but top-level context.
Now, we have discussed in the past that there should potentially be intrinsic functions
for global modifications (method table additions, etc), currently only reachable through
`Core.eval`, but such an intrinsic would require semantics that differ from both
the current `set_binding_type!` and the new `:globaldecl`.
Using an Expr form here is the most consistent with our current practice for
these sort of things elsewhere and accordingly, this PR removes the intrinsic.

Note that this PR does not yet change any syntax semantics,
although there could in principle be a reordering of
side-effects within an expression (e.g. things like
`global a::(@isdefined(a) ? Int : Float64)` might behave
differently after this commit. However, we never defined
the order of side effects (which is part of what this is
cleaning up, although, I am not formally defining any specific
ordering here either - #54654 will do some of that), and that
is not a common case, so this PR should be largely considered
non-semantic with respect to the syntax change.
@Keno Keno mentioned this pull request Jun 23, 2024
Keno added a commit that referenced this pull request Jun 23, 2024
This is a prepratory commit for #54654 to change the lowering of `const`
and typed globals to be compatible with the new semantics.

Currently, we lower `const a::T = val` to:
```
const a
global a::T
a = val
```
(which further expands to typed-globals an implicit converts).

This works, because, under the hood, our const declarations are actually
assign-once globals. Note however, that this is not syntactically
reachable, since we have a parse error for plain `const a`:

```
julia> const a
ERROR: ParseError:
# Error @ REPL[1]:1:1
const a
└─────┘ ── expected assignment after `const`
Stacktrace:
 [1] top-level scope
   @ none:1
```

However, this lowering is not atomic with respect to world age. The
semantics in #54654 require that the const-ness and the value are
established atomically (with respect to world age, potentially on
another thread) or undergo invalidation.

To resolve this issue, this PR changes the lowering of `const a::T =
val` to:
```
let
    local a::T = val
    const (global a) = a
end
```
where the latter is a special syntax form `Expr(:const, GlobalRef(,:a),
:a)`.

A similar change is made to const global declarations, which previously
lowered via intrinsic, i.e. `global a::T = val` lowered to:
```
global a
Core.set_binding_type!(Main, :a, T)
_T = Core.get_binding_type(Main, :a)
if !isa(val, _T)
    val = convert(_T, val)
end
a = val
```

This changes the `set_binding_type!` to instead be a syntax form
`Expr(:globaldecl, :a, T)`. This is not technically required, but we
currently do not use intrinsics for world-age affecting side-effects
anywhere else in the system. In particular, after #54654, it would be
illegal to call `set_binding_type!` in anything but top-level context.
Now, we have discussed in the past that there should potentially be
intrinsic functions for global modifications (method table additions,
etc), currently only reachable through `Core.eval`, but such an
intrinsic would require semantics that differ from both the current
`set_binding_type!` and the new `:globaldecl`. Using an Expr form here
is the most consistent with our current practice for these sort of
things elsewhere and accordingly, this PR removes the intrinsic.

Note that this PR does not yet change any syntax semantics, although
there could in principle be a reordering of side-effects within an
expression (e.g. things like `global a::(@isdefined(a) ? Int : Float64)`
might behave differently after this commit. However, we never defined
the order of side effects (which is part of what this is cleaning up,
although, I am not formally defining any specific ordering here either -
#54654 will do some of that), and that is not a common case, so this PR
should be largely considered non-semantic with respect to the syntax
change.

Also fixes #54787 while we're at it.
Keno added a commit that referenced this pull request Jun 23, 2024
This is a re-worked extraction of #54654, adjusted to support the
new semantics arrived at over the course of that thread. Note that
this is the data-structure change only. The semantics in this PR
still match current master to the greatest extent possible. In
particular, this does not fully implement thread-safety of the
old semantics, since that works differently after #54654 (this
PR essentially implements the old semantics by mutating in place
data structures that are supposed to be immutable), though note
that there were some thread safety concerns in the existing also.

The core idea here is to split `Binding` in two: A new `Binding`
with minimal data and a `BindingPartition` that holds all data
that is world-age partitioned. In the present PR, these are always
in 1:1 correspondednce, but after #54654, there will be multiple
`BindingPartition`s for every `Binding`.

Essentially the `owner` and `ty` fields have been merged into
a new `restriction` field of `BindingPartition`, which may also
hold the value of a constant (consistent with the final semantics
reached in #54654). The non-partitioned binding->value field is
now used exclusively for non-constant globals. The disambiguation
for how to interpret the `restriction` field happens via flags.
`->imported` grew to 2 bits and can now be one of `NONE`/`IMPLICIT`/
`EXPLICIT`/`GUARD`. `GUARD` corresponds to the old `b->owner == NULL`
case. `NONE` corresponds to the old `b->owner == b` case, while
IMPLICIT/EXPLICIT correspond to `b->owner != b` and the old `imported`
flag. Other than that, the behavior of the flags is unchanged.

Additionally, fields are provided for `min_world`/`max_world`/`next`
and a `declared` flag in `Binding`, but these are unused in this PR
and prepratory only.

Because of above mentioned thread-safety concerns, I am not anticipating
merging this PR by itself, but I think it is a good review partition
to cut down on the number of changes in the main PRs.
Keno added a commit that referenced this pull request Jul 1, 2024
Currently we error when attempting to serialize Bindings that do not
beloing to the incremental module (GlobalRefs have special logic to
avoid looking at the binding field). With #54654, Bindings will
show up in more places, so let's just unique them properly by
their module/name identity. Of course, we then have two objects
so serialized (both GlobalRef and Binding), which suggests that
we should perhaps finish the project of unifying them. This is
not currently possible, because the existence of a binding object
in the binding table has semantic content, but this will change
with #54654, so we can do such a change thereafter.
Keno added a commit that referenced this pull request Jul 2, 2024
This is a follow up to resolve a TODO left in #54773 as part of
preparatory work for #54654. Currently, our lowering for type
definition contains an early `isdefined` that forces a decision
on binding resolution before the assignment of the actual binding.
In the current implementation, this doesn't matter much, but
with #54654, this would incur a binding invalidation we would like
to avoid.

To get around this, we extend the (internal) `isdefined` form to take
an extra argument specifying whether or not to permit looking at
imported bindings. If not, resolving the binding is not required
semantically, but for the purposes of type definition (where assigning
to an imported binding would error anyway), this is all we need.
Keno added a commit that referenced this pull request Jul 2, 2024
This is a follow up to resolve a TODO left in #54773 as part of
preparatory work for #54654. Currently, our lowering for type
definition contains an early `isdefined` that forces a decision
on binding resolution before the assignment of the actual binding.
In the current implementation, this doesn't matter much, but
with #54654, this would incur a binding invalidation we would like
to avoid.

To get around this, we extend the (internal) `isdefined` form to take
an extra argument specifying whether or not to permit looking at
imported bindings. If not, resolving the binding is not required
semantically, but for the purposes of type definition (where assigning
to an imported binding would error anyway), this is all we need.
Keno added a commit that referenced this pull request Jul 2, 2024
This is a follow up to resolve a TODO left in #54773 as part of
preparatory work for #54654. Currently, our lowering for type
definition contains an early `isdefined` that forces a decision
on binding resolution before the assignment of the actual binding.
In the current implementation, this doesn't matter much, but
with #54654, this would incur a binding invalidation we would like
to avoid.

To get around this, we extend the (internal) `isdefined` form to take
an extra argument specifying whether or not to permit looking at
imported bindings. If not, resolving the binding is not required
semantically, but for the purposes of type definition (where assigning
to an imported binding would error anyway), this is all we need.
Keno added a commit that referenced this pull request Jul 2, 2024
This is a follow up to resolve a TODO left in #54773 as part of
preparatory work for #54654. Currently, our lowering for type
definition contains an early `isdefined` that forces a decision
on binding resolution before the assignment of the actual binding.
In the current implementation, this doesn't matter much, but
with #54654, this would incur a binding invalidation we would like
to avoid.

To get around this, we extend the (internal) `isdefined` form to take
an extra argument specifying whether or not to permit looking at
imported bindings. If not, resolving the binding is not required
semantically, but for the purposes of type definition (where assigning
to an imported binding would error anyway), this is all we need.
Keno added a commit that referenced this pull request Jul 4, 2024
Currently we error when attempting to serialize Bindings that do not
beloing to the incremental module (GlobalRefs have special logic to
avoid looking at the binding field). With #54654, Bindings will show up
in more places, so let's just unique them properly by their module/name
identity. Of course, we then have two objects so serialized (both
GlobalRef and Binding), which suggests that we should perhaps finish the
project of unifying them. This is not currently possible, because the
existence of a binding object in the binding table has semantic content,
but this will change with #54654, so we can do such a change thereafter.
Keno added a commit that referenced this pull request Jul 4, 2024
This is a follow up to resolve a TODO left in #54773 as part of
preparatory work for #54654. Currently, our lowering for type
definition contains an early `isdefined` that forces a decision
on binding resolution before the assignment of the actual binding.
In the current implementation, this doesn't matter much, but
with #54654, this would incur a binding invalidation we would like
to avoid.

To get around this, we extend the (internal) `isdefined` form to take
an extra argument specifying whether or not to permit looking at
imported bindings. If not, resolving the binding is not required
semantically, but for the purposes of type definition (where assigning
to an imported binding would error anyway), this is all we need.
Keno added a commit that referenced this pull request Jul 5, 2024
This is a follow up to resolve a TODO left in #54773 as part of
preparatory work for #54654. Currently, our lowering for type
definition contains an early `isdefined` that forces a decision
on binding resolution before the assignment of the actual binding.
In the current implementation, this doesn't matter much, but
with #54654, this would incur a binding invalidation we would like
to avoid.

To get around this, we extend the (internal) `isdefined` form to take
an extra argument specifying whether or not to permit looking at
imported bindings. If not, resolving the binding is not required
semantically, but for the purposes of type definition (where assigning
to an imported binding would error anyway), this is all we need.
Keno added a commit that referenced this pull request Jul 6, 2024
This is a follow up to resolve a TODO left in #54773 as part of
preparatory work for #54654. Currently, our lowering for type definition
contains an early `isdefined` that forces a decision on binding
resolution before the assignment of the actual binding. In the current
implementation, this doesn't matter much, but with #54654, this would
incur a binding invalidation we would like to avoid.

To get around this, we extend the (internal) `isdefined` form to take an
extra argument specifying whether or not to permit looking at imported
bindings. If not, resolving the binding is not required semantically,
but for the purposes of type definition (where assigning to an imported
binding would error anyway), this is all we need.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants