Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Julep: Expr provenance when lowering #31162

Open
timholy opened this issue Feb 23, 2019 · 3 comments
Open

Julep: Expr provenance when lowering #31162

timholy opened this issue Feb 23, 2019 · 3 comments
Labels
compiler:lowering Syntax lowering (compiler front end, 2nd stage) kind:feature Indicates new feature / enhancement requests kind:julep Julia Enhancement Proposal

Comments

@timholy
Copy link
Sponsor Member

timholy commented Feb 23, 2019

When lowering code we have line-number provenance, but I want expression-provenance too.
I'd love it if

ex = quote
    if Sys.islinux()
        f() = 1; g() = 2
    else
        g() = 3
    end
end
thunk, src = Meta.lower_logged(m, ex)

resulted in something along the lines of

# Lowered result                                           Corresponding src entry
:($(Expr(:thunk, CodeInfo(
    @ REPL[3]:2 within `top-level scope'
1 ─ %1  = Base.getproperty(Sys, :islinux)                  :(Sys.islinux)
│   %2  = (%1)()                                           :(Sys.islinux())
└──       goto #3 if not %2                                ex
    @ REPL[3]:3 within `top-level scope'
2 ─       $(Expr(:thunk, CodeInfo(                         :(f() = 1)
    @ none within `top-level scope'
1 ─     return $(Expr(:method, :f))
)))
│         $(Expr(:method, :f))                             :(f() = 1)
│   %6  = Core.Typeof(f)                                   :(f)
│   %7  = Core.svec(%6)                                    :(f())
│   %8  = Core.svec()                                      :(f())
│   %9  = Core.svec(%7, %8)                                :(f())
│         $(Expr(:method, :f, :(%9), CodeInfo(quote        :(f() = 1)
    return 1
end)))
│         $(Expr(:thunk, CodeInfo(                         :(g() = 2)
    @ none within `top-level scope'
1 ─     return $(Expr(:method, :g))
)))
│         $(Expr(:method, :g))                             :(g() = 2)
│   %13 = Core.Typeof(g)                                   :(g)
│   %14 = Core.svec(%13)                                   :(g())
│   %15 = Core.svec()                                      :(g() )
│   %16 = Core.svec(%14, %15)                              :(g())
│         $(Expr(:method, :g, :(%16), CodeInfo(quote       :(g() = 2)
    return 2
end)))
└──       return g                                         :(g)
    @ REPL[3]:5 within `top-level scope'
3 ─       $(Expr(:method, :g))                             :(g() = 3)
│   %20 = Core.Typeof(g)                                   :(g)
│   %21 = Core.svec(%20)                                   :(g())
│   %22 = Core.svec()                                      :(g())
│   %23 = Core.svec(%21, %22)                              :(g())
│         $(Expr(:method, :g, :(%23), CodeInfo(quote       :(g() = 3)
    return 3
end)))
└──       return g                                         :(g)
))))

src here doesn't contain any "new" expressions, just references to the nested contents of ex, specifically the object you've recursed into at the moment you inserted the corresponding line in thunk. It's essentially a log of "what I'm working on now."

You can sometimes get this from the line number info, but as this example partially demonstrates it's fragile (not guaranteed to return complete expressions) and incomplete (might correspond to a block independent of expressions). Moreover, if there aren't any LineNumberNodes in ex, it's impossible. In contrast, the log approach seems quite robust.

I'd implement this myself except for the fact that the key part of this work is in scheme, and I simply don't have time to develop the relevant mastery. I should therefore say I'm not particularly picky about the exact contents of each entry in src, this is just to convey the general idea.

There are interesting questions about whether logs should have sub-logs for the lowered bodies of methods and other internal :thunk expressions. Rather than returning src one could simply make this part of CodeInfo, and then the sub-logs would happen naturally. If one went that way, then there's question of whether this information should survive compression and later retrieval with uncompressed_ast. (I'm guessing that's not on the table as I think it would substantially slow Julia's startup due to pointer relocation.) I'm fine with having this information being transient; getting it out just when lowering manually would be a big win.

@timholy timholy added compiler:lowering Syntax lowering (compiler front end, 2nd stage) kind:feature Indicates new feature / enhancement requests labels Feb 23, 2019
@Keno
Copy link
Member

Keno commented Feb 23, 2019

I've wanted this for a long time, though ideally with lowering from CSTParser in order to be able to give exact source locations.

@timholy
Copy link
Sponsor Member Author

timholy commented Feb 23, 2019

That would be great too, but I'd guess that's a much harder change. If this were Julia code I'd guess this one would be pretty easy, just adding src === nothing || push!(src, ex) every place you see push!(code.code, stmt).

@timholy
Copy link
Sponsor Member Author

timholy commented Mar 18, 2019

The place this is immediately noticeable is in the debugger(s) when stepping through calls to kwarg methods; it would be great to be able to detect that all the "preparatory" steps are associated with a single call. Unfortunately there is a certain amount of diversity here:

julia> a = [4,2,3,1];

julia> m = @which sort(a)
sort(v::AbstractArray{T,1} where T) in Base.Sort at sort.jl:742

julia> Base.uncompressed_ast(m)
CodeInfo(
1%1 = (Core.NamedTuple)()
│   %2 = (Base.pairs)(%1)
│   %3 = (Base.Sort.:(#sort#8))(%2, #self#, v)
└──      return %3
)

julia> m = first(methods(getfield(Base.Sort, Symbol("#sort#8"))))
#sort#8(kws, ::Any, v::AbstractArray{T,1} where T) in Base.Sort at sort.jl:742

julia> Base.uncompressed_ast(m)
CodeInfo(
1%1 = (Base.Sort.copymutable)(v)
│   %2 = (Base.NamedTuple)()
│   %3 = (Base.merge)(%2, kws)
│   %4 = (Base.isempty)(%3)
└──      goto #3 if not %4
2%6 = (Base.Sort.sort!)(%1)
└──      return %6
3%8 = (Core.kwfunc)(Base.Sort.sort!)
│   %9 = (%8)(%3, Base.Sort.sort!, %1)
└──      return %9
)

julia> g(a) = sort(a; rev=true)
g (generic function with 1 method)

julia> Base.uncompressed_ast(first(methods(g)))
CodeInfo(
1%1 = (:rev,)
│   %2 = (Core.apply_type)(Core.NamedTuple, %1)
│   %3 = (Core.tuple)(true)
│   %4 = (%2)(%3)
│   %5 = (Core.kwfunc)(Main.sort)
│   %6 = (%5)(%4, Main.sort, a)
└──      return %6
)

That's diverse enough that I'm a bit fearful about making guesses. I think I can write a pattern-matcher (there's already a limited one) that will pick up most of these and yet not get confused with explicitly-written operations like

julia> f() = (x=1,)
f (generic function with 1 method)

julia> Base.uncompressed_ast(first(methods(f)))
CodeInfo(
1%1 = (:x,)
│   %2 = (Core.apply_type)(Core.NamedTuple, %1)
│   %3 = (Core.tuple)(1)
│   %4 = (%2)(%3)
└──      return %4
)

but I worry it might be a little magical.

Even if there's nothing stored for use by Base.uncompressed_ast, Revise is starting from top-level expressions and lowering everything again; that gives it the opportunity to hang on to any extra returned arguments from a special call to lowering that returns the provenance. Revise can then insert what it learns into its own internal data stores.

@ViralBShah ViralBShah added the kind:julep Julia Enhancement Proposal label Mar 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:lowering Syntax lowering (compiler front end, 2nd stage) kind:feature Indicates new feature / enhancement requests kind:julep Julia Enhancement Proposal
Projects
None yet
Development

No branches or pull requests

3 participants