Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revise does not apply functions in the two-argument form of include #634

Open
ranocha opened this issue Jun 18, 2021 · 27 comments
Open

Revise does not apply functions in the two-argument form of include #634

ranocha opened this issue Jun 18, 2021 · 27 comments
Labels
wishlist Feature requests

Comments

@ranocha
Copy link
Contributor

ranocha commented Jun 18, 2021

The two-argument form include(mapexpr::Function, path::AbstractString) allows transforming the code in the file at path. However, Revise does not seem to apply the same transformation mapexpr when re-loading code that is included using this form. For example

julia> using Pkg; Pkg.activate(temp=true); Pkg.add("Revise")
  Activating new environment at `/tmp/jl_D95G2q/Project.toml`
    Updating registry at `~/.julia/registries/General`
    Updating git-repo `https://github.com/JuliaRegistries/General`
   Resolving package versions...
    Updating `/tmp/jl_D95G2q/Project.toml`
  [295af30f] + Revise v3.1.17
[...]

julia> "BrokenModule" |> mkdir |> cd

julia> "src" |> mkdir |> cd

julia> open("BrokenModule.jl", "w") do io
           println(io, """
               module BrokenModule

               include(expr -> quote @fastmath begin \$expr end end, "broken_a.jl")

               end
           """)
       end

julia> open("broken_a.jl", "w") do io
           println(io, """
               foo(a, b, c) = a * b + c
           """)
       end

julia> cd(dirname(pwd()))

julia> open("Project.toml", "w") do io
           println(io, """
               name = "BrokenModule"
               uuid = "79e551b4-d023-11eb-261f-b9feb8f119ff"
               authors = ["Someone"]
               version = "0.0.1"

               [compat]
               julia = "1.6"
           """)
       end

julia> run(`git init`)
Initialized empty Git repository in ~/.julia/dev/BrokenModule/.git/
Process(`git init`, ProcessExited(0))

julia> run(`git add .`)
Process(`git add .`, ProcessExited(0))

julia> run(`git commit -m 'initial commit'`)
[master (root-commit) 78b2121] initial commit
 3 files changed, 16 insertions(+)
 create mode 100644 Project.toml
 create mode 100644 src/BrokenModule.jl
 create mode 100644 src/broken_a.jl
Process(`git commit -m 'initial commit'`, ProcessExited(0))

julia> Pkg.add(url=pwd())
    Updating git-repo `~/.julia/dev/BrokenModule`
   Resolving package versions...
    Updating `/tmp/jl_D95G2q/Project.toml`
  [79e551b4] + BrokenModule v0.0.1 `~/.julia/dev/BrokenModule#master`
    Updating `/tmp/jl_D95G2q/Manifest.toml`
  [79e551b4] + BrokenModule v0.0.1 `~/.julia/dev/BrokenModule#master`

julia> Pkg.develop("BrokenModule")
Path `~/.julia/dev/BrokenModule` exists and looks like the correct package. Using existing path.
   Resolving package versions...
    Updating `/tmp/jl_D95G2q/Project.toml`
  [79e551b4] ~ BrokenModule v0.0.1 `~/.julia/dev/BrokenModule#master`  v0.0.1 `~/.julia/dev/BrokenModule`
    Updating `/tmp/jl_D95G2q/Manifest.toml`
  [79e551b4] ~ BrokenModule v0.0.1 `~/.julia/dev/BrokenModule#master`  v0.0.1 `~/.julia/dev/BrokenModule`

julia> using Revise, BrokenModule
[ Info: Precompiling BrokenModule [79e551b4-d023-11eb-261f-b9feb8f119ff]

julia> BrokenModule.foo(1.0, 2.0, 3.0) # This used FMA, as desired when using `@fastmath`
5.0

julia> @code_native debuginfo=:none syntax=:intel BrokenModule.foo(1.0, 2.0, 3.0)
        .text
        vfmadd213sd     xmm0, xmm1, xmm2        # xmm0 = (xmm1 * xmm0) + xmm2
        ret
        nop     word ptr cs:[rax + rax]

julia> open(joinpath(dirname(pathof(BrokenModule)), "broken_a.jl"), "w") do io
           println(io, """
               foo(a, b, c) = a + b * c
           """)
       end

julia> BrokenModule.foo(1.0, 2.0, 3.0) # This doesn't use FMA although we have `@fastmath`
7.0

julia> @code_native debuginfo=:none syntax=:intel BrokenModule.foo(1.0, 2.0, 3.0)
        .text
        vmulsd  xmm1, xmm1, xmm2
        vaddsd  xmm0, xmm1, xmm0
        ret
        nop     dword ptr [rax]

The expected output of the last line would be something like

julia> @fastmath foo(a, b, c) = a + b * c
foo (generic function with 1 method)

julia> @code_native debuginfo=:none syntax=:intel foo(1.0, 2.0, 3.0)
        .text
        vfmadd231sd     xmm0, xmm2, xmm1        # xmm0 = (xmm2 * xmm1) + xmm0
        ret
        nop     word ptr cs:[rax + rax]

Hence, this demonstrates that Revise did not apply the transformation expr -> quote @fastmath begin $expr end end passed to include.

@sloede
Copy link

sloede commented Jun 25, 2021

Is this an issue that seems easily fixable or is this a tougher nut to crack? It would be really nice to be able to use the 2-argument include with Revise :-)

@timholy timholy added the wishlist Feature requests label Jul 18, 2021
@timholy
Copy link
Owner

timholy commented Jul 18, 2021

It's not trivial, but not impossible either. Someone would need to modify Revise's internal data structures to keep track of a mapexpr for each source file. I see some links to Trixi above, is that the only package that uses this feature?

@ranocha
Copy link
Contributor Author

ranocha commented Jul 18, 2021

We wanted to use this feature of include but discovered the issue described above while experimenting with it. Our current workaround is to basically wrap each file in a macro to get the same results while Revise works fine. It seems to be more clean if we could use the two-argument form of include instead.

@MilesCranmer
Copy link

MilesCranmer commented Jun 1, 2024

Friendly ping on this; would you be able to add this @timholy? Seems like it's the 2nd-most requested feature (and of those top two, perhaps requires the least changes?)

It seems like this might be the reason that Revise.jl doesn't work with DispatchDoctor.jl in #820?

@MilesCranmer
Copy link

MilesCranmer commented Jun 1, 2024

I tried to get started on this by modifying

const watched_files = Dict{String,WatchList}()

to be Dict{String,Tuple{WatchList,Union{Function,Nothing}}}() and then
function track(mod::Module, file; mode=:sigs, kwargs...)
to take a required mapexpr::Union{Function,Nothing}=nothing argument. And likewise for includet

However I quickly got lost. @timholy I think you might need to help with this one... Sorry I couldn't be more useful! Would be really nice to have this for compatibility with DispatchDoctor.jl as apparently the precompile times become slower with it.

@timholy
Copy link
Owner

timholy commented Jun 12, 2024

I don't have time for general Julia development very often these days. This is a bigger project than I can tackle currently.

@MilesCranmer
Copy link

@timholy perhaps you could simply point out specifically what needs to be changed for this to work?

@timholy
Copy link
Owner

timholy commented Jun 12, 2024

It's not so obvious that I can confidently predict it. You want to grab the expression that gets produced by mapexpr and add that to Revise's internal cache. While complex, the internals are described pretty thoroughly: https://timholy.github.io/Revise.jl/stable/internals/ You have likely already read that, but if not, start there.

@disberd
Copy link

disberd commented Jun 12, 2024

@MilesCranmer i would be willing to help if possible in trying to achieve this, I have been trying to understand better revise internals for using it in a package of mine and I would love to have this functionality as well.

That being said, I believe an additional challenge is that the specific mapexpr to use is not easy to extract as it's not known by the include_callback called during loading in Julia and it's not tracked in the cache of the precompiled module, so it would have to be extracted after loading from the module itself.

In any case we could have a chat if you are interested in trying to collaborate on this (I'll also be at juliacon in Eindhoven to meet in person if you'll attend)

@timholy
Copy link
Owner

timholy commented Jun 12, 2024

Yeah, you'll need to store the mapexpr for each included file in some (new) cache in Revise. The handling of include directives is in src/lowered.jl, so that's the place to grab it from.

@MilesCranmer
Copy link

@disberd that’s great! I won’t be at juliacon unfortunately but good to hear you are interested in fixing this. (I don’t think I have much time to actually work on this though, aside from some tiny/quick contributions)

@timholy This sounds like the “proper” way to do it, but is there a way I can just hack it in? Let’s say I don’t care about having the smallest possible eval. Say I was happy to evaluate all the included code every single time any change is detected.

I ask because right now it throws an error. But even if it was extremely inefficient, I think it would be still useful to just have something working. A useful hack is still useful.

@timholy
Copy link
Owner

timholy commented Jun 12, 2024

I can't imagine a hack that would work that doesn't do it this way. Here's the issue:

module MyPackage
function mymapexpr(ex)
    ...
end

include(mymapexpr, "anotherfile.jl")
end

Now revise "anotherfile.jl". How do you know you're supposed to apply mymapexpr unless Revise is storing that info? If Revise just compares the source text, I'm pretty sure you'll generate incorrect expressions (the new code won't include the effects of mymapexpr).

@MilesCranmer
Copy link

Well one could always leave it as a regular include and then modify the file to be

eval(mymapexpr(quote
...
end))

But maybe there’s something in between this very manual approach and a full solution within Revise..?

@timholy
Copy link
Owner

timholy commented Jun 12, 2024

I'm not sure you need any changes to Revise for that. Worth testing.

@MilesCranmer
Copy link

I think it would work since it’s just a regular eval at that point. But the downside is it’s no longer automatic. So I guess I’m wondering if Revise could just read any 2-arg include as this. Sure it would lose the nice incremental evaluation (instead operating at a per-file level), but it’s a working solution in the interim.

But maybe this is exactly the same as the full solution… I’m not sure

@MilesCranmer
Copy link

Right now I have a macro that acts on an entire library, and does so by converting include(file) into include(mapexpr, file): https://github.com/MilesCranmer/DispatchDoctor.jl/blob/f0c0ec72a9e6e03c67e3e89f94dd2da58b8dd1df/src/stabilization.jl#L142-L147

I am just wondering if there's a simpler approach that does not require a full solution to this issue?

@MilesCranmer
Copy link

MilesCranmer commented Jun 13, 2024

I feel like this would require structural changes in Revise.jl because it seems like if you had the situation

include(foo, "file.jl")
include(bar, "file.jl")

Many assumptions made throughout Revise would break. For example watched_files being a Dict rather than an Array means you can't track a directory/file multiple times with different mapexpr...

Probably easier to find some hacky workaround for this at a per-instance level? Unless of course someone wants to fix it for real

@timholy
Copy link
Owner

timholy commented Jun 13, 2024

I won't commit hacks to Revise. It's hard enough to debug as it is. Better for it to be as correct as it can be, somewhat maintainable, and limited than "works for me but wow is this a mess."

And yes, Revise assumes that each file gets included just once. But if you're adding a mapexpr function that could become part of the key, and that would support the use-case above.

@MilesCranmer
Copy link

MilesCranmer commented Jun 13, 2024

Sorry that's not what I meant. And by "hacky" I am mostly just being self-deprecating :-) It's actually a decent solution and more robust than what happens currently –

What I meant was for Revise.jl to expose an interface for specifying mapexpr for specific files, similar to how it supports user-defined callbacks:

"""
Revise.user_callbacks_by_file
Global variable, maps files (identified by their absolute path) to the set of
callback keys registered for them.
"""
const user_callbacks_by_file = Dict{String, Set{Any}}()

For example you could have

const user_mapexpr_by_file = Dict{String, Function}()

And just require the user to add their mapexpr manually if they wish to use the two-argument form.

Then, within DispatchDoctor.jl, I can write a Revise.jl extension that will tag each "stabilized" file with the right mapexpr, rather than what I currently do which is modifying include(file) -> include(mapexpr, file) – which is incompatible with Revise.jl at the moment (Revise.jl converts it back to include(file), removing the type stability analysis upon first revision).

@timholy
Copy link
Owner

timholy commented Jun 13, 2024

Here's what I don't like about that: there is a solution that keeps Revise "invisible" and makes it Just Work. Revise has gotten to where it is by being willing to do the hard work to support as much of Julia as it can as correctly as possible. Heck, I wrote an entire lowered-code interpreter (with a lot of help from some key people) just so I could reason about what is happening for the ~10% of functions that are defined via an @eval. There's also a whole selective-code evaluation module in LoweredCodeUtils that analyzes call graphs to figure out the minimum dependencies for a particular statement, to minimize side effects when you re-evaluate code. All that added up to months of work but it is what made Revise able to do what it can today.

If we break that tradition now, then there will be a release of Revise that requires package authors to distort their code to handle limitations of Revise---Revise will no longer be invisible. Then sometime later when someone puts in the real work, we say "now you can put your package code back the way you wanted it in the first place. And oh by the way, until you do that, your users can't use Revise 4.x, it's broken for those packages that used the hack on Revise 3.x." I'm not certain the latter will happen (it's probably less than a 50% chance), but it seems that it could be a risk.

That's why I don't like taking shortcuts. Until now, I've added new capabilities to Revise when I was prepared to support them.

@MilesCranmer
Copy link

@timholy I understand and agree with your intentions to maintain the integrity of Revise. However, I want to re-emphasize that the current behavior of Revise.jl with the two-argument include is incorrect and silently results in bugs, as it removes the mapexpr upon revising.

Therefore, even a temporary, less elegant solution – like enabling authors to add their own workaround – is preferable to the current state, which silently changes the behavior of revised Julia code, without any workaround possible.

An alternative, correct solution that aligns with your ethos could be to throw an error on any two-argument include until a proper fix is implemented.

Thanks for considering this.

@timholy
Copy link
Owner

timholy commented Jun 13, 2024

We could add it to the limitations section of the documentation. Yes, it's wrong now. 2-arg include is a relatively late-breaking change and so niche it has never been a major priority before.

An alternative, correct solution that aligns with your ethos could be to throw an error on any two-argument include until a proper fix is implemented.

It's a good idea (I would make it a warning and not an error), but how would this work? Other than turning on file-watching, Revise only gets engaged when you save changes to a tracked file. Suppose you start editing the file that got 2-arg included; since the 2-arg include happens in a different file, how do you know?

I suspect the only solution is to parse the entire package. That is surely doable, but it makes Revise "heavier" in both speed and memory consumption. You have to defensively parse every file of every package that has been revised. (That's true even of a "proper" fix to this issue, I fear.)

I like MilesCranmer/DispatchDoctor.jl#40 much better 😛. Or of course, me or someone sitting down for a week and implementing 2-arg include correctly. I'm not saying I won't merge a proper fix to this issue (to the contrary, subject to the caveat about "heaviness" above), but I'm putting my foot down when it comes to a dirty hack. I don't intend to change that decision.

Some other alternatives are:

  • to discourage use of 2-arg include
  • to figure out whether there is some way we could annotate the cache file so Revise knows which files have been 2-arg included (and with which functions) without having to parse the whole thing.

@MilesCranmer
Copy link

MilesCranmer commented Jun 14, 2024

I think a warning should be prioritized for now if possible. For example, when using DispatchDoctor with Revise, I often forget that Revise removes the generated @stable from my included files (which is added with the mapexpr), leading me to mistakenly assume I fixed all my type instabilities.

While my use case is fairly innocuous as its just a developer tool, others might use the two-argument include for larger codebases with more complex mapexpr. This could produce unexpected bugs when mixed with Revise. Since the two-argument include is fully documented in Julia, a warning would at least make this clear. (People often miss such details in documentation, so a warning would be very helpful :-))

@timholy
Copy link
Owner

timholy commented Jun 14, 2024

The main issue is, how does Revise issue this warning without defensively parsing the whole package? In other words, to catch something that is rarely used we have to make Revise clunkier. How do we decide what to do about that tradeoff?

@MilesCranmer
Copy link

MilesCranmer commented Jun 14, 2024

I should say I think one should be cautious about relying on usage frequency in open-source code for such decisions – I had an instance where a 100k-line closed-source Julia codebase at a company used my package and sent me some unexpected bug reports (for a different thing).

I think basically that anything exported in the Julia standard library is fair game and should either be handled correctly or result in an error/warning. This is because we can’t measure what features are used in internal code, especially in industry settings where coding practices can differ significantly from open-source conventions. Niche Julia features, as long as they are documented and official, might be used extensively within internal codebases.

So regarding the question of catching this, I'm not sure. I feel like the prudent thing to do would be to implement the slower/clunkier solution now, just to preserve correctness (especially since the type of changes performed with a 2-arg include may be very subtle and not immediately noticed). And then once JuliaLang/julia#54794 is fixed, it can go back to the more efficient approach. (And anybody complaining about slowness would be motivated to help us fix this 😉). But there's certainly a tradeoff here.

@timholy
Copy link
Owner

timholy commented Jun 14, 2024

Do keep in mind Revise is a developer-efficiency tool, not a compiler. What you're describing is a bit more appropriate for a compiler, where any failure to handle any language construct at all is quite catastrophic. That's why we have PkgEval for Julia development.

In VSCode, would you prefer it if F12 resulted in a stacktrace if it couldn't find the source lines for the given call? Or if GitHub Copilot errored noisily if it wasn't sure about your intent? Even DispatchDoctor seems to be something you intend to be transient and quite selective, as you're presumably well aware that if you enforced it globally virtually nothing would run.

But yes, it's probably OK to implement the slower thing here. Presumably the number of packages with edits in any given session is modest.

@MilesCranmer
Copy link

Do keep in mind Revise is a developer-efficiency tool, not a compiler. What you're describing is a bit more appropriate for a compiler, where any failure to handle any language construct at all is quite catastrophic. That's why we have PkgEval for Julia development.

Fair enough! I think perhaps Revise.jl might just be a victim of its own success a bit here :) in that it is so reliable and smooth, that people just have it on all the time (at least I do!) and so the expectations are artificially inflated a bit relative to other dev tools. But I totally get your point though, and I would agree we shouldn't hold it to the standard of a compiler.

Presumably the number of packages with edits in any given session is modest.

Good point, yes this sounds reasonable to me too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wishlist Feature requests
Projects
None yet
Development

No branches or pull requests

5 participants