Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

references to interpolated values persist in modules #9375

Closed
mlubin opened this issue Dec 16, 2014 · 16 comments
Closed

references to interpolated values persist in modules #9375

mlubin opened this issue Dec 16, 2014 · 16 comments
Labels
won't change Indicates that work won't continue on an issue or pull request

Comments

@mlubin
Copy link
Member

mlubin commented Dec 16, 2014

The following code:

type MyType
    function MyType()
        t = new()
        finalizer(t,t->println("finalized"))
        t
    end
end


function go()
    t = MyType()
    return
end

for i in 1:10
    go()
    println("returned from go()")
    gc()
end
println("finished")

prints

returned from go()
returned from go()
returned from go()
returned from go()
returned from go()
returned from go()
returned from go()
returned from go()
returned from go()
returned from go()
finished
finalized
finalized
finalized
finalized
finalized
finalized
finalized
finalized
finalized
finalized

(It sometimes also segfaults in uv_write2, hoping that that's already captured in some issue...)

Why doesn't the GC finalize the object when gc() is called? If I use a large object like an array instead, I see the expected behavior. I'm guessing that the object seems too small to care about, but this behavior is particularly bad if the object really represents an opaque C struct whose memory the Julia GC isn't aware of.

Ref jump-dev/JuMP.jl#330

@simonster
Copy link
Member

I think this is because MyType has no fields. Julia represents it with a single instance (i.e. A === B for all A::MyType and B::MyType) and the finalizers on that instance are not called until Julia exits. If I add a field to MyType it looks like the finalizer gets called on gc(). The segfault in uv_write2 is probably #8050.

@mlubin
Copy link
Member Author

mlubin commented Dec 16, 2014

Hmm, this was meant to be a reduced test case. I have a more complex example where this occurs with a type that has fields.

@mlubin
Copy link
Member Author

mlubin commented Dec 16, 2014

Here's a larger test case involving Ipopt and JuMP:

Run this gist with the following patch to Ipopt.jl

diff --git a/src/Ipopt.jl b/src/Ipopt.jl
index d572289..b195ca3 100644
--- a/src/Ipopt.jl
+++ b/src/Ipopt.jl
@@ -195,6 +195,7 @@ end
   # TODO: Not even expose this? Seems dangerous, should just destruct
   # the IpoptProblem object via GC
   function freeProblem(prob::IpoptProblem)
+    exit(1)
     ccall((:FreeIpoptProblem, libipopt), Void, (Ptr{Void},), prob.ref)
   end

What's a good way to debug this?

@simonster
Copy link
Member

Jeff's tip here is the only way I know. I tried this and it looks like there are some functions that maintain references to the model that are dynamically created in ReverseDiffSparse. A function created here maintains a reference to a JuMPArray that eventually references a JuMP.Model that references the IpoptProblem. Because eval evaluates at top level and the functions have unique names, they are never garbage collected and neither are the objects they reference. Using the pattern from broadcast.jl fixes that leak, but it looks like references to the JuMPArray still get added to ReverseDiffSparse's constant_table and I don't know what to do about that.

@mlubin
Copy link
Member Author

mlubin commented Dec 16, 2014

This is very helpful. I can reorganize ReverseDiffSparse to not embed references to values in function definitions. I don't quite understand the constant_table issue, do values embedded in functions get stored in a global module state? Maybe they should be made weak references if the reference from the functions themselves already count for gc purposes?

@simonster
Copy link
Member

The constant table issue may be related to #6597. According to the source, the constant_table is where the "pointers to non-AST-ish objects in a compressed tree" are stored, but it's a property of the module, not the function. It looks like when Julia compresses the AST of a function, it references objects that were interpolated into the AST from the module constant_table rather than keeping them with the function. They can't be weak references, because then the objects could be collected even if the function is still around. This might make GC a bit faster since it only has to traverse the list of constants for each module and not each function, but the variables can only be collected when the module is collected, even after the function falls out of scope. I don't really know this code, so hopefully @JeffBezanson or @vtjnash will correct me if I'm wrong.

mlubin added a commit to mlubin/ReverseDiffSparse.jl that referenced this issue Dec 16, 2014
@mlubin
Copy link
Member Author

mlubin commented Dec 16, 2014

I can work around this behavior if it's decided that this is how the GC needs to work, though it is pretty subtle and surprising.

@mlubin mlubin changed the title forced garbage collection of "small" objects fails references to interpolated values persist in modules Dec 16, 2014
@vchuravy
Copy link
Member

@simonster How do I mark a module for collection? This sounds related to a problem that I am having.

@simonster
Copy link
Member

@vchuravy Once there are no longer any references to a module, it should be collected. That part seems to work for me. Defining a module as module X end will create a new constant in the enclosing module, but if you create a new module with the same name, it will overwrite the old one. You can also create a module using m = Module(:modname) and then use eval(m, expr) to put code in it and it should be collected when there are no longer any references to m or its functions.

@vtjnash
Copy link
Member

vtjnash commented Dec 16, 2014

It's nearly impossible to remove all references to a module, since you would need to eliminate any methods it added to other modules and all instances of types defined in that module.

Why does your AST contain non-AST things? An closure does not. You would need to eval a function definition AST with an interpolated value in it to get that.

@mlubin
Copy link
Member Author

mlubin commented Dec 16, 2014

Why does your AST contain non-AST things? An closure does not. You would need to eval a function definition AST with an interpolated value in it to get that.

That's exactly what was done in ReverseDiffSparse, but I can work around it by using closures instead.

@mlubin
Copy link
Member Author

mlubin commented Dec 16, 2014

There's actually a pretty significant performance issue here. I've been interpolating functions into ASTs so that type inference works correctly. Moving over to a closure gives me a 40% slowdown in one of my benchmarks. Do I have to eval the generated functions into a new module that I can dispose of later?

@vtjnash
Copy link
Member

vtjnash commented Dec 17, 2014

Can you point to the case where this is needed for inference? Can you just add a typeassert, or improve the inference stage to handle your specific case?

@mlubin
Copy link
Member Author

mlubin commented Dec 17, 2014

This is buried pretty deep in ReverseDiffSparse. I think the issue is more that using the closure form leads to extra allocations because there are tuples and splatting and runtime dispatch involved. If the function is interpolated into the AST, I'm assuming that Julia is doing something smart and determining which method to call at compile time. If the function is passed in, the dispatch needs to happen at runtime, which is slow in itself and also seemingly triggers more work for the GC because the function is being called as f(x...) where the type of x is known statically. Does this make sense?

@mlubin
Copy link
Member Author

mlubin commented Feb 11, 2015

I don't mind if this is closed as wontfix. I understand now that values spliced into functions become module variables which won't be gc'd with the function itself.

@vtjnash vtjnash added the won't change Indicates that work won't continue on an issue or pull request label Mar 25, 2016
@vtjnash
Copy link
Member

vtjnash commented Mar 25, 2016

improvements have been made here (especially with the anonymous function rewrite)

@vtjnash vtjnash closed this as completed Mar 25, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
won't change Indicates that work won't continue on an issue or pull request
Projects
None yet
Development

No branches or pull requests

4 participants