Generated code memory leaks #14495

tmptrash · 2015-12-28T10:52:23Z

Hello guys!

I found a generated code related memory leak issue. I wrote a post about it on julia-lang users group, but it looks like everyone have no time for that. I emailed to @JeffBezanson about this issue and he knows about it. I have no possibility to fix this by myself, because i have no such experience in C. So, i decided to create this issue to track it somehow.

So, here is a problem:

function leak()
    for i=1:100000
        t = Task(eval(:(function() produce() end)))
        consume(t)
        try
          Base.throwto(t, null)
        end
    end
    gc()
end

Every call of leak() eats ~30mb of memory on my PC. This problem exists on both Linux and Windows platforms.

julia> versioninfo()
Julia Version 0.4.2
Commit bb73f34 (2015-12-06 21:47 UTC)
Platform Info:
  System: Linux (x86_64-linux-gnu)
  CPU: Intel(R) Core(TM) i7-4700HQ CPU @ 2.40GHz
  WORD_SIZE: 64
  BLAS: libopenblas (NO_LAPACK NO_LAPACKE DYNAMIC_ARCH NO_AFFINITY Haswell)
  LAPACK: liblapack.so.3
  LIBM: libopenlibm
  LLVM: libLLVM-3.3

The text was updated successfully, but these errors were encountered:

ViralBShah · 2015-12-28T16:06:07Z

Thanks for filing - so that this can be tracked.

yuyichao · 2015-12-28T16:07:56Z

Is this just because we don't free jit code?

vtjnash · 2015-12-28T16:36:29Z

We don't free the results of eval because the memory cost is lower than the computational cost (in the Julia cost model where functions -- even closures -- are statically defined). Clearly, the eval is trivially unnecessary here, so this won't be fixed unless someone finds a real use case.

tmptrash · 2015-12-28T17:35:16Z

I have use case in my app. This example (above) - is simplified version of it. In two words, i have to generate a lot of code from the string, which is modified all the time. So in my case eval(parse("...")) are called all the time. This is something like self modified application. It's used in evolution biology research.
I also have a case when i modify AST and call eval() after that many times. I was wondering to have this feature (AST modification) in Julia, by the way :) In this case memory leak also appears. So, after 20min of working, my app eats 6Gb of memory :)

Is it possible to solve this?

JeffBezanson · 2015-12-29T15:45:23Z

Reopened. I think this is a real issue that I'd like to fix eventually.

tmptrash · 2015-12-29T15:46:57Z

This is great news :) For us this is a real stopper. Thanks Jeff.

afbarnard · 2016-01-19T21:01:42Z

Is it generated code that is causing the maxrss to monotonically increase with every set of tests in the entire suite? If so, this issue affects my ability to run all the tests on my laptop which has 4GB. Right now, I can only run the tests single-threaded (make testall1) because running the tests multi-threaded (make testall) runs out of memory 2 times. (2 test workers are terminated.) Plus, generated code taking up so much memory is not in concert with my intuition about running tests independently. Perhaps I should open an issue? (Note I have been using the release-0.4 branch, not master. See #13719 for backstory.)

JeffBezanson · 2016-01-19T21:33:12Z

Yes, that's probably a significant part of the problem. Also ref #14626

JeffBezanson · 2016-01-19T21:34:28Z

We could also perhaps restart workers more frequently, e.g. every few test files, to use less persistent memory.

tkelman · 2016-01-19T21:59:41Z

There's an environment variable you can set to do just that. Have to look at runtests to check exactly how it's spelled.

yuyichao · 2016-01-19T21:59:44Z

We already have JULIA_TEST_MAXRSS_MB

robertfeldt · 2017-02-23T11:10:44Z

There are also use cases from genetic programming and other code generation/synthesis situations when one wants to compile and run a large number of programs in order to then select some subset of them.

See discussion here:
https://discourse.julialang.org/t/is-mem-of-compiled-evaled-functions-garbage-collected/2231

Also see the (closed) issue here:
#20755 (comment)

Nosferican · 2019-06-03T04:07:31Z

Status of this?

freddycct · 2020-09-15T00:03:51Z

My issue #37560 was closed. So I am posting my MWE here. I used Flux/Zygote with pmap.

using Distributed
addprocs(4)

@everywhere mutable struct A
    a::Float32
end

@everywhere function genprog(n, p::A)
    map(1:n) do i
        y = rand()
        mdname = gensym()
        expr = :(module $mdname
            f(x) = 2*x + $y + $p.a
            end
        )
        m = eval(expr)
        Base.invokelatest(m.f, p.a)
    end
end

function main()
    i = 0
    x = A(rand())
    while true
        println("epoch $(i)")
        @everywhere GC.gc()
        
        tasks = rand(1:100, 100)
        _, timeTaken, _, _, _ = @timed let x=x
            pmap(tasks) do n
                genprog(n, x)
            end
        end
        @show timeTaken
        x.a = rand()
        i += 1
    end
end

main()

aeisman · 2020-09-17T20:28:30Z

In another use case, I have been having the same problem with a combination of Distributed and RCall.jl. It appears that repeated uses of RCall are causing a similar memory leak in my case up to 1TB of combined RAM and VRAM usage.

schlichtanders · 2023-06-30T11:24:56Z

Just closed my fresh issue as a duplicate of this one. Repeatedly creating closures.

I run into the problem while using Pluto:
in certain cases Pluto falls back to mere eval of the code and when done repeatedly, the memory stacks up

Here my minimal reproducible example

for _ in 1:50
    @eval function myfunc() end
    GC.gc(true); GC.gc(false)
    Core.println(Base.gc_live_bytes() / 2^20)
end

vchuravy · 2023-06-30T12:31:57Z

GC for code is going to be rather challenging. Besides the mechanics of being able to free the memory one must be able to prove that the code has become unreachable.

While Julia's world-age mechanism might be able to be reused for that, we also have an intrinsic invokeinworld that potentially makes everything reachable.

schlichtanders · 2023-06-30T14:01:00Z

Okay, I see, that is why even the function with the same name cannot "overwrite" itself, because of different world ages...
at least not in an easy automated way.

Is there a manual way to completely cleanup such functions? (let's assume we know their name)

vchuravy · 2023-06-30T14:37:16Z

Not currently, it would require re-engineering parts of the JIT compiler to separate functions into individual JITDylib (either per world, or per function compilation) and then expose a mechanism to evict specific JITDylibs.

The only current way is to restart your Julia session ;)

vinhpb · 2024-03-30T12:34:25Z

Hi,
I have an evolutionary algorithm searching for solutions in form of functions, in which I use the package RuntimeGeneratedFunctions.jl to generate the functions. I see that as my code runs the memory use increases gradually over time until my system crashes. Does that also sound like memory leak?
I thought the package is built in a way that allows GC to collect functions once they are out of scope (as mentioned here SciML/RuntimeGeneratedFunctions.jl#7), so I am confused if memory leak actually happens. I would really appreciate it if someone can explain it to me, since I am completely new to this area of code generating. :)
This is how my code looks like:

function main(...)
 ...
 while iterate > 0
   Threads.@threads  for i in n_threads
      expr = ...  % Calling the function to generate an expr
      fitness = eval_solution(expr, data, eval_genfunc)
      ...
   end
   ...
   iterate -= 1
 end
 ...
end

function eval_solution(expr, data, eval_genfunc) 
      f = expr
      f1 = @RuntimeGeneratedFunction(f)
      fitness = evaluate_genfunc(f1, data)
      return fitness
 end

function eval_genfunc(f1, data) 
      parameters = f1(data)
      score = g(parameters)  % g performs a simulation with given parameters and extracts some information from there as the score
      return score
end

chriselrod · 2024-03-30T13:21:03Z

I don't believe JITed functions can be freed (aside from exiting the process).
One trick I've used to deal with memory leaks in the past is to using Distributed, and do the work in another process. You can manually rmproc and replace with a new addprocs periodically. Cumbersome, but better than crashing your system.

robertfeldt · 2024-03-30T17:04:12Z

Yes, I had similar goals but at least concluded back then (a few years ago) that generated functions are not GC'ed so cannot use for genetic programming type of algorithms easily.

vchuravy · 2024-03-30T17:56:45Z

It should be possible to eventually GC native code, but doing so is hard, and the use-case is limited.

For genetic programming it might be better to use https://github.com/JuliaDebug/JuliaInterpreter.jl

chriselrod · 2024-03-30T18:05:11Z

For genetic programming it might be better to use https://github.com/JuliaDebug/JuliaInterpreter.jl

It might also be easy to write a custom "interpreter" if the functions have a limited enough set of behaviors. E.g., a vector for storing temporaries, and a while loop with if/else to branch quickly on an enum of a limited possible number of functions you may call (assuming it is limited), with some indexes for which temporary to use as arguments, storing the temporary in the vector.
How much further/less you optimize it depends on just how limited the set of behaviors your function may have.

vinhpb · 2024-03-30T18:54:24Z

Thanks for all the tips, guys! I really appreciate it. I will consider which one is the most suitable for my application and try it out.
@vchuravy: Just for curiosity, can you tell me a bit what would it take to GC native code?

ViralBShah added the kind:bug Indicates an unexpected problem or unintended behavior label Dec 28, 2015

ViralBShah added this to the 0.5.0 milestone Dec 28, 2015

vtjnash closed this as completed Dec 28, 2015

tmptrash mentioned this issue Dec 29, 2015

Generated code memory leak (real cases) #14499

Closed

JeffBezanson removed this from the 0.5.0 milestone Dec 29, 2015

JeffBezanson reopened this Dec 29, 2015

tmptrash mentioned this issue Jul 16, 2016

Checklist of steps towards 0.5 release [candidates] #17418

Closed

16 tasks

JeffBezanson mentioned this issue Sep 12, 2016

Degrading performance when generating many functions #18446

Closed

kshyatt added the compiler:codegen Generation of LLVM IR and native code label Jan 26, 2017

tkelman mentioned this issue Feb 23, 2017

Garbage collection of compiled code/functions? #20755

Closed

freddycct mentioned this issue Sep 14, 2020

Distributed pmap memory leak using metaprogramming with MWE #37560

Closed

willow-ahrens mentioned this issue Jan 18, 2021

Garbage Collection SciML/RuntimeGeneratedFunctions.jl#7

Closed

freddycct mentioned this issue Jan 20, 2021

zygote crashed on a recursion based deep learning FluxML/Zygote.jl#616

Closed

vtjnash removed the kind:bug Indicates an unexpected problem or unintended behavior label Mar 18, 2021

KristofferC mentioned this issue Jun 30, 2023

Memory leak in Base - combining Tasks with eval #50363

Closed

LilithHafner mentioned this issue Oct 21, 2023

Memory leak when repeatedly benchmarking JuliaCI/BenchmarkTools.jl#339

Open

LilithHafner mentioned this issue Mar 4, 2024

support $x variable interpolation LilithHafner/Chairmarks.jl#62

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generated code memory leaks #14495

Generated code memory leaks #14495

tmptrash commented Dec 28, 2015

ViralBShah commented Dec 28, 2015

yuyichao commented Dec 28, 2015

vtjnash commented Dec 28, 2015

tmptrash commented Dec 28, 2015

JeffBezanson commented Dec 29, 2015

tmptrash commented Dec 29, 2015

afbarnard commented Jan 19, 2016

JeffBezanson commented Jan 19, 2016

JeffBezanson commented Jan 19, 2016

tkelman commented Jan 19, 2016

yuyichao commented Jan 19, 2016

robertfeldt commented Feb 23, 2017

Nosferican commented Jun 3, 2019

freddycct commented Sep 15, 2020

aeisman commented Sep 17, 2020

schlichtanders commented Jun 30, 2023 •

edited

Loading

vchuravy commented Jun 30, 2023

schlichtanders commented Jun 30, 2023 •

edited

Loading

vchuravy commented Jun 30, 2023

vinhpb commented Mar 30, 2024 •

edited

Loading

chriselrod commented Mar 30, 2024

robertfeldt commented Mar 30, 2024

vchuravy commented Mar 30, 2024

chriselrod commented Mar 30, 2024

vinhpb commented Mar 30, 2024

Generated code memory leaks #14495

Generated code memory leaks #14495

Comments

tmptrash commented Dec 28, 2015

ViralBShah commented Dec 28, 2015

yuyichao commented Dec 28, 2015

vtjnash commented Dec 28, 2015

tmptrash commented Dec 28, 2015

JeffBezanson commented Dec 29, 2015

tmptrash commented Dec 29, 2015

afbarnard commented Jan 19, 2016

JeffBezanson commented Jan 19, 2016

JeffBezanson commented Jan 19, 2016

tkelman commented Jan 19, 2016

yuyichao commented Jan 19, 2016

robertfeldt commented Feb 23, 2017

Nosferican commented Jun 3, 2019

freddycct commented Sep 15, 2020

aeisman commented Sep 17, 2020

schlichtanders commented Jun 30, 2023 • edited Loading

vchuravy commented Jun 30, 2023

schlichtanders commented Jun 30, 2023 • edited Loading

vchuravy commented Jun 30, 2023

vinhpb commented Mar 30, 2024 • edited Loading

chriselrod commented Mar 30, 2024

robertfeldt commented Mar 30, 2024

vchuravy commented Mar 30, 2024

chriselrod commented Mar 30, 2024

vinhpb commented Mar 30, 2024

schlichtanders commented Jun 30, 2023 •

edited

Loading

schlichtanders commented Jun 30, 2023 •

edited

Loading

vinhpb commented Mar 30, 2024 •

edited

Loading