Skip to content

Compile time and memory usage grows quickly with function size #19158

@toivoh

Description

@toivoh

This is a follow-up to #16434.
Once again, my test code creates functions like

function f(x0)
    x1 = x0 + 1
    x2 = x1 + 1
    x3 = x2 + 1
    x4 = x3 + 1
    x5 = x4 + 1
    return x5
end

and records the compilation time as a function of the number of computations n.
What I'm actually interested in is compiler performance for huge functions that contain heterogeneous code that has been generated somehow, but I hope that this serves as a more minimal example.

Unlike #16434, I called the generated functions with a Float64 argument, which seems to take much longer than with an Int argument. I did the tests on a recent master (from a few days ago).

Results:

     "n"    "t"         "t/n"        "t/n^2"          "a"            "a/n"         "a/n^2"
   10.0    0.00271183  0.000271183  2.71183e-5  100557.0        10055.7        1005.57    
  100.0    0.00923331  9.23331e-5   9.23331e-7  931393.0         9313.93         93.1393  
 1000.0    0.123555    0.000123555  1.23555e-7       1.71152e7  17115.2          17.1152  
 2000.0    0.282674    0.000141337  7.06686e-8       5.05424e7  25271.2          12.6356  
 5000.0    0.926126    0.000185225  3.7045e-8        2.46851e8  49370.2           9.87404 
10000.0    2.81464     0.000281464  2.81464e-8       8.93756e8  89375.6           8.93756 
20000.0   10.1834      0.000509169  2.54585e-8       3.38861e9      1.69431e5     8.47154 
30000.0   27.0231      0.00090077   3.00257e-8       7.48242e9      2.49414e5     8.3138  

where n is the number of computations in the function that was generated, t is the time taken to eval the generated AST and call f(0.0) the first time, and a is the number of bytes allocated during these two steps. I got similar (though slightly smaller) figures with julia -O0.

As can be seen, the compilation time quickly grows to be quadratic in n, as does the number of bytes allocated. I don't know how the memory usage grows, but at n = 40000 my julia process was killed by the system after attempting to use more than 75% of the machine's 16GB of RAM.

Test code:

function code_big_function(n::Int)
    code = [:( $(Symbol("x$k")) = $(Symbol("x$(k-1)")) + 1 ) for k=1:n]
    quote
        let
            function f(x0)
                $(code...)
                return $(Symbol("x$n"))
            end
        end
    end
end

@show code_big_function(5)

ns = [10,100, 1000, 2000, 5000, 10000, 20000, 30000]
ts = Float64[]
alloced = Int[]

for n in ns
    println("-"^79)
    @time code = code_big_function(n)
    data = @timed begin
        @time f = eval(code)
        @time f(0.0)
    end
    push!(ts, data[2])
    push!(alloced, data[3])
end

display(vcat(["n", "t", "t/n", "t/n^2", "a", "a/n", "a/n^2"]',
             hcat(ns, ts, ts./ns, ts./ns.^2, alloced, alloced./ns, alloced./ns.^2)))

Do we know what is the bottleneck? Is there anything that we can do about this?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions