Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak in solving ODE many times #1946

Closed
MasonProtter opened this issue May 24, 2023 · 17 comments
Closed

Memory leak in solving ODE many times #1946

MasonProtter opened this issue May 24, 2023 · 17 comments

Comments

@MasonProtter
Copy link

The following code has monotically increasing memory usage for me and will just keep on growing until the system runs out of RAM and crashes:

julia> using SparseArrays, LinearAlgebra, OrdinaryDiffEq

julia> function memory_leak(;N=100^2)
           for i  1:100
               H = sprandn(N, N, 0.00125/2)
               H = H + H'
               ψ0 = normalize!(randn(ComplexF64, N))
               tspan=(0, 10)
               ∂ₜ(ψ, (; H,), t) = -im * (H * ψ)
               prob = ODEProblem(∂ₜ, ψ0, tspan, (;H))
               ψ = solve(prob, Tsit5(), reltol=1e-8,abstol=1e-8)
           end
       end

julia> memory_leak()

This seems to happen whether or not H is passed as a parameter or is a captured variable in ∂ₜ.


Here's my system:

julia> versioninfo()
Julia Version 1.9.0
Commit 8e63055292* (2023-05-07 11:25 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: 12 × AMD Ryzen 5 5600X 6-Core Processor
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-14.0.6 (ORCJIT, znver3)
  Threads: 6 on 12 virtual cores
Environment:
  JULIA_NUM_THREADS = 6

(@v1.9) pkg> st OrdinaryDiffEq
Status `~/.julia/environments/v1.9/Project.toml`
  [1dea7af3] OrdinaryDiffEq v6.51.2
@ChrisRackauckas
Copy link
Member

Does it go away if you add an explicit GC call in the loop?

@MasonProtter
Copy link
Author

Yes, it seems to go away with an explicit call in the loop. If I interrupt the loop though and then do a GC.gc() outside of the loop, the previously allocated memory does not go away.

@ChrisRackauckas
Copy link
Member

It sounds like a behavior of the GC then that should be reported upstream? There's nothing really special that we're doing here, no manual mallocs and alllocs or anything calling C, so there's nothing we'd do in this codebase to change this behavior. It would be good for some base dev to look at though.

@vbertret
Copy link

Do you have any news on this problem ? I'm actually facing exactly the same issue.

@oscardssmith
Copy link
Contributor

This doesn't reproduce for me. What Julia version and OS are you on? also, what's the output of ]st?

@vbertret
Copy link

It's not exactly the same problem (maybe it's not the good place to put the issue) but i have the same behaviour mentionned above : "Yes, it seems to go away with an explicit call in the loop. If I interrupt the loop though and then do a GC.gc() outside of the loop, the previously allocated memory does not go away.". The code i'm using is the following :

using Distributed

addprocs(20);

@everywhere using DifferentialEquations, StaticArrays

# Define ODE problem
@everywhere function simplified_asm1!(dX, X, p::Array, t)

    # Compute stoichiometric_matrix
    Y_A = p[14] ; Y_H = p[15] ; i_XB = p[16]
    R = @SMatrix[ -1/Y_H          -1/Y_H                 0                  1      0     0;
                  -(1-Y_H)/Y_H    0                      -4.57/Y_A+1        0      0     0;
                  0               -(1-Y_H)/(2.86*Y_H)    1.0/Y_A            0      0     0;
                  -i_XB           -i_XB                  -(i_XB+(1.0/Y_A))  0      1     0;
                  0               0                      0                  0     -1     1]

    # Compute process rates
    K_OH = p[2]
    saturation_oxy_1 = (X[2]/(K_OH+X[2]))
    saturation_dco = p[9]*(X[1]/(p[1]+X[1]))
    saturation_no = (X[3]/(p[3]+X[3]))
    saturation_oxy2_no = saturation_no*K_OH/(K_OH+X[2])
    process_rates = @SArray [saturation_dco*saturation_oxy_1,
                             p[4]*saturation_dco*saturation_oxy2_no, 
                             p[11]*(X[4]/(p[7]+X[4]))*(X[2]/(p[8]+X[2])), 
                             p[10], 
                             p[12]*X[5], 
                             p[13]*((X[1])/(p[6]+X[1]))*(saturation_oxy_1+p[5]*saturation_oxy2_no)]
    
    # Compute differential equations
    dX[1:5] = (p[20]/p[17]) * (p[21:25] - X[1:5]) + R * process_rates
    dX[6] = 0.0
    dX[2] += X[6] * p[19] * (p[18] - X[2])

end

@everywhere greybox_problem = ODEProblem(simplified_asm1!, zeros(6),  (0, 1), repeat([0.0, 25]))

# Define initial parameters, x, exogenous, u, t
@everywhere begin
    params = [483.1182795698925, 0.2, 0.5, 0.8, 0.8, 228.53031818181822, 1.0, 0.4, 8952.0, 625.37, 83.5, 111.9, 473.0318181818182, 0.24, 0.67, 0.08, 1333.0, 8.000000000154731, 200.0]
    dt_model = 5
    t = 20.0
    exogenous = [15534.0,  66.0936,  0.0093,  3.935,  6.8924, 0.958]
    x = repeat([48.234495894424384, 2.0924266603066993e-5, 1.4307895535558337, 5.466475672448638, 0.5024614026334902], 1, 100)
    u=[1]
end

@everywhere function M_t(x, exogenous, u, params, t)

    # To overcome stabilities issues
    params = max.(params, 0.01)

    ode_params = vcat(params[1:19], exogenous[1:6])

    problem_ite = remake(greybox_problem, u0=zeros(6), tspan=(t, t + dt_model / 1440), p=ode_params)

    n_particules = size(x, 2)
    states = vcat(x, repeat(u, n_particules)')

    function prob_func(prob, i, repeat)
        remake(prob, u0=states[:, i])
    end
    monte_prob = EnsembleProblem(problem_ite, prob_func=prob_func)

    sim_results = solve(monte_prob, AutoTsit5(Rosenbrock23()), trajectories=n_particules, saveat=[t + dt_model / 1440], maxiters=10e5, reltol=10e-8, abstol=10e-8)
    
    return hcat([max.(sim_results[i].u[1][1:5], 0.0) for i in 1:n_particules]...)

end

@everywhere function memory_leak(i)
    for i ∈ 1:10^5
        z = M_t(x, exogenous, u, params, t)
    end
end

pmap(memory_leak, 1:50)

But when i'm running the code i have a slowly increasing memory allocation as you can see on the picture below but as i'm using this function a lot of time in different processes it takes a lot of ram at the end.

Capture d’écran du 2024-07-16 23-02-58

I'm on ubuntu with julia 1.10.0 and the output of ]st is :

  [ddbc3d08] ASPSimulator v0.1.0 `../ASPSimulator.jl`
⌃ [c7e460c6] ArgParse v1.1.4
⌃ [6e4b80f9] BenchmarkTools v1.4.0
⌃ [8be319e6] Chain v0.5.0
  [75880514] DataFrameMacros v0.4.1
  [a93c6f00] DataFrames v1.6.1
  [0c46a032] DifferentialEquations v7.13.0
⌃ [31c24e10] Distributions v0.25.107
⌃ [634d3b9d] DrWatson v2.13.0
  [c8e1da08] IterTools v1.10.0
⌃ [033835bb] JLD2 v0.4.45
  [b964fa9f] LaTeXStrings v1.3.1
  [7eb4fadd] Match v2.0.0
  [442fdcdd] Measures v0.3.2
⌃ [76087f3c] NLopt v1.0.1
  [b8a86587] NearestNeighbors v0.4.16
⌃ [429524aa] Optim v1.9.2
⌃ [7f7a1694] Optimization v3.21.2
⌃ [4e6fcdb7] OptimizationNLopt v0.2.0
  [42dfb2eb] OptimizationOptimisers v0.2.1
  [90014a1f] PDMats v0.11.31
⌃ [f0f68f2c] PlotlyJS v0.18.12
⌃ [91a5bcdd] Plots v1.40.0
  [1a8c2f83] Query v1.0.0
  [16b3a121] StateSpaceIdentification v0.1.0 `../StateSpaceIdentification.jl`
⌃ [f3b207a7] StatsPlots v0.15.6
⌃ [e88e6eb3] Zygote v0.6.69
  [10745b16] Statistics v1.10.0

Here i'm using DifferentialEquations.jl and not OrdinaryDiffEq.jl directly and i have seen that the DifferentialEquations.jl latest version is using the version "6.53" OrdinaryDiffEq.jl which is an old one so maybe it's my problem.

PS : When i wrote the issue and tried to have a simple example, I also tried to remove StaticArrays in the function simplified_asm1! and the problem disapear (as shown below) so i think there is a problem using StaticArrays when solving many ODEs.

Capture d’écran du 2024-07-16 23-01-52

@oscardssmith
Copy link
Contributor

Does the problem exist when removing Distributed?

@ChrisRackauckas
Copy link
Member

And is this actually an OrdinaryDiffEq thing or a Julia Distributed thing? Is there no way to recreate this without the ODE solver?

Again, we do not do any manual memory operations in the ODE solver, so it would be really weird for this to be an ODE solver issue. Almost certainly this is something about Base Julia, in which case this would be the wrong people to report it to and we would likely not be the ones to solve it.

@vbertret
Copy link

I have tried the same program but without Distributed and i get the same behaviour.

Capture d’écran du 2024-07-17 10-41-30

So I think the problem is related to the use of StaticArrays when solving ODEs but maybe it's not the good place to report the problem.

@ChrisRackauckas
Copy link
Member

That's still not very high memory usage. Did you set a GC cap (or whatever it's called) at like 1GB to see if that triggers? I forget the default setting but it may just not be triggering at this level if your computer has enough memory.

@vbertret
Copy link

Yeah i have tried to use "--heap-size-hint" keyword when starting the process but it didn't work. Something very weird is that calling GC.gc() outside the function M_t but inside the loop doesn't work whereas calling GC.gc() inside the function M_t works but the program become very slow. It's not high memory usage but when calling the function a lot of time in different processes, at the end i have 400GiB of RAM.

@ChrisRackauckas
Copy link
Member

This probably needs @gbaraldi

@vbertret
Copy link

For information, I have tried with the simple example mentionned above to add "--heap-size-hint=1G" but the problem is the same :

Capture d’écran du 2024-07-17 12-01-24

@oscardssmith
Copy link
Contributor

oscardssmith commented Jul 17, 2024

ah, this was fixed half a year ago (#2148), but that doesn't help you if you haven't updated your packages in half a year.

@vbertret
Copy link

Ok it's what i have tried to explain at the end of my second message. Just one last question, why DifferentialEquations.jl is using an old version of OrdinaryDiffEq.jl as dependency ?

@oscardssmith
Copy link
Contributor

it isn't. Something else is pinning your entire dependency stack back by ~6 months. What happens if you ]add OrdinaryDiffEq@6.85?

@vbertret
Copy link

vbertret commented Jul 18, 2024

Ok i'm sorry for the misunderstanding. I didn't understand the "[compat]" section in the Project.toml file.

Thanks for the fast resolution of my problem !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants