Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Array comprehension with many intermediate allocations causes large permanent memory usage #39475

Open
JonasIsensee opened this issue Feb 1, 2021 · 1 comment
Labels
performance Must go faster

Comments

@JonasIsensee
Copy link
Contributor

The following code snippet causes memory usage of my julia process to continuously increase significantly.
(I'm aware that there are many better ways to implement makesimpletype.)

Note that effect only seems to happen when I modify the number of entries in d.
Purely speculating: Could this be connected to some overspecialization on the number of elements?
If I only modify values of the dictionary, the code runs much faster and memory usage is constant.

makesimpletype(x::Dict{T,T}) where {T<:Integer} = 
    Array{T}(vcat([[child parent] for (child, parent) ∈ x]...))

d = Dict( n => n for n=1:1000000)
function loopfun(d)
       e=deepcopy(d)
       for n = 1:100
                  delete!(e,n)
                  mat = makesimpletype(e)
                  GC.gc(true)
                  println("$(round(Sys.maxrss()/2^20,digits=2)) MiB, ", size(mat))
       end
end
julia> loopfun(d)
740.79 MiB, (999999, 2)
740.79 MiB, (999998, 2)
814.68 MiB, (999997, 2)
857.2 MiB, (999996, 2)
946.41 MiB, (999995, 2)
1025.78 MiB, (999994, 2)
1076.62 MiB, (999993, 2)
1104.29 MiB, (999992, 2)
1170.89 MiB, (999991, 2)
1281.26 MiB, (999990, 2)
1307.23 MiB, (999989, 2)
1424.96 MiB, (999988, 2)
1483.08 MiB, (999987, 2)
1578.66 MiB, (999986, 2)
1681.47 MiB, (999985, 2)
1681.47 MiB, (999984, 2)
1681.47 MiB, (999983, 2)
1682.7 MiB, (999982, 2)
1736.14 MiB, (999981, 2)
1789.57 MiB, (999980, 2)
1835.41 MiB, (999979, 2)
1896.5 MiB, (999978, 2)
julia> versioninfo()
Julia Version 1.5.2
Commit 539f3ce943 (2020-09-23 23:17 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Core(TM) i7-9700K CPU @ 3.60GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-9.0.1 (ORCJIT, skylake)

See https://julialang.slack.com/archives/C67910KEH/p1612172510438000
and I was asked to reference #15543

@JeffBezanson
Copy link
Sponsor Member

Yes, at some point in the chain we're forming a tuple of all the splatted arrays, so it is constantly generating new types to specialize on. We might be able to fix that. But, in the meantime it's never a good idea to splat a large number of values, i.e. anything O(n) in the size of your data.

@JeffBezanson JeffBezanson added the performance Must go faster label Feb 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster
Projects
None yet
Development

No branches or pull requests

2 participants