Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A large number of non-referenced small objects results in runaway memory increase on 0.4, gradual increase on master #15543

Closed
amitmurthy opened this issue Mar 17, 2016 · 14 comments
Assignees
Labels
GC Garbage collector performance Must go faster
Milestone

Comments

@amitmurthy
Copy link
Contributor

amitmurthy commented Mar 17, 2016

Issue and reduced code sample reported and provided by @kswietli in JuliaLang/Distributed.jl#36.

Have changed pmap in the original code to map.

type Inner
    data :: Int64
end

type Outer
    inners :: Vector{Inner}
end

function generate_outer()
    inners = Array(Inner, 1000)

    for position in 1:1000
        inners[position] = Inner(0)
    end

    return Outer(inners)
end

function loop(unused)
    outers = Array(Outer, 1000)

    for position in 1:1000
        outers[position] = generate_outer()
    end

    return
end    

function sweep()
    while true
        dontcares = Array(Int64, 100)
        map(loop, dontcares)

        #Uncomment this to not hit swap on 0.4.3.
        #gc()
    end
end

sweep()

In 0.4, the above code results in RAM usage increasing in a runaway manner. On master it is is much slower but still increases fairly gradually.

@amitmurthy amitmurthy added the GC Garbage collector label Mar 17, 2016
@amitmurthy
Copy link
Contributor Author

cc: @carnaval , @yuyichao

@yuyichao
Copy link
Contributor

Ram usage increasing in a runaway manner when allocating a lot of dead small object sounds like #13993

@amitmurthy
Copy link
Contributor Author

I'll test on the latest 0.4 commit. On master while the growth is much slower, it does keep growing and does not reach a steady state.

@amitmurthy
Copy link
Contributor Author

Oh! I didn't realize that #13993 has not been merged. A point to note is that with the call to gc() uncommented, the code runs fine. Runaway growth on latest 0.4 branch too.

@kswietli
Copy link

Not sure if this is helpful, but I did test it on 0.3.12. Explosive growth there as well, this seems a bit of a longstanding issue.

@amitmurthy
Copy link
Contributor Author

@carnaval / @yuyichao, as a practical workaround, is there an API to get the current total heap size? Till this is fixed, users can at least have their long running programs written with a test like

if julia_heap_size() > SOME_LIMIT
  gc()
end

@amitmurthy amitmurthy changed the title collection of non-referenced objects is slow on 0.4 A large number of non-referenced small objects results in runaway memory increase on 0.4, gradual increase on master Apr 29, 2016
@amitmurthy
Copy link
Contributor Author

Tagging this as 0.5 since it is a deal-breaker for long running programs. If we cannot have an automatic fix, we should at least provide a workaround of the type mentioned above.

@amitmurthy amitmurthy added this to the 0.5.0 milestone Apr 29, 2016
@yuyichao
Copy link
Contributor

yuyichao commented May 8, 2016

The issue seems to be that the allocation pattern (allocate a lot of live objects and throw them away at once) breaks the current full collection heuristics. Need to see how can we tweak the heuristic without introducing issues like #12632 ....

@StefanKarpinski
Copy link
Sponsor Member

Is the issue here that this memory is never reclaimed or that it is not reclaimed soon enough?

@JeffBezanson JeffBezanson added the performance Must go faster label May 26, 2016
@yuyichao
Copy link
Contributor

Not soon enough

@StefanKarpinski
Copy link
Sponsor Member

In that case we need to improve the heuristic for when to do a full collect to handle this better. Perhaps this is stating the obvious, but hopefully it helps focus attention and energy for this issue.

@yuyichao
Copy link
Contributor

Yes. Thats my plan after I come back from this conference.

@amitmurthy
Copy link
Contributor Author

The "when to do a full collect" is difficult to get right - depends on system configuration as well as load on the system. I still feel a simpler solution is to expose the total allocated amount and then do a full collect either a) at a pre-configured limit or b) the user can check the same and call gc().

yuyichao added a commit that referenced this issue May 28, 2016
* Delete `gc_bits` and `allocd` which are not enough to accurately
  record the necessary information.
* Add `has_marked` and `has_young` to identify free pages and untouched pages.

Fixes #15543
@yuyichao
Copy link
Contributor

This turns out to be a similar issue with the one fixed by #13993 (I was confused by the GC pause count ....) and it seems that the sweep is somehow confused by the page metadata and misses the free pages.

However it seems that the band-aid in #13993 doesn't really fix this case, so I end up using the scheme @carnaval and I put together for the bit swap GC, which fixes both cases and is surprisingly easy to implement on the current GC.....

yuyichao added a commit that referenced this issue May 29, 2016
* Delete `gc_bits` and `allocd` which are not enough to accurately
  record the necessary information.
* Add `has_marked` and `has_young` to identify free pages and untouched pages.

Fixes #15543
yuyichao added a commit that referenced this issue May 29, 2016
* Delete `gc_bits` and `allocd` which are not enough to accurately
  record the necessary information.
* Add `has_marked` and `has_young` to identify free pages and untouched pages.

Fixes #15543
yuyichao added a commit that referenced this issue May 29, 2016
* Delete `gc_bits` and `allocd` which are not enough to accurately
  record the necessary information.
* Add `has_marked` and `has_young` to identify free pages and untouched pages.

Fixes #15543
yuyichao added a commit that referenced this issue May 30, 2016
* Delete `gc_bits` and `allocd` which are not enough to accurately
  record the necessary information.
* Add `has_marked` and `has_young` to identify free pages and untouched pages.

Fixes #15543
yuyichao added a commit that referenced this issue May 30, 2016
* Delete `gc_bits` and `allocd` which are not enough to accurately
  record the necessary information.
* Add `has_marked` and `has_young` to identify free pages and untouched pages.

Fixes #15543
yuyichao added a commit that referenced this issue May 30, 2016
* Delete `gc_bits` and `allocd` which are not enough to accurately
  record the necessary information.
* Add `has_marked` and `has_young` to identify free pages and untouched pages.

Fixes #15543
yuyichao added a commit that referenced this issue Jun 1, 2016
* Delete `gc_bits` and `allocd` which are not enough to accurately
  record the necessary information.
* Add `has_marked` and `has_young` to identify free pages and untouched pages.

Fixes #15543
yuyichao added a commit that referenced this issue Jun 1, 2016
* Delete `gc_bits` and `allocd` which are not enough to accurately
  record the necessary information.
* Add `has_marked` and `has_young` to identify free pages and untouched pages.

Fixes #15543
yuyichao added a commit that referenced this issue Jun 1, 2016
* Delete `gc_bits` and `allocd` which are not enough to accurately
  record the necessary information.
* Add `has_marked` and `has_young` to identify free pages and untouched pages.

Fixes #15543
yuyichao added a commit that referenced this issue Jun 1, 2016
* Delete `gc_bits` and `allocd` which are not enough to accurately
  record the necessary information.
* Add `has_marked` and `has_young` to identify free pages and untouched pages.

Fixes #15543
yuyichao added a commit that referenced this issue Jun 1, 2016
* Delete `gc_bits` and `allocd` which are not enough to accurately
  record the necessary information.
* Add `has_marked` and `has_young` to identify free pages and untouched pages.

Fixes #15543
yuyichao added a commit that referenced this issue Jun 2, 2016
* Delete `gc_bits` and `allocd` which are not enough to accurately
  record the necessary information.
* Add `has_marked` and `has_young` to identify free pages and untouched pages.

Fixes #15543
yuyichao added a commit that referenced this issue Jun 2, 2016
* Delete `gc_bits` and `allocd` which are not enough to accurately
  record the necessary information.
* Add `has_marked` and `has_young` to identify free pages and untouched pages.

Fixes #15543
yuyichao added a commit that referenced this issue Jun 3, 2016
* Delete `gc_bits` and `allocd` which are not enough to accurately
  record the necessary information.
* Add `has_marked` and `has_young` to identify free pages and untouched pages.

Fixes #15543
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
GC Garbage collector performance Must go faster
Projects
None yet
Development

No branches or pull requests

5 participants