Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

print memory usage and a truncated version of the object in whos() #12791

Merged
merged 7 commits into from
Aug 26, 2015

Conversation

vtjnash
Copy link
Sponsor Member

@vtjnash vtjnash commented Aug 25, 2015

this avoids the issues and complications of the previous attempts at this by explicitly defining minimally-recursive byte ownership relationships, rather than recursing over all fields and guessing too much. the general heuristic here is that if something could (in theory) form a recursive tree, then it doesn't exclusively "own" that memory, so it doesn't need to get counted. Only when an object is effectively just an indirection (by construction), did I add some rules to try to incorporate more of the fields. In practice, I think this makes the results pretty much as good as they possibly can be.

example output:

julia> whos()
                    @code_llvm    335 bytes  Function : (anonymous function)
                @code_llvm_raw    339 bytes  Function : (anonymous function)
                 @code_lowered    338 bytes  Function : (anonymous function)
                  @code_native    337 bytes  Function : (anonymous function)
                   @code_typed    336 bytes  Function : (anonymous function)
                @code_warntype    339 bytes  Function : (anonymous function)
                         @edit    330 bytes  Function : (anonymous function)
                         @less    330 bytes  Function : (anonymous function)
                        @which    331 bytes  Function : (anonymous function)
                    ArrayViews    137 KB     Module : ArrayViews
                          Base  30951 KB     Module : Base
                         Cairo    144 KB     Module : Cairo
                         Color    322 KB     Module : Color
                        Compat     39 KB     Module : Compat
                          Core   2620 KB     Module : Core
                    DataArrays    512 KB     Module : DataArrays
                    DataFrames    888 KB     Module : DataFrames
                        Docile    268 KB     Module : Docile
             FixedPointNumbers     30 KB     Module : FixedPointNumbers
                          GZip    284 KB     Module : GZip
                      Graphics     61 KB     Module : Graphics
                           Gtk   3307 KB     Module : Gtk
                        Images    879 KB     Module : Images
                          Main  42168 KB     Module : Main
                      Reexport   3110 bytes  Module : Reexport
             SortingAlgorithms     22 KB     Module : SortingAlgorithms
                     StatsBase    360 KB     Module : StatsBase
                     StatsFuns    352 KB     Module : StatsFuns
                             a     24 bytes  3-element Array{Int64,1} : [1,2,3]
                           ans    406 KB     Function : summarysize
                             b     16 bytes  1-element Array{Any,1} : Any[2]
                             c     64 bytes  2x2 Array{Any,2} : Any[2 3…
                     clipboard    831 bytes  Function : clipboard
                 code_warntype   1980 bytes  Function : code_warntype
                             d    322 bytes  Dict{Any,Any} with 2 entries : Dict{Any,Any}("b"=>:B,"a"=>:a)
                      download   1849 bytes  Function : download
                   downloadcmd      0 bytes  Void : nothing
                             e    256 bytes  Dict{Int64,Char} with 2 entries : Dict(2=>'3',1=>'2')
                          edit   6241 bytes  Function : edit
                             f    123 KB     Function : +
                             g    212 bytes  Function : (anonymous function)
 gen_call_with_extracted_types   3005 bytes  Function : gen_call_with_extracted_types
                             h      8 bytes  UInt64 : 0x0000000000000000
                             i     48 bytes  Base.AbstractIOBuffer{Array{UInt8,1}} : IOBuffer(data=UInt8[...], readable=true, writabl…
                          less   1992 bytes  Function : less
                   methodswith   4074 bytes  Function : methodswith
                      runtests   1688 bytes  Function : runtests
                   summarysize    406 KB     Function : summarysize
             type_close_enough    656 bytes  Function : type_close_enough
                       typesof    465 bytes  Function : typesof
                   versioninfo   4238 bytes  Function : versioninfo
                          whos     17 KB     Function : whos
                     workspace    809 bytes  Function : workspace

julia> bigtask = let a = rand(100_000_00)
         Task(()->a)
       end
Task (runnable) @0x00000001122a0010

julia> Base.summarysize(bigtask, true)
80_000_299

@StefanKarpinski
Copy link
Sponsor Member

This looks really nice and does seem to give the right intuitive sense of memory usage. Does this "cover" all memory that the program can reach, or does memory in structures that could have cycles get skipped?

@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Aug 25, 2015

Does this "cover" all memory that the program can reach

probably not, although that's actually quite intentional since there's a huge ambiguity issue with that goal. so instead this is hugely conservative on the notion of ownership. basically, if it can be transferred away to another object without the parent object noticing, than it didn't "own" that memory. it also means that if a structure could have a cycle, then I don't want to try to sum all of the memory.

i could give lots of examples, but consider for example a node in a linked list. while it's true that you could reach every other node in that list (and then all of their data and so on), when I am looking at a particular node, my intuition for memory consumption is most closely "how much memory does this node consume", not "how much valid memory can I manage address by following this pointer"

(note, the CI errors look like they may be pre-existing bugs)

@StefanKarpinski
Copy link
Sponsor Member

That makes sense. It should just be documented when this does or doesn't comprehensively account for memory usage. It may be useful to have some option for reporting how much reachable memory is not accounted for by this reporting and probably some other stats about memory usage.

@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Aug 25, 2015

travis failure appears to have been a random OOM event, considering the rest all passed, so I think this is ready to merge.

@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Aug 26, 2015

we can tweak this over time if its not entirely optimal, but lets see how it goes for a bit on master

@carnaval
Copy link
Contributor

as an aside, it's probably an interesting visualization question: even if you had all the information, i.e., a precise breakdown of reachable memory for every subset of roots, how would you display that ?

@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Aug 26, 2015

i think the biggest problem is that you really only want to include memory with a clear parent -> child relationship, and reachability isn't really it. I wanted this number to be fairly stable against changes from distant parts of the system.

although it can also be just as interesting to see how you can mislead the computation of this metric:

julia> begin
       a = let a=randn(100000); (x)->a; end
       b = [a, a, a]
       c = [+, +, +]
       whos()
       end
                          Base  25136 KB     Module : Base
                          Core   2595 KB     Module : Core
                          Main  29113 KB     Module : Main
                             a    781 KB     Function : (anonymous function)
                           ans      0 bytes  Void : nothing
                             b    735 bytes  3-element Array{Function,1} : [(anonymous function),(a…
                            b1    490 bytes  2-element Array{Function,1} : [(anonymous function),(a…
                            b2   1960 bytes  8-element Array{Function,1} : [(anonymous function),(a…
                             c    596 KB     8-element Array{Function,1} : [+,+,+,+,+,+,+,+]

vtjnash added a commit that referenced this pull request Aug 26, 2015
print memory usage and a truncated version of the object in whos()
@vtjnash vtjnash merged commit 8b95c0d into master Aug 26, 2015
@vtjnash vtjnash deleted the jn/whos branch August 26, 2015 21:07
@JeffBezanson
Copy link
Sponsor Member

I think this might be a bit too arbitrary:

julia> Base.summarysize((+)=>(+),true)
16

julia> Base.summarysize(((+),(+)),true)
180408

julia> Base.summarysize([(+),(+)],true)
153328

@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Aug 26, 2015

yeah, functions are a bit tricky since the difference between a Function and a Closure isn't really represented correctly by the type hierarchy

@JeffBezanson
Copy link
Sponsor Member

That's not the issue there. The problem is the difference between a struct, a tuple, and an array. The answers don't make any sense. Why should a tuple lead to deeper recursion for size-counting than an array?

I think there are 2 reasonable approaches: (1) report the space used by only the toplevel object, or (2) recur and avoid duplicates with an ObjectIdDict. Double-counting is very misleading, and I think giving spurious wrong answers here is much worse than giving nothing. Trying to debug memory using this could waste many hours.

@JeffBezanson
Copy link
Sponsor Member

Another suspicious smell here is how many methods summarysize has. I would expect this to look almost identical to deepcopy. I don't think it's useful to have a method for Associative, for example. Given 2 subtypes of Associative, it's not useful for summarysize to say they have the same size. What you'd want to know is which implementation is more memory efficient.

@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Aug 26, 2015

recur and avoid duplicates with an ObjectIdDict.

this will almost always end up grossly exaggerating the memory usage because it will follow connections that are not logically parent -> child ownership links. I can avoid the double counting by building a small ObjectIdDict for the single level of recursion. I was waiting to see if that was worth the cost.

suspicious smell here is how many methods summarysize has

true, I actually suspect a lot of these should have been methods of Base.sizeof

@JeffBezanson
Copy link
Sponsor Member

this will almost always end up grossly exaggerating the memory usage because it will follow connections that are not logically parent -> child ownership links

That's silly. The current implementation doesn't address this at all, and also exaggerates memory usage.

The key here is that you need a mental model of what this thing does to make it useful. "Follows all pointers without double-counting" I would argue is a simple and useful model. That model explains exactly why each node of a linked list would seem to use O(n) space. You can use this right away, for example to see how much more space a linked list uses compared to an array with the same data.

Another way to think about it is that it should reflect the space it would take to serialize an object.

@JeffBezanson
Copy link
Sponsor Member

I actually suspect a lot of these should have been methods of Base.sizeof

I mostly disagree. Something like summarysize is free to pick one algorithm among many possibilities (being discussed here) for determining size. sizeof has to give a meaningful answer; in fact equal to the number of bytes written by write (I believe this always holds but not 100% sure).

@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Aug 26, 2015

Another way to think about it is that it should reflect the space it would take to serialize an object.

not a bad way to think about it, but pretty much worthless for implementing it

Follows all pointers without double-counting" I would argue is a simple and useful model. That model explains exactly why each node of a linked list would seem to use O(n) space

except that following all pointers would imply that you you also follow the data pointers

I think the following patch would helps limit the recursion a bit better:

diff --git a/base/interactiveutil.jl b/base/interactiveutil.jl
index 99126c0..8f2d76a 100644
--- a/base/interactiveutil.jl
+++ b/base/interactiveutil.jl
@@ -530,7 +530,7 @@ function summarysize(obj::Tuple, recurse::Bool)
     if recurse
         for val in obj
             if val !== obj && !isbits(val)
# treat Tuple the same as any other iterable
-                size += summarysize(val, recurse)::Int
+                size += summarysize(val, false)::Int
             end
         end
     end
@@ -589,9 +589,11 @@ summarysize(obj::Set, recurse::Bool) =

 function summarysize(obj::Function, recurse::Bool)
     size::Int = sizeof(obj)
# don't unilaterally include `env` in the Function cost
-    size += summarysize(obj.env, recurse)::Int
+    if recurse
+        size += summarysize(obj.env, true)::Int
+    end
     if isdefined(obj, :code)
-        size += summarysize(obj.code, recurse)::Int
+        size += summarysize(obj.code, true)::Int
     end
     return size
 end

@JeffBezanson
Copy link
Sponsor Member

except that following all pointers would imply that you you also follow the data pointers

Of course you would; that's part of the data.

@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Aug 26, 2015

Of course you would; that's part of the data.

yes, but that's O(n*m), not the expected O(n) space estimate

@JeffBezanson
Copy link
Sponsor Member

You're changing the subject. I'm responding to the "gross overcounting" objection. The reason I think it's not gross overcounting is that it's easy to understand what's going on. In contrast, I don't see how to make use of something that recurs 1 level. I don't think that helps you understand how much space something uses.

@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Aug 26, 2015

sizeof has to give a meaningful answer; in fact equal to the number of bytes written by write (I believe this always holds but not 100% sure).

write(Array) is defined as map(write, Array) which is actually close to none of Core.sizeof, Base.sizeof, nor Base.summarysize

@JeffBezanson
Copy link
Sponsor Member

Ok fair enough. sizeof probably needs some work. Good thing the help for sizeof and write do not mention each other.

@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Aug 26, 2015

I don't think that helps you understand how much space something uses.

I was pretty tempted to go with just printing Core.sizeof. But it's not terribly helpful for a Closure (since you can't see how much is in the array). For Arrays, the iteration is basically there as a cheap hack to make Array{Any} look a bit worse than Array{Int}.

@JeffBezanson
Copy link
Sponsor Member

But nobody will be able to intuit what these cheap hacks and "bit worse than" heuristics mean or where they come from.

vtjnash added a commit that referenced this pull request Aug 26, 2015
…nly include the summarysize of env in the count when at the toplevel

ref #12791
@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Aug 26, 2015

But nobody will be able to intuit what these cheap hacks and "bit worse than" heuristics mean or where they come from.

that was in reference to this comment from above:

Given 2 subtypes of Associative, it's not useful for summarysize to say they have the same size. What you'd want to know is which implementation is more memory efficient.

I do not think there is any generic algorithm that can tell you that. So far, the options seem to be:

  1. Core.sizeof : only tells you how many fields it has
  2. this : tells you (1) + how many objects are in the Associative but not how much space they use (nor really does it indicate how much space the Associative uses)
  3. full recursion : more accurate for comparisons of similar data structures in exactly the same environment, not useful if the environment changes too much or if it encounters a global object such as a Function or Module (for example DataType->name->module, Function->env->module). Also can be expensive to compute.

@JeffBezanson
Copy link
Sponsor Member

I agree there isn't a perfect algorithm, I just want one simple enough to reason about. I think you can draw many useful conclusions from the deepcopy algorithm.

I think special cases for certain objects like Module are justifiable. Maybe there could be a default list of types not to recur into, like (Module, Function, Task).

@ScottPJones
Copy link
Contributor

Does this have any function to call to get the results as some sort of collection, rather than outputting it?
I'd like to be able to take the output, and then display it in different manners, say sorted by size, etc.

vtjnash added a commit that referenced this pull request Aug 27, 2015
…nly include the summarysize of env in the count when at the toplevel

ref #12791
@tkelman
Copy link
Contributor

tkelman commented Aug 27, 2015

I think this may have started causing a segfault in the misc test - http://buildbot.e.ip.saba.us:8010/builders/build_ubuntu14.04-x64/builds/2310/steps/shell_2/logs/stdio

function summarysize(obj::Array, recurse::Bool)
size::Int = sizeof(obj)
if recurse && !isbits(eltype(obj))
for i in 1:length(obj)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eachindex ?

Copy link
Sponsor Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's a plain array, so it doesn't need to be fancy

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't eachindex just as fast as 1:length(obj)? Seems if so, be consistent and always use eachindex (or renamed to indexes, as discussed elsewhere?)

@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Aug 27, 2015

I think this may have started causing a segfault in the misc test - http://buildbot.e.ip.saba.us:8010/builders/build_ubuntu14.04-x64/builds/2310/steps/shell_2/logs/stdio

this test can be sensitive to invalid objects that get left behind by past tests

@tkelman
Copy link
Contributor

tkelman commented Aug 27, 2015

Any way to make it more robust? Otherwise I have a bad feeling we'll start seeing this intermittently on CI, and now's not the time to be introducing any more of those...

@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Aug 27, 2015

figure out what other test(s) are leaving behind invalid objects (this one appears to be an array for which calling length is an infinite recursion)

@JeffBezanson
Copy link
Sponsor Member

This is another argument for why summarysize should depend on abstract interfaces as little as possible. There is nothing more concrete than the number of bytes taken up by an object; it doesn't matter what it reports about itself via a function like length.

@JeffBezanson
Copy link
Sponsor Member

Actually, wasn't there supposed to be a feature freeze a week ago or so? Now we have to flail around with new intermittent CI segfaults, for yet another change that wasn't slated for 0.4. Great.

@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Aug 27, 2015

let's please not conflate what seems likely to be increased test coverage increasing the likelihood of something causing a failure with that of an unrelated metric.

for yet another change that wasn't slated for 0.4

it was on the planning list (ViralBShah added this to the 0.4 milestone on Apr 30, 2014)

@JeffBezanson
Copy link
Sponsor Member

It was moved to 0.4.x quite a while ago.

The point is this is not a good time to disrupt the tests, even though such disruption is ultimately good since it exposes more issues to fix. This is why feature freezes exist, though it looks like we didn't officially announce one which is too bad.

@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Aug 27, 2015

Another way to look at the algorithm here is that it basically gives sizeof, unless that answer was really bad or undefined (e.g. Module, Function, Symbol, etc.), in which case I tried to give it a little help towards giving a plausible answer.

If you want to play with totalsizeof(3) the link is #11461

@tkelman
Copy link
Contributor

tkelman commented Aug 28, 2015

The failure is 100% repeatable via make testall1.

ihnorton added a commit that referenced this pull request Sep 6, 2015
'(line, linenum, filename, funcname) is used to pass
the original name through to keyword-arg specializations.
See 'line handling in julia-syntax.scm:keywords-method-def-expr

fixes #12791
fixes #12977
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants