Permalink
Switch branches/tags
Nothing to show
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
executable file 1499 lines (1360 sloc) 59.4 KB
#!/usr/bin/env python
# vim: et:ts=4:sw=4:
#
# This file is part of sp-rtrace.
#
# Copyright (C) 2005-2007,2009-2012 by Nokia Corporation
#
# Contact: Eero Tamminen <eero.tamminen@nokia.com>
#
# This program is free software; you can redistribute it and/or
# modify it under the terms of the GNU General Public License
# version 2 as published by the Free Software Foundation.
#
# This program is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA
# 02110-1301 USA
#
# CHANGES:
#
# 2005-10-21:
# - First version
# 2005-10-24:
# - Changed everything to use classes
# - Changed from tree model to all-nodes hash approach
# - The memory usage is shown in HTML tables to make it more readable
# - Support for showing the hash either top-down or bottom-up
# 2005-10-25:
# - Prevent output recursion
# - Add hyperlinks to HTML output
# - Add list of trace IDs to each node to:
# - Handle recursion in parsing which distorted allocation accounting
# - To be able to leave out functions which are not part of given trace
# although a function in the trace calls them with some other traces
# 2005-10-26:
# - Add GraphViz diagram output
# - Add numeric IDs and percentage to all nodes
# - Calculate node memory usage percentages on Graph completion
# - helps GraphViz (and other) reports
# - In HTML report:
# - use Stats directly
# - node IDs instead of function names in links (saves >10% HTML size)
# - rows are right aligned by default (saves >5% HTML size)
# - Add ASCII histogram output
# - Collect all alloc sizes to Stats
# - Add topmost alloc graph output
# - Store highest address/trace ID when parsing
# - Add command line option parsing
# 2005-10-27:
# - Make graphs more readable by grouping together child nodes below the limit
# - Show always at least one '#' for histogram entries
# - Bugfixes
# 2005-10-28:
# - Make output and graphs smaller / more readable:
# - coalesce same functions having different offsets in libleaks
# - add option for skipping rest of backtrace after given function
# - add option to have graph only of backtraces going through certain
# functions
# - Improve graph readability visually:
# - correction: show percentage, size and count in nodes, not edges
# - add option for setting the graph node emphasize percent
# - show terminator nodes as different shapes
# - make arrow heads larger
# - Add option to select how many highest address traces are shown
# in the topmost graph:
# - collect highest allocation address for each trace into Graph
# - Fixes to the HTML output report and generalizing Graphviz reports
# - Option parsing and show() methods output to stderr
# - Move Stats & Graph finalizing out of parsing
# - Document classes more
# 2005-11-04:
# - Graph readability improvements:
# - Ignore nodes without function names
# - Add option for merging all nodes from a library to one
# - Add node clustering option (useless with recursive Gtk backtraces)
# 2005-11-07:
# - Fix stats with --include-only option, move stats object to Graph
# 2006-02-09:
# - Support for the libleaks "name resolved" backtrace format
# - Fix graph anonymous function counts (it was showing count of calls)
# - Minor improvements to option parsing
# - Add headings to GraphViz reports
# 2006-02-21:
# - Changed '--skip' parameter to '--stop'
# - Added '--ignore' parameter to ignore certain functions in backtraces
# - Finetune GraphViz output (e.g. fill leaf nodes)
# - Allow minimum size limit < 1%
# 2006-02-24:
# - Add options to trim out intermediate and leaf nodes from the internal graph
# - Add option for leaving out "a function/N function ..." stuff from reports
# - Use debug() function instead of stdout/err.write
# - --show-references option for debugging
# - Use iterators for traversing dicts
# 2006-02-27:
# - Grouped options in help, added --ignore-gtk-internals and --verbose options
# 2006-02-28:
# - Added --no-ignore option to prevent ignoring certain nodes
# - Fine-tune --ignore-gtk-internals
# 2006-03-21:
# - Added --own-alloc-limit option for highlighting nodes doing a certain
# factor of allocations (done through them) directly, instead of them
# being done by child functions
# 2006-03-23:
# - Show args, helps in debugging when leaks-calltree is called by a script
# - Fix bug when --include-only doesn't match any nodes
# 2006-05-10:
# - In --ignore-gtk-internals option ignore also g_slice stuff
# 2006-07-14
# - Added --stats="total"/"count" option for statistics report output
# 2007-07-14
# - --ignore-gtk-internals -> --ignore-gtk-details
# 2007-07-17
# - Add support for Gobject traces and manual page
# 2009-06-26
# - Support non-indented backtraces from glibc backtrace_symbols_fd().
# - Allow librefs to report zero size objects, that can result from
# g_type_query() failures.
# 2010-08-02
# - Updated to work with rtrace format, renamed to rtrace-calltree and
# moved to the sp-rtrace package.
# 2010-09-06:
# - Remove support for old sp-libleaks format
# - Fix PyLint reported errors and some of the warnings
# 2010-09-08:
# - Fix handling of resource freeing backtraces (they caused destructors
# to seem allocate resources and main() to be called from them).
# - Include trace functions to nodes (by default, but allow ignoring them)
# - Add more --ignore-* convenience options, split Glib ones to their option
# - Match function names only from the beginning with --stop, --ignore
# and --include-only options
# 2010-09-09:
# - Fix HTML report recursion prevention
# - Option to use function argument value instead of function name
# (useful e.g. for file open traces)
# 2010-09-10:
# - Add --ignore-middle option for matching e.g. template functions
# - Move new/delete under --ignore-libc allocs and change/expand
# --ignore-qt-allocs option to --ignore-qt-internals
# 2010-09-21:
# - Change node text order to fix prefix matching with debug symbols
# - Fix --cluster node parsing and support also source files
# - Rename --join-lib to --join, match also function and source file names
# - Improve Qt internal functions ignoring
# - Add help on how to reduce graph with "sed"
# 2010-09-23:
# - Improve error handling
# - Handle the new Gdb style sp-rtrace backtrace format
# 2010-09-29:
# - Option for specifying size range for included allocations
# 2010-10-05:
# - Fixes to parsing BFD resolved traces and traces with resource frees
# that don't have backtraces + '*' shortcut to --node option
# 2010-11-01:
# - Ignore frees completely, don't count them as zero sized allocs
# (use -1 for free sizes and ignoring stack with negative sizes)
# 2011-02-07:
# - Remove HTML report as useless
# - Remove histogram reports and median alloc size calculation
# as they're now done by sp-rtrace-timeline
# - Change lists to hashes to speedup parsing of large traces
# 2011-02-24:
# - Rename --trace-args to --show-args
# - Support arbitrary function argument names (used by newer sp-rtrace)
# and different sized allocs for each argument instance
# - Add --include-only-args option to complement --include-only option
# 2011-03-09:
# - Correct statistics how many backtraces lines are ignored on
# request and how many because they lack required info
# 2011-03-18:
# - Fix help/usage text (only dot callgraph is output now)
# - Add options for reducing text in callgraph nodes
# - Add "count" callgraph report
# 2011-03-24:
# - Rename&reuse refs to only_counts to indicate kBs aren't needed
# 2012-03-20:
# - Fix max size limit
# 2012-05-24:
# - Consider traced functions for --include-only option too, not
# just functions in backtrace
# - Add trace info like process name to callgraph
#
# Originally this file (alloc-tree) was a part of libleaks package, but
# renamed and moved to the sp-rtrace package during libleaks redesign.
#
"""
CALLTREE
This script reads non-freed resource allocations from sp-rtrace trace
file and accumulates these to functions up in the listed backtraces.
It can then produce a variety of callgraphs from which you can see what
functions account for most of the resource usage and how the resource
allocations are divided between the functions causing them.
"""
import os, sys, re
def debug(msg):
sys.stderr.write("%s\n" % msg)
def error_exit(msg):
sys.stderr.write("\nERROR: %s!\n" % msg)
sys.exit(1)
class TotalStats:
def __init__(self, overhead):
# estimated average allocation overhead, bytes
self.alloc_overhead = overhead
self.zero()
def zero(self):
"zero/initialize statistics"
self.sizes = []
self.sizeitems = {}
self.count = 0
self.size = 0
self.overhead = 0
self.size_with_overhead = 0
def add_sizes(self, sizes):
"add list of given positive sizes to earlier sizes, return their sum"
size = 0
for s in sizes:
size += s
if s in self.sizeitems:
self.sizeitems[s] += 1
else:
self.sizeitems[s] = 1
return size
def complete(self):
"calculate memory allocation count, overhead, total and median"
self.count = sum(self.sizeitems.itervalues())
if not self.count:
self.zero()
return
self.overhead = self.count * self.alloc_overhead
self.size = sum([int(size)*count for size,count in self.sizeitems.iteritems()])
self.size_with_overhead = self.size + self.overhead
self.sizes = self.sizeitems.keys()
self.sizes.sort()
# ----------- Single backtrace node in all-items hash ----------
class Node:
def __init__(self, name, size, count, trace_id):
"initialize node with name, size & count"
# updated while nodes are being added to Graph
self.traces = {trace_id: True} # backtrace IDs
self.count = count # number of allocations done in this node
self.size = size # sum of allocations done in this node
self.name = name # node function name / hash ID
self.children = {} # child node hash IDs
self.parents = {} # parent node hash IDs
# these are filled after all nodes have been added to Graph
self.nid = 0 # node ID
self.leafnode = 0 # set if node has no children or no parents
self.percentage = 0.0 # percentage of memory used by the node
self.own_allocs = 1.0 # factor of own vs. child allocs
def show(self):
"print out node contents"
debug("Node %d:" % self.nid)
debug("- Name: %s" % self.name)
debug("- Alloc count: %d" % self.count)
debug("- Total alloc size: %d" % self.size)
debug("- Percentage of all allocs: %.2f" % self.percentage)
debug("- Traces:")
debug(str(self.traces))
debug("- Parents: ")
debug(str(self.parents))
debug("- Children:")
debug(str(self.children))
def add_size_count(self, size, count):
"add given size & count to node values"
self.count += count
self.size += size
def child_list(self):
"return list of node children"
return self.children.keys()
def parent_list(self):
"return list of node parents"
return self.parents.keys()
def add_child(self, fun_name):
"add given child to node"
self.children[fun_name] = True
def add_parent(self, fun_name):
"add given parent to node"
self.parents[fun_name] = True
def remove_child(self, fun_name):
"removes given child from node"
if fun_name in self.children:
del(self.children[fun_name])
else:
error_exit("function '%s' not in node children" % fun_name)
def remove_parent(self, fun_name):
"removes given parent from node"
if fun_name in self.parents:
del(self.parents[fun_name])
else:
error_exit("function '%s' not in node parents" % fun_name)
def remove_children(self):
"remove all node children"
self.children = {}
def remove_parents(self):
"remove all node parents"
self.parents = {}
def add_trace(self, trace_id):
"add given trace id to node or return False if it already was in"
if trace_id not in self.traces:
self.traces[trace_id] = True
return True
else:
return False
# ---------------- All items hash handling -------------------
class Graph:
def __init__(self, overhead):
# filled through .update()
self.info = None
self.trace_id = 0 # trace count, used as trace ID, start from 1
self.recursed = 0 # count of recursed functions
self.nodes = {} # all nodes in the backtraces
self.addresses = [] # list of (address, trace id) tuples
self.stats = TotalStats(overhead)
def show_contents(self):
"print out Graph contents"
self.stats_full()
debug(str(self.nodes))
def show_nodes(self, nodes, references, remove):
"print out Graph nodes (in readable format)"
for name in nodes:
if name not in self.nodes:
if name == '*':
for node in self.nodes.itervalues():
node.show()
break
debug("Warning: node '%s' does not exist" % name)
continue
self.nodes[name].show()
if references:
for node in self.nodes.itervalues():
if name in node.parents:
debug("parent-ref from '%s'" % node.name)
if name in node.children:
debug("child-ref from '%s'" % node.name)
if name in remove:
debug("- This node will be REMOVED")
debug("")
def stats_full(self):
"print out Graph parsing statistics"
debug("Graph statistics:")
debug("- backtraces parsed: %d" % self.trace_id)
debug("- Recursed functions: %d" % self.recursed)
debug("- Nodes: %d" % len(self.nodes))
def stats_oneliner(self):
"return one-line Graph parsing stats string"
return "%d traces (with %d recursed function calls) parsed to %d nodes." % (self.trace_id, self.recursed, len(self.nodes))
def get_stats(self):
"return TotalStats object with Graph node alloc statistics"
return self.stats
def get_top_nodes(self):
"returns a list of keys for top nodes"
nodes = []
for key,node in self.nodes.iteritems():
# nodes without parents are top ones
if not node.parents:
nodes.append(key)
return nodes
def get_bottom_nodes(self):
"returns a list of keys for bottom nodes"
nodes = []
for key,node in self.nodes.iteritems():
# nodes without children are bottom ones
if not node.children:
nodes.append(key)
return nodes
def add_node(self, trace, idx, size, count):
"add trace item to graph"
#debug("DEBUG: add '%s'" % trace[idx])
name = trace[idx]
node = Node(name, size, count, self.trace_id)
if idx > 0:
node.add_child(trace[idx-1])
if idx+1 < len(trace):
node.add_parent(trace[idx+1])
self.nodes[name] = node
def merge_node(self, trace, idx, size, count):
"merge trace item to graph"
#debug("DEBUG: merge '%s'" % trace[idx])
node = self.nodes[trace[idx]]
# prevent size&count addition if this trace function is recursed
if node.add_trace(self.trace_id):
node.add_size_count(size, count)
else:
self.recursed += 1
if idx > 0:
node.add_child(trace[idx-1])
if idx+1 < len(trace):
node.add_parent(trace[idx+1])
def update(self, trace, sizes, address):
"update graph with given trace[], size, count, topmost address"
size = self.stats.add_sizes(sizes)
if size < 0:
# negative sizes come from frees, ignore them
return
self.trace_id += 1
count = len(sizes)
idx = 0
for line in trace:
if line in self.nodes:
self.merge_node(trace, idx, size, count)
else:
self.add_node(trace, idx, size, count)
idx += 1
self.addresses.append((address, self.trace_id))
def trim_intermediate(self, node, remove):
"""if node has only one parent and child, add it to remove list
and relink surrounding nodes"""
if len(node.parents) == 1 and len(node.children) == 1:
parent_name = node.parent_list()[0]
child_name = node.child_list()[0]
if parent_name == child_name:
return
parent = self.nodes[parent_name]
child = self.nodes[child_name]
# remove references to this node and re-link the nodes
name = node.name
parent.remove_child(name)
child.remove_parent(name)
parent.add_child(child_name)
child.add_parent(parent_name)
node.remove_parents()
node.remove_children()
#debug("Removing intermediate node: '%s'" % name)
remove[name] = 1
def trim_leaf(self, node, remove):
"""if node is a leaf, put it and other nodes in the same leaf
to remove list and remove the refence to the leaf"""
# if this is a leaf node with only one reference,
# remove reference and leaf node, and check next
# node in the same direction
while (not len(node.parents)) and (len(node.children) == 1):
child = self.nodes[node.child_list()[0]]
child.remove_parent(node.name)
node.remove_children()
#debug("Removing leaf node without parents: '%s'" % node.name)
remove[node.name] = 1
node = child
while (not len(node.children)) and (len(node.parents) == 1):
parent = self.nodes[node.parent_list()[0]]
parent.remove_child(node.name)
node.remove_parents()
#debug("Removing leaf node without children: '%s'" % node.name)
remove[node.name] = 1
node = parent
def trim(self, options):
"""if 'intermediates' not set, removes nodes with only one parent
and child. if 'leafs' not set, removes leaf nodes"""
remove = {}
if not options.leafs:
for node in self.nodes.itervalues():
self.trim_leaf(node, remove)
if not options.intermediate:
for node in self.nodes.itervalues():
self.trim_intermediate(node, remove)
# you cannot remove items while iterating, so do it now
if remove and options.node:
debug("NOTE: Removed some node(s), showing updated node infos")
self.show_nodes(options.node, options.references, remove)
for name in remove.iterkeys():
del(self.nodes[name])
def complete(self):
"""sets allocation percentage and creates an ID for each node,
returns stats object"""
self.stats.complete()
alloc_overhead = self.stats.alloc_overhead
total_size = self.stats.size_with_overhead
if not total_size:
debug("WARNING: allocation total is zero, all percentages will be bogus!")
total_size = 0.1
idx = 0
for node in self.nodes.itervalues():
node.percentage = 100.0 * (node.size + node.count*alloc_overhead) / total_size
if (not node.parents) or (not node.children):
node.leafnode = 1
childsize = sum([self.nodes[child].size for child in node.children])
if childsize > 0:
node.own_allocs = float(node.size) / childsize
node.nid = idx
idx += 1
return self.stats
# -------------- rtrace report parsing --------------------
class Rtrace:
def __init__(self, tracefile):
# format: <index>. [@<context id>] [\[<timestamp>\]] <function name>\<<resource type>\>(<resource size>) = <hex ID>
self.alloc_pattern = re.compile("^[0-9]+\.( @[0-9a-fA-F]+|)( \[[^]]+\]|) ([a-zA-Z_].+)\(([0-9]+)\) = (0x[a-fA-F0-9]+)")
# format: <index>. [@<context id>] [\[<timestamp>\]] <function name>\<<resource type>\>(<hex ID>)
self.free_pattern = re.compile("^[0-9]+\.( @[0-9a-fA-F]+|)( \[[^]]+\]|) ([a-zA-Z_].+)\((0x[a-fA-F0-9]+)\)")
# format: at <source file>:<line nro>
self.gdb_src_pattern = re.compile(".* at ([^:]+):([0-9]+)$")
self.leaksfile = open(tracefile)
self.get_info()
def get_info(self):
"get process/trace information from first trace file line"
line = self.leaksfile.readline().split(',')
info = { 'process': "UNKNOWN", 'pid': "0" }
for item in line:
if '=' in item:
name,value = item.strip().split('=')
info[name] = value
self.info = info
def libname_strip(self, libname):
"returns library name stripped of 'lib' prefix and '.so*' postfix"
# format: from <library/binary file>
pos1 = libname.rfind("/")
if pos1 >= 0:
pos1 += 1
elif libname.startswith(" from "):
pos1 = 6
else:
pos1 = 0
pos2 = libname.rfind(".so")
if pos2 > 0:
return libname[pos1:pos2]
return libname[pos1:].strip()
def trace_strip(self, line, joins):
"return relevant information from GDB style backtrace line"
# format (src is only with function):
# \t0x<hex>[ <function name>(<args>)][ at <src>:<line>| from <bin/lib>]
#
# remove address, leave preceeding space
pos = line.find(' ')
if pos <= 0:
return None
line = line[pos:]
pos = line.find(" from ")
if pos == 0:
# just library/binary
return self.libname_strip(line)
if pos > 0:
# function + library/binary
lib = self.libname_strip(line[pos:])
if lib in joins:
return lib
fun = line[:pos]
src = None
else:
lib = None
pos = line.find(" at ")
if pos < 0:
# just function/method (+ args)
fun = line
src = None
else:
fun = line[:pos]
if ':' in line[pos:]:
# function + src:line
match = self.gdb_src_pattern.match(line)
if not match:
error_exit("line containing ' at ' doesn't match <src>:<line>:\n'%s'\n" % line)
src,linenro = match.groups()
else:
# function + src
src = line[pos+4:].strip()
linenro = ''
if src in joins:
return src
fun = fun.strip()
if fun:
# check function joins
pos = fun.find('(')
if pos >= 0:
name = fun[:pos]
else:
name = fun
if name in joins:
return name + "()"
else:
fun = "()"
#debug("----------\nline:\t%s\nfun:\t%s\nsrc:\t%s\nlib:\t%s" % (line.strip(),fun,src,lib))
if lib:
return "%s in %s" % (fun, lib)
elif src:
return "%s at %s:%s" % (fun, src, linenro)
else:
return fun
def get_name_size_addr(self, line):
"return name and size & ID for alloc, but negative sizes for frees"
match = self.alloc_pattern.match(line)
if match:
return (match.group(3), int(match.group(4)), int(match.group(5), 16))
match = self.free_pattern.match(line)
if match:
return (match.group(3), -1, int(match.group(4), 16))
error_exit("this alloc/free line doesn't match the patterns:\n%s" % line)
def ignore(self, line, options):
"return True if given line should be ignored, False otherwise"
for fun in options.ignore:
if line.startswith(fun):
for fun in options.noignore:
if line.startswith(fun):
return False
return True
for fun in options.ignore_middle:
if fun in line:
for fun in options.noignore:
if line.startswith(fun):
return False
return True
return False
def update_args(self, trace, argsfun, argvalues, argsizes):
if argsfun[-1] == '>':
# remove traced function type
argsfun = argsfun[:argsfun.find('<')]
idx = 0
# '*' needed for zip to remove list surrounding value lists
for arglist in zip(*argvalues):
fun = "%s<%s>" % (argsfun, ", ".join(arglist))
#sys.stderr.write(fun+"\n")
graph.update([fun]+trace, argsizes[idx], 0)
idx += 1
def parse(self, graph, options):
"return parsed graph out of the given rtrace report"
graph.info = self.info
only_counts = options.only_counts
top = 0 # highest allocation address/ID for current backtrace
nonames = 0 # backtrace lines without function name
ignores = 0 # otherwise ignored backtrace lines
ignored = 0 # count for single backtrace
trace = [] # current backtrace
sizes = [] # sizes for each alloc/free for current backtrace
argsfunc = "" # name for function with args
argsizes = [] # list of resource sizes for each argument group
argvalues = [] # list of values for each named argument
argkeys = {} # argument { name : valuelist index }
name = ""
skip = False # whether to skip rest of the current backtrace
stack = False # whether current alloc/free includes stack frames
include = False # whether to include this backtrace
for line in self.leaksfile:
# new backtrace?
if line[0] in "0123456789":
oldname = name
(name,size,addr) = self.get_name_size_addr(line)
if name != oldname:
# backtrace can be missing for previous alloc
# or free but it still needs to be handled and
# separated from backtraces following them
stack = True
# backtrace stack frames collected for previous allocs/frees?
if stack:
#if options.verbose:
# debug("sizes: %s, trace: %s" % (sizes,trace))
if options.progress:
# show parsing progress
debug(line[:line.find('.')])
# there are included alloc lines for this backtrace and
# it's marked for inclusion or all BTs are included?
if (sizes or argsizes) and (include or not (options.include_funs or options.include_args)):
if argkeys:
argsizes.append(sizes)
self.update_args(trace, argsfun, argvalues, argsizes)
else:
graph.update(trace, sizes, top)
ignores += ignored
else:
if ignored:
ignores += ignored
else:
ignores += len(sizes)
# initialize new backtrace
argsfun = name
argsizes = []
argvalues = []
argkeys = {}
if self.ignore(name, options):
ignored = 1
trace = []
else:
ignored = 0
trace = [name]
sizes = []
top = 0
skip = False
stack = False
include = False
for fun in options.include_funs:
if name.startswith(fun):
include = True
elif argkeys:
# add sizes for previous args from this backtrace
argsizes.append(sizes)
sizes = []
# get alloc size
if size >= options.size_min and size <= options.size_max:
if only_counts:
size = 1
sizes.append(size)
# get alloc address
if (addr > top):
# this is highest address for this trace
top = addr
elif line.startswith("\t0x"):
# ignore rest of this backtrace?
if skip:
ignored += 1
continue
# parse backtrace line
line = self.trace_strip(line, options.joins)
# ignore lines without function names
if not line:
nonames += 1
continue
# ignore lines specified in options
if self.ignore(line, options):
ignored += 1
continue
# skip rest of the trace?
for fun in options.stop:
if line.startswith(fun):
ignored += 1
skip = True
break
# flag this trace for node inclusion?
for fun in options.include_funs:
if line.startswith(fun):
include = True
break
trace.append(line)
stack = True
continue
elif options.show_args and line.startswith("\t$"):
pos = line.find(' = ')
if pos < 4:
error_exit("line doesn't match traced function argument pattern:\n%s" % line)
key = line[2:pos].strip()
val = line[pos+3:].strip()
if val:
if key in argkeys:
argvalues[argkeys[key]].append(val)
else:
# index to this argument type (e.g. path vs. flags)
argkeys[key] = len(argvalues)
# list of values for this argument type
argvalues.append([val])
for arg in options.include_args:
if arg in val:
include = True
break
if trace and not stack:
# no separate node for traced function name
trace = []
# add previous backtrace to Graph
if stack:
if include or (not options.include_funs and not options.include_args):
if argkeys:
argsizes.append(sizes)
self.update_args(trace, argsfun, argvalues, argsizes)
else:
graph.update(trace, sizes, top)
return (graph, ignores, nonames)
# --------------- GraphViz output reports -----------------
class GraphvizReport:
def __init__(self, graph, options):
self.nodes = graph.nodes
self.show_leafs = options.show_below_limit
# don't show nodes using less memory and which to emphasize
self.emph_limit = options.emph_limit
self.own_allocs = options.own_allocs
self.limit = options.limit
self.counter = 0
self.cluster_nodes = options.cluster
self.only_counts = options.only_counts
self.strip_filenames = not options.keep_filenames
if options.keep_paths:
self.strip_paths = None
else:
self.strip_paths = re.compile("/.*[^ ]/")
if options.keep_types:
self.strip_args = None
else:
self.strip_args = re.compile("\(.*\)")
self.parse_info(graph.info)
# these need to be overridden in subclasses
self.heading = ""
def parse_info(self, traceinfo):
keys = traceinfo.keys()
keys.sort()
info = None
for key in keys:
if key not in ('arch', 'filter', 'pid', 'process', 'timestamp'):
continue
if info:
info += ", %s=%s" % (key, traceinfo[key])
else:
info = "%s=%s" % (key, traceinfo[key])
self.info = info
def header(self):
"output GraphViz header with help, layout and style options"
print """
# Convert this to PostScript with:
# dot -Tps -o alloc-graph.ps <this file>
digraph alloc_graph {
# set layout options
center=1;
#ratio=fill;
ratio=compress;
# avoid line crossing
#mclimit=2.0;
# page size A4
page="11.69,8.27";
size="9.69,6.27";
#page="8.27,11.69";
#size="6.27,9.69";
margin="1.0";
# set style options
color="black";
bgcolor="white";
node [shape="ellipse"];
edge [dir="forward" arrowsize="2"];
labelloc="t";
label="%s\\n\\n(%s)";
""" % (self.heading, self.info)
def footer(self):
"output GraphViz footer"
print "}"
def set_direction_tb(self):
self.next = self.next_down
self.prev = self.next_up
def set_direction_bt(self):
self.next = self.next_up
self.prev = self.next_down
def next_up(self, node):
return node.parent_list()
def next_down(self, node):
return node.child_list()
def node_ok(self, node):
"override in subclass to limit which nodes are output"
return True
def output(self):
"output GraphViz report"
self.header()
self.edges = []
clusters = {}
others = {}
for node in self.nodes.itervalues():
if not self.node_ok(node):
continue
# cluster name is just the library or code file name
name = node.name
pos = name.rfind(' in ')
if pos > 0:
name = name[pos+4:]
pos = name.rfind(":")
if pos > 0:
name = name[:pos]
if name in self.cluster_nodes:
if name in clusters:
clusters[name][node.name] = node
else:
clusters[name] = { node.name: node }
else:
others[node.name] = node
for key in clusters.iterkeys():
print "subgraph \"cluster-%s\" {" % key
self.output_cluster(clusters[key])
print "}"
self.output_cluster(others)
for edge in self.edges:
print edge
self.footer()
def output_cluster(self, nodes):
"output all nodes in a cluster"
for node in nodes.values():
#if self.node_ok(node):
self.output_node(node)
def reduce_name(self, name):
"reduce node names as requested"
if self.strip_args:
name = self.strip_args.sub("()", name)
if self.strip_paths:
name = self.strip_paths.sub("/", name)
if self.strip_filenames:
offset = name.find(" at ")
if offset > 0:
name = name[:offset]
# wrap filenames
return name.replace(" at ", "\\nat ", 1).replace(" in ", "\\nin ", 1)
def output_node(self, node):
"output node name and links to its children"
# this node
percentage = node.percentage
style = ""
if percentage >= self.emph_limit:
if node.own_allocs >= self.own_allocs:
style = " style=filled fillcolor=lightgray color=red"
else:
style = " style=bold color=red"
if node.leafnode:
style += " shape=diamond style=filled fillcolor=lightgray"
nid, size, count = node.nid, node.size, node.count
name = self.reduce_name(node.name)
if self.only_counts:
detail_str = "%d%% (%d)" % (percentage, count)
else:
detail_str = "%d%% (%dKB / %d)" % (percentage, (size + 512) / 1024, count)
print "N%d [label=\"%s\\n%s\"%s];" % (nid, name, detail_str, style)
# child nodes
self.output_child_nodes(node)
def output_child_nodes(self, node):
# TODO: add node labels for how much given route counts for
size = 0
funs = 0
count = 0
percentage = 0
for next_id in self.next(node):
next = self.nodes[next_id]
if self.node_ok(next):
self.edges.append("N%d -> N%d;" % (node.nid, next.nid))
elif self.show_leafs:
percentage += next.percentage
count += next.count
size += next.size
funs += 1
if funs:
if self.only_counts:
detail_str = "%d%% (%d)" % (percentage, count)
else:
detail_str = "%d%% (%dKB / %d)" % (percentage, (size + 512)/1024, count)
if funs > 1:
print "U%d [label=\"%d functions\\n%s\"];" % (self.counter, funs, detail_str)
else:
print "U%d [label=\"a function\\n%s\"];" % (self.counter, detail_str)
self.edges.append("N%d -> U%d;" % (node.nid, self.counter))
self.counter += 1
class TopmostGraphReport(GraphvizReport):
"report showing a graph from topmost allocation to stack top or bottom"
def __init__(self, graph, options):
GraphvizReport.__init__(self, graph, options)
self.heading = "Functions involved in %d topmost allocations" % options.depth
# trace addresses from highest to lowest
graph.addresses.sort()
graph.addresses.reverse()
# how many of the highest address traces we want?
self.traces = []
for addr,tid in graph.addresses[:options.depth]:
print "# 0x%x: %d" % (addr, tid)
self.traces.append(tid)
def node_ok(self, node):
# called by superclass output_child_nodes()
for tid in self.traces:
if tid in node.traces:
return True
return False
class AllocGraphReport(GraphvizReport):
"report showing a graph including the largest backtraces"
def __init__(self, graph, options):
"init GraphViz graph limits"
GraphvizReport.__init__(self, graph, options)
if options.only_counts:
self.heading = "Allocation counts callgraph\\nFunctions which did or caused >=%.1f%% of all allocations" % self.limit
else:
self.heading = "Allocation size totals callgraph\\nFunctions through which >=%.1f%% amount of the allocations total were done" % self.limit
def node_ok(self, node):
# called by superclass output_child_nodes()
if node.percentage >= self.limit:
return True
return False
# --------------------- Options class -----------------
class Options:
"command line option help, parsing and storage"
cluster = [] # which libraries should be clustered
include_funs = [] # include only backtraces matching these functions
include_args = [] # include only backtraces with functions having these args
ignore = [] # prefixes of functions to ignore in the backtraces
ignore_middle = [] # substrings of function to ignore in the backtraces
noignore = [] # prefixes of functions not to ignore
stop = [] # prefixes of functions where to end backtrace
joins = [] # libs/sources which should be shown with single node
node = [] # which node(s) to show info about
depth = 3 # depends on report type
limit = 10 # maximum percentage of total to show
size_min = 0 # minimum alloc size to include
size_max = 0x7fffffff # maximum alloc size to include
emph_limit = 60 # maximum percentage of total to emphasize
own_allocs = 1.2 # factor by how much own allocs can exceed child allocs
report = "graph" # report type
direction = "tb" # traversal direction
leafs = True # leave out leaf nodes with only 1 parent?
intermediate = True # leave out intermediate nodes?
show_below_limit = True # show how many nodes were below limit
keep_filenames = True # keep filenames in nodes
keep_paths = False # keep full paths in nodes
keep_types = False # keep C++ signature types in nodes
show_args = False # replace trace function names with arg values?
references = False # whether to show what references --node=<node>
progress = False # whether to show number of each read record
verbose = False # whether to have some extra output
only_counts = False # whether to talk in output about KBs or count
tracefile = None # resource trace file
def __init__(self, argv):
"parse command line options"
name = argv[0].split('/')[-1]
if len(argv) < 2:
self.help(name, "not enough arguments")
for arg in argv[1:]:
if arg[:2] == "--":
if arg.find('=') < 1:
self.help(name, "option value missing", arg)
option, value = arg[2:].split("=")
else:
if arg[0] == "-":
self.help(name, "Unknown option", arg)
self.tracefile = arg
continue
# check/get options
if option == "limit":
value = float(value)
if value > 0.0 and value < 100.0:
self.limit = float(value)
else:
self.help(name, "invalid limit percentage", arg)
elif option == "emph-limit":
value = float(value)
if value >= 1.0 and value < 100.0:
self.emph_limit = float(value)
else:
self.help(name, "invalid emph-limit percentage", arg)
elif option == "own-alloc-emph-limit":
value = float(value)
if value > 1.0 and value < 10.0:
self.own_allocs = float(value)
else:
self.help(name, "invalid own-alloc-emph-limit factor", arg)
elif option == "size-range":
try:
(mins, maxs) = value.split('-')
if mins:
self.size_min = int(mins)
if maxs:
self.size_max = int(maxs)
except ValueError:
self.help(name, "invalid --size-range range/values", arg)
elif option == "depth":
value = int(value)
if value >= 1 and value < 100:
self.depth = int(value)
else:
self.help(name, "invalid percentage limit", arg)
elif option == "join":
self.joins.append(value)
elif option == "ignore":
self.ignore.append(value)
elif option == "ignore-middle":
self.ignore_middle.append(value)
elif option == "no-ignore":
self.noignore.append(value)
elif option == "ignore-font-allocs":
if not self.get_bool(name, arg, value):
continue
self.ignore += [
"FT_Realloc", "FT_Alloc", "FT_QAlloc", "FcStrCopy",
"ft_alloc", "ft_realloc", "ft_mem_"
]
elif option == "ignore-libc-allocs":
if not self.get_bool(name, arg, value):
continue
self.ignore += [
"malloc", "realloc", "calloc", "operator new",
"memdup", "strdup", "strndup"
]
elif option == "ignore-glib-internals":
if not self.get_bool(name, arg, value):
continue
self.ignore += [
"libglib-2.0", "libgobject-2.0",
"g_type_", "g_object_", "g_signal_",
"g_enum_", "g_quark_", "g_param_", "g_pattern_spec_",
"g_hash_table_", "g_slist_", "g_list_", "g_array_",
"g_malloc", "g_realloc", "g_try_", "g_mem_", "g_slice_",
"g_strdup", "g_strndup", "g_memdup",
"g_main_context_", "g_main_dispatch",
"g_closure_", "g_cclosure_"
]
# "tsearch", "dcgettext"]
elif option == "ignore-gtk-internals":
if not self.get_bool(name, arg, value):
continue
self.ignore += [
"gtk_binding_", "_gtk_", "gtk_container_", "png_zalloc"
]
# "gtk_widget_size_request",
# "gtk_widget_map", "gtk_widget_realize",
elif option == "ignore-qt-internals":
if not self.get_bool(name, arg, value):
continue
self.ignore += [
"libQt", "qMalloc", "qRealloc",
"QString::", "QByteArray::", "QMutex",
"QHashData::", "QListData::", "QMapData::",
"QVectorData::", "QString in ", "QEventLoop::",
"QMetaObject::", "QObjectPrivate::", "QObject::",
"QMetaCallEvent::", "QMetaMethod::",
"qDBusAddSpyHook", "qt_plugin_instance"
]
# these can cause complication calls/loops
# and QVariant is a template class
self.ignore_middle += [
"::event", "::notify", "::sendPostedEvents",
"::qt_metacall", " QVariant::"
]
elif option == "stop":
self.stop.append(value)
elif option == "node":
self.node.append(value)
elif option == "cluster":
self.cluster.append(value)
elif option == "include-only":
self.include_funs.append(value)
elif option == "include-only-args":
self.include_args.append(value)
elif option == "show-below-limit":
self.show_below_limit = self.get_bool(name, arg, value)
elif option == "keep-filenames":
self.keep_filenames = self.get_bool(name, arg, value)
elif option == "keep-paths":
self.keep_paths = self.get_bool(name, arg, value)
elif option == "keep-types":
self.keep_types = self.get_bool(name, arg, value)
elif option == "show-references":
self.references = self.get_bool(name, arg, value)
elif option == "show-args":
self.show_args = self.get_bool(name, arg, value)
elif option == "intermediate":
self.intermediate = self.get_bool(name, arg, value)
elif option == "leafs":
self.leafs = self.get_bool(name, arg, value)
elif option == "progress":
self.progress = self.get_bool(name, arg, value)
elif option == "verbose":
self.verbose = self.get_bool(name, arg, value)
elif option == "type":
if value in ("count", "graph", "topmost"):
if value == "count":
self.only_counts = True
else:
self.only_counts = False
self.report = value
else:
self.help(name, "unknown report type", arg)
elif option == "direction":
if value in ("tb", "bt"):
self.direction = value
else:
self.help(name, "unknown direction", arg)
else:
self.help(name, "unknown option", arg)
self.validate_input(name)
debug(" ".join(argv))
if self.verbose:
self.show_options()
def show_options(self):
debug("file = '%s' (rtrace leak file to parse)" % self.tracefile)
debug("report = '%s' (report type)" % self.report)
debug("direction = '%s' (report traversal direction)" % self.direction)
debug("depth = %d (number of topmost graphs to show)" % self.depth)
debug("limit = %.1f (show only nodes with amount of allocs >= given %% of total)" % self.limit)
debug("emph_limit = %.1f (emphatize nodes with amount of allocs >= given %% of total)" % self.emph_limit)
debug("size-range = %d-%d (which sizes allocs to incluce)" % (self.size_min, self.size_max))
debug("leafs = %s (include leaf nodes?)" % self.bool_str(self.leafs))
debug("intermediate = %s (include intermediate nodes?)" % self.bool_str(self.intermediate))
debug("show_args = %s (show also trace function args?)" % self.bool_str(self.show_args))
debug("show_below_limit = %s (show how many nodes were below limit?)" % self.bool_str(self.show_below_limit))
debug("keep_filenames = %s (show filenames in graphs?)" % self.bool_str(self.keep_filenames))
debug("keep_paths = %s (show full file paths in graphs?)" % self.bool_str(self.keep_paths))
debug("keep_types = %s (show C++ method signature args?)" % self.bool_str(self.keep_types))
self.show_list(self.cluster, "Cluster functions within given sources/libs together")
self.show_list(self.include_funs, "Include only backtraces having node(s) which names start with these prefixes")
self.show_list(self.include_args, "Include only backtraces which traced functions contain these args")
self.show_list(self.ignore, "Ignore in backtraces nodes which names have these prefixes")
self.show_list(self.ignore_middle, "Ignore in backtraces nodes which names have these substrings")
self.show_list(self.noignore, "However, do not ignore nodes which names have these prefixes")
self.show_list(self.stop, "Stop backtrace parsing to nodes which names have these prefixes")
self.show_list(self.joins, "Merge nodes for these functions/source files/libraries together")
self.show_list(self.node, "Nodes about which to show debug information")
debug("references = %s (whether to show debug node references)" % self.bool_str(self.references))
debug("verbose = %s" % self.bool_str(self.verbose))
def show_list(self, items, title):
if items:
debug("%s:" % title)
for name in items:
debug("- %s" % name)
def validate_input(self, app):
if not os.path.isfile(self.tracefile):
self.help(app, "resource trace report '%s' missing" % self.tracefile)
fi = open(self.tracefile, "r")
if fi.read(8) != "version=":
self.help(app, "'%s' isn't an ASCII resource trace file" % self.tracefile)
fi.close()
def bool_str(self, value):
if value: return "yes"
else: return "no"
def get_bool(self, name, arg, value):
"check whether value is yes/no and returns corresponding bool"
if value in ["yes", "no"]:
return (value == "yes")
else:
self.help(name, "value should be either 'yes' or 'no'", arg)
def help(self, name, errstr, arg=None):
"show command line help and exit"
debug(__doc__)
debug("usage: %s [options] <rtrace-report> > report.dot" % name)
debug("""
Options for callgraph output type and looks:
--type=value, value is one of following report types:
graph -- output a callgraph for largest allocation amounts (default)
count -- output a callgraph for largest allocation counts
topmost -- output a callgraph for allocations with highest addresses
--direction=value, value is one of:
tb -- top to bottom i.e. from main() towards allocs (default)
bt -- bottom to top i.e. from allocs towards main()
Default is "tb".
--depth=value, value is an integer (1-99) used in following reports:
topmost -- how many of the traces to highest allocations are shown
Default is 3.
--emph-limit=value, value (1.0-100) is used with "graph"
report type to indicate that functions responsible for more
than the given percentage of allocations should be emphasized.
Default is 60.0.
--own-alloc-limit=value, value is a factor (1.1 - 10) used with
"graph" report type to indicate how much single node needs to do
more allocations than all of its children together for it to be
emphatized. Only nodes whose allocations go over emph-limit
are checked for this. Default is 1.2.
--cluster=value, value is the name for the library or code file
which nodes should be clustered together. Note: this option
is useless with recursive backtraces, for UI applications you
should use also --ignore-glib/gtk/qt-internals=yes.
Additional options for filtering which kind of allocation information
is included to the callgraph:
--size-range=[smallest]-[largest], values are sizes of the smallest
and largest allocations that will be included to the graph.
Additional options for reducing the number of nodes (backtrace items)
included to the callgraph:
--limit=value, value (0.1-100) is used with "graph" and
"html" report types to limit the number of functions included
into the output. The given value is a percentage of how much
of the allocation come through that function. Default is 10.0.
--show-below-limit=no, do not show nodes telling how many called
nodes were below the specified limit.
--stop=value, value is a string prefix of node name where
each backtrace parsing stops. Using something like
"--stop=g_cclosure_marshal --stop=_gtk_marshal" could for
example make Gtk memory allocation graphs more readable.
--include-only=value, value is a string prefix for node names
for backtraces that should be included into report. All backtraces
NOT containing a node name starting with these substrings are ignored.
--include-only-args=value, value is a substring for traced function
argument values for which backtraces should be included into report.
All backtraces NOT having an argument value that contains this
substring are ignored. NOTE: this needs also --show-args=yes option
and it cannot work properly for merged backtraces.
--ignore=value, value is a string prefix for node names which should be
ignored in the backtraces. Useful for making graphs more readable.
After enabling the shortcut options below, use --verbose to see what
actually is ignored by them.
--ignore-middle=value, like --ignore, but given substring is for any
part of the function names (slower, but useful e.g. with template
functions as their names are prefixed with a changing return type).
--no-ignore=value, value is a string prefix for node names that should
be kept in the backtraces although they were matched by --ignore
or --ignore-middle.
--ignore-font-allocs=yes, shortcut to ignore lower level freetype
and fontconfig allocation functions.
--ignore-libc-allocs=yes, shortcut to ignore lower level C and
C++ allocation functions to make graphs more readable.
--ignore-glib-internals=yes, shortcut to ignore several lowest level
allocation, standard data structure handling and recursed functions
called by Glib programs. You might want to try also these options:
--intermediate=no --leafs=no --show-below-limit=no
--ignore=dcgettext --ignore=tsearch
--join=ld-<version>.so
--ignore-gtk-internals=yes, shortcut to ignore several recursed
functions called by Gtk applications. You might want to try
also these options:
--ignore-libc-allocs --ignore-font-allocs
--ignore=gtk_widget_ --no-ignore=gtk_widget_show
--ignore=gdk_window_ --ignore=_gdk_window_
--join=fontconfig
--ignore-qt-internals=yes, shortcut to ignore lower level
Qt allocation, standard data structure handling and
marshalling/recursed functions called by Qt applications.
You might want to try also these options:
--ignore-glib-internals=yes
--join=value, value is the name for the function, library or source
code file which nodes should be represented as a single node.
--intermediate=no, leave out nodes with only one parent and child.
--leafs=no, leaves out nodes which have either:
- one parent and no children, or
- one child and no parents
--show-args=yes, show trace functions arguments in addition to trace
functions. This can be used e.g. to get callgraphs to individual
file names in file descriptor traces. To limit this set, use
--include-only-args option.
Options affecting what information is shown in graph nodes:
--keep-types=yes, whether to show overloaded C++ method signature
arguments. These are removed by default as they can be very long.
--keep-paths=yes, whether to show full paths in file names.
By default path is stripped out.
--keep-filenames=no, whether to show source filenames (+ line numbers).
Shown by default.
Debug options:
--node=value -- shows internal information about given node.
'*' will show information about all nodes.
--show-references=yes -- show what other nodes refer nodes
specified with the above option.
--progress=yes -- show running parsed alloc/free record index number.
--verbose=yes -- show more information about internal state.
Following options can be given as many times as you wish and they apply
to all values you gave (of other options, only the last value applies):
--stop
--ignore
--include-only
--include-only-args
--cluster
--join
--node
Typical first usage looks like this:
rtrace-calltree --limit=4 app.rtrace.txt > app.dot
Graphs are output in 'dot' format. You will need GraphViz package to
generate PostScript, SVG, PNG etc. formats out of them, or use e.g.
"xdot.py" tool to view them.""")
if errstr:
if arg:
errstr += " (for '%s')" % arg
error_exit(errstr)
sys.exit(1)
# ----------------------- main ------------------------
if __name__ == "__main__":
options = Options(sys.argv)
if options.only_counts:
graph = Graph(0)
else:
# assumed per allocation overhead 8 bytes
graph = Graph(8)
debug("\nParsing...")
leaks = Rtrace(options.tracefile)
(graph, ignores, nonames) = leaks.parse(graph, options)
debug(graph.stats_oneliner())
debug("Ignored %d backtrace lines with just hex addresses." % nonames)
debug("Ignored %d other backtrace lines, as requested." % ignores)
debug("Finalizing backtrace graph and stats...")
stats = graph.complete()
# debug output for selected nodes
graph.show_nodes(options.node, options.references, {})
# if requested, remove unwanted nodes
graph.trim(options)
debug("Writing report...")
if options.report == "topmost":
report = TopmostGraphReport(graph, options)
else:
report = AllocGraphReport(graph, options)
if options.direction == "tb":
report.set_direction_tb()
else:
report.set_direction_bt()
report.output()