Skip to content
This repository has been archived by the owner on Apr 23, 2024. It is now read-only.

Commit

Permalink
Speed up the sampling profiler
Browse files Browse the repository at this point in the history
Summary:
The sampling profiler is designed to grab 250 samples per second, but it wasn’t
going nearly that fast, and the sampling times weren’t continuous (it would
sometimes grab many samples in succession, then wait hundreds of milliseconds
before the next sample). This slowness was due to time spent in
traceback.extract_stack, which I believe had highly variable performance because
of its use of the linecache. The fix (inspired by
https://github.com/bdarnell/plop ) was to instead grab the stack information
directly. This means that we no longer have the text of the lines at each stack
frame, but that's probably ok (it only showed up in the tooltip anyway, and it's
not provided by the instrumented profiler).

With this change, the number of samples that we get on the homepage in
dev_appserver increased from about 70 to about 300. With the “time.sleep” line
removed, the new sampler is about 100x faster rather than 4x.

Test Plan:
Load the logged-in homepage in dev_appserver with sampling enabled. Make sure
the render_learning_dashboard frames add up to something close to 100%.

Reviewers: ben, chris

Reviewed By: ben

CC: alpert

Differential Revision: http://phabricator.khanacademy.org/D6646
  • Loading branch information
Alan Pierce committed Feb 12, 2014
1 parent 664176e commit 3454b06
Showing 1 changed file with 29 additions and 9 deletions.
38 changes: 29 additions & 9 deletions sampling_profiler.py
Expand Up @@ -23,7 +23,6 @@
import sys
import time
import threading
import traceback

from . import util

Expand All @@ -49,9 +48,9 @@ def run(self):
This will run, periodically inspecting and then sleeping, until
manually stopped via stop()."""

# Keep sampling until this thread is explicitly stopped.
while not self.should_stop():

# Take a sample of the main request thread's frame stack...
self.profile.take_sample()

Expand All @@ -61,8 +60,28 @@ def run(self):

class ProfileSample(object):
"""Single stack trace sample gathered during a periodic inspection."""
def __init__(self, stack):
self.stack_trace = traceback.extract_stack(stack)
def __init__(self, stack_trace):
# stack_trace should be a list of (filename, line_num, function_name)
# triples.
self.stack_trace = stack_trace

@staticmethod
def from_frame(active_frame):
"""Creates a profile from the current frame of a particular thread.
The "active_frame" parameter should be the current frame from some
thread, as returned by sys._current_frames(). Note that we must walk
the stack trace up-front at sampling time, since it will change out
from under us if we wait to access it."""
stack_trace = []
frame = active_frame
while frame is not None:
code = frame.f_code
stack_trace.append(
(code.co_filename, frame.f_lineno, code.co_name))
frame = frame.f_back

return ProfileSample(stack_trace)


class Profile(object):
Expand All @@ -83,9 +102,9 @@ def results(self):
total_samples = len(self.samples)

for sample in self.samples:
for filename, line_num, function_name, src in sample.stack_trace:
aggregated_calls["%s\n\n%s:%s (%s)" %
(src, filename, line_num, function_name)] += 1
for filename, line_num, function_name in sample.stack_trace:
aggregated_calls["%s:%s (%s)" %
(filename, line_num, function_name)] += 1

# Turn aggregated call samples into dictionary of results
calls = [{
Expand All @@ -108,17 +127,18 @@ def results(self):
def take_sample(self):
# Look at stacks of all existing threads...
# See http://bzimmer.ziclix.com/2008/12/17/python-thread-dumps/
for thread_id, stack in sys._current_frames().items():
for thread_id, active_frame in sys._current_frames().items():
# ...but only sample from the main request thread.
# TODO(kamens): this profiler will need work if we ever
# actually use multiple threads in a single request and want to
# profile more than one of them.
if thread_id == self.current_request_thread_id:
# Grab a sample of this thread's current stack
self.samples.append(ProfileSample(stack))
self.samples.append(ProfileSample.from_frame(active_frame))

def run(self, fxn):
"""Run function with samping profiler enabled, saving results."""

if not hasattr(threading, "current_thread"):
# Sampling profiler is not supported in Python2.5
logging.warn("The sampling profiler is not supported in Python2.5")
Expand Down

0 comments on commit 3454b06

Please sign in to comment.