more precise structural summarization instead of NLP #105

mircealungu · 2018-05-03T14:52:26Z

@bogdanp05 , one idea that we should think about is the following:

instead of summarizing stuff with NLP maybe we can take inspiration from the python profiler, which has the same task to achieve as the one we have: summarize multiple stack traces.

the profiler, afaik, wakes up multiple times, observes the stack trace, and then it summarizes everything at the end with a view like the following one:

maybe we could do something similar, since after all, we also have a bunch of stack traces that we want to summarize!

surely, the profiler estimates time spent in a given method, and we must summarize the number of times the outlier was found being in that method, while calling from the previous method, but it should be a similar thing.

think about this, and let's discuss it the next time we meet!

bogdanp05 · 2018-05-03T16:03:17Z

Ok, this approach looks like it could provide way more useful results than NLP.
It also means changing the way we collect outliers, right? Because right now we don't have any duration info, just the stack trace.

mircealungu · 2018-05-03T20:54:24Z

i'm not sure exactly how these time-sampling profilers
work.

i'm imagining something like this:

stack trace 1
a
- b
-- c
--- d

stack trace 2
a
- b
-- c
-- e

could in theory be summarized like this:

a --> 2
- b --> 2
-- c --> 2
-- d --> 1
-- e --> 1

i think that exploring something like this, might be one way of summarizing many traces, right?
what do you think?

mircealungu · 2018-05-03T20:56:24Z

basically, what i think i'm saying is that every stack trace is a graph (directed, acyclic, actually a graph degenerated into a list)

thus, if we could summarize it with a prefix tree where every node has the count of paths that pass through it. or something like this, it's late now :)

bogdanp05 · 2018-05-04T08:08:18Z

Ok, I think I see what you mean. I will look into this.

bogdanp05 · 2018-05-16T12:11:32Z

I worked on this issue and the script I have so far is here.
First, I parsed the stack traces as they appeared in the db and tried to represent them in a unified way (one stack trace element per line).
Then, I represented each stack element as a unique tuple of 4:
An element in the stacktrace is uniquely represented as a 4-element tuple:
(file_name, line_number, function, text of line).
What's left now is to actually visualize one list of tuples (i.e. a stack trace) as a tree, and then merging such trees together.
There are still 2 points that should be discussed here:

Over different versions, an endpoint can change its code significantly and this means that also the stack traces will be different. Thus, it might be better to visualize the stack traces of outliers per endpoint per version.
A stack trace of an outlier contains all the individual stack traces of the running threads. Visualizing one such stack trace could be done by simply having one tree per thread. But how could we merge these resulting trees for multiple outliers?

mircealungu · 2018-05-17T22:05:46Z

agreed with visualizing per version
interesting. can't we just create a big graph that contains all the stack traces, w/o worrying about the individual thread where the action happens?

bogdanp05 · 2018-05-18T11:09:36Z

We'll probably go for that, at least in the beginning.

bogdanp05 · 2018-06-11T14:53:49Z

Implemented in #164

bogdanp05 self-assigned this May 3, 2018

FlyingBird95 added this to To do in Flask-MonitoringDashboard May 4, 2018

bogdanp05 closed this as completed Jun 11, 2018

Flask-MonitoringDashboard automation moved this from To do to Done Jun 11, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

more precise structural summarization instead of NLP #105

more precise structural summarization instead of NLP #105

mircealungu commented May 3, 2018 •

edited

bogdanp05 commented May 3, 2018

mircealungu commented May 3, 2018

mircealungu commented May 3, 2018

bogdanp05 commented May 4, 2018

bogdanp05 commented May 16, 2018

mircealungu commented May 17, 2018

bogdanp05 commented May 18, 2018

bogdanp05 commented Jun 11, 2018

more precise structural summarization instead of NLP #105

more precise structural summarization instead of NLP #105

Comments

mircealungu commented May 3, 2018 • edited

bogdanp05 commented May 3, 2018

mircealungu commented May 3, 2018

mircealungu commented May 3, 2018

bogdanp05 commented May 4, 2018

bogdanp05 commented May 16, 2018

mircealungu commented May 17, 2018

bogdanp05 commented May 18, 2018

bogdanp05 commented Jun 11, 2018

mircealungu commented May 3, 2018 •

edited