New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge eugeneia/snabb:timeline-raptorjit into Vita #65

Open
wants to merge 43 commits into
base: master
from

Conversation

Projects
None yet
2 participants
@eugeneia
Copy link
Member

eugeneia commented Dec 7, 2018

This builds on @lukego’s work on the timeline log, a probabilistic flight recorder for Snabb, see: snabbco#849 snabbco#873 snabbco#916 snabbco#973 snabbco#1011 snabbco#1098 snabbco#1112

My current working branch for the timeline is at eugeneia/snabb:timeline-raptorjit, this branch merges this feature into Vita and adds some application specific events (user events).

See #58 for some example plots.

lukego and others added some commits Nov 8, 2016

dasm_x86.lua: Add support for RDTSCP instruction
This is a very useful instruction for self-benchmarking programs that
want to read the CPU timestamp counter efficiently.

See Intel whitepaper for details:
http://www.intel.com/content/dam/www/public/us/en/documents/white-papers/ia-32-ia-64-benchmark-code-execution-paper.pdf
core.timeline: Switch to double-float on disk
Use 'double' instead of 'uint64_t' for values in the timeline file.

This change is motivated by making timeline files easier to process by
R. In the future we may switch back to uint64_t for the TSC counter
and/or argument values for improved precision. The major_version file
header field can be used to avoid confusion.

The obvious downside to using doubles is that the TSC value will lose
precision as the server uptime increases (the TSC starts at zero and
increases at the base frequency of the CPU e.g. 2GHz.) The impact seems
to be modest though. For example a 2GHz CPU would start rounding TSC
values to the nearest 128 (likely quite acceptable in practice) after
approximately 2 years of operation (2^53 * 128 cycles.)

So - storing the TSC as a double-float is definitely a kludge - but
unlikely to cause any real harm and expedient for the short-term goal of
putting this code to use without getting blocked due to e.g. my lack of
sophisticated as an R hacker.
Merge snabbco/next into timeline-redux-next
Resolved conflict in app.lua between adding timeline events and the new
breath topological-sort machinery.
engine: Randomize timeline log level with math
Simplify the code and eliminate unwanted branches from the engine loop
by drawing a random timeline level from a log-uniform distribution that
mathematically favors higher log levels over lower ones.

Plucked log5() out of the air i.e. each log level should be enabled for
5x more breaths than the one below.

Here is how the distribution of log level choice looks in practice using
this algorithm:

    > t = {0,0,0,0,0,0,0,0,0}
    > for i = 1, 1e8 do
         local n = math.max(1,math.ceil(math.log(math.random(5^9))/math.log(5)))
         t[n] = t[n]+1
      end
    > for i,n in ipairs(t) do print(i,n) end
    1       560
    2       2151
    3       10886
    4       55149
    5       273376
    6       1367410
    7       6844261
    8       34228143
    9       171120244

Note: Lua provides only natural logarithm functions but it is easy to
derive other bases from this (google "log change of base formula").
engine: Remove timeline packet payload sampling
I suspect that it is a misfeature for the timeline to sample the
contents of packets. Do we really want user data potentially appearing
in debug logs? Removed for now.
Merge remote-tracking branch 'snabbco/raptorjit' into timeline-raptorjit
Cleanup timeline integration in core.app a little along the merge.
core.app: set timeline log level at the very end of breathe loop
This fixes a bug where timeline log level was rerolled between end of breaths
but before before post-breath events, causing sampling to affect the event lag
of the polled_timers event.
core.timeline: decouple log level from event rate
Changes the syntax of event specs to

  <level>,<rate>|<eventname>: ...

The previous level digit becomes the event’s "rate" and retains its semantics
with regard to the logging frequency of the specified event. The "stack depth"
of the event is now decoupled as the new, leading level digit and specified
independently. The new level semantics are as follows:

 - level ranges from 0-9 (10 levels in total)
 - 0 is the top most level while 9 in the lowest
 - levels 0-4 are reserved for use by the engine
 - user applications can use levels 5-9 to create hierarchy in their events

Caveat: users should avoid defining events with a higher level and a lower
event rate than an enclosed event if the higher level event is supposed to
serve as a latency anchor for the lower level event.

  RIGHT                  WRONG
  5,3|op_start:          5,2|op_start:
    6,2|op_iter:           6,3|op_iter:
  5,3|op_end:            5,2|op_end:

In the left most WRONG example, the anchor of the op_inter event depends on the
log rate at the time of sampling.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment