New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fiber.top() #2694
Comments
kostja
added
feature
A new functionality
and removed
feature
A new functionality
labels
Feb 13, 2018
sergepetrenko
pushed a commit
that referenced
this issue
Sep 15, 2019
sergepetrenko
added a commit
that referenced
this issue
Sep 15, 2019
sergepetrenko
added a commit
that referenced
this issue
Sep 20, 2019
Implement a new function in Lua fiber library: top(). It returns a table of alive fibers (including the scheduler). Each table entry has two fields: average cpu consumption, which is calculated with exponential moving average over event loop iterations, and current cpu consumption, which shows fiber's cpu usage over the last event loop iteration. The patch relies on CPU timestamp counter to measure each fiber's time share. Closes #2694
sergepetrenko
added a commit
that referenced
this issue
Oct 3, 2019
Implement a new function in Lua fiber library: top(). It returns a table of alive fibers (including the scheduler). Each table entry has two fields: average cpu consumption, which is calculated with exponential moving average over event loop iterations, and current cpu consumption, which shows fiber's cpu usage over the last event loop iteration. The patch relies on CPU timestamp counter to measure each fiber's time share. Closes #2694 @TarantoolBot document Title: fiber: new function `fiber.top()` `fiber.top()` returns a table of all alive fibers and lists their cpu consumption. Let's take a look at the example: ``` tarantool> fiber.top() --- - 1: cpu average (%): 10.779696493982 cpu instant (%): 10.435256168573 115: cpu average (%): 5.4571279061075 cpu instant (%): 5.9653973440576 120: cpu average (%): 21.944382148464 cpu instant (%): 23.849021825646 116: cpu average (%): 8.6603872318158 cpu instant (%): 9.6812031335093 119: cpu average (%): 21.933168871944 cpu instant (%): 20.007540530351 cpu misses: 0 118: cpu average (%): 19.342901995963 cpu instant (%): 16.932679820703 117: cpu average (%): 11.549674814981 cpu instant (%): 13.128901177161 ... ``` In the table above keys are fiber id's (and a single 'cpu misses' key which indicates the amount of times tx thread was rescheduled on a different cpu core. More on that later). The two metrics available for each fiber are: 1) cpu instant (per cent), which indicates the share of time fiber was executing during the previous event loop iteration 2) cpu average (per cent), which is calculated as an exponential moving average of `cpu instant` values over all previous event loop iterations.
sergepetrenko
added a commit
that referenced
this issue
Oct 8, 2019
Add a wrapper function around swim test passed to swim_run_test which breaks from default cord event loop. Otherwise we will hang indefinitely in ev_run in case there is a started watcher. Found during work on #2694
sergepetrenko
added a commit
that referenced
this issue
Oct 8, 2019
Implement a new function in Lua fiber library: top(). It returns a table of alive fibers (including the scheduler). Each table entry has two fields: average cpu consumption, which is calculated with exponential moving average over event loop iterations, and current cpu consumption, which shows fiber's cpu usage over the last event loop iteration. The patch relies on CPU timestamp counter to measure each fiber's time share. Closes #2694 @TarantoolBot document Title: fiber: new function `fiber.top()` `fiber.top()` returns a table of all alive fibers and lists their cpu consumption. Let's take a look at the example: ``` tarantool> fiber.top() --- - 1: cpu average (%): 10.779696493982 cpu instant (%): 10.435256168573 115: cpu average (%): 5.4571279061075 cpu instant (%): 5.9653973440576 120: cpu average (%): 21.944382148464 cpu instant (%): 23.849021825646 116: cpu average (%): 8.6603872318158 cpu instant (%): 9.6812031335093 119: cpu average (%): 21.933168871944 cpu instant (%): 20.007540530351 cpu misses: 0 118: cpu average (%): 19.342901995963 cpu instant (%): 16.932679820703 117: cpu average (%): 11.549674814981 cpu instant (%): 13.128901177161 ... ``` In the table above keys are fiber id's (and a single 'cpu misses' key which indicates the amount of times tx thread was rescheduled on a different cpu core. More on that later). The two metrics available for each fiber are: 1) cpu instant (per cent), which indicates the share of time fiber was executing during the previous event loop iteration 2) cpu average (per cent), which is calculated as an exponential moving average of `cpu instant` values over all previous event loop iterations. More info on `cpu misses` field returned by `fiber.top()`: `cpu misses` indicates the amount of times tx thread detected it was rescheduled on a different cpu core during the last event loop iteration. fiber.top() uses cpu timestamp counter to measure each fiber's execution time. However, each cpu core may have its own counter value (you can only rely on counter deltas if both measurements were taken on the same core, otherwise the delta may even get negative). When tx thread is rescheduled to a different cpu core, tarantool just assumes cpu delta was zero for the lust measurement. This loweres precision of our computations, so the bigger `cpu misses` value the lower the precision of fiber.top() results.
sergepetrenko
added a commit
that referenced
this issue
Oct 17, 2019
Add a wrapper function around swim test passed to swim_run_test which breaks from default cord event loop. Otherwise we will hang indefinitely in ev_run in case there is a started watcher. Found during work on #2694
sergepetrenko
added a commit
that referenced
this issue
Oct 17, 2019
Implement a new function in Lua fiber library: top(). It returns a table of alive fibers (including the scheduler). Each table entry has two fields: average cpu consumption, which is calculated with exponential moving average over event loop iterations, and current cpu consumption, which shows fiber's cpu usage over the last event loop iteration. The patch relies on CPU timestamp counter to measure each fiber's time share. Closes #2694 @TarantoolBot document Title: fiber: new function `fiber.top()` `fiber.top()` returns a table of all alive fibers and lists their cpu consumption. Let's take a look at the example: ``` tarantool> fiber.top() --- - 1: cpu average (%): 10.779696493982 cpu instant (%): 10.435256168573 115: cpu average (%): 5.4571279061075 cpu instant (%): 5.9653973440576 120: cpu average (%): 21.944382148464 cpu instant (%): 23.849021825646 116: cpu average (%): 8.6603872318158 cpu instant (%): 9.6812031335093 119: cpu average (%): 21.933168871944 cpu instant (%): 20.007540530351 cpu misses: 0 118: cpu average (%): 19.342901995963 cpu instant (%): 16.932679820703 117: cpu average (%): 11.549674814981 cpu instant (%): 13.128901177161 ... ``` In the table above keys are fiber id's (and a single 'cpu misses' key which indicates the amount of times tx thread was rescheduled on a different cpu core. More on that later). The two metrics available for each fiber are: 1) cpu instant (per cent), which indicates the share of time fiber was executing during the previous event loop iteration 2) cpu average (per cent), which is calculated as an exponential moving average of `cpu instant` values over all previous event loop iterations. More info on `cpu misses` field returned by `fiber.top()`: `cpu misses` indicates the amount of times tx thread detected it was rescheduled on a different cpu core during the last event loop iteration. fiber.top() uses cpu timestamp counter to measure each fiber's execution time. However, each cpu core may have its own counter value (you can only rely on counter deltas if both measurements were taken on the same core, otherwise the delta may even get negative). When tx thread is rescheduled to a different cpu core, tarantool just assumes cpu delta was zero for the lust measurement. This loweres precision of our computations, so the bigger `cpu misses` value the lower the precision of fiber.top() results. Fiber.top() doesn't work on arm architecture at the moment.
Performance measurements with and without fiber.top(): The results are averaged between 5 runs.
File used to test fiber.top() version: fiber = require('fiber')
num_fibers = arg[1]
delta_time = 1
fiber_func = function()
while true do
fiber.yield()
end
end
-- create fibers but do not let them run yet
for i = 1, num_fibers do
fiber.new(fiber_func)
end
fiber.top_enable()
t_start = fiber.time()
fiber.sleep(delta_time)
t_end = fiber.time()
fiber.top_disable();
tbl = fiber.info()
csw = 0
for k ,v in pairs(tbl) do
csw = csw + v["csw"]
end
print("delta time:", t_end - t_start)
print("csw/sec:", csw / (t_end - t_start))
File used to test default version: fiber = require('fiber')
num_fibers = arg[1]
delta_time = 1
fiber_func = function()
while true do
fiber.yield()
end
end
-- create fibers but do not let them run yet
for i = 1, num_fibers do
fiber.new(fiber_func)
end
t_start = fiber.time()
fiber.sleep(delta_time)
t_end = fiber.time()
tbl = fiber.info()
csw = 0
for k ,v in pairs(tbl) do
csw = csw + v["csw"]
end
print("delta time:", t_end - t_start)
print("csw/sec:", csw / (t_end - t_start)) |
sergepetrenko
added a commit
that referenced
this issue
Oct 25, 2019
Implement a new function in Lua fiber library: top(). It returns a table of alive fibers (including the scheduler). Each table entry has two fields: average cpu consumption, which is calculated with exponential moving average over event loop iterations, and current cpu consumption, which shows fiber's cpu usage over the last event loop iteration. The patch relies on CPU timestamp counter to measure each fiber's time share. Closes #2694 @TarantoolBot document Title: fiber: new function `fiber.top()` `fiber.top()` returns a table of all alive fibers and lists their cpu consumption. Let's take a look at the example: ``` tarantool> fiber.top() --- - sched: average: 10.779696493982 instant: 10.435256168573 115: average: 5.4571279061075 instant: 5.9653973440576 120: average: 21.944382148464 instant: 23.849021825646 116: average: 8.6603872318158 instant: 9.6812031335093 119: average: 21.933168871944 instant: 20.007540530351 cpu misses: 0 118: average: 19.342901995963 instant: 16.932679820703 117: average: 11.549674814981 instant: 13.128901177161 ... ``` In the table above keys are fiber id's (the only exceptions are `sched` fiber, which is indexed by mane and a single 'cpu misses' key which indicates the amount of times tx thread was rescheduled on a different cpu core. More on that later). The two metrics available for each fiber are: 1) instant (per cent), which indicates the share of time fiber was executing during the previous event loop iteration 2) average (per cent), which is calculated as an exponential moving average of `instant` values over all previous event loop iterations. More info on `cpu misses` field returned by `fiber.top()`: `cpu misses` indicates the amount of times tx thread detected it was rescheduled on a different cpu core during the last event loop iteration. fiber.top() uses cpu timestamp counter to measure each fiber's execution time. However, each cpu core may have its own counter value (you can only rely on counter deltas if both measurements were taken on the same core, otherwise the delta may even get negative). When tx thread is rescheduled to a different cpu core, tarantool just assumes cpu delta was zero for the latest measurement. This loweres precision of our computations, so the bigger `cpu misses` value the lower the precision of fiber.top() results. Fiber.top() doesn't work on arm architecture at the moment. fiber.top() is disabled by default, to enable it you need to issue `fiber.top_enable()`. You can disable it back later using `fiber.top_disable()`.
sergepetrenko
added a commit
that referenced
this issue
Oct 25, 2019
Implement a new function in Lua fiber library: top(). It returns a table of alive fibers (including the scheduler). Each table entry has two fields: average cpu consumption, which is calculated with exponential moving average over event loop iterations, and current cpu consumption, which shows fiber's cpu usage over the last event loop iteration. The patch relies on CPU timestamp counter to measure each fiber's time share. Closes #2694 @TarantoolBot document Title: fiber: new function `fiber.top()` `fiber.top()` returns a table of all alive fibers and lists their cpu consumption. Let's take a look at the example: ``` tarantool> fiber.top() --- - sched: average: 10.779696493982 instant: 10.435256168573 115: average: 5.4571279061075 instant: 5.9653973440576 120: average: 21.944382148464 instant: 23.849021825646 116: average: 8.6603872318158 instant: 9.6812031335093 119: average: 21.933168871944 instant: 20.007540530351 cpu misses: 0 118: average: 19.342901995963 instant: 16.932679820703 117: average: 11.549674814981 instant: 13.128901177161 ... ``` In the table above keys are fiber id's (the only exceptions are `sched` fiber, which is indexed by mane and a single 'cpu misses' key which indicates the amount of times tx thread was rescheduled on a different cpu core. More on that later). The two metrics available for each fiber are: 1) instant (per cent), which indicates the share of time fiber was executing during the previous event loop iteration 2) average (per cent), which is calculated as an exponential moving average of `instant` values over all previous event loop iterations. More info on `cpu misses` field returned by `fiber.top()`: `cpu misses` indicates the amount of times tx thread detected it was rescheduled on a different cpu core during the last event loop iteration. fiber.top() uses cpu timestamp counter to measure each fiber's execution time. However, each cpu core may have its own counter value (you can only rely on counter deltas if both measurements were taken on the same core, otherwise the delta may even get negative). When tx thread is rescheduled to a different cpu core, tarantool just assumes cpu delta was zero for the latest measurement. This loweres precision of our computations, so the bigger `cpu misses` value the lower the precision of fiber.top() results. Fiber.top() doesn't work on arm architecture at the moment. fiber.top() is disabled by default, to enable it you need to issue `fiber.top_enable()`. You can disable it back later using `fiber.top_disable()`.
sergepetrenko
added a commit
that referenced
this issue
Oct 28, 2019
Implement a new function in Lua fiber library: top(). It returns a table of alive fibers (including the scheduler). Each table entry has two fields: average cpu consumption, which is calculated with exponential moving average over event loop iterations, and current cpu consumption, which shows fiber's cpu usage over the last event loop iteration. The patch relies on CPU timestamp counter to measure each fiber's time share. Closes #2694 @TarantoolBot document Title: fiber: new function `fiber.top()` `fiber.top()` returns a table of all alive fibers and lists their cpu consumption. Let's take a look at the example: ``` tarantool> fiber.top() --- - 104/lua: instant: 9.8578003709556 average: 7.1213095296968 103/lua: instant: 10.242495019578 average: 7.2647247910588 107/lua: instant: 33.337913031531 average: 34.602640818951 101/on_shutdown: instant: 0 average: 0 105/lua: instant: 10.558494195233 average: 7.4526482369814 106/lua: instant: 21.19255341073 average: 23.668463478562 102/interactive: instant: 0 average: 0 cpu misses: 0 1/sched: instant: 14.810743971972 average: 19.494584837545 ... ``` In the table above keys are strings containing fiber ids and names (the only exception is a single 'cpu misses' key which indicates the number of times tx thread was rescheduled on a different cpu core. More on that later). The two metrics available for each fiber are: 1) instant (per cent), which indicates the share of time fiber was executing during the previous event loop iteration 2) average (per cent), which is calculated as an exponential moving average of `instant` values over all previous event loop iterations. More info on `cpu misses` field returned by `fiber.top()`: `cpu misses` indicates the amount of times tx thread detected it was rescheduled on a different cpu core during the last event loop iteration. fiber.top() uses cpu timestamp counter to measure each fiber's execution time. However, each cpu core may have its own counter value (you can only rely on counter deltas if both measurements were taken on the same core, otherwise the delta may even get negative). When tx thread is rescheduled to a different cpu core, tarantool just assumes cpu delta was zero for the latest measurement. This loweres precision of our computations, so the bigger `cpu misses` value the lower the precision of fiber.top() results. Fiber.top() doesn't work on arm architecture at the moment. fiber.top() is disabled by default, to enable it you need to issue `fiber.top_enable()`. You can disable it back later using `fiber.top_disable()`.
sergepetrenko
added a commit
that referenced
this issue
Oct 29, 2019
Implement a new function in Lua fiber library: top(). It returns a table of alive fibers (including the scheduler). Each table entry has two fields: average cpu consumption, which is calculated with exponential moving average over event loop iterations, and current cpu consumption, which shows fiber's cpu usage over the last event loop iteration. The patch relies on CPU timestamp counter to measure each fiber's time share. Closes #2694 @TarantoolBot document Title: fiber: new function `fiber.top()` `fiber.top()` returns a table of all alive fibers and lists their cpu consumption. Let's take a look at the example: ``` tarantool> fiber.top() --- - 104/lua: instant: 9.8578003709556 average: 7.1213095296968 103/lua: instant: 10.242495019578 average: 7.2647247910588 107/lua: instant: 33.337913031531 average: 34.602640818951 101/on_shutdown: instant: 0 average: 0 105/lua: instant: 10.558494195233 average: 7.4526482369814 106/lua: instant: 21.19255341073 average: 23.668463478562 102/interactive: instant: 0 average: 0 cpu misses: 0 1/sched: instant: 14.810743971972 average: 19.494584837545 ... ``` In the table above keys are strings containing fiber ids and names (the only exception is a single 'cpu misses' key which indicates the number of times tx thread was rescheduled on a different cpu core. More on that later). The two metrics available for each fiber are: 1) instant (per cent), which indicates the share of time fiber was executing during the previous event loop iteration 2) average (per cent), which is calculated as an exponential moving average of `instant` values over all previous event loop iterations. More info on `cpu misses` field returned by `fiber.top()`: `cpu misses` indicates the amount of times tx thread detected it was rescheduled on a different cpu core during the last event loop iteration. fiber.top() uses cpu timestamp counter to measure each fiber's execution time. However, each cpu core may have its own counter value (you can only rely on counter deltas if both measurements were taken on the same core, otherwise the delta may even get negative). When tx thread is rescheduled to a different cpu core, tarantool just assumes cpu delta was zero for the latest measurement. This loweres precision of our computations, so the bigger `cpu misses` value the lower the precision of fiber.top() results. Fiber.top() doesn't work on arm architecture at the moment. Please note, that enabling fiber.top() slows down fiber switching by about 15 per cent, so it is disabled by default. To enable it you need to issue `fiber.top_enable()`. You can disable it back after you finished debugging using `fiber.top_disable()`.
sergepetrenko
added a commit
that referenced
this issue
Nov 1, 2019
Implement a new function in Lua fiber library: top(). It returns a table of alive fibers (including the scheduler). Each table entry has two fields: average cpu consumption, which is calculated with exponential moving average over event loop iterations, and current cpu consumption, which shows fiber's cpu usage over the last event loop iteration. The patch relies on CPU timestamp counter to measure each fiber's time share. Closes #2694 @TarantoolBot document Title: fiber: new function `fiber.top()` `fiber.top()` returns a table of all alive fibers and lists their cpu consumption. Let's take a look at the example: ``` tarantool> fiber.top() --- - 104/lua: instant: 18.433514726042 time: 0.677505865 average: 21.98826143184 103/lua: instant: 19.131392015951 time: 0.689521917 average: 20.807772656431 107/lua: instant: 18.624600174469 time: 0.681585168 average: 17.78194117452 101/on_shutdown: instant: 0 time: 0 average: 0 105/lua: instant: 18.562289702156 time: 0.682085309 average: 15.513811055476 106/lua: instant: 18.441822789017 time: 0.677320271 average: 15.427595583115 102/interactive: instant: 0 time: 0.000367182 average: 0 cpu misses: 0 1/sched: instant: 6.8063805923649 time: 0.253035056 average: 8.3479789103691 ... ``` In the table above keys are strings containing fiber ids and names (the only exception is a single 'cpu misses' key which indicates the number of times tx thread was rescheduled on a different cpu core. More on that later). The three metrics available for each fiber are: 1) instant (per cent), which indicates the share of time fiber was executing during the previous event loop iteration 2) average (per cent), which is calculated as an exponential moving average of `instant` values over all previous event loop iterations. 3) time (seconds), which estimates how much cpu time each fiber spent processing during its lifetime. More info on `cpu misses` field returned by `fiber.top()`: `cpu misses` indicates the amount of times tx thread detected it was rescheduled on a different cpu core during the last event loop iteration. fiber.top() uses cpu timestamp counter to measure each fiber's execution time. However, each cpu core may have its own counter value (you can only rely on counter deltas if both measurements were taken on the same core, otherwise the delta may even get negative). When tx thread is rescheduled to a different cpu core, tarantool just assumes cpu delta was zero for the latest measurement. This loweres precision of our computations, so the bigger `cpu misses` value the lower the precision of fiber.top() results. Fiber.top() doesn't work on arm architecture at the moment. Please note, that enabling fiber.top() slows down fiber switching by about 15 per cent, so it is disabled by default. To enable it you need to issue `fiber.top_enable()`. You can disable it back after you finished debugging using `fiber.top_disable()`.
sergepetrenko
added a commit
that referenced
this issue
Nov 1, 2019
Implement a new function in Lua fiber library: top(). It returns a table of alive fibers (including the scheduler). Each table entry has two fields: average cpu consumption, which is calculated with exponential moving average over event loop iterations, and current cpu consumption, which shows fiber's cpu usage over the last event loop iteration. The patch relies on CPU timestamp counter to measure each fiber's time share. Closes #2694 @TarantoolBot document Title: fiber: new function `fiber.top()` `fiber.top()` returns a table of all alive fibers and lists their cpu consumption. Let's take a look at the example: ``` tarantool> fiber.top() --- - 104/lua: instant: 18.433514726042 time: 0.677505865 average: 21.98826143184 103/lua: instant: 19.131392015951 time: 0.689521917 average: 20.807772656431 107/lua: instant: 18.624600174469 time: 0.681585168 average: 17.78194117452 101/on_shutdown: instant: 0 time: 0 average: 0 105/lua: instant: 18.562289702156 time: 0.682085309 average: 15.513811055476 106/lua: instant: 18.441822789017 time: 0.677320271 average: 15.427595583115 102/interactive: instant: 0 time: 0.000367182 average: 0 cpu misses: 0 1/sched: instant: 6.8063805923649 time: 0.253035056 average: 8.3479789103691 ... ``` In the table above keys are strings containing fiber ids and names (the only exception is a single 'cpu misses' key which indicates the number of times tx thread was rescheduled on a different cpu core. More on that later). The three metrics available for each fiber are: 1) instant (per cent), which indicates the share of time fiber was executing during the previous event loop iteration 2) average (per cent), which is calculated as an exponential moving average of `instant` values over all previous event loop iterations. 3) time (seconds), which estimates how much cpu time each fiber spent processing during its lifetime. More info on `cpu misses` field returned by `fiber.top()`: `cpu misses` indicates the amount of times tx thread detected it was rescheduled on a different cpu core during the last event loop iteration. fiber.top() uses cpu timestamp counter to measure each fiber's execution time. However, each cpu core may have its own counter value (you can only rely on counter deltas if both measurements were taken on the same core, otherwise the delta may even get negative). When tx thread is rescheduled to a different cpu core, tarantool just assumes cpu delta was zero for the latest measurement. This loweres precision of our computations, so the bigger `cpu misses` value the lower the precision of fiber.top() results. Fiber.top() doesn't work on arm architecture at the moment. Please note, that enabling fiber.top() slows down fiber switching by about 15 per cent, so it is disabled by default. To enable it you need to issue `fiber.top_enable()`. You can disable it back after you finished debugging using `fiber.top_disable()`. "Time" entry is also added to each fibers output in fiber.info() (it duplicates "time" entry from fiber.top()). Note, that "time" is only counted while fiber.top is enabled.
sergepetrenko
added a commit
that referenced
this issue
Nov 1, 2019
Implement a new function in Lua fiber library: top(). It returns a table of alive fibers (including the scheduler). Each table entry has two fields: average cpu consumption, which is calculated with exponential moving average over event loop iterations, and current cpu consumption, which shows fiber's cpu usage over the last event loop iteration. The patch relies on CPU timestamp counter to measure each fiber's time share. Closes #2694 @TarantoolBot document Title: fiber: new function `fiber.top()` `fiber.top()` returns a table of all alive fibers and lists their cpu consumption. Let's take a look at the example: ``` tarantool> fiber.top() --- - 104/lua: instant: 18.433514726042 time: 0.677505865 average: 21.98826143184 103/lua: instant: 19.131392015951 time: 0.689521917 average: 20.807772656431 107/lua: instant: 18.624600174469 time: 0.681585168 average: 17.78194117452 101/on_shutdown: instant: 0 time: 0 average: 0 105/lua: instant: 18.562289702156 time: 0.682085309 average: 15.513811055476 106/lua: instant: 18.441822789017 time: 0.677320271 average: 15.427595583115 102/interactive: instant: 0 time: 0.000367182 average: 0 cpu misses: 0 1/sched: instant: 6.8063805923649 time: 0.253035056 average: 8.3479789103691 ... ``` In the table above keys are strings containing fiber ids and names (the only exception is a single 'cpu misses' key which indicates the number of times tx thread was rescheduled on a different cpu core. More on that later). The three metrics available for each fiber are: 1) instant (per cent), which indicates the share of time fiber was executing during the previous event loop iteration 2) average (per cent), which is calculated as an exponential moving average of `instant` values over all previous event loop iterations. 3) time (seconds), which estimates how much cpu time each fiber spent processing during its lifetime. More info on `cpu misses` field returned by `fiber.top()`: `cpu misses` indicates the amount of times tx thread detected it was rescheduled on a different cpu core during the last event loop iteration. fiber.top() uses cpu timestamp counter to measure each fiber's execution time. However, each cpu core may have its own counter value (you can only rely on counter deltas if both measurements were taken on the same core, otherwise the delta may even get negative). When tx thread is rescheduled to a different cpu core, tarantool just assumes cpu delta was zero for the latest measurement. This loweres precision of our computations, so the bigger `cpu misses` value the lower the precision of fiber.top() results. Fiber.top() doesn't work on arm architecture at the moment. Please note, that enabling fiber.top() slows down fiber switching by about 15 per cent, so it is disabled by default. To enable it you need to issue `fiber.top_enable()`. You can disable it back after you finished debugging using `fiber.top_disable()`. "Time" entry is also added to each fibers output in fiber.info() (it duplicates "time" entry from fiber.top()). Note, that "time" is only counted while fiber.top is enabled.
sergepetrenko
added a commit
that referenced
this issue
Nov 1, 2019
Implement a new function in Lua fiber library: top(). It returns a table of alive fibers (including the scheduler). Each table entry has three fields: average cpu consumption, which is calculated with exponential moving average over event loop iterations, current cpu consumption, which shows fiber's cpu usage over the last event loop iteration and cpu time the fiber has used up during its lifespan. The patch relies on CPU timestamp counter to measure each fiber's time share. Closes #2694 @TarantoolBot document Title: fiber: new function `fiber.top()` `fiber.top()` returns a table of all alive fibers and lists their cpu consumption. Let's take a look at the example: ``` tarantool> fiber.top() --- - 104/lua: instant: 18.433514726042 time: 0.677505865 average: 21.98826143184 103/lua: instant: 19.131392015951 time: 0.689521917 average: 20.807772656431 107/lua: instant: 18.624600174469 time: 0.681585168 average: 17.78194117452 101/on_shutdown: instant: 0 time: 0 average: 0 105/lua: instant: 18.562289702156 time: 0.682085309 average: 15.513811055476 106/lua: instant: 18.441822789017 time: 0.677320271 average: 15.427595583115 102/interactive: instant: 0 time: 0.000367182 average: 0 cpu misses: 0 1/sched: instant: 6.8063805923649 time: 0.253035056 average: 8.3479789103691 ... ``` In the table above keys are strings containing fiber ids and names (the only exception is a single 'cpu misses' key which indicates the number of times tx thread was rescheduled on a different cpu core. More on that later). The three metrics available for each fiber are: 1) instant (per cent), which indicates the share of time fiber was executing during the previous event loop iteration 2) average (per cent), which is calculated as an exponential moving average of `instant` values over all previous event loop iterations. 3) time (seconds), which estimates how much cpu time each fiber spent processing during its lifetime. More info on `cpu misses` field returned by `fiber.top()`: `cpu misses` indicates the amount of times tx thread detected it was rescheduled on a different cpu core during the last event loop iteration. fiber.top() uses cpu timestamp counter to measure each fiber's execution time. However, each cpu core may have its own counter value (you can only rely on counter deltas if both measurements were taken on the same core, otherwise the delta may even get negative). When tx thread is rescheduled to a different cpu core, tarantool just assumes cpu delta was zero for the latest measurement. This loweres precision of our computations, so the bigger `cpu misses` value the lower the precision of fiber.top() results. Fiber.top() doesn't work on arm architecture at the moment. Please note, that enabling fiber.top() slows down fiber switching by about 15 per cent, so it is disabled by default. To enable it you need to issue `fiber.top_enable()`. You can disable it back after you finished debugging using `fiber.top_disable()`. "Time" entry is also added to each fibers output in fiber.info() (it duplicates "time" entry from fiber.top()). Note, that "time" is only counted while fiber.top is enabled.
sergepetrenko
added a commit
that referenced
this issue
Nov 5, 2019
Implement a new function in Lua fiber library: top(). It returns a table containing fiber cpu usage stats. The table has two entries: "cpu_misses" and "cpu". "cpu" itself is a table listing all the alive fibers and their cpu consumtion. The patch relies on CPU timestamp counter to measure each fiber's time share. Closes #2694 @TarantoolBot document Title: fiber: new function `fiber.top()` `fiber.top()` returns a table of all alive fibers and lists their cpu consumption. Let's take a look at the example: ``` tarantool> fiber.top() --- - cpu: 107/lua: instant: 30.967324490456 time: 0.351821993 average: 25.582738345233 104/lua: instant: 9.6473633128437 time: 0.110869897 average: 7.9693406131877 101/on_shutdown: instant: 0 time: 0 average: 0 103/lua: instant: 9.8026528631511 time: 0.112641118 average: 18.138387232255 106/lua: instant: 20.071174377224 time: 0.226901357 average: 17.077908441831 102/interactive: instant: 0 time: 9.6858e-05 average: 0 105/lua: instant: 9.2461986412164 time: 0.10657528 average: 7.7068458630827 1/sched: instant: 20.265286315108 time: 0.237095335 average: 23.141537169257 cpu_misses: 0 ... ``` The two entries in a table returned by `fiber.top()` are `cpu_misses` and `cpu`. `cpu` itself is a table whose keys are strings containing fiber ids and names. The three metrics available for each fiber are: 1) instant (per cent), which indicates the share of time fiber was executing during the previous event loop iteration 2) average (per cent), which is calculated as an exponential moving average of `instant` values over all previous event loop iterations. 3) time (seconds), which estimates how much cpu time each fiber spent processing during its lifetime. More info on `cpu_misses` field returned by `fiber.top()`: `cpu_misses` indicates the amount of times tx thread detected it was rescheduled on a different cpu core during the last event loop iteration. fiber.top() uses cpu timestamp counter to measure each fiber's execution time. However, each cpu core may have its own counter value (you can only rely on counter deltas if both measurements were taken on the same core, otherwise the delta may even get negative). When tx thread is rescheduled to a different cpu core, tarantool just assumes cpu delta was zero for the latest measurement. This loweres precision of our computations, so the bigger `cpu misses` value the lower the precision of fiber.top() results. Fiber.top() doesn't work on arm architecture at the moment. Please note, that enabling fiber.top() slows down fiber switching by about 15 per cent, so it is disabled by default. To enable it you need to issue `fiber.top_enable()`. You can disable it back after you finished debugging using `fiber.top_disable()`. "Time" entry is also added to each fibers output in fiber.info() (it duplicates "time" entry from fiber.top().cpu per fiber). Note, that "time" is only counted while fiber.top is enabled.
sergepetrenko
added a commit
that referenced
this issue
Nov 12, 2019
fiber.top() fills in statistics every event loop iteration, so if it was just enabled, fiber.top() may contain 'inf's and 'nan's in fiber cpu usage averages because total time consumed by the main thread was not yet accounted for. Same stands for viewing top() results for a freshly created fiber: its metrics will be zero since it hasn't lived a full ev loop iteration yet. Fix this by delaying the test till top() results are meaningful and add minor refactoring. Follow-up #2694
sergepetrenko
added a commit
that referenced
this issue
Nov 12, 2019
sergepetrenko
added a commit
that referenced
this issue
Nov 12, 2019
fiber.top() fills in statistics every event loop iteration, so if it was just enabled, fiber.top() may contain 'inf's and 'nan's in fiber cpu usage averages because total time consumed by the main thread was not yet accounted for. Same stands for viewing top() results for a freshly created fiber: its metrics will be zero since it hasn't lived a full ev loop iteration yet. Fix this by delaying the test till top() results are meaningful and add minor refactoring. Follow-up #2694
sergepetrenko
added a commit
that referenced
this issue
Nov 12, 2019
fiber.top() fills in statistics every event loop iteration, so if it was just enabled, fiber.top() may contain 'inf's and 'nan's in fiber cpu usage averages because total time consumed by the main thread was not yet accounted for. Same stands for viewing top() results for a freshly created fiber: its metrics will be zero since it hasn't lived a full ev loop iteration yet. Fix this by delaying the test till top() results are meaningful and add minor refactoring. Follow-up #2694
sergepetrenko
added a commit
that referenced
this issue
Nov 14, 2019
fiber.top() fills in statistics every event loop iteration, so if it was just enabled, fiber.top() may contain 'inf's and 'nan's in fiber cpu usage averages because total time consumed by the main thread was not yet accounted for. Same stands for viewing top() results for a freshly created fiber: its metrics will be zero since it hasn't lived a full ev loop iteration yet. Fix this by delaying the test till top() results are meaningful and add minor refactoring. Follow-up #2694
sergepetrenko
added a commit
that referenced
this issue
Nov 15, 2019
We didn't refresh last remembered clock on fiber.top_enable() This means that the fiber issuing fiber.top_enable() would get a huge chunk of cpu time on its first yield. Fix this. Also reset clock_delta and cpu_miss_count. If fiber.top() is issued on the same ev loop iteration as fiber.top_enable(), clock_delta_last is 0 for both cord and all the fibers, so report "instant" and "average" stats per last iteration as 0 instead of NaN. Follow-up #2694
sergepetrenko
added a commit
that referenced
this issue
Nov 15, 2019
fiber.top() fills in statistics every event loop iteration, so if it was just enabled, fiber.top() may contain 'inf's and 'nan's in fiber cpu usage averages because total time consumed by the main thread was not yet accounted for. Same stands for viewing top() results for a freshly created fiber: its metrics will be zero since it hasn't lived a full ev loop iteration yet. Fix this by delaying the test till top() results are meaningful and add minor refactoring. Follow-up #2694
sergepetrenko
added a commit
that referenced
this issue
Nov 15, 2019
fiber.top() fills in statistics every event loop iteration, so if it was just enabled, fiber.top() returns zero in fiber cpu usage statistics because total time consumed by the main thread was not yet accounted for. Same stands for viewing top() results for a freshly created fiber: its metrics will be zero since it hasn't lived a full ev loop iteration yet. Fix this by delaying the test till top() results are meaningful and add minor refactoring. Follow-up #2694
sergepetrenko
added a commit
that referenced
this issue
Nov 15, 2019
We didn't refresh last remembered clock on fiber.top_enable() This means that the fiber issuing fiber.top_enable() would get a huge chunk of cpu time on its first yield. Fix this. Also reset clock_delta and cpu_miss_count. If fiber.top() is issued on the same ev loop iteration as fiber.top_enable(), clock_delta_last is 0 for both cord and all the fibers, so report "instant" and "average" stats per last iteration as 0 instead of NaN. Follow-up #2694
sergepetrenko
added a commit
that referenced
this issue
Nov 15, 2019
fiber.top() fills in statistics every event loop iteration, so if it was just enabled, fiber.top() returns zero in fiber cpu usage statistics because total time consumed by the main thread was not yet accounted for. Same stands for viewing top() results for a freshly created fiber: its metrics will be zero since it hasn't lived a full ev loop iteration yet. Fix this by delaying the test till top() results are meaningful and add minor refactoring. Follow-up #2694
sergepetrenko
added a commit
that referenced
this issue
Nov 18, 2019
Unify all the members related to fiber's clock statistics into struct clock_stat and all the members related to cord's knowledge of cpu state and clocks to struct cpu_stat. Reset stats of all alive fibers on fiber.top_enable(). Follow-up #2694
sergepetrenko
added a commit
that referenced
this issue
Nov 18, 2019
When fiber EMA is 0 and first non-zero observation is added to it, we assumed that EMA should be equal to this observation (i.e. average value should be the same as the observed one). This breaks the following invariant: sum of clock EMAs of all fibers equals clock EMA of the thread. If one of the fibers is just spawned and has a big clock delta, it will assign this delta to its EMA, while the thread will calculate the new EMA as 15 * EMA / 16 + delta / 16, which may lead to a situation when fiber EMA is greater than cord EMA. This caused occasional test failures: ``` [001] Test failed! Result content mismatch: [001] --- app/fiber.result Mon Nov 18 17:00:48 2019 [001] +++ app/fiber.reject Mon Nov 18 17:33:10 2019 [001] @@ -1511,7 +1511,7 @@ [001] -- not exact due to accumulated integer division errors [001] sum_avg > 99 and sum_avg < 101 or sum_avg [001] --- [001] -- true [001] +- 187.59585601717 [001] ... [001] tbl = nil [001] --- ``` Follow-up #2694
sergepetrenko
added a commit
that referenced
this issue
Nov 18, 2019
fiber.top() fills in statistics every event loop iteration, so if it was just enabled, fiber.top() returns zero in fiber cpu usage statistics because total time consumed by the main thread was not yet accounted for. Same stands for viewing top() results for a freshly created fiber: its metrics will be zero since it hasn't lived a full ev loop iteration yet. Fix this by delaying the test till top() results are meaningful and add minor refactoring. Follow-up #2694
Gerold103
pushed a commit
that referenced
this issue
Nov 18, 2019
When fiber EMA is 0 and first non-zero observation is added to it, we assumed that EMA should be equal to this observation (i.e. average value should be the same as the observed one). This breaks the following invariant: sum of clock EMAs of all fibers equals clock EMA of the thread. If one of the fibers is just spawned and has a big clock delta, it will assign this delta to its EMA, while the thread will calculate the new EMA as 15 * EMA / 16 + delta / 16, which may lead to a situation when fiber EMA is greater than cord EMA. This caused occasional test failures: ``` [001] Test failed! Result content mismatch: [001] --- app/fiber.result Mon Nov 18 17:00:48 2019 [001] +++ app/fiber.reject Mon Nov 18 17:33:10 2019 [001] @@ -1511,7 +1511,7 @@ [001] -- not exact due to accumulated integer division errors [001] sum_avg > 99 and sum_avg < 101 or sum_avg [001] --- [001] -- true [001] +- 187.59585601717 [001] ... [001] tbl = nil [001] --- ``` Follow-up #2694
Gerold103
pushed a commit
that referenced
this issue
Nov 18, 2019
fiber.top() fills in statistics every event loop iteration, so if it was just enabled, fiber.top() returns zero in fiber cpu usage statistics because total time consumed by the main thread was not yet accounted for. Same stands for viewing top() results for a freshly created fiber: its metrics will be zero since it hasn't lived a full ev loop iteration yet. Fix this by delaying the test till top() results are meaningful and add minor refactoring. Follow-up #2694
sergepetrenko
added a commit
that referenced
this issue
Nov 19, 2019
Unify all the members related to fiber's clock statistics into struct clock_stat and all the members related to cord's knowledge of cpu state and clocks to struct cpu_stat. Reset stats of all alive fibers on fiber.top_enable(). Follow-up #2694
sergepetrenko
added a commit
that referenced
this issue
Nov 19, 2019
When fiber EMA is 0 and first non-zero observation is added to it, we assumed that EMA should be equal to this observation (i.e. average value should be the same as the observed one). This breaks the following invariant: sum of clock EMAs of all fibers equals clock EMA of the thread. If one of the fibers is just spawned and has a big clock delta, it will assign this delta to its EMA, while the thread will calculate the new EMA as 15 * EMA / 16 + delta / 16, which may lead to a situation when fiber EMA is greater than cord EMA. This caused occasional test failures: ``` [001] Test failed! Result content mismatch: [001] --- app/fiber.result Mon Nov 18 17:00:48 2019 [001] +++ app/fiber.reject Mon Nov 18 17:33:10 2019 [001] @@ -1511,7 +1511,7 @@ [001] -- not exact due to accumulated integer division errors [001] sum_avg > 99 and sum_avg < 101 or sum_avg [001] --- [001] -- true [001] +- 187.59585601717 [001] ... [001] tbl = nil [001] --- ``` Follow-up #2694
sergepetrenko
added a commit
that referenced
this issue
Nov 19, 2019
fiber.top() fills in statistics every event loop iteration, so if it was just enabled, fiber.top() returns zero in fiber cpu usage statistics because total time consumed by the main thread was not yet accounted for. Same stands for viewing top() results for a freshly created fiber: its metrics will be zero since it hasn't lived a full ev loop iteration yet. Fix this by delaying the test till top() results are meaningful and add minor refactoring. Follow-up #2694
sergepetrenko
added a commit
that referenced
this issue
Nov 20, 2019
fiber.top() fills in statistics every event loop iteration, so if it was just enabled, fiber.top() returns zero in fiber cpu usage statistics because total time consumed by the main thread was not yet accounted for. Same stands for viewing top() results for a freshly created fiber: its metrics will be zero since it hasn't lived a full ev loop iteration yet. Fix this by delaying the test till top() results are meaningful and add minor refactoring. Follow-up #2694
kyukhin
pushed a commit
that referenced
this issue
Nov 21, 2019
Unify all the members related to fiber's clock statistics into struct clock_stat and all the members related to cord's knowledge of cpu state and clocks to struct cpu_stat. Reset stats of all alive fibers on fiber.top_enable(). Follow-up #2694
kyukhin
pushed a commit
that referenced
this issue
Nov 21, 2019
When fiber EMA is 0 and first non-zero observation is added to it, we assumed that EMA should be equal to this observation (i.e. average value should be the same as the observed one). This breaks the following invariant: sum of clock EMAs of all fibers equals clock EMA of the thread. If one of the fibers is just spawned and has a big clock delta, it will assign this delta to its EMA, while the thread will calculate the new EMA as 15 * EMA / 16 + delta / 16, which may lead to a situation when fiber EMA is greater than cord EMA. This caused occasional test failures: ``` [001] Test failed! Result content mismatch: [001] --- app/fiber.result Mon Nov 18 17:00:48 2019 [001] +++ app/fiber.reject Mon Nov 18 17:33:10 2019 [001] @@ -1511,7 +1511,7 @@ [001] -- not exact due to accumulated integer division errors [001] sum_avg > 99 and sum_avg < 101 or sum_avg [001] --- [001] -- true [001] +- 187.59585601717 [001] ... [001] tbl = nil [001] --- ``` Follow-up #2694
kyukhin
pushed a commit
that referenced
this issue
Nov 21, 2019
fiber.top() fills in statistics every event loop iteration, so if it was just enabled, fiber.top() returns zero in fiber cpu usage statistics because total time consumed by the main thread was not yet accounted for. Same stands for viewing top() results for a freshly created fiber: its metrics will be zero since it hasn't lived a full ev loop iteration yet. Fix this by delaying the test till top() results are meaningful and add minor refactoring. Follow-up #2694
Closed
Gumix
added a commit
to Gumix/tarantool
that referenced
this issue
Jun 24, 2022
clock_gettime(CLOCK_MONOTONIC) is implemented via the RDTSCP instruction on x86 an has the following advantages over the raw instruction: * It checks for RDTSCP availability in CPUID. If RDTSCP is not supported, it switches to RDTSC. * Linux guarantee that clock is monotonic, hence, the CPU miss detection is not needed. * It works on ARM. As for disadvantage, this function is about 2x slower compared to a single RDTSCP instruction. Performance degradation measured by the fiber switch benchmark [1] is about 8% for num_fibers = 100/1000. Closes tarantool#5869 [1] tarantool#2694 (comment) NO_DOC=bugfix NO_TEST=<Tested in test/app/fiber.test.lua>
Gumix
added a commit
to Gumix/tarantool
that referenced
this issue
Jun 27, 2022
clock_gettime(CLOCK_MONOTONIC) is implemented via the RDTSCP instruction on x86 an has the following advantages over the raw instruction: * It checks for RDTSCP availability in CPUID. If RDTSCP is not supported, it switches to RDTSC. * Linux guarantee that clock is monotonic, hence, the CPU miss detection is not needed. * It works on ARM. As for disadvantage, this function is about 2x slower compared to a single RDTSCP instruction. Performance degradation measured by the fiber switch benchmark [1] is about 3-7% for num_fibers == 10-1000. Closes tarantool#5869 [1] tarantool#2694 (comment) NO_DOC=bugfix NO_TEST=<Tested in test/app/fiber.test.lua>
locker
pushed a commit
that referenced
this issue
Jun 29, 2022
clock_gettime(CLOCK_MONOTONIC) is implemented via the RDTSCP instruction on x86 an has the following advantages over the raw instruction: * It checks for RDTSCP availability in CPUID. If RDTSCP is not supported, it switches to RDTSC. * Linux guarantee that clock is monotonic, hence, the CPU miss detection is not needed. * It works on ARM. As for disadvantage, this function is about 2x slower compared to a single RDTSCP instruction. Performance degradation measured by the fiber switch benchmark [1] is about 3-7% for num_fibers == 10-1000. Closes #5869 [1] #2694 (comment) NO_DOC=bugfix NO_TEST=<Tested in test/app/fiber.test.lua>
locker
pushed a commit
that referenced
this issue
Jun 29, 2022
clock_gettime(CLOCK_MONOTONIC) is implemented via the RDTSCP instruction on x86 an has the following advantages over the raw instruction: * It checks for RDTSCP availability in CPUID. If RDTSCP is not supported, it switches to RDTSC. * Linux guarantee that clock is monotonic, hence, the CPU miss detection is not needed. * It works on ARM. As for disadvantage, this function is about 2x slower compared to a single RDTSCP instruction. Performance degradation measured by the fiber switch benchmark [1] is about 3-7% for num_fibers == 10-1000. Closes #5869 [1] #2694 (comment) NO_DOC=bugfix NO_TEST=<Tested in test/app/fiber.test.lua> (cherry picked from commit 2fb7ffc)
mkokryashkin
pushed a commit
to mkokryashkin/tarantool
that referenced
this issue
Sep 9, 2022
clock_gettime(CLOCK_MONOTONIC) is implemented via the RDTSCP instruction on x86 an has the following advantages over the raw instruction: * It checks for RDTSCP availability in CPUID. If RDTSCP is not supported, it switches to RDTSC. * Linux guarantee that clock is monotonic, hence, the CPU miss detection is not needed. * It works on ARM. As for disadvantage, this function is about 2x slower compared to a single RDTSCP instruction. Performance degradation measured by the fiber switch benchmark [1] is about 3-7% for num_fibers == 10-1000. Closes tarantool#5869 [1] tarantool#2694 (comment) NO_DOC=bugfix NO_TEST=<Tested in test/app/fiber.test.lua>
Gumix
added a commit
to Gumix/tarantool
that referenced
this issue
Feb 20, 2023
clock_gettime(CLOCK_MONOTONIC) is implemented via the RDTSCP instruction on x86 an has the following advantages over the raw instruction: * It checks for RDTSCP availability in CPUID. If RDTSCP is not supported, it switches to RDTSC. * Linux guarantee that clock is monotonic, hence, the CPU miss detection is not needed. * It works on ARM. As for disadvantage, this function is about 2x slower compared to a single RDTSCP instruction. Performance degradation measured by the fiber switch benchmark [1] is about 3-7% for num_fibers == 10-1000. Backport of tarantool#5869 [1] tarantool#2694 (comment) NO_DOC=bugfix NO_TEST=<Tested in test/app/fiber.test.lua> (cherry picked from commit 2fb7ffc)
locker
pushed a commit
that referenced
this issue
Feb 20, 2023
clock_gettime(CLOCK_MONOTONIC) is implemented via the RDTSCP instruction on x86 an has the following advantages over the raw instruction: * It checks for RDTSCP availability in CPUID. If RDTSCP is not supported, it switches to RDTSC. * Linux guarantee that clock is monotonic, hence, the CPU miss detection is not needed. * It works on ARM. As for disadvantage, this function is about 2x slower compared to a single RDTSCP instruction. Performance degradation measured by the fiber switch benchmark [1] is about 3-7% for num_fibers == 10-1000. Backport of #5869 [1] #2694 (comment) NO_DOC=bugfix NO_TEST=<Tested in test/app/fiber.test.lua> (cherry picked from commit 2fb7ffc)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Implement fiber.top() which returns fiber list ordered by average cpu usage in the last second
Arguments:
The text was updated successfully, but these errors were encountered: