Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fiber.top() #2694

Closed
kostja opened this issue Aug 17, 2017 · 1 comment
Closed

fiber.top() #2694

kostja opened this issue Aug 17, 2017 · 1 comment
Assignees
Labels
feature A new functionality
Milestone

Comments

@kostja
Copy link
Contributor

kostja commented Aug 17, 2017

Implement fiber.top() which returns fiber list ordered by average cpu usage in the last second

Arguments:

  • seconds
  • number of fibers
@kostja kostja added this to the wishlist milestone Oct 27, 2017
@kostja kostja added feature A new functionality and removed feature A new functionality labels Feb 13, 2018
@kyukhin kyukhin modified the milestones: wishlist, 2.3.1 Aug 21, 2019
sergepetrenko added a commit that referenced this issue Sep 20, 2019
Implement a new function in Lua fiber library: top(). It returns a table
of alive fibers (including the scheduler). Each table entry has two
fields: average cpu consumption, which is calculated with exponential
moving average over event loop iterations, and current cpu consumption,
which shows fiber's cpu usage over the last event loop iteration.
The patch relies on CPU timestamp counter to measure each fiber's time
share.

Closes #2694
sergepetrenko added a commit that referenced this issue Oct 3, 2019
Implement a new function in Lua fiber library: top(). It returns a table
of alive fibers (including the scheduler). Each table entry has two
fields: average cpu consumption, which is calculated with exponential
moving average over event loop iterations, and current cpu consumption,
which shows fiber's cpu usage over the last event loop iteration.
The patch relies on CPU timestamp counter to measure each fiber's time
share.

Closes #2694

@TarantoolBot document
Title: fiber: new function `fiber.top()`

`fiber.top()` returns a table of all alive fibers and lists their cpu
consumption. Let's take a look at the example:
```
tarantool> fiber.top()
---
- 1:
    cpu average (%): 10.779696493982
    cpu instant (%): 10.435256168573
  115:
    cpu average (%): 5.4571279061075
    cpu instant (%): 5.9653973440576
  120:
    cpu average (%): 21.944382148464
    cpu instant (%): 23.849021825646
  116:
    cpu average (%): 8.6603872318158
    cpu instant (%): 9.6812031335093
  119:
    cpu average (%): 21.933168871944
    cpu instant (%): 20.007540530351
  cpu misses: 0
  118:
    cpu average (%): 19.342901995963
    cpu instant (%): 16.932679820703
  117:
    cpu average (%): 11.549674814981
    cpu instant (%): 13.128901177161
...
```
In the table above keys are fiber id's (and a single 'cpu misses' key
which indicates the amount of times tx thread was rescheduled on a
different cpu core. More on that later).
The two metrics available for each fiber are:
1) cpu instant (per cent),
which indicates the share of time fiber was executing during the
previous event loop iteration
2) cpu average (per cent), which is calculated as an exponential moving
average of `cpu instant` values over all previous event loop iterations.
sergepetrenko added a commit that referenced this issue Oct 8, 2019
Add a wrapper function around swim test passed to swim_run_test which
breaks from default cord event loop. Otherwise we will hang indefinitely
in ev_run in case there is a started watcher.

Found during work on #2694
sergepetrenko added a commit that referenced this issue Oct 8, 2019
Implement a new function in Lua fiber library: top(). It returns a table
of alive fibers (including the scheduler). Each table entry has two
fields: average cpu consumption, which is calculated with exponential
moving average over event loop iterations, and current cpu consumption,
which shows fiber's cpu usage over the last event loop iteration.
The patch relies on CPU timestamp counter to measure each fiber's time
share.

Closes #2694

@TarantoolBot document
Title: fiber: new function `fiber.top()`

`fiber.top()` returns a table of all alive fibers and lists their cpu
consumption. Let's take a look at the example:
```
tarantool> fiber.top()
---
- 1:
    cpu average (%): 10.779696493982
    cpu instant (%): 10.435256168573
  115:
    cpu average (%): 5.4571279061075
    cpu instant (%): 5.9653973440576
  120:
    cpu average (%): 21.944382148464
    cpu instant (%): 23.849021825646
  116:
    cpu average (%): 8.6603872318158
    cpu instant (%): 9.6812031335093
  119:
    cpu average (%): 21.933168871944
    cpu instant (%): 20.007540530351
  cpu misses: 0
  118:
    cpu average (%): 19.342901995963
    cpu instant (%): 16.932679820703
  117:
    cpu average (%): 11.549674814981
    cpu instant (%): 13.128901177161
...
```
In the table above keys are fiber id's (and a single 'cpu misses' key
which indicates the amount of times tx thread was rescheduled on a
different cpu core. More on that later).
The two metrics available for each fiber are:
1) cpu instant (per cent),
which indicates the share of time fiber was executing during the
previous event loop iteration
2) cpu average (per cent), which is calculated as an exponential moving
average of `cpu instant` values over all previous event loop iterations.

More info on `cpu misses` field returned by `fiber.top()`:
`cpu misses` indicates the amount of times tx thread detected it was
rescheduled on a different cpu core during the last event loop
iteration.
fiber.top() uses cpu timestamp counter to measure each fiber's execution
time. However, each cpu core may have its own counter value (you can
only rely on counter deltas if both measurements were taken on the same
core, otherwise the delta may even get negative).
When tx thread is rescheduled to a different cpu core, tarantool just
assumes cpu delta was zero for the lust measurement. This loweres
precision of our computations, so the bigger `cpu misses` value the
lower the precision of fiber.top() results.
sergepetrenko added a commit that referenced this issue Oct 17, 2019
Add a wrapper function around swim test passed to swim_run_test which
breaks from default cord event loop. Otherwise we will hang indefinitely
in ev_run in case there is a started watcher.

Found during work on #2694
sergepetrenko added a commit that referenced this issue Oct 17, 2019
Implement a new function in Lua fiber library: top(). It returns a table
of alive fibers (including the scheduler). Each table entry has two
fields: average cpu consumption, which is calculated with exponential
moving average over event loop iterations, and current cpu consumption,
which shows fiber's cpu usage over the last event loop iteration.
The patch relies on CPU timestamp counter to measure each fiber's time
share.

Closes #2694

@TarantoolBot document
Title: fiber: new function `fiber.top()`

`fiber.top()` returns a table of all alive fibers and lists their cpu
consumption. Let's take a look at the example:
```
tarantool> fiber.top()
---
- 1:
    cpu average (%): 10.779696493982
    cpu instant (%): 10.435256168573
  115:
    cpu average (%): 5.4571279061075
    cpu instant (%): 5.9653973440576
  120:
    cpu average (%): 21.944382148464
    cpu instant (%): 23.849021825646
  116:
    cpu average (%): 8.6603872318158
    cpu instant (%): 9.6812031335093
  119:
    cpu average (%): 21.933168871944
    cpu instant (%): 20.007540530351
  cpu misses: 0
  118:
    cpu average (%): 19.342901995963
    cpu instant (%): 16.932679820703
  117:
    cpu average (%): 11.549674814981
    cpu instant (%): 13.128901177161
...
```
In the table above keys are fiber id's (and a single 'cpu misses' key
which indicates the amount of times tx thread was rescheduled on a
different cpu core. More on that later).
The two metrics available for each fiber are:
1) cpu instant (per cent),
which indicates the share of time fiber was executing during the
previous event loop iteration
2) cpu average (per cent), which is calculated as an exponential moving
average of `cpu instant` values over all previous event loop iterations.

More info on `cpu misses` field returned by `fiber.top()`:
`cpu misses` indicates the amount of times tx thread detected it was
rescheduled on a different cpu core during the last event loop
iteration.
fiber.top() uses cpu timestamp counter to measure each fiber's execution
time. However, each cpu core may have its own counter value (you can
only rely on counter deltas if both measurements were taken on the same
core, otherwise the delta may even get negative).
When tx thread is rescheduled to a different cpu core, tarantool just
assumes cpu delta was zero for the lust measurement. This loweres
precision of our computations, so the bigger `cpu misses` value the
lower the precision of fiber.top() results.

Fiber.top() doesn't work on arm architecture at the moment.
@sergepetrenko
Copy link
Collaborator

sergepetrenko commented Oct 25, 2019

Performance measurements with and without fiber.top():
Invoked with tarantool test.lua num_fibers
The test measures context switches per second with num_fibers all running in a loop while true do fiber.yield() end

The results are averaged between 5 runs.

num_fibers default version (csw/sec) fiber.top() version (csw/sec)
1 484 266 429 150 (- 11.3 %)
10 2 006 201 1 728 187 (- 13.9 %)
100 2 950 172 2 467 650 (- 16.4 %)
1000 2 400 037 2 020 417 (- 15.8 %)
10000 8 824 8 910 (+ 1.0 %)

File used to test fiber.top() version:

fiber = require('fiber')
num_fibers = arg[1]
delta_time = 1

fiber_func = function()
    while true do
        fiber.yield()
    end
end

-- create fibers but do not let them run yet
for i = 1, num_fibers do
    fiber.new(fiber_func)
end

fiber.top_enable()

t_start = fiber.time()
fiber.sleep(delta_time)
t_end = fiber.time()

fiber.top_disable();

tbl = fiber.info()
csw = 0
for k ,v in pairs(tbl) do
    csw = csw + v["csw"]
end

print("delta time:", t_end - t_start)
print("csw/sec:", csw / (t_end - t_start))

File used to test default version:

fiber = require('fiber')
num_fibers = arg[1]
delta_time = 1

fiber_func = function()
    while true do
        fiber.yield()
    end
end

-- create fibers but do not let them run yet
for i = 1, num_fibers do
    fiber.new(fiber_func)
end


t_start = fiber.time()
fiber.sleep(delta_time)
t_end = fiber.time()


tbl = fiber.info()
csw = 0
for k ,v in pairs(tbl) do
    csw = csw + v["csw"]
end

print("delta time:", t_end - t_start)
print("csw/sec:", csw / (t_end - t_start))

sergepetrenko added a commit that referenced this issue Oct 25, 2019
Implement a new function in Lua fiber library: top(). It returns a table
of alive fibers (including the scheduler). Each table entry has two
fields: average cpu consumption, which is calculated with exponential
moving average over event loop iterations, and current cpu consumption,
which shows fiber's cpu usage over the last event loop iteration.
The patch relies on CPU timestamp counter to measure each fiber's time
share.

Closes #2694

@TarantoolBot document
Title: fiber: new function `fiber.top()`

`fiber.top()` returns a table of all alive fibers and lists their cpu
consumption. Let's take a look at the example:
```
tarantool> fiber.top()
---
- sched:
    average: 10.779696493982
    instant: 10.435256168573
  115:
    average: 5.4571279061075
    instant: 5.9653973440576
  120:
    average: 21.944382148464
    instant: 23.849021825646
  116:
    average: 8.6603872318158
    instant: 9.6812031335093
  119:
    average: 21.933168871944
    instant: 20.007540530351
  cpu misses: 0
  118:
    average: 19.342901995963
    instant: 16.932679820703
  117:
    average: 11.549674814981
    instant: 13.128901177161
...
```
In the table above keys are fiber id's (the only exceptions are `sched`
fiber, which is indexed by mane and a single 'cpu misses' key which
indicates the amount of times tx thread was rescheduled on a different
cpu core. More on that later).
The two metrics available for each fiber are:
1) instant (per cent),
which indicates the share of time fiber was executing during the
previous event loop iteration
2) average (per cent), which is calculated as an exponential moving
average of `instant` values over all previous event loop iterations.

More info on `cpu misses` field returned by `fiber.top()`:
`cpu misses` indicates the amount of times tx thread detected it was
rescheduled on a different cpu core during the last event loop
iteration.
fiber.top() uses cpu timestamp counter to measure each fiber's execution
time. However, each cpu core may have its own counter value (you can
only rely on counter deltas if both measurements were taken on the same
core, otherwise the delta may even get negative).
When tx thread is rescheduled to a different cpu core, tarantool just
assumes cpu delta was zero for the latest measurement. This loweres
precision of our computations, so the bigger `cpu misses` value the
lower the precision of fiber.top() results.

Fiber.top() doesn't work on arm architecture at the moment.

fiber.top() is disabled by default, to enable it you need to
issue `fiber.top_enable()`. You can disable it back later using
`fiber.top_disable()`.
sergepetrenko added a commit that referenced this issue Oct 25, 2019
Implement a new function in Lua fiber library: top(). It returns a table
of alive fibers (including the scheduler). Each table entry has two
fields: average cpu consumption, which is calculated with exponential
moving average over event loop iterations, and current cpu consumption,
which shows fiber's cpu usage over the last event loop iteration.
The patch relies on CPU timestamp counter to measure each fiber's time
share.

Closes #2694

@TarantoolBot document
Title: fiber: new function `fiber.top()`

`fiber.top()` returns a table of all alive fibers and lists their cpu
consumption. Let's take a look at the example:
```
tarantool> fiber.top()
---
- sched:
    average: 10.779696493982
    instant: 10.435256168573
  115:
    average: 5.4571279061075
    instant: 5.9653973440576
  120:
    average: 21.944382148464
    instant: 23.849021825646
  116:
    average: 8.6603872318158
    instant: 9.6812031335093
  119:
    average: 21.933168871944
    instant: 20.007540530351
  cpu misses: 0
  118:
    average: 19.342901995963
    instant: 16.932679820703
  117:
    average: 11.549674814981
    instant: 13.128901177161
...
```
In the table above keys are fiber id's (the only exceptions are `sched`
fiber, which is indexed by mane and a single 'cpu misses' key which
indicates the amount of times tx thread was rescheduled on a different
cpu core. More on that later).
The two metrics available for each fiber are:
1) instant (per cent),
which indicates the share of time fiber was executing during the
previous event loop iteration
2) average (per cent), which is calculated as an exponential moving
average of `instant` values over all previous event loop iterations.

More info on `cpu misses` field returned by `fiber.top()`:
`cpu misses` indicates the amount of times tx thread detected it was
rescheduled on a different cpu core during the last event loop
iteration.
fiber.top() uses cpu timestamp counter to measure each fiber's execution
time. However, each cpu core may have its own counter value (you can
only rely on counter deltas if both measurements were taken on the same
core, otherwise the delta may even get negative).
When tx thread is rescheduled to a different cpu core, tarantool just
assumes cpu delta was zero for the latest measurement. This loweres
precision of our computations, so the bigger `cpu misses` value the
lower the precision of fiber.top() results.

Fiber.top() doesn't work on arm architecture at the moment.

fiber.top() is disabled by default, to enable it you need to
issue `fiber.top_enable()`. You can disable it back later using
`fiber.top_disable()`.
sergepetrenko added a commit that referenced this issue Oct 28, 2019
Implement a new function in Lua fiber library: top(). It returns a table
of alive fibers (including the scheduler). Each table entry has two
fields: average cpu consumption, which is calculated with exponential
moving average over event loop iterations, and current cpu consumption,
which shows fiber's cpu usage over the last event loop iteration.
The patch relies on CPU timestamp counter to measure each fiber's time
share.

Closes #2694

@TarantoolBot document
Title: fiber: new function `fiber.top()`

`fiber.top()` returns a table of all alive fibers and lists their cpu
consumption. Let's take a look at the example:
```
tarantool> fiber.top()
---
- 104/lua:
    instant: 9.8578003709556
    average: 7.1213095296968
  103/lua:
    instant: 10.242495019578
    average: 7.2647247910588
  107/lua:
    instant: 33.337913031531
    average: 34.602640818951
  101/on_shutdown:
    instant: 0
    average: 0
  105/lua:
    instant: 10.558494195233
    average: 7.4526482369814
  106/lua:
    instant: 21.19255341073
    average: 23.668463478562
  102/interactive:
    instant: 0
    average: 0
  cpu misses: 0
  1/sched:
    instant: 14.810743971972
    average: 19.494584837545
...

```
In the table above keys are strings containing fiber ids and names
(the only exception is a single 'cpu misses' key which indicates the
number of times tx thread was rescheduled on a different cpu core.
More on that later).
The two metrics available for each fiber are:
1) instant (per cent),
which indicates the share of time fiber was executing during the
previous event loop iteration
2) average (per cent), which is calculated as an exponential moving
average of `instant` values over all previous event loop iterations.

More info on `cpu misses` field returned by `fiber.top()`:
`cpu misses` indicates the amount of times tx thread detected it was
rescheduled on a different cpu core during the last event loop
iteration.
fiber.top() uses cpu timestamp counter to measure each fiber's execution
time. However, each cpu core may have its own counter value (you can
only rely on counter deltas if both measurements were taken on the same
core, otherwise the delta may even get negative).
When tx thread is rescheduled to a different cpu core, tarantool just
assumes cpu delta was zero for the latest measurement. This loweres
precision of our computations, so the bigger `cpu misses` value the
lower the precision of fiber.top() results.

Fiber.top() doesn't work on arm architecture at the moment.

fiber.top() is disabled by default, to enable it you need to
issue `fiber.top_enable()`. You can disable it back later using
`fiber.top_disable()`.
sergepetrenko added a commit that referenced this issue Oct 29, 2019
Implement a new function in Lua fiber library: top(). It returns a table
of alive fibers (including the scheduler). Each table entry has two
fields: average cpu consumption, which is calculated with exponential
moving average over event loop iterations, and current cpu consumption,
which shows fiber's cpu usage over the last event loop iteration.
The patch relies on CPU timestamp counter to measure each fiber's time
share.

Closes #2694

@TarantoolBot document
Title: fiber: new function `fiber.top()`

`fiber.top()` returns a table of all alive fibers and lists their cpu
consumption. Let's take a look at the example:
```
tarantool> fiber.top()
---
- 104/lua:
    instant: 9.8578003709556
    average: 7.1213095296968
  103/lua:
    instant: 10.242495019578
    average: 7.2647247910588
  107/lua:
    instant: 33.337913031531
    average: 34.602640818951
  101/on_shutdown:
    instant: 0
    average: 0
  105/lua:
    instant: 10.558494195233
    average: 7.4526482369814
  106/lua:
    instant: 21.19255341073
    average: 23.668463478562
  102/interactive:
    instant: 0
    average: 0
  cpu misses: 0
  1/sched:
    instant: 14.810743971972
    average: 19.494584837545
...

```
In the table above keys are strings containing fiber ids and names
(the only exception is a single 'cpu misses' key which indicates the
number of times tx thread was rescheduled on a different cpu core.
More on that later).
The two metrics available for each fiber are:
1) instant (per cent),
which indicates the share of time fiber was executing during the
previous event loop iteration
2) average (per cent), which is calculated as an exponential moving
average of `instant` values over all previous event loop iterations.

More info on `cpu misses` field returned by `fiber.top()`:
`cpu misses` indicates the amount of times tx thread detected it was
rescheduled on a different cpu core during the last event loop
iteration.
fiber.top() uses cpu timestamp counter to measure each fiber's execution
time. However, each cpu core may have its own counter value (you can
only rely on counter deltas if both measurements were taken on the same
core, otherwise the delta may even get negative).
When tx thread is rescheduled to a different cpu core, tarantool just
assumes cpu delta was zero for the latest measurement. This loweres
precision of our computations, so the bigger `cpu misses` value the
lower the precision of fiber.top() results.

Fiber.top() doesn't work on arm architecture at the moment.

Please note, that enabling fiber.top() slows down fiber switching by
about 15 per cent, so it is disabled by default.
To enable it you need to issue `fiber.top_enable()`.
You can disable it back after you finished debugging  using
`fiber.top_disable()`.
sergepetrenko added a commit that referenced this issue Nov 1, 2019
Implement a new function in Lua fiber library: top(). It returns a table
of alive fibers (including the scheduler). Each table entry has two
fields: average cpu consumption, which is calculated with exponential
moving average over event loop iterations, and current cpu consumption,
which shows fiber's cpu usage over the last event loop iteration.
The patch relies on CPU timestamp counter to measure each fiber's time
share.

Closes #2694

@TarantoolBot document
Title: fiber: new function `fiber.top()`

`fiber.top()` returns a table of all alive fibers and lists their cpu
consumption. Let's take a look at the example:
```
tarantool> fiber.top()
---
- 104/lua:
    instant: 18.433514726042
    time: 0.677505865
    average: 21.98826143184
  103/lua:
    instant: 19.131392015951
    time: 0.689521917
    average: 20.807772656431
  107/lua:
    instant: 18.624600174469
    time: 0.681585168
    average: 17.78194117452
  101/on_shutdown:
    instant: 0
    time: 0
    average: 0
  105/lua:
    instant: 18.562289702156
    time: 0.682085309
    average: 15.513811055476
  106/lua:
    instant: 18.441822789017
    time: 0.677320271
    average: 15.427595583115
  102/interactive:
    instant: 0
    time: 0.000367182
    average: 0
  cpu misses: 0
  1/sched:
    instant: 6.8063805923649
    time: 0.253035056
    average: 8.3479789103691
...

```
In the table above keys are strings containing fiber ids and names
(the only exception is a single 'cpu misses' key which indicates the
number of times tx thread was rescheduled on a different cpu core.
More on that later).
The three metrics available for each fiber are:
1) instant (per cent),
which indicates the share of time fiber was executing during the
previous event loop iteration
2) average (per cent), which is calculated as an exponential moving
average of `instant` values over all previous event loop iterations.
3) time (seconds), which estimates how much cpu time each fiber spent
processing during its lifetime.

More info on `cpu misses` field returned by `fiber.top()`:
`cpu misses` indicates the amount of times tx thread detected it was
rescheduled on a different cpu core during the last event loop
iteration.
fiber.top() uses cpu timestamp counter to measure each fiber's execution
time. However, each cpu core may have its own counter value (you can
only rely on counter deltas if both measurements were taken on the same
core, otherwise the delta may even get negative).
When tx thread is rescheduled to a different cpu core, tarantool just
assumes cpu delta was zero for the latest measurement. This loweres
precision of our computations, so the bigger `cpu misses` value the
lower the precision of fiber.top() results.

Fiber.top() doesn't work on arm architecture at the moment.

Please note, that enabling fiber.top() slows down fiber switching by
about 15 per cent, so it is disabled by default.
To enable it you need to issue `fiber.top_enable()`.
You can disable it back after you finished debugging  using
`fiber.top_disable()`.
sergepetrenko added a commit that referenced this issue Nov 1, 2019
Implement a new function in Lua fiber library: top(). It returns a table
of alive fibers (including the scheduler). Each table entry has two
fields: average cpu consumption, which is calculated with exponential
moving average over event loop iterations, and current cpu consumption,
which shows fiber's cpu usage over the last event loop iteration.
The patch relies on CPU timestamp counter to measure each fiber's time
share.

Closes #2694

@TarantoolBot document
Title: fiber: new function `fiber.top()`

`fiber.top()` returns a table of all alive fibers and lists their cpu
consumption. Let's take a look at the example:
```
tarantool> fiber.top()
---
- 104/lua:
    instant: 18.433514726042
    time: 0.677505865
    average: 21.98826143184
  103/lua:
    instant: 19.131392015951
    time: 0.689521917
    average: 20.807772656431
  107/lua:
    instant: 18.624600174469
    time: 0.681585168
    average: 17.78194117452
  101/on_shutdown:
    instant: 0
    time: 0
    average: 0
  105/lua:
    instant: 18.562289702156
    time: 0.682085309
    average: 15.513811055476
  106/lua:
    instant: 18.441822789017
    time: 0.677320271
    average: 15.427595583115
  102/interactive:
    instant: 0
    time: 0.000367182
    average: 0
  cpu misses: 0
  1/sched:
    instant: 6.8063805923649
    time: 0.253035056
    average: 8.3479789103691
...

```
In the table above keys are strings containing fiber ids and names
(the only exception is a single 'cpu misses' key which indicates the
number of times tx thread was rescheduled on a different cpu core.
More on that later).
The three metrics available for each fiber are:
1) instant (per cent),
which indicates the share of time fiber was executing during the
previous event loop iteration
2) average (per cent), which is calculated as an exponential moving
average of `instant` values over all previous event loop iterations.
3) time (seconds), which estimates how much cpu time each fiber spent
processing during its lifetime.

More info on `cpu misses` field returned by `fiber.top()`:
`cpu misses` indicates the amount of times tx thread detected it was
rescheduled on a different cpu core during the last event loop
iteration.
fiber.top() uses cpu timestamp counter to measure each fiber's execution
time. However, each cpu core may have its own counter value (you can
only rely on counter deltas if both measurements were taken on the same
core, otherwise the delta may even get negative).
When tx thread is rescheduled to a different cpu core, tarantool just
assumes cpu delta was zero for the latest measurement. This loweres
precision of our computations, so the bigger `cpu misses` value the
lower the precision of fiber.top() results.

Fiber.top() doesn't work on arm architecture at the moment.

Please note, that enabling fiber.top() slows down fiber switching by
about 15 per cent, so it is disabled by default.
To enable it you need to issue `fiber.top_enable()`.
You can disable it back after you finished debugging  using
`fiber.top_disable()`.
"Time" entry is also added to each fibers output in fiber.info()
(it duplicates "time" entry from fiber.top()).
Note, that "time" is only counted while fiber.top is enabled.
sergepetrenko added a commit that referenced this issue Nov 1, 2019
Implement a new function in Lua fiber library: top(). It returns a table
of alive fibers (including the scheduler). Each table entry has two
fields: average cpu consumption, which is calculated with exponential
moving average over event loop iterations, and current cpu consumption,
which shows fiber's cpu usage over the last event loop iteration.
The patch relies on CPU timestamp counter to measure each fiber's time
share.

Closes #2694

@TarantoolBot document
Title: fiber: new function `fiber.top()`

`fiber.top()` returns a table of all alive fibers and lists their cpu
consumption. Let's take a look at the example:
```
tarantool> fiber.top()
---
- 104/lua:
    instant: 18.433514726042
    time: 0.677505865
    average: 21.98826143184
  103/lua:
    instant: 19.131392015951
    time: 0.689521917
    average: 20.807772656431
  107/lua:
    instant: 18.624600174469
    time: 0.681585168
    average: 17.78194117452
  101/on_shutdown:
    instant: 0
    time: 0
    average: 0
  105/lua:
    instant: 18.562289702156
    time: 0.682085309
    average: 15.513811055476
  106/lua:
    instant: 18.441822789017
    time: 0.677320271
    average: 15.427595583115
  102/interactive:
    instant: 0
    time: 0.000367182
    average: 0
  cpu misses: 0
  1/sched:
    instant: 6.8063805923649
    time: 0.253035056
    average: 8.3479789103691
...

```
In the table above keys are strings containing fiber ids and names
(the only exception is a single 'cpu misses' key which indicates the
number of times tx thread was rescheduled on a different cpu core.
More on that later).
The three metrics available for each fiber are:
1) instant (per cent),
which indicates the share of time fiber was executing during the
previous event loop iteration
2) average (per cent), which is calculated as an exponential moving
average of `instant` values over all previous event loop iterations.
3) time (seconds), which estimates how much cpu time each fiber spent
processing during its lifetime.

More info on `cpu misses` field returned by `fiber.top()`:
`cpu misses` indicates the amount of times tx thread detected it was
rescheduled on a different cpu core during the last event loop
iteration.
fiber.top() uses cpu timestamp counter to measure each fiber's execution
time. However, each cpu core may have its own counter value (you can
only rely on counter deltas if both measurements were taken on the same
core, otherwise the delta may even get negative).
When tx thread is rescheduled to a different cpu core, tarantool just
assumes cpu delta was zero for the latest measurement. This loweres
precision of our computations, so the bigger `cpu misses` value the
lower the precision of fiber.top() results.

Fiber.top() doesn't work on arm architecture at the moment.

Please note, that enabling fiber.top() slows down fiber switching by
about 15 per cent, so it is disabled by default.
To enable it you need to issue `fiber.top_enable()`.
You can disable it back after you finished debugging  using
`fiber.top_disable()`.
"Time" entry is also added to each fibers output in fiber.info()
(it duplicates "time" entry from fiber.top()).
Note, that "time" is only counted while fiber.top is enabled.
sergepetrenko added a commit that referenced this issue Nov 1, 2019
Implement a new function in Lua fiber library: top(). It returns a table
of alive fibers (including the scheduler). Each table entry has three
fields: average cpu consumption, which is calculated with exponential
moving average over event loop iterations, current cpu consumption,
which shows fiber's cpu usage over the last event loop iteration and
cpu time the fiber has used up during its lifespan.
The patch relies on CPU timestamp counter to measure each fiber's time
share.

Closes #2694

@TarantoolBot document
Title: fiber: new function `fiber.top()`

`fiber.top()` returns a table of all alive fibers and lists their cpu
consumption. Let's take a look at the example:
```
tarantool> fiber.top()
---
- 104/lua:
    instant: 18.433514726042
    time: 0.677505865
    average: 21.98826143184
  103/lua:
    instant: 19.131392015951
    time: 0.689521917
    average: 20.807772656431
  107/lua:
    instant: 18.624600174469
    time: 0.681585168
    average: 17.78194117452
  101/on_shutdown:
    instant: 0
    time: 0
    average: 0
  105/lua:
    instant: 18.562289702156
    time: 0.682085309
    average: 15.513811055476
  106/lua:
    instant: 18.441822789017
    time: 0.677320271
    average: 15.427595583115
  102/interactive:
    instant: 0
    time: 0.000367182
    average: 0
  cpu misses: 0
  1/sched:
    instant: 6.8063805923649
    time: 0.253035056
    average: 8.3479789103691
...

```
In the table above keys are strings containing fiber ids and names
(the only exception is a single 'cpu misses' key which indicates the
number of times tx thread was rescheduled on a different cpu core.
More on that later).
The three metrics available for each fiber are:
1) instant (per cent),
which indicates the share of time fiber was executing during the
previous event loop iteration
2) average (per cent), which is calculated as an exponential moving
average of `instant` values over all previous event loop iterations.
3) time (seconds), which estimates how much cpu time each fiber spent
processing during its lifetime.

More info on `cpu misses` field returned by `fiber.top()`:
`cpu misses` indicates the amount of times tx thread detected it was
rescheduled on a different cpu core during the last event loop
iteration.
fiber.top() uses cpu timestamp counter to measure each fiber's execution
time. However, each cpu core may have its own counter value (you can
only rely on counter deltas if both measurements were taken on the same
core, otherwise the delta may even get negative).
When tx thread is rescheduled to a different cpu core, tarantool just
assumes cpu delta was zero for the latest measurement. This loweres
precision of our computations, so the bigger `cpu misses` value the
lower the precision of fiber.top() results.

Fiber.top() doesn't work on arm architecture at the moment.

Please note, that enabling fiber.top() slows down fiber switching by
about 15 per cent, so it is disabled by default.
To enable it you need to issue `fiber.top_enable()`.
You can disable it back after you finished debugging  using
`fiber.top_disable()`.
"Time" entry is also added to each fibers output in fiber.info()
(it duplicates "time" entry from fiber.top()).
Note, that "time" is only counted while fiber.top is enabled.
sergepetrenko added a commit that referenced this issue Nov 5, 2019
Implement a new function in Lua fiber library: top(). It returns a table
containing fiber cpu usage stats. The table has two entries:
"cpu_misses" and "cpu". "cpu" itself is a table listing all the alive
fibers and their cpu consumtion.
The patch relies on CPU timestamp counter to measure each fiber's time
share.

Closes #2694

@TarantoolBot document
Title: fiber: new function `fiber.top()`

`fiber.top()` returns a table of all alive fibers and lists their cpu
consumption. Let's take a look at the example:
```
tarantool> fiber.top()
---
- cpu:
    107/lua:
      instant: 30.967324490456
      time: 0.351821993
      average: 25.582738345233
    104/lua:
      instant: 9.6473633128437
      time: 0.110869897
      average: 7.9693406131877
    101/on_shutdown:
      instant: 0
      time: 0
      average: 0
    103/lua:
      instant: 9.8026528631511
      time: 0.112641118
      average: 18.138387232255
    106/lua:
      instant: 20.071174377224
      time: 0.226901357
      average: 17.077908441831
    102/interactive:
      instant: 0
      time: 9.6858e-05
      average: 0
    105/lua:
      instant: 9.2461986412164
      time: 0.10657528
      average: 7.7068458630827
    1/sched:
      instant: 20.265286315108
      time: 0.237095335
      average: 23.141537169257
  cpu_misses: 0
...

```
The two entries in a table returned by `fiber.top()` are
`cpu_misses` and `cpu`.

`cpu` itself is a table whose keys are strings containing fiber ids and
names.
The three metrics available for each fiber are:
1) instant (per cent),
which indicates the share of time fiber was executing during the
previous event loop iteration
2) average (per cent), which is calculated as an exponential moving
average of `instant` values over all previous event loop iterations.
3) time (seconds), which estimates how much cpu time each fiber spent
processing during its lifetime.

More info on `cpu_misses` field returned by `fiber.top()`:
`cpu_misses` indicates the amount of times tx thread detected it was
rescheduled on a different cpu core during the last event loop
iteration.
fiber.top() uses cpu timestamp counter to measure each fiber's execution
time. However, each cpu core may have its own counter value (you can
only rely on counter deltas if both measurements were taken on the same
core, otherwise the delta may even get negative).
When tx thread is rescheduled to a different cpu core, tarantool just
assumes cpu delta was zero for the latest measurement. This loweres
precision of our computations, so the bigger `cpu misses` value the
lower the precision of fiber.top() results.

Fiber.top() doesn't work on arm architecture at the moment.

Please note, that enabling fiber.top() slows down fiber switching by
about 15 per cent, so it is disabled by default.
To enable it you need to issue `fiber.top_enable()`.
You can disable it back after you finished debugging  using
`fiber.top_disable()`.
"Time" entry is also added to each fibers output in fiber.info()
(it duplicates "time" entry from fiber.top().cpu per fiber).
Note, that "time" is only counted while fiber.top is enabled.
@kyukhin kyukhin closed this as completed in 77fa45b Nov 9, 2019
sergepetrenko added a commit that referenced this issue Nov 12, 2019
fiber.top() fills in statistics every event loop iteration,
so if it was just enabled, fiber.top() may contain 'inf's and
'nan's in fiber cpu usage averages because total time consumed by
the main thread was not yet accounted for.
Same stands for viewing top() results for a freshly created fiber:
its metrics will be zero since it hasn't lived a full ev loop iteration
yet.
Fix this by delaying the test till top() results are meaningful and add
minor refactoring.

Follow-up #2694
sergepetrenko added a commit that referenced this issue Nov 12, 2019
sergepetrenko added a commit that referenced this issue Nov 12, 2019
fiber.top() fills in statistics every event loop iteration,
so if it was just enabled, fiber.top() may contain 'inf's and
'nan's in fiber cpu usage averages because total time consumed by
the main thread was not yet accounted for.
Same stands for viewing top() results for a freshly created fiber:
its metrics will be zero since it hasn't lived a full ev loop iteration
yet.
Fix this by delaying the test till top() results are meaningful and add
minor refactoring.

Follow-up #2694
sergepetrenko added a commit that referenced this issue Nov 12, 2019
fiber.top() fills in statistics every event loop iteration,
so if it was just enabled, fiber.top() may contain 'inf's and
'nan's in fiber cpu usage averages because total time consumed by
the main thread was not yet accounted for.
Same stands for viewing top() results for a freshly created fiber:
its metrics will be zero since it hasn't lived a full ev loop iteration
yet.
Fix this by delaying the test till top() results are meaningful and add
minor refactoring.

Follow-up #2694
sergepetrenko added a commit that referenced this issue Nov 14, 2019
fiber.top() fills in statistics every event loop iteration,
so if it was just enabled, fiber.top() may contain 'inf's and
'nan's in fiber cpu usage averages because total time consumed by
the main thread was not yet accounted for.
Same stands for viewing top() results for a freshly created fiber:
its metrics will be zero since it hasn't lived a full ev loop iteration
yet.
Fix this by delaying the test till top() results are meaningful and add
minor refactoring.

Follow-up #2694
sergepetrenko added a commit that referenced this issue Nov 15, 2019
We didn't refresh last remembered clock on fiber.top_enable()
This means that the fiber issuing fiber.top_enable() would get a huge
chunk of cpu time on its first yield. Fix this.
Also reset clock_delta and cpu_miss_count.

If fiber.top() is issued on the same ev loop iteration as
fiber.top_enable(), clock_delta_last is 0 for both cord and all the
fibers, so report "instant" and "average" stats per last iteration as 0
instead of NaN.

Follow-up #2694
sergepetrenko added a commit that referenced this issue Nov 15, 2019
fiber.top() fills in statistics every event loop iteration,
so if it was just enabled, fiber.top() may contain 'inf's and
'nan's in fiber cpu usage averages because total time consumed by
the main thread was not yet accounted for.
Same stands for viewing top() results for a freshly created fiber:
its metrics will be zero since it hasn't lived a full ev loop iteration
yet.
Fix this by delaying the test till top() results are meaningful and add
minor refactoring.

Follow-up #2694
sergepetrenko added a commit that referenced this issue Nov 15, 2019
fiber.top() fills in statistics every event loop iteration,
so if it was just enabled, fiber.top() returns zero in fiber cpu
usage statistics because total time consumed by the main thread was
not yet accounted for.
Same stands for viewing top() results for a freshly created fiber:
its metrics will be zero since it hasn't lived a full ev loop iteration
yet.
Fix this by delaying the test till top() results are meaningful and add
minor refactoring.

Follow-up #2694
sergepetrenko added a commit that referenced this issue Nov 15, 2019
We didn't refresh last remembered clock on fiber.top_enable()
This means that the fiber issuing fiber.top_enable() would get a huge
chunk of cpu time on its first yield. Fix this.
Also reset clock_delta and cpu_miss_count.

If fiber.top() is issued on the same ev loop iteration as
fiber.top_enable(), clock_delta_last is 0 for both cord and all the
fibers, so report "instant" and "average" stats per last iteration as 0
instead of NaN.

Follow-up #2694
sergepetrenko added a commit that referenced this issue Nov 15, 2019
fiber.top() fills in statistics every event loop iteration,
so if it was just enabled, fiber.top() returns zero in fiber cpu
usage statistics because total time consumed by the main thread was
not yet accounted for.
Same stands for viewing top() results for a freshly created fiber:
its metrics will be zero since it hasn't lived a full ev loop iteration
yet.
Fix this by delaying the test till top() results are meaningful and add
minor refactoring.

Follow-up #2694
sergepetrenko added a commit that referenced this issue Nov 18, 2019
Unify all the members related to fiber's clock statistics into struct
clock_stat and all the members related to cord's knowledge of cpu state
and clocks to struct cpu_stat.
Reset stats of all alive fibers on fiber.top_enable().

Follow-up #2694
sergepetrenko added a commit that referenced this issue Nov 18, 2019
When fiber EMA is 0 and first non-zero observation is added to it, we assumed
that EMA should be equal to this observation (i.e. average value should
be the same as the observed one). This breaks the following invariant:
sum of clock EMAs of all fibers equals clock EMA of the thread.
If one of the fibers is just spawned and has a big clock delta, it
will assign this delta to its EMA, while the thread will calculate the
new EMA as 15 * EMA / 16 + delta / 16, which may lead to a situation
when fiber EMA is greater than cord EMA.

This caused occasional test failures:
```
[001] Test failed! Result content mismatch:
[001] --- app/fiber.result	Mon Nov 18 17:00:48 2019
[001] +++ app/fiber.reject	Mon Nov 18 17:33:10 2019
[001] @@ -1511,7 +1511,7 @@
[001]  -- not exact due to accumulated integer division errors
[001]  sum_avg > 99 and sum_avg < 101 or sum_avg
[001]  ---
[001] -- true
[001] +- 187.59585601717
[001]  ...
[001]  tbl = nil
[001]  ---

```

Follow-up #2694
sergepetrenko added a commit that referenced this issue Nov 18, 2019
fiber.top() fills in statistics every event loop iteration,
so if it was just enabled, fiber.top() returns zero in fiber cpu
usage statistics because total time consumed by the main thread was
not yet accounted for.
Same stands for viewing top() results for a freshly created fiber:
its metrics will be zero since it hasn't lived a full ev loop iteration
yet.
Fix this by delaying the test till top() results are meaningful and add
minor refactoring.

Follow-up #2694
Gerold103 pushed a commit that referenced this issue Nov 18, 2019
When fiber EMA is 0 and first non-zero observation is added to it, we assumed
that EMA should be equal to this observation (i.e. average value should
be the same as the observed one). This breaks the following invariant:
sum of clock EMAs of all fibers equals clock EMA of the thread.
If one of the fibers is just spawned and has a big clock delta, it
will assign this delta to its EMA, while the thread will calculate the
new EMA as 15 * EMA / 16 + delta / 16, which may lead to a situation
when fiber EMA is greater than cord EMA.

This caused occasional test failures:
```
[001] Test failed! Result content mismatch:
[001] --- app/fiber.result	Mon Nov 18 17:00:48 2019
[001] +++ app/fiber.reject	Mon Nov 18 17:33:10 2019
[001] @@ -1511,7 +1511,7 @@
[001]  -- not exact due to accumulated integer division errors
[001]  sum_avg > 99 and sum_avg < 101 or sum_avg
[001]  ---
[001] -- true
[001] +- 187.59585601717
[001]  ...
[001]  tbl = nil
[001]  ---

```

Follow-up #2694
Gerold103 pushed a commit that referenced this issue Nov 18, 2019
fiber.top() fills in statistics every event loop iteration,
so if it was just enabled, fiber.top() returns zero in fiber cpu
usage statistics because total time consumed by the main thread was
not yet accounted for.
Same stands for viewing top() results for a freshly created fiber:
its metrics will be zero since it hasn't lived a full ev loop iteration
yet.
Fix this by delaying the test till top() results are meaningful and add
minor refactoring.

Follow-up #2694
sergepetrenko added a commit that referenced this issue Nov 19, 2019
Unify all the members related to fiber's clock statistics into struct
clock_stat and all the members related to cord's knowledge of cpu state
and clocks to struct cpu_stat.
Reset stats of all alive fibers on fiber.top_enable().

Follow-up #2694
sergepetrenko added a commit that referenced this issue Nov 19, 2019
When fiber EMA is 0 and first non-zero observation is added to it, we assumed
that EMA should be equal to this observation (i.e. average value should
be the same as the observed one). This breaks the following invariant:
sum of clock EMAs of all fibers equals clock EMA of the thread.
If one of the fibers is just spawned and has a big clock delta, it
will assign this delta to its EMA, while the thread will calculate the
new EMA as 15 * EMA / 16 + delta / 16, which may lead to a situation
when fiber EMA is greater than cord EMA.

This caused occasional test failures:
```
[001] Test failed! Result content mismatch:
[001] --- app/fiber.result	Mon Nov 18 17:00:48 2019
[001] +++ app/fiber.reject	Mon Nov 18 17:33:10 2019
[001] @@ -1511,7 +1511,7 @@
[001]  -- not exact due to accumulated integer division errors
[001]  sum_avg > 99 and sum_avg < 101 or sum_avg
[001]  ---
[001] -- true
[001] +- 187.59585601717
[001]  ...
[001]  tbl = nil
[001]  ---

```

Follow-up #2694
sergepetrenko added a commit that referenced this issue Nov 19, 2019
fiber.top() fills in statistics every event loop iteration,
so if it was just enabled, fiber.top() returns zero in fiber cpu
usage statistics because total time consumed by the main thread was
not yet accounted for.
Same stands for viewing top() results for a freshly created fiber:
its metrics will be zero since it hasn't lived a full ev loop iteration
yet.
Fix this by delaying the test till top() results are meaningful and add
minor refactoring.

Follow-up #2694
sergepetrenko added a commit that referenced this issue Nov 20, 2019
fiber.top() fills in statistics every event loop iteration,
so if it was just enabled, fiber.top() returns zero in fiber cpu
usage statistics because total time consumed by the main thread was
not yet accounted for.
Same stands for viewing top() results for a freshly created fiber:
its metrics will be zero since it hasn't lived a full ev loop iteration
yet.
Fix this by delaying the test till top() results are meaningful and add
minor refactoring.

Follow-up #2694
kyukhin pushed a commit that referenced this issue Nov 21, 2019
Unify all the members related to fiber's clock statistics into struct
clock_stat and all the members related to cord's knowledge of cpu state
and clocks to struct cpu_stat.
Reset stats of all alive fibers on fiber.top_enable().

Follow-up #2694
kyukhin pushed a commit that referenced this issue Nov 21, 2019
When fiber EMA is 0 and first non-zero observation is added to it, we assumed
that EMA should be equal to this observation (i.e. average value should
be the same as the observed one). This breaks the following invariant:
sum of clock EMAs of all fibers equals clock EMA of the thread.
If one of the fibers is just spawned and has a big clock delta, it
will assign this delta to its EMA, while the thread will calculate the
new EMA as 15 * EMA / 16 + delta / 16, which may lead to a situation
when fiber EMA is greater than cord EMA.

This caused occasional test failures:
```
[001] Test failed! Result content mismatch:
[001] --- app/fiber.result	Mon Nov 18 17:00:48 2019
[001] +++ app/fiber.reject	Mon Nov 18 17:33:10 2019
[001] @@ -1511,7 +1511,7 @@
[001]  -- not exact due to accumulated integer division errors
[001]  sum_avg > 99 and sum_avg < 101 or sum_avg
[001]  ---
[001] -- true
[001] +- 187.59585601717
[001]  ...
[001]  tbl = nil
[001]  ---

```

Follow-up #2694
kyukhin pushed a commit that referenced this issue Nov 21, 2019
fiber.top() fills in statistics every event loop iteration,
so if it was just enabled, fiber.top() returns zero in fiber cpu
usage statistics because total time consumed by the main thread was
not yet accounted for.
Same stands for viewing top() results for a freshly created fiber:
its metrics will be zero since it hasn't lived a full ev loop iteration
yet.
Fix this by delaying the test till top() results are meaningful and add
minor refactoring.

Follow-up #2694
Gumix added a commit to Gumix/tarantool that referenced this issue Jun 24, 2022
clock_gettime(CLOCK_MONOTONIC) is implemented via the RDTSCP instruction
on x86 an has the following advantages over the raw instruction:

* It checks for RDTSCP availability in CPUID.
  If RDTSCP is not supported, it switches to RDTSC.
* Linux guarantee that clock is monotonic, hence, the CPU miss
  detection is not needed.
* It works on ARM.

As for disadvantage, this function is about 2x slower compared to a
single RDTSCP instruction. Performance degradation measured by the
fiber switch benchmark [1] is about 8% for num_fibers = 100/1000.

Closes tarantool#5869

[1] tarantool#2694 (comment)

NO_DOC=bugfix
NO_TEST=<Tested in test/app/fiber.test.lua>
Gumix added a commit to Gumix/tarantool that referenced this issue Jun 27, 2022
clock_gettime(CLOCK_MONOTONIC) is implemented via the RDTSCP instruction
on x86 an has the following advantages over the raw instruction:

* It checks for RDTSCP availability in CPUID.
  If RDTSCP is not supported, it switches to RDTSC.
* Linux guarantee that clock is monotonic, hence, the CPU miss
  detection is not needed.
* It works on ARM.

As for disadvantage, this function is about 2x slower compared to a
single RDTSCP instruction. Performance degradation measured by the
fiber switch benchmark [1] is about 3-7% for num_fibers == 10-1000.

Closes tarantool#5869

[1] tarantool#2694 (comment)

NO_DOC=bugfix
NO_TEST=<Tested in test/app/fiber.test.lua>
locker pushed a commit that referenced this issue Jun 29, 2022
clock_gettime(CLOCK_MONOTONIC) is implemented via the RDTSCP instruction
on x86 an has the following advantages over the raw instruction:

* It checks for RDTSCP availability in CPUID.
  If RDTSCP is not supported, it switches to RDTSC.
* Linux guarantee that clock is monotonic, hence, the CPU miss
  detection is not needed.
* It works on ARM.

As for disadvantage, this function is about 2x slower compared to a
single RDTSCP instruction. Performance degradation measured by the
fiber switch benchmark [1] is about 3-7% for num_fibers == 10-1000.

Closes #5869

[1] #2694 (comment)

NO_DOC=bugfix
NO_TEST=<Tested in test/app/fiber.test.lua>
locker pushed a commit that referenced this issue Jun 29, 2022
clock_gettime(CLOCK_MONOTONIC) is implemented via the RDTSCP instruction
on x86 an has the following advantages over the raw instruction:

* It checks for RDTSCP availability in CPUID.
  If RDTSCP is not supported, it switches to RDTSC.
* Linux guarantee that clock is monotonic, hence, the CPU miss
  detection is not needed.
* It works on ARM.

As for disadvantage, this function is about 2x slower compared to a
single RDTSCP instruction. Performance degradation measured by the
fiber switch benchmark [1] is about 3-7% for num_fibers == 10-1000.

Closes #5869

[1] #2694 (comment)

NO_DOC=bugfix
NO_TEST=<Tested in test/app/fiber.test.lua>

(cherry picked from commit 2fb7ffc)
mkokryashkin pushed a commit to mkokryashkin/tarantool that referenced this issue Sep 9, 2022
clock_gettime(CLOCK_MONOTONIC) is implemented via the RDTSCP instruction
on x86 an has the following advantages over the raw instruction:

* It checks for RDTSCP availability in CPUID.
  If RDTSCP is not supported, it switches to RDTSC.
* Linux guarantee that clock is monotonic, hence, the CPU miss
  detection is not needed.
* It works on ARM.

As for disadvantage, this function is about 2x slower compared to a
single RDTSCP instruction. Performance degradation measured by the
fiber switch benchmark [1] is about 3-7% for num_fibers == 10-1000.

Closes tarantool#5869

[1] tarantool#2694 (comment)

NO_DOC=bugfix
NO_TEST=<Tested in test/app/fiber.test.lua>
Gumix added a commit to Gumix/tarantool that referenced this issue Feb 20, 2023
clock_gettime(CLOCK_MONOTONIC) is implemented via the RDTSCP instruction
on x86 an has the following advantages over the raw instruction:

* It checks for RDTSCP availability in CPUID.
  If RDTSCP is not supported, it switches to RDTSC.
* Linux guarantee that clock is monotonic, hence, the CPU miss
  detection is not needed.
* It works on ARM.

As for disadvantage, this function is about 2x slower compared to a
single RDTSCP instruction. Performance degradation measured by the
fiber switch benchmark [1] is about 3-7% for num_fibers == 10-1000.

Backport of tarantool#5869

[1] tarantool#2694 (comment)

NO_DOC=bugfix
NO_TEST=<Tested in test/app/fiber.test.lua>

(cherry picked from commit 2fb7ffc)
locker pushed a commit that referenced this issue Feb 20, 2023
clock_gettime(CLOCK_MONOTONIC) is implemented via the RDTSCP instruction
on x86 an has the following advantages over the raw instruction:

* It checks for RDTSCP availability in CPUID.
  If RDTSCP is not supported, it switches to RDTSC.
* Linux guarantee that clock is monotonic, hence, the CPU miss
  detection is not needed.
* It works on ARM.

As for disadvantage, this function is about 2x slower compared to a
single RDTSCP instruction. Performance degradation measured by the
fiber switch benchmark [1] is about 3-7% for num_fibers == 10-1000.

Backport of #5869

[1] #2694 (comment)

NO_DOC=bugfix
NO_TEST=<Tested in test/app/fiber.test.lua>

(cherry picked from commit 2fb7ffc)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature A new functionality
Projects
None yet
Development

No branches or pull requests

3 participants