Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scheduler:sample/0 reports inconsistent values from child processes #5425

Closed
unflxw opened this issue Nov 19, 2021 · 5 comments
Closed

scheduler:sample/0 reports inconsistent values from child processes #5425

unflxw opened this issue Nov 19, 2021 · 5 comments
Assignees
Labels
bug Issue is reported as a bug in progress team:VM Assigned to OTP team VM

Comments

@unflxw
Copy link

unflxw commented Nov 19, 2021

Describe the bug

When calling scheduler:sample/0 from a spawned process, time values for the normal schedulers will only be reported starting from the moment the child process was started. This makes values reported from different processes inconsistent with each other. When passing samples from different child processes to scheduler:utilization, the resulting utilization values will be meaningless.

However, this behaviour only happens if the parent process has not itself called scheduler:sample/0 before -- after the parent process calls scheduler:sample/0, the values reported from the child processes will be consistent with each other.

[Note: I'm describing this in terms of scheduler:sample/0, but the same issue can be observed by calling erlang:statistics(scheduler_wall_time) directly]

To Reproduce

To simplify the resulting output, we'll call element(4, hd(element(2, scheduler:sample()))), obtaining the total time utilization of the first normal scheduler. Note that the behaviour can be observed for the total and active time values of all schedulers.

From an ERL shell, without any previous calls to scheduler:sample():

spawn(fun() -> erlang:display(element(4, hd(element(2, scheduler:sample())))) end).
timer:sleep(1000).
spawn(fun() -> erlang:display(element(4, hd(element(2, scheduler:sample())))) end).

Expected behavior

The time values returned by these two functions should be at least 1000000 microseconds apart, so that calculating the utilization from them returns meaningful values.

Alternatively, if the value is meant to have a per-process scope, then it should be documented as such, and then the values of parent processes should not leak to child processes. Note that the following does correctly return values from the spawned processes that are at least 1000000 microseconds apart:

scheduler:sample().
spawn(fun() -> erlang:display(element(4, hd(element(2, scheduler:sample())))) end).
timer:sleep(1000).
spawn(fun() -> erlang:display(element(4, hd(element(2, scheduler:sample())))) end).

That is, the moment you call scheduler:sample/0 on the parent process, values obtained by calling it from its child processes have time values that are consistent with the parent process. My hunch is that calling scheduler:sample/0 from the parent process initialises some scheduler time counters that get passed to the child process.

EDIT: it seems that calling erlang:system_flag(scheduler_wall_time, true) from the parent process has the same effect. Calling scheduler:sample() implicitly sets that flag, but when it is called in a child process, the flag is not propagated to the parent.

Affected versions
Tested on OTP 24. Older versions may be affected as well.

Additional context

I've found this issue when fetching samples from independent Elixir.Task processes and calculating the utilization.

@unflxw unflxw added the bug Issue is reported as a bug label Nov 19, 2021
@unflxw unflxw changed the title scheduler:sample/0 reports inconsistent values from spawn/1 scheduler:sample/0 reports inconsistent values from child processes Nov 19, 2021
unflxw added a commit to appsignal/appsignal-elixir that referenced this issue Nov 19, 2021
When `:scheduler.sample/0` is called from a process for the first
time, _if and only if none of its parent processes have themselves
called `:scheduler.sample/0`_, the values obtained from the call
will refer to the time the child process has been alive. As such,
calculating the scheduler utilization from two calls obtained from
different child processes returns meaningless values. This has
been [reported as a bug in the OTP repository][bug].

[bug]: erlang/otp#5425

As a workaround, this commit calls `:scheduler.sample/0` from
`Appsignal`, right before the Erlang probe is initialised, ensuring
that the Task child processes receive coherent values.
unflxw added a commit to appsignal/appsignal-elixir that referenced this issue Nov 19, 2021
When `:scheduler.sample/0` is called from a process for the first
time, _if and only if none of its parent processes have themselves
called `:scheduler.sample/0`_, the values obtained from the call
will refer to the time the child process has been alive. As such,
calculating the scheduler utilization from two calls obtained from
different child processes returns meaningless values. This has
been [reported as a bug in the OTP repository][bug].

[bug]: erlang/otp#5425

As a workaround, this commit calls `:scheduler.sample/0` from
`Appsignal`, right before the Erlang probe is initialised, ensuring
that the Task child processes receive coherent values.
unflxw added a commit to appsignal/appsignal-elixir that referenced this issue Nov 19, 2021
When `:scheduler.sample/0` is called from a process for the first
time, _if and only if none of its parent processes have themselves
called `:scheduler.sample/0`_, the values obtained from the call
will refer to the time the child process has been alive. As such,
calculating the scheduler utilization from two calls obtained from
different child processes returns meaningless values. This has
been [reported as a bug in the OTP repository][bug].

[bug]: erlang/otp#5425

As a workaround, this commit calls `:scheduler.sample/0` from
`Appsignal`, right before the Erlang probe is initialised, ensuring
that the Task child processes receive coherent values.
unflxw added a commit to appsignal/appsignal-elixir that referenced this issue Nov 19, 2021
When `:scheduler.sample/0` is called from a process for the first
time, _if and only if none of its parent processes have themselves
called `:scheduler.sample/0`_, the values obtained from the call
will refer to the time the child process has been alive. As such,
calculating the scheduler utilization from two calls obtained from
different child processes returns meaningless values. This has
been [reported as a bug in the OTP repository][bug].

[bug]: erlang/otp#5425

As a workaround, this commit calls `:scheduler.sample/0` from
`Appsignal`, right before the Erlang probe is initialised, ensuring
that the Task child processes receive coherent values.
@rickard-green rickard-green added the team:VM Assigned to OTP team VM label Nov 19, 2021
unflxw added a commit to appsignal/appsignal-elixir that referenced this issue Nov 19, 2021
When `:scheduler.sample/0` is called from a process for the first
time, _if and only if none of its parent processes have themselves
called `:scheduler.sample/0`_, the values obtained from the call
will refer to the time the child process has been alive. As such,
calculating the scheduler utilization from two calls obtained from
different child processes returns meaningless values. This has
been [reported as a bug in the OTP repository][bug].

[bug]: erlang/otp#5425

As a workaround, this commit calls `:scheduler.sample/0` from
`Appsignal`, right before the Erlang probe is initialised, ensuring
that the Task child processes receive coherent values.
@sverker
Copy link
Contributor

sverker commented Nov 19, 2021

I think this is a matter of lacking documentation. In particular the documentation of erlang:system_flag(scheduler_wall_time, true) not stating that the setting of the flag is "process bound". If a process sets the flag and then terminates without clearing it, then the flag will be cleared automatically. However, the node global state of the flag will not be reset to false until all processes that have set the flag have cleared it or terminated.

And when the node global scheduler_wall_time flag is reset to false, the time counters are reset to zero.

So to get correct values the process that did the first call to scheduler:sample, which in turn called erlang:system_flag(scheduler_wall_time, true), must be kept alive.

@unflxw
Copy link
Author

unflxw commented Nov 22, 2021

@sverker I agree that, if this is the intended behaviour, it should be documented as such.

There's a "leaky abstraction" of sorts in having scheduler:sample turn on that flag for you. The flag is not mentioned in the scheduler module documentation (although it does mention erlang:statistics(scheduler_wall_time), which in turn mentions the flag on its documentation) but getting coherent values from calls to it in child processes requires the parent process to turn this flag on (by calling scheduler:sample or manually) and stay alive.

@sverker
Copy link
Contributor

sverker commented Nov 22, 2021

I agree.
Except that process parent-child relationships are irrelevant here. The process that turned on the flag must be kept alive, regardless of its family situation.

@sverker
Copy link
Contributor

sverker commented Nov 24, 2021

@unflxw What do you think about #5444?

@sverker
Copy link
Contributor

sverker commented Jan 11, 2022

@unflxw Thanks for the report and feedback.

@sverker sverker closed this as completed Jan 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Issue is reported as a bug in progress team:VM Assigned to OTP team VM
Projects
None yet
Development

No branches or pull requests

3 participants