Skip to content

[erts] Expose parent process via process_info(_, parent) #5768

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 28, 2022

Conversation

max-au
Copy link
Contributor

@max-au max-au commented Mar 5, 2022

Parent process information is available in the VM process structure,
but is not exposed via process_info. Sometimes it is very convenient
to know who spawned processes that are not supporting any OTP
behaviours (where this information is stored in a process dictionary).

In addition to counting reductions, it may also be important to keep
track of total number sent to a process. This allows various monitoring
scenarios. For example, when message_queue_len is draining slowly, is
it because it the process being slow, or is it because it gets more
messages enqueued. Having both message_queue_len and messages_enqueued
provides a way to calculate queue fill rate (or drain rate) over time.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 5, 2022

CT Test Results

       3 files     125 suites   36m 40s ⏱️
1 435 tests 1 389 ✔️ 43 💤 3
1 733 runs  1 668 ✔️ 60 💤 5

For more details on these failures, see this check.

Results for commit df95635.

♻️ This comment has been updated with latest results.

To speed up review, make sure that you have read Contributing to Erlang/OTP and that all checks pass.

See the TESTING and DEVELOPMENT HowTo guides for details about how to run test locally.

Artifacts

// Erlang/OTP Github Action Bot

@max-au max-au force-pushed the max-au/process-info-parent branch 2 times, most recently from e1c1496 to 3775bb3 Compare March 6, 2022 01:32
@max-au max-au changed the title [erts] Expose parent process via process_info(_, parent) [erts] Expose parent and number of enqueued messages Mar 6, 2022
@potatosalad
Copy link
Contributor

@max-au erl_process.c currently supports having parent set to NIL, although right now it's admittedly abnormal for the majority of processes (maybe just the init process today?).

However, this "orphaned" process creation is used as part of RFC: Erlang Dist Filtering for the Erlang processes spawned to accept or reject the incoming dist command. There may be a different way to accomplish this, though, but I thought it worth mentioning that parent may not always be a pid().

@rickard-green rickard-green added the team:VM Assigned to OTP team VM label Mar 7, 2022
@rickard-green rickard-green self-assigned this Mar 7, 2022
@max-au max-au force-pushed the max-au/process-info-parent branch from 1d5e097 to aa3e157 Compare March 7, 2022 17:49
@max-au
Copy link
Contributor Author

max-au commented Mar 7, 2022

Fixed a bunch of typos and updated documentation.

I see two ways to solve a problem with init (or other parent-less processes):

  • return Pid of the process itself (as in, init is self-spawned). This is what Unix does and I like this more, so I updated the patch to do exactly this
  • return undefined atom

@rickard-green
Copy link
Contributor

@max-au sorry for later answer. It has been really busy lately...

Regarding process_info(Pid, messages_enqueued):

  • I don't like to introduce things that breaks the signal order. One could perhaps get around that by documenting the value as being an approximate value. That way it can be considered as not breaking the signal order.
  • I suspect that a lot of systems will never use it. All processes in all systems will have to carry the cost of the extra word in the process structure, though. It is very easy to add just another word, since it is just a word, but in the end up with a large amount of normally unused words in all processes structures, so we need to be a bit restrictive when it comes to adding stuff in the process structure. We have process specific data (search for PSD in erl_process.h) which is intended to be used for rarely used features. When used, an array, currently 10 words large, is allocated to hold such rarely used values. Once allocated for a process, it will stay there until the process terminates. If you should add this word to PSD and check for message_enqueue on all processes, all processes in your system would allocate 11 more words, though. Here it would have been nice with something in between PSD and putting it into the process structure...
  • I also suspect that one actually would want a completely different approach than polling processes for information. If I understand correct, you poll processes repeatedly to fetch this information and then make decisions based on that. Frequently fetching information from processes this way is quite a heavy task. It is also just a snapshot, so you might miss peaks, for example in message queue length (although in this case the value is monotonically increasing). I suspect that functionality like this would fit much better in erlang:system_monitor(). It won't fit "as is", but, for example, by making it possible to set a limit in system monitor for fill rate and/or drain rate. I guess this would be much more efficient when looking at the total amount of work performed for the whole operation.

I don't see any major issues with process_info(Pid, parent). One thing though is that processes that has a remote process as parent currently does not have it saved in the parent field. This would add a couple of words to the heap of that process by saving the remote pid, but the majority of processes on a system wont be remotely spawned, and I think it could be useful information. A bit funny that I both argue for not using more memory as well as using more memory in the same comment :-)

@okeuday
Copy link
Contributor

okeuday commented Mar 21, 2022

@max-au If a remote parent is not set, it seems best to have the parent in that case be undefined instead of self()

@rickard-green
Copy link
Contributor

@max-au If a remote parent is not set, it seems best to have the parent in that case be undefined instead of self()

Yes, I agree

@max-au max-au force-pushed the max-au/process-info-parent branch from aa3e157 to f2560cf Compare March 21, 2022 21:15
@max-au max-au changed the title [erts] Expose parent and number of enqueued messages [erts] Expose parent process via process_info(_, parent) Mar 21, 2022
Parent process information is available in the VM process structure,
but is not exposed via process_info. Sometimes it is very convenient
to know who spawned processes that are not supporting any OTP
behaviours (where this information is stored in a process dictionary).

Parent process is returned only for locally started processes. Remotely
started process, or 'init' process that has no parent, return 'undefined'.
@max-au
Copy link
Contributor Author

max-au commented Mar 21, 2022

I updated this PR:

  • removed enqueued_messages in favour of a separate PR for a more focused discussion
  • rebased to master branch
  • changed to return undefined instead of self() for processes that have no local parent.

I explained my thought process for self() in the comment above (this is what Unix does). It also keeps the return spec simpler (there is always a pid() returned). But as I outlined above, I'm not against returning undefined either.

@max-au max-au force-pushed the max-au/process-info-parent branch from f2560cf to d0e45b4 Compare March 21, 2022 21:28
@rickard-green
Copy link
Contributor

I explained my thought process for self() in the comment above (this is what Unix does). It also keeps the return spec simpler (there is always a pid() returned). But as I outlined above, I'm not against returning undefined either.

Yes, I think self() approach is nice, but I feel the undefined approach is more consistent with how we usually do.

I've pushed a commit which adds support for remote parents.

We are really soon about to release 25-rc2. This wont make it into rc2, but I'll put this into testing as soon as rc2 is out.

@rickard-green rickard-green force-pushed the max-au/process-info-parent branch from a6d3d61 to df95635 Compare March 22, 2022 02:42
@max-au
Copy link
Contributor Author

max-au commented Mar 22, 2022

Thanks!
I don't have a good benchmark for remote spawn/erpc, but just looking at code it should not cause significant regression.

@max-au
Copy link
Contributor Author

max-au commented Mar 22, 2022

I suspect that functionality like this would fit much better in erlang:system_monitor().

I happened to have this PR in works as well. It introduces long_queue monitoring event (similar to large_heap, but triggered by total queue size). However the actual functionality we need (and internally still use) is this: max-au@a3d7f60

Every N seconds we iterate over processes to find the queue fill/drain rate. This way we find processes that are building up queues over time. It can also be implemented as part of system_monitor flag (something like {queue_fill_rate, 200, 1000} meaning "send a monitoring message if process message queue grows by 200 messages in 1000 milliseconds"), but it would be even more expensive for all ERTS users. We also monitor process message queues (enqueued/dequeued messages) to see irregularities in system behaviour. It may be a good solution to have this optional (controlled by process flag), but that is also extra complexity in ERTS I was hoping to avoid. I should probably make a post on Erlang forums for a wider discussion, but I'd first appreciate if @rickard-green already has something to suggest.

@rickard-green
Copy link
Contributor

I suspect that functionality like this would fit much better in erlang:system_monitor().

I happened to have this PR in works as well. It introduces long_queue monitoring event (similar to large_heap, but triggered by total queue size). However the actual functionality we need (and internally still use) is this: max-au@a3d7f60

I guess you by "total queue size" mean the value in sig_qs.len field?

It is okay to base such functionality on this value as long as it does not claim it is measuring the message queue length. Calling it signal queue length is okay, though.

Every N seconds we iterate over processes to find the queue fill/drain rate. This way we find processes that are building up queues over time. It can also be implemented as part of system_monitor flag (something like {queue_fill_rate, 200, 1000} meaning "send a monitoring message if process message queue grows by 200 messages in 1000 milliseconds"), but it would be even more expensive for all ERTS users.

What do you mean would be more expensive for all users? Repeated process_info() calls vs system monitor, or signal queue length vs queue fill rate?

We also monitor process message queues (enqueued/dequeued messages) to see irregularities in system behaviour. It may be a good solution to have this optional (controlled by process flag), but that is also extra complexity in ERTS I was hoping to avoid. I should probably make a post on Erlang forums for a wider discussion, but I'd first appreciate if @rickard-green already has something to suggest.

I don't see it as acceptable that users not interested in this should pay an, other than negligible, performance/memory loss for this. If extra complexity is needed in order to achieve this, I see that as unavoidable.

Using PSD for holding data related to this type of monitoring would also not be as costly memory wise if only enabling this on a subset of processes.

@rickard-green
Copy link
Contributor

Exactly what type of system monitor limit one would want I think it is good to discuss at Erlang forums. My guess is that a system monitor limit on signal queue length would be appreciated by many and would also not require any extra process specific memory usage. This is probably something that should be added regardless of whether more even "fancier" system monitor limits are added or not.

@rickard-green
Copy link
Contributor

I suspect that functionality like this would fit much better in erlang:system_monitor().

I happened to have this PR in works as well. It introduces long_queue monitoring event (similar to large_heap, but triggered by total queue size). However the actual functionality we need (and internally still use) is this: max-au@a3d7f60

I guess you by "total queue size" mean the value in sig_qs.len field?

It is okay to base such functionality on this value as long as it does not claim it is measuring the message queue length. Calling it signal queue length is okay, though.

@max-au I spoke without thinking... This is not signal queue length either, so one cannot document it as such. It actually is not possible to give a good description of what it represents except under certain very specific conditions. Its value is somewhere between message queue length and some definition of signal queue length. So by using it one will get some sloppy value of something that is a message-signal queueish lenghtish yuck... :-( At the moment I'm too tired to think of alternatives, but I'll try to figure something out later...

@max-au
Copy link
Contributor Author

max-au commented Mar 23, 2022

I am thinking of it as a "total amount of messages" sent to the process (both in the queue and in flight).

The caveat I see is that signals are also counted towards sig_qs.len and it would be expensive to keep track of non-message signal count (and in fact NM-signals are also affecting process performance/reductions).

@KennethL
Copy link
Contributor

I understand it that both signals and messages must be handled by the process so in practice they all add to the backlog of things the process has to handle. And it is processes beginning to get a too big backlog we want to catch with the system_monitor.

@rickard-green
Copy link
Contributor

@max-au wrote:

I am thinking of it as a "total amount of messages" sent to the process (both in the queue and in flight).

The caveat I see is that signals are also counted towards sig_qs.len and it would be expensive to keep track of non-message signal count (and in fact NM-signals are also affecting process performance/reductions).

It is only message signals included in sig_qs.len but also message signals that have not yet been received. However, not all message signals that have not yet been received. There might be a large amount of message signals in sig_inq as well, especially in situations when we got a large sig_qs, since we only fetch from sig_inq when we have run out of things to handle in sig_qs. You, however, really do not want to mess with sig_inq. It is not only protected by locking, but might nowadays also consist of up to 64 separate queues.

When having a look at the code I also realized that in the case you got a process info request signal in the queue, updates of sig_qs.len are also delayed. That is, reading it at any arbitrary time may give you a value significantly smaller than even the message queue length.

That is, it is more or less impossible to describe what the sig_qs.len value represents at an arbitrary point in time...

@KennethL wrote:

I understand it that both signals and messages must be handled by the process so in practice they all add to the backlog of things the process has to handle. And it is processes beginning to get a too big backlog we want to catch with the system_monitor.

Yes such a system monitor limit is preferably based on some definition of signal queue length. A gen server call implies 3 signals (monitor, message, demonitor) from the client, so message signals are even in minority for this operation from the servers point of view. Such a value is, however, currently not easily accessible. If one restrict it to the signal queue length of the private signal queue, and accept increased memory consumption and decreased performance it can, however, easily be implemented. By adding two 64-bit words on each process, a couple of additions when fetching and a lot of subtractions (at every removal of a signal) one would have a value representing the signal queue length private to the process. There can however still be an unlimited amount of signals in transit to the process that won't be accounted for. If one would try to account for signals in transit it would be a much greater performance and scalability loss, so I don't even see that as a viable option at all.

@rickard-green
Copy link
Contributor

rickard-green commented Mar 23, 2022

If using sig_qs.len for a system monitor one needs to be really fuzzy about what it is. A system monitor called {incoming_message, Low, High} could perhaps be document as:

If the system detects an amount of incoming messages higher then High it will send you a monitor message, where "incoming messages" is defined as messages in the message queue and in transit. No more monitor message will be sent until the system has interpreted the amount of incoming as lower than Low. Note that the system might misinterpret the situation.

@max-au
Copy link
Contributor Author

max-au commented Mar 23, 2022

@rickard-green
I think that most usages of "long message queue" monitoring event would be to implement "circuit breaker". For example, call pg:leave for the process that had accumulated long message queue to provide overload protection (or to change message_queue_data to off_heap).

I wonder how long it might take for sig_inq to be moved to sig_qs. My reading suggests that it may not happen for quite some time. If that is true, it would be useful to have sig_inq taken into account.

Implementation that does not take sig_inq into account is cheap. It only affects how sig_inq gets moved to sig_qs (updating sig_qs.len). Before OTP 25, inner queue could not contain any non-message signals and it was easy to document that as "message queue length". With receive markers in the inner queue it is no longer valid explanation.

@rickard-green
Copy link
Contributor

@rickard-green I think that most usages of "long message queue" monitoring event would be to implement "circuit breaker". For example, call pg:leave for the process that had accumulated long message queue to provide overload protection (or to change message_queue_data to off_heap).

I wonder how long it might take for sig_inq to be moved to sig_qs.

It can take a very long time and it can contain a huge amount of signals. There is no hard upper limit.

My reading suggests that it may not happen for quite some time. If that is true, it would be useful to have sig_inq taken into account.

I don't see it as viable to take sig_inq into account. Both since it already today would be a scalability and performance bottleneck, but also since it could prevent future optimizations if introduced. The fact that it could prevent future optimizations to me makes it even worse than the fact that it would be a bottleneck.

Implementation that does not take sig_inq into account is cheap. It only affects how sig_inq gets moved to sig_qs (updating sig_qs.len). Before OTP 25, inner queue could not contain any non-message signals and it was easy to document that as "message queue length". With receive markers in the inner queue it is no longer valid explanation.

No it can not be described as message queue length. Not even before receive markers. It is a mixture of message queue length and the amount of some messages in transit, but with hickups which can give you values less than message queue length. If one were to use it, one needs to document it as something extremely fuzzy like in my comment above. Otherwise, someone will want a bug fix on it some day.

@rickard-green rickard-green added the testing currently being tested, tag is used by OTP internal CI label Mar 24, 2022
@rickard-green rickard-green merged commit 06b8cee into erlang:master Mar 28, 2022
@max-au max-au deleted the max-au/process-info-parent branch September 30, 2022 20:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
team:VM Assigned to OTP team VM testing currently being tested, tag is used by OTP internal CI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants