New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prevent application from crashing if child process crashes - Closes #3142 #3168
Prevent application from crashing if child process crashes - Closes #3142 #3168
Conversation
It prevents the parent process to crash when child process exits
axon and axon-rpc are not maintained anymore and contain bugs that were fixed by pm2 fork.
@@ -127,6 +130,7 @@ class Controller { | |||
'lisk', | |||
['ready'], | |||
{ | |||
isLiskReady: () => this.liskReady, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we don't need this action? We must not create actions preferably if the it can be sorted through events. And lisk:ready
event is the right solution for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need this when reloading the module
await this.bus.unregisterChannel(moduleAlias); | ||
|
||
// Reload the module | ||
this._loadChildProcessModule(alias, Klass, options); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once child process is exit
, we must not load it again at least for this release. We can develop more dedicated reload strategy later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was discussed and decided to reload the faulty module. It can be improved in next versions but it works pretty well as it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No its not the right solution, it can lead into infinite reloading and exhaust the system memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For current release its good if parent process exit wit proper logs if child process is crashed. Later we can develop a strategy, how to reload modules, or even have configurable options with module load retry etc. This topic needed to have more thought process and a refined solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can improve the solution later on, but the resilience should be there in 1.6.0.
|
||
channel.once('lisk:ready', () => { | ||
if (isLiskReady) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not the right way to do it, lisk:ready
event is triggered after load
is called. So it must be listened as of previous implementation. I suspect you changed this because of module reload, if yes then as I told earlier, we should develop that strategy as an enhancement later. And leave that implication as it was before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it was introduced to allow modules to be reloaded in case it crashes and I don't see the reason of reverting this implementation if it works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If child process cause some memory leaks crash, then this infinite loading will exhaust the resources.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree and we plan to improve this already. This is the first implementation of child process and wont be the final one. Knowing that this implementation can be unstable, I've created a property in the config file to completely disable it if required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this change is not related to stability. If there is an issue and any of code crash the whole application exited. That's exactly the default behaviour we have from start and we should keep it consistent in this release. In next release we develop resilience with a detailed and extensive approach.
…tion-crashing-with-child
Requested changes are addressed. |
What was the problem?
Application was crashing if a child process exited
How did I fix it?
I've implemented a reload mechanism that reload child process when it crashes without impacting the rest of application
How to test it?
Run the application with
httpApi
andchain
modules as child process:NODE_ENV=test LISK_CHILD_PROCESS_MODULES=httpApi,chain node src/index.js -n devnet
Kill
httpApi
process (it should reload the module and keep running normally):pgrep -f http_api | xargs kill -9
Kill
chain
process (it should reload the module and keep running normally):pgrep -f chain | xargs kill -9
Review checklist