Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use init:stop to make sure the zotonic node shuts down cleanly #2957

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

mmzeeman
Copy link
Member

@mmzeeman mmzeeman commented May 5, 2022

Make sure started applications are stopped before shutting down the node

Please describe here what the PR does.

erlang:halt(0) immediately stops the node, without waiting for applications to stop
cleanly. Applications which need to do teardown work before stopping will not do this. The init application, which coordinates system startup, has a stop function which cleanly handles shutting down the node, and manage heart when needed. See: https://www.erlang.org/doc/man/init.html#stop-1

Checklist

  • documentation updated
  • tests added
  • no BC breaks

@mmzeeman mmzeeman requested a review from mworrell May 5, 2022 08:01
@mworrell
Copy link
Member

mworrell commented May 5, 2022

We added the rather brutal kill as the init:stop() tries to unload beam files, which takes very long on some systems. This seems to be fixed in OTP25 (as you mentioned).

Maybe we can stop all applications instead?

@mmzeeman
Copy link
Member Author

mmzeeman commented May 5, 2022

Something like?

[ application:stop(App) || {App, _, _} <- application:which_applications() ],

@mworrell
Copy link
Member

mworrell commented May 5, 2022

Maybe, I guess we don't have a dependency graph for a clean shutdown?

@mmzeeman
Copy link
Member Author

mmzeeman commented May 5, 2022

Here is the original issue regarding the unloading behaviour. I guess we had this problem even on faster computers because zotonic loads a lot of modules. erlang/otp#5031

@mworrell
Copy link
Member

mworrell commented May 5, 2022

Yes, I got some complaints from people, especially on non MacBook Pro machines ...

@mmzeeman
Copy link
Member Author

mmzeeman commented May 5, 2022

I don't know if the application module maintains some sort of order. It could be that they just return the applications in reverse order they where started. That would be good I guess.

@mworrell
Copy link
Member

mworrell commented May 5, 2022

Looking at the application list it seems to be that way.

@mworrell mworrell added this to the 1.0 milestone May 6, 2022
application:stop(mnesia),
application:stop(epgsql),

[ application:stop(A) || {A, _, _} <- application:which_applications() ],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we first stop the other applications and then the base apps like exometer, jobs and mnesia?

It might be that the other apps want to save something in the databases.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes you are right. I noticed the order is important. When it is wrong it can lead to unexpected hangs. Maybe we should just use init:stop/0. OTP 25 is just around the corner for anybody who complains about slowness.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is really slow.

We might also use init:stop/0 if the OTP version is >= 25.

And the other shutdown on older versions.

@mworrell
Copy link
Member

@mmzeeman can you check my latest change?

@mmzeeman
Copy link
Member Author

It is reporting a lot of errors.

(zotonic@Tukkal)2> zotonic:stop().
2022-05-23 10:50:05 ERROR <0.1917.0> [:] ▸ text="Error in process <0.1917.0> on node zotonic@Tukkal with exit value:\
{badarg,[{ets,lookup,\
              ['m:z_depcache$site_cafe',333],\
              [{error_info,#{cause => id,module => erl_stdlib_errors}}]},\
         {depcache,get_concurrent,5,\
                   [{file,\"/Users/mmzeeman/Work/git/zotonic/master/zotonic/_build/default/lib/depcache/src/depcache.erl\"},\
                    {line,630}]},\
         {depcache,get_ets,2,\
                   [{file,\"/Users/mmzeeman/Work/git/zotonic/master/zotonic/_build/default/lib/depcache/src/depcache.erl\"},\
                    {line,350}]},\
         {depcache,get,3,\
                   [{file,\"/Users/mmzeeman/Work/git/zotonic/master/zotonic/_build/default/lib/depcache/src/depcache.erl\"},\
                    {line,265}]},\
...

and stopping again, after it has already stopped.

(zotonic@Tukkal)3> zotonic:stop().
** exception error: bad argument
     in function  erlang:send/2
        called as erlang:send(heart,{<0.1993.0>,set_cmd,"echo ok"})
        *** argument 1: invalid destination
     in call from heart:set_cmd/1 (heart.erl, line 109)
     in call from zotonic:stop/0 (/Users/mmzeeman/Work/git/zotonic/master/zotonic/apps/zotonic_launcher/src/zotonic.erl, line 120)

Will look at it later. The order in which things are taken down is probably wrong.

@mmzeeman
Copy link
Member Author

Hmm... strange, now it won't stop at all.

> ./bin/zotonic stop
Stopping zotonic zotonic@Tukkal ................................. Still not stopped after 30 seconds

@mworrell
Copy link
Member

Maybe we should just do the brute force stop on OTP <= 24, and the nice stop on OTP 25+

@mworrell mworrell modified the milestones: 1.0, 1.1 Jan 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants