New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How do we see the details if the service shows the state "error" #92

Closed
lacivert opened this Issue Dec 27, 2016 · 25 comments

Comments

Projects
None yet
5 participants
@lacivert

lacivert commented Dec 27, 2016

I installed mysql via brew and I saw that it was working yesterday by list command;

brew services list

But today I see that mysql service has the "error" state even though it looks like it works without a problem.

Now how can we see why brew list says there is an error and how do we see the details of this "error" state?

@lacivert

This comment has been minimized.

Show comment
Hide comment
@lacivert

lacivert Dec 27, 2016

$ brew services list
Name           Status  User  Plist
docker-machine stopped       
mysql          error   yasin /Users/yasin/Library/LaunchAgents/homebrew.mxcl.mysql.plist
postgresql     stopped 

lacivert commented Dec 27, 2016

$ brew services list
Name           Status  User  Plist
docker-machine stopped       
mysql          error   yasin /Users/yasin/Library/LaunchAgents/homebrew.mxcl.mysql.plist
postgresql     stopped 
@ilovezfs

This comment has been minimized.

Show comment
Hide comment
@ilovezfs

ilovezfs Dec 27, 2016

Contributor

You can enable logging of stdout to a file and stderr to a file in the plist with the keys StandardOutPath and StandardErrorPath, which should throw some light on what's happening.

Contributor

ilovezfs commented Dec 27, 2016

You can enable logging of stdout to a file and stderr to a file in the plist with the keys StandardOutPath and StandardErrorPath, which should throw some light on what's happening.

@lacivert

This comment has been minimized.

Show comment
Hide comment
@lacivert

lacivert Dec 28, 2016

Thanks @ilovezfs. Sorry about that but I don't know what these are all about on your comment. How can I do that exactly (or how should I search on Google what these all means shortly)?

lacivert commented Dec 28, 2016

Thanks @ilovezfs. Sorry about that but I don't know what these are all about on your comment. How can I do that exactly (or how should I search on Google what these all means shortly)?

@ilovezfs

This comment has been minimized.

Show comment
Hide comment
@ilovezfs

ilovezfs Dec 28, 2016

Contributor
--- homebrew.mxcl.mysql.plist.orig	2016-12-28 04:11:24.000000000 -0800
+++ homebrew.mxcl.mysql.plist	2016-12-28 04:13:54.000000000 -0800
@@ -16,5 +16,9 @@
   <true/>
   <key>WorkingDirectory</key>
   <string>/usr/local/var/mysql</string>
+  <key>StandardOutPath</key>
+  <string>/tmp/homebrew.mxcl.mysql.stdout.log</string>
+  <key>StandardErrorPath</key>
+  <string>/tmp/homebrew.mxcl.mysql.stderr.log</string>
 </dict>
 </plist>

That diff to the plist should work.

Contributor

ilovezfs commented Dec 28, 2016

--- homebrew.mxcl.mysql.plist.orig	2016-12-28 04:11:24.000000000 -0800
+++ homebrew.mxcl.mysql.plist	2016-12-28 04:13:54.000000000 -0800
@@ -16,5 +16,9 @@
   <true/>
   <key>WorkingDirectory</key>
   <string>/usr/local/var/mysql</string>
+  <key>StandardOutPath</key>
+  <string>/tmp/homebrew.mxcl.mysql.stdout.log</string>
+  <key>StandardErrorPath</key>
+  <string>/tmp/homebrew.mxcl.mysql.stderr.log</string>
 </dict>
 </plist>

That diff to the plist should work.

@MikeMcQuaid

This comment has been minimized.

Show comment
Hide comment
@MikeMcQuaid

MikeMcQuaid Dec 29, 2016

Member

If a service shows "error" that means that launchd has detected an error since the process was started. It may be still running on your machine, though, but it just means that launchd has detected an issue. If it's working fine you can ignore it.

CC @antstorm for additional thoughts/corrections and how we can perhaps adjust this to improve the usability.

Member

MikeMcQuaid commented Dec 29, 2016

If a service shows "error" that means that launchd has detected an error since the process was started. It may be still running on your machine, though, but it just means that launchd has detected an issue. If it's working fine you can ignore it.

CC @antstorm for additional thoughts/corrections and how we can perhaps adjust this to improve the usability.

@antstorm

This comment has been minimized.

Show comment
Hide comment
@antstorm

antstorm Dec 29, 2016

Contributor

@lacivert so error status means one of two things — either launchd doesn't have PID of the process or the exit code was non-zero. Both mean that there's something wrong with the service and launchd won't be able to handle it properly.

Can you please copy the output of launchctl list | grep mysql and ps aux | grep [m]ysql?

Contributor

antstorm commented Dec 29, 2016

@lacivert so error status means one of two things — either launchd doesn't have PID of the process or the exit code was non-zero. Both mean that there's something wrong with the service and launchd won't be able to handle it properly.

Can you please copy the output of launchctl list | grep mysql and ps aux | grep [m]ysql?

@lacivert

This comment has been minimized.

Show comment
Hide comment
@lacivert

lacivert Dec 29, 2016

@antstorm, un(!)fortunately now mysql looks "started" well without error state, but I'll try to response you if I get the same issue.

lacivert commented Dec 29, 2016

@antstorm, un(!)fortunately now mysql looks "started" well without error state, but I'll try to response you if I get the same issue.

@antstorm

This comment has been minimized.

Show comment
Hide comment
@antstorm

antstorm Dec 29, 2016

Contributor

@lacivert oh, I see. Well in case that happens it's worth looking at launchctl list to see what exactly is going on.

Contributor

antstorm commented Dec 29, 2016

@lacivert oh, I see. Well in case that happens it's worth looking at launchctl list to see what exactly is going on.

@MikeMcQuaid

This comment has been minimized.

Show comment
Hide comment
@MikeMcQuaid

MikeMcQuaid Dec 30, 2016

Member

Closing this as we've got instructions on what to do here.

@antstorm Thoughts about how to expose or document this better? Put an explanation in the README? If so, is that something you could open a PR for? Thanks!

Member

MikeMcQuaid commented Dec 30, 2016

Closing this as we've got instructions on what to do here.

@antstorm Thoughts about how to expose or document this better? Put an explanation in the README? If so, is that something you could open a PR for? Thanks!

@antstorm

This comment has been minimized.

Show comment
Hide comment
@antstorm

antstorm Dec 31, 2016

Contributor

@MikeMcQuaid yeah, I'll open a PR. Apart from README we can use the unknown status (which is slightly more gentle than error) when the exit code was 0, but there's a pid missing.

Contributor

antstorm commented Dec 31, 2016

@MikeMcQuaid yeah, I'll open a PR. Apart from README we can use the unknown status (which is slightly more gentle than error) when the exit code was 0, but there's a pid missing.

@MikeMcQuaid

This comment has been minimized.

Show comment
Hide comment
@MikeMcQuaid

MikeMcQuaid Dec 31, 2016

Member

@MikeMcQuaid yeah, I'll open a PR. Apart from README we can use the unknown status (which is slightly more gentle than error) when the exit code was 0, but there's a pid missing.

@antstorm Thanks! I reckon a missing pid should also probably just be started.

Member

MikeMcQuaid commented Dec 31, 2016

@MikeMcQuaid yeah, I'll open a PR. Apart from README we can use the unknown status (which is slightly more gentle than error) when the exit code was 0, but there's a pid missing.

@antstorm Thanks! I reckon a missing pid should also probably just be started.

@antstorm

This comment has been minimized.

Show comment
Hide comment
@antstorm

antstorm Jan 2, 2017

Contributor

@MikeMcQuaid not really and a very popular use case here is redis. It has daemonize yes listed in default settings, which somehow conflicts with launchd resulting in it not running and:

$ launchctl list | grep redis
-	0	homebrew.mxcl.redis

This also might try to unsuccessfully restart the service every 10 seconds depending on the settings in plist.

Contributor

antstorm commented Jan 2, 2017

@MikeMcQuaid not really and a very popular use case here is redis. It has daemonize yes listed in default settings, which somehow conflicts with launchd resulting in it not running and:

$ launchctl list | grep redis
-	0	homebrew.mxcl.redis

This also might try to unsuccessfully restart the service every 10 seconds depending on the settings in plist.

@MikeMcQuaid

This comment has been minimized.

Show comment
Hide comment
@MikeMcQuaid

MikeMcQuaid Jan 2, 2017

Member

@antstorm Thanks for fixing up redis. My view is we should save non-started for cases where we 100% know it's failed. That means we provide an improvement on the existing brew services interface without causing confusion when it's working fine (and just the plist has been misconfigured). I'd agree with making unknown be shown if HOMEBREW_DEVELOPER is set in the environment, though as those are the folks who might know enough to fix it.

Member

MikeMcQuaid commented Jan 2, 2017

@antstorm Thanks for fixing up redis. My view is we should save non-started for cases where we 100% know it's failed. That means we provide an improvement on the existing brew services interface without causing confusion when it's working fine (and just the plist has been misconfigured). I'd agree with making unknown be shown if HOMEBREW_DEVELOPER is set in the environment, though as those are the folks who might know enough to fix it.

@antstorm

This comment has been minimized.

Show comment
Hide comment
@antstorm

antstorm Jan 2, 2017

Contributor

@MikeMcQuaid I think both missing pid and non-zero exit code are good indicators that there's something wrong with the service. error might be a bit too harsh, but started is as confusing, because probably nothing's running.

I wonder if exposing pid would make a difference in reporting what's happening with the service. It's not for me to decide, but it would be really great to have all the info needed to figure out what's happening in brew services list, without having to ps aux, launchctl list and so on…

Contributor

antstorm commented Jan 2, 2017

@MikeMcQuaid I think both missing pid and non-zero exit code are good indicators that there's something wrong with the service. error might be a bit too harsh, but started is as confusing, because probably nothing's running.

I wonder if exposing pid would make a difference in reporting what's happening with the service. It's not for me to decide, but it would be really great to have all the info needed to figure out what's happening in brew services list, without having to ps aux, launchctl list and so on…

@MikeMcQuaid

This comment has been minimized.

Show comment
Hide comment
@MikeMcQuaid

MikeMcQuaid Jan 2, 2017

Member

I think both missing pid and non-zero exit code are good indicators that there's something wrong with the service. error might be a bit too harsh, but started is as confusing, because probably nothing's running.

Sure. I think the difference is that one indicates a failure in starting and one may just indicate the Homebrew plist has been misconfigured. Given the latter we don't have 100% certainty that the service isn't running I think we should err on the side of maintaining the previous interface for non-Homebrew developers.

I wonder if exposing pid would make a difference in reporting what's happening with the service. It's not for me to decide, but it would be really great to have all the info needed to figure out what's happening in brew services list, without having to ps aux, launchctl list and so on…

More subcommands that wrap launchctl would certainly be useful 👍

Member

MikeMcQuaid commented Jan 2, 2017

I think both missing pid and non-zero exit code are good indicators that there's something wrong with the service. error might be a bit too harsh, but started is as confusing, because probably nothing's running.

Sure. I think the difference is that one indicates a failure in starting and one may just indicate the Homebrew plist has been misconfigured. Given the latter we don't have 100% certainty that the service isn't running I think we should err on the side of maintaining the previous interface for non-Homebrew developers.

I wonder if exposing pid would make a difference in reporting what's happening with the service. It's not for me to decide, but it would be really great to have all the info needed to figure out what's happening in brew services list, without having to ps aux, launchctl list and so on…

More subcommands that wrap launchctl would certainly be useful 👍

@antstorm

This comment has been minimized.

Show comment
Hide comment
@antstorm

antstorm Jan 2, 2017

Contributor

@MikeMcQuaid

Sure. I think the difference is that one indicates a failure in starting and one may just indicate the Homebrew plist has been misconfigured. Given the latter we don't have 100% certainty that the service isn't running I think we should err on the side of maintaining the previous interface for non-Homebrew developers.

Not really, if the service simply decided to stop executing (shouldn't normally happen, but still very possible). Missing pid also means that stopping the service with brew services stop SERVICE won't actually do anything apart from removing the plist. What I'm getting at is that I'd rather know there's something wrong with the services than have to manually check it just to find out it's not actually running.

But then again, it's only my opinion… 😄 Perfectly happy to agree to disagree on this one and do whatever you think is better for the project and it's users.

Contributor

antstorm commented Jan 2, 2017

@MikeMcQuaid

Sure. I think the difference is that one indicates a failure in starting and one may just indicate the Homebrew plist has been misconfigured. Given the latter we don't have 100% certainty that the service isn't running I think we should err on the side of maintaining the previous interface for non-Homebrew developers.

Not really, if the service simply decided to stop executing (shouldn't normally happen, but still very possible). Missing pid also means that stopping the service with brew services stop SERVICE won't actually do anything apart from removing the plist. What I'm getting at is that I'd rather know there's something wrong with the services than have to manually check it just to find out it's not actually running.

But then again, it's only my opinion… 😄 Perfectly happy to agree to disagree on this one and do whatever you think is better for the project and it's users.

@MikeMcQuaid

This comment has been minimized.

Show comment
Hide comment
@MikeMcQuaid

MikeMcQuaid Jan 2, 2017

Member

But then again, it's only my opinion… 😄 Perfectly happy to agree to disagree on this one and do whatever you think is better for the project and it's users.

Yeh, I think having it be an optional flag for now would be best 👍

Member

MikeMcQuaid commented Jan 2, 2017

But then again, it's only my opinion… 😄 Perfectly happy to agree to disagree on this one and do whatever you think is better for the project and it's users.

Yeh, I think having it be an optional flag for now would be best 👍

@lacivert

This comment has been minimized.

Show comment
Hide comment
@lacivert

lacivert Jan 3, 2017

my 5 pennies; if the state is "unknown" for brew, could it be better to see "unknown" when we command "services list"?

lacivert commented Jan 3, 2017

my 5 pennies; if the state is "unknown" for brew, could it be better to see "unknown" when we command "services list"?

@uicosp

This comment has been minimized.

Show comment
Hide comment
@uicosp

uicosp Jan 9, 2017

I had the same issue today.
after updated something (brew itself and the formula), my dnsmasq and nginx show error status, thought everything works well.

⋊> ~ brew services list                                                 16:57:59
Name    Status  User Plist
dnsmasq error root /Library/LaunchDaemons/homebrew.mxcl.dnsmasq.plist
nginx   error root /Library/LaunchDaemons/homebrew.mxcl.nginx.plist

I debug as @ilovezfs said, but nothing in stdout.log and stderr.log .

Then I read the introduction on A launchd Tutorial, and learned that there are several locations to define the job (.plist file) . And

launchd differentiates between agents and daemons. The main difference is that an agent is run on behalf of the logged in user while a daemon runs on behalf of the root user or any user you specify with the UserName key.

So I checked those locations, and find extra dnsmasq and nginx .plist files in ~/Library/LaunchAgents.

So the problem is dnsmasq and nginx .plist files in /Library/LaunchDaemons has been loaded and started successfully when system(root) booted. Then load the files on behalf of current user in ~/Library/LaunchAgents and try to start them again, since they started already, a not zero exit code return, this why we will see error status when run brew services list.

run sudo launchctl list | grep nginx, I got this:

⋊> ~ sudo launchctl list | grep nginx                                   17:33:09
52	0	homebrew.mxcl.nginx

while run launchctl list | grep nginx:

⋊> ~ launchctl list | grep nginx                                   17:33:39
-	2	homebrew.mxcl.nginx

How to fix
Simply delete the extra files in ~/Library/LaunchAgents, and restart your computer.
You will see the status is started now!

uicosp commented Jan 9, 2017

I had the same issue today.
after updated something (brew itself and the formula), my dnsmasq and nginx show error status, thought everything works well.

⋊> ~ brew services list                                                 16:57:59
Name    Status  User Plist
dnsmasq error root /Library/LaunchDaemons/homebrew.mxcl.dnsmasq.plist
nginx   error root /Library/LaunchDaemons/homebrew.mxcl.nginx.plist

I debug as @ilovezfs said, but nothing in stdout.log and stderr.log .

Then I read the introduction on A launchd Tutorial, and learned that there are several locations to define the job (.plist file) . And

launchd differentiates between agents and daemons. The main difference is that an agent is run on behalf of the logged in user while a daemon runs on behalf of the root user or any user you specify with the UserName key.

So I checked those locations, and find extra dnsmasq and nginx .plist files in ~/Library/LaunchAgents.

So the problem is dnsmasq and nginx .plist files in /Library/LaunchDaemons has been loaded and started successfully when system(root) booted. Then load the files on behalf of current user in ~/Library/LaunchAgents and try to start them again, since they started already, a not zero exit code return, this why we will see error status when run brew services list.

run sudo launchctl list | grep nginx, I got this:

⋊> ~ sudo launchctl list | grep nginx                                   17:33:09
52	0	homebrew.mxcl.nginx

while run launchctl list | grep nginx:

⋊> ~ launchctl list | grep nginx                                   17:33:39
-	2	homebrew.mxcl.nginx

How to fix
Simply delete the extra files in ~/Library/LaunchAgents, and restart your computer.
You will see the status is started now!

@antstorm

This comment has been minimized.

Show comment
Hide comment
@antstorm

antstorm Jan 9, 2017

Contributor

@uicosp I was able to replicate that behaviour by running a service (nginx) as a root and then as other user — it did fail to start a service the second time, while plist file was created anyways and thus showing the error.

I wonder if brew services should prevent from starting a service if it's already running by a different user…

Contributor

antstorm commented Jan 9, 2017

@uicosp I was able to replicate that behaviour by running a service (nginx) as a root and then as other user — it did fail to start a service the second time, while plist file was created anyways and thus showing the error.

I wonder if brew services should prevent from starting a service if it's already running by a different user…

@MikeMcQuaid

This comment has been minimized.

Show comment
Hide comment
@MikeMcQuaid

MikeMcQuaid Jan 9, 2017

Member

I wonder if brew services should prevent from starting a service if it's already running by a different user…

@antstorm If it's possible to query that as a non-root user: it seems like a good idea.

Member

MikeMcQuaid commented Jan 9, 2017

I wonder if brew services should prevent from starting a service if it's already running by a different user…

@antstorm If it's possible to query that as a non-root user: it seems like a good idea.

@antstorm

This comment has been minimized.

Show comment
Hide comment
@antstorm

antstorm Jan 9, 2017

Contributor

@MikeMcQuaid should be possible, because we already do that to show services started as root. That would only rely on the plist file's presence though, but still might be worth implementing. WDYT?

Contributor

antstorm commented Jan 9, 2017

@MikeMcQuaid should be possible, because we already do that to show services started as root. That would only rely on the plist file's presence though, but still might be worth implementing. WDYT?

@MikeMcQuaid

This comment has been minimized.

Show comment
Hide comment
@MikeMcQuaid

MikeMcQuaid Jan 9, 2017

Member

@antstorm I agree it's worth implementing. Want to give it a shot?

Member

MikeMcQuaid commented Jan 9, 2017

@antstorm I agree it's worth implementing. Want to give it a shot?

@antstorm

This comment has been minimized.

Show comment
Hide comment
@antstorm

antstorm Jan 9, 2017

Contributor

@MikeMcQuaid yeah, absolutely

Contributor

antstorm commented Jan 9, 2017

@MikeMcQuaid yeah, absolutely

@MikeMcQuaid

This comment has been minimized.

Show comment
Hide comment
@MikeMcQuaid

MikeMcQuaid Jan 11, 2017

Member

Going to close this because I consider it improved in a few ways.

Member

MikeMcQuaid commented Jan 11, 2017

Going to close this because I consider it improved in a few ways.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment