Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Program crashed #19

Closed
dananick opened this issue Jun 20, 2013 · 24 comments
Closed

Program crashed #19

dananick opened this issue Jun 20, 2013 · 24 comments

Comments

@dananick
Copy link

Where are the logs stored so I can find out why groundcontrol suddently stopped working after a few days?

@jondot
Copy link
Owner

jondot commented Jun 20, 2013

Hello,

Ground Control logs to STDOUT by default, so where ever you've redirected it is where your logs should be (e.g. groundcontrol > /var/log/your-log.log.
If you're not sure how to customize an init script to do that let me know and I can gist an example for you.

@dananick
Copy link
Author

I looked it up and i am not sure, I am not an expert in linux... I'd love an example. Will the log keep appending? Eventually it would get pretty big wouldn't it? I am all about headless and touchless with my pi.

Perhaps this could be a feature request, to log the output to file for x days or something?

@jondot
Copy link
Owner

jondot commented Jun 20, 2013

Yes, logs on the Pi can do 2 bad things - (1) grow without attention (2) trash the SD card with needless I/O.

So to solve (1) there's what's called 'log rotating'. Everyone likes to do their own policies and there's also built-in Linux tools to do that. Additionally each distribution may decide to its own how to do log rotating.

A good option might be to log to syslog assuming most if not all Pi distros already logrotate that.
I think I could test that solution in the upcoming days.

For now as long as Ground Control is under an init script, even if it crashes due to some exceptional case, init should bring it back up.

@dananick
Copy link
Author

Yup I got it going again by restarting the init script. Just curious on
what brought it down
On 2013-06-20 6:19 PM, "Dotan J. Nahum" notifications@github.com wrote:

Yes, logs on the Pi can do 2 bad things - (1) grow without attention (2)
trash the SD card with needless I/O.

So that's what's called 'log rotating'. Everyone likes to do their own
policies and there's also built-in Linux tools to do that. Additionally
each distribution may decide to its own how to do log rotating.

A good option might be to log to syslog assuming most Linuxes already
logrotate that.
I think I could test that solution in the upcoming days.

For now as long as Ground Control is under an init script, even if it
crashes due to some exceptional case, init should bring it back up.


Reply to this email directly or view it on GitHubhttps://github.com//issues/19#issuecomment-19784064
.

@dananick
Copy link
Author

ok it keeps going down after a couple of days. i think i need some help redirecting output to /var/log please?

@dananick dananick reopened this Jun 24, 2013
@spieiga
Copy link

spieiga commented Jun 26, 2013

yea mine also crashed after running for a day. not sure how to see why. could I just the command in init.d to this?

command="${gc_bin} -- -config ${gc_conf} > /var/log/groundcontrol.log"

and then /etc/init.d/groundcontrol restart? then the next time it crashes, there should be some output in the log

@jondot
Copy link
Owner

jondot commented Jun 26, 2013

Hey guys.

It can also be very simple if you choose not to use init.d, just for the sake of the debugging. Linux provides nohup and that ensures that your program will run even after you log out. So:

$ nohup groundcontrol -config=<your conf file> &

Note that we're sending the process to background with &. Output will be logged to nohup.out.

@Candunc
Copy link

Candunc commented Jun 27, 2013

Not alone, my groundcontrol goes down every once in a while. Since my Pi has been taken down because of corrupted SD card, I'll send you the error next time it occurs on my other computer.

@jondot
Copy link
Owner

jondot commented Jun 27, 2013

Hey Candunc, which distro are you using for your RPi?

@Candunc
Copy link

Candunc commented Jun 28, 2013

Raspbain, Groundcontrol has been stable today, don't see a crash. Simple sudo reboot and it died (Although groundcontrol reported an 85 C spike on the last working boot)

Crashes occur on non-raspberry pi versions more than the Pi, although it has crashed in the past.

@jondot
Copy link
Owner

jondot commented Jun 28, 2013

I had a ton of SD corruption with Raspbmc (like almost every other day), but since moving to Xbian (for my streamer Pi), I haven't had any for months.

I suspect the 85c spike is something that has to do with the Pi's sensor, I've had it a couple times as well so I think it's a false one.

I've been monitoring Groundcontrol closely since this issue and haven't gotten a crash like described yet.. hoping for a log.

@spieiga
Copy link

spieiga commented Jul 1, 2013

Left it running over the weekend and it crashed. In my init.d file, I have
command="${gc_bin} -- -config ${gc_conf} > /var/log/groundcontrol.log"

but when I check in /var/log there is no groundcontrol.log so maybe it just didn't write anything to stdout.

I ran
nohup ./groundcontrol -config=/etc/groundcontrol.json
instead now ... going to try with nohup to see if it catches anything the next time it crashes.

@spieiga
Copy link

spieiga commented Jul 2, 2013

Crashed today ... here is my nohup.out

2013/07/02 11:36:52 Reporters: TempoDB OK.
2013/07/02 11:36:52 Reporters: No Librato credentials, skipping.
2013/07/02 11:36:52 Lauching Health
2013/07/02 11:36:52 Launching Control
2013/07/02 11:37:24 Reporters: TempoDB OK.
2013/07/02 11:37:24 Reporters: No Librato credentials, skipping.
2013/07/02 11:37:24 Lauching Health
2013/07/02 11:37:24 Launching Control
2013/07/02 11:39:53 series [.....shortened.....]
2013/07/02 13:22:07 series [.....shortened.....]
panic: runtime error: invalid memory address or nil pointer dereference
[signal 0xb code=0x1 addr=0x8 pc=0x158e4]

goroutine 7 [running]:
main.(*TempoDBReporter).ReportHealth(0x10455500, 0x1046aac0)
/Users/dotan/projects/groundcontrol/tempodb_reporter.go:53 +0x127c
main.report(0x10475230, 0x104554f0)
/Users/dotan/projects/groundcontrol/main.go:105 +0x15c
main.func·001()
/Users/dotan/projects/groundcontrol/main.go:82 +0xb0
created by main.main
/Users/dotan/projects/groundcontrol/main.go:84 +0xc90

goroutine 1 [IO wait]:
net.runtime_pollWait(0xb69a6ec4, 0x72, 0x0)
/usr/local/Cellar/go/1.1rc1/src/pkg/runtime/znetpoll_linux_arm.c:118 +0x64
net.(_pollDesc).WaitRead(0x10475dbc, 0xb, 0x10469480)
/usr/local/Cellar/go/1.1rc1/src/pkg/net/fd_poll_runtime.go:75 +0x34
net.(_netFD).accept(0x10475d70, 0x27e9ec, 0x0, 0x10469480, 0xb, ...)
/usr/local/Cellar/go/1.1rc1/src/pkg/net/fd_unix.go:385 +0x2f4
net.(_TCPListener).AcceptTCP(0x104afda0, 0xb6aa8cbc, 0x6b318, 0x4)
/usr/local/Cellar/go/1.1rc1/src/pkg/net/tcpsock_posix.go:229 +0x4c
net.(_TCPListener).Accept(0x104afda0, 0x10473500, 0x10500380, 0x104944b0, 0x0, ...)
/usr/local/Cellar/go/1.1rc1/src/pkg/net/tcpsock_posix.go:239 +0x28
net/http.(_Server).Serve(0x1049fe70, 0x104cccc0, 0x104afda0, 0x0, 0x0, ...)
/usr/local/Cellar/go/1.1rc1/src/pkg/net/http/server.go:1542 +0x8c
net/http.(_Server).ListenAndServe(0x1049fe70, 0x1049fe70, 0xc)
/usr/local/Cellar/go/1.1rc1/src/pkg/net/http/server.go:1532 +0xa8
net/http.ListenAndServe(0x104d5140, 0xc, 0x0, 0x0, 0x2, ...)
/usr/local/Cellar/go/1.1rc1/src/pkg/net/http/server.go:1597 +0x70
main.main()
/Users/dotan/projects/groundcontrol/main.go:94 +0x10e4

goroutine 1404 [select]:
net/http.(_persistConn).writeLoop(0x1051ddc0)
/usr/local/Cellar/go/1.1rc1/src/pkg/net/http/transport.go:774 +0x228
created by net/http.(_Transport).dialConn
/usr/local/Cellar/go/1.1rc1/src/pkg/net/http/transport.go:512 +0x590

goroutine 1407 [select]:
net/http.(_persistConn).writeLoop(0x104f0f50)
/usr/local/Cellar/go/1.1rc1/src/pkg/net/http/transport.go:774 +0x228
created by net/http.(_Transport).dialConn
/usr/local/Cellar/go/1.1rc1/src/pkg/net/http/transport.go:512 +0x590

goroutine 1406 [IO wait]:
net.runtime_pollWait(0xb69a6d80, 0x72, 0x0)
/usr/local/Cellar/go/1.1rc1/src/pkg/runtime/znetpoll_linux_arm.c:118 +0x64
net.(_pollDesc).WaitRead(0x104f0f4c, 0xb, 0x10469480)
/usr/local/Cellar/go/1.1rc1/src/pkg/net/fd_poll_runtime.go:75 +0x34
net.(_netFD).Read(0x104f0f00, 0x105c4000, 0x1000, 0x1000, 0x0, ...)
/usr/local/Cellar/go/1.1rc1/src/pkg/net/fd_unix.go:195 +0x2fc
net.(_conn).Read(0x1050c100, 0x105c4000, 0x1000, 0x1000, 0x0, ...)
/usr/local/Cellar/go/1.1rc1/src/pkg/net/net.go:123 +0xc8
net.(_TCPConn).Read(0x1050c100, 0x105c4000, 0x1000, 0x1000, 0x401a0000, ...)
/usr/local/Cellar/go/1.1rc1/src/pkg/net/cgo_stub.go:0 +0x40
crypto/tls.(_block).readFromUntil(0x10548020, 0x1049eb40, 0x1050c100, 0x5, 0x1050c100, ...)
/usr/local/Cellar/go/1.1rc1/src/pkg/crypto/tls/conn.go:401 +0xcc
crypto/tls.(_Conn).readRecord(0x10546780, 0x17, 0x0, 0x0)
/usr/local/Cellar/go/1.1rc1/src/pkg/crypto/tls/conn.go:481 +0x11c
crypto/tls.(_Conn).Read(0x10546780, 0x10513000, 0x1000, 0x1000, 0x0, ...)
/usr/local/Cellar/go/1.1rc1/src/pkg/crypto/tls/conn.go:796 +0x118
bufio.(_Reader).fill(0x1046d4e0)
/usr/local/Cellar/go/1.1rc1/src/pkg/bufio/bufio.go:79 +0x144
bufio.(_Reader).Peek(0x1046d4e0, 0x1, 0x0, 0x0, 0x0, ...)
/usr/local/Cellar/go/1.1rc1/src/pkg/bufio/bufio.go:107 +0xbc
net/http.(_persistConn).readLoop(0x104f0f50)
/usr/local/Cellar/go/1.1rc1/src/pkg/net/http/transport.go:670 +0xc4
created by net/http.(*Transport).dialConn
/usr/local/Cellar/go/1.1rc1/src/pkg/net/http/transport.go:511 +0x568

goroutine 1403 [IO wait]:
net.runtime_pollWait(0xb69a6c3c, 0x72, 0x0)
/usr/local/Cellar/go/1.1rc1/src/pkg/runtime/znetpoll_linux_arm.c:118 +0x64
net.(_pollDesc).WaitRead(0x1051ddbc, 0xb, 0x10469480)
/usr/local/Cellar/go/1.1rc1/src/pkg/net/fd_poll_runtime.go:75 +0x34
net.(_netFD).Read(0x1051dd70, 0x10528000, 0x1000, 0x1000, 0x0, ...)
/usr/local/Cellar/go/1.1rc1/src/pkg/net/fd_unix.go:195 +0x2fc
net.(_conn).Read(0x10547898, 0x10528000, 0x1000, 0x1000, 0x0, ...)
/usr/local/Cellar/go/1.1rc1/src/pkg/net/net.go:123 +0xc8
net.(_TCPConn).Read(0x10547898, 0x10528000, 0x1000, 0x1000, 0x401a0000, ...)
/usr/local/Cellar/go/1.1rc1/src/pkg/net/cgo_stub.go:0 +0x40
crypto/tls.(_block).readFromUntil(0x10563780, 0x1049eb40, 0x10547898, 0x5, 0x10547898, ...)
/usr/local/Cellar/go/1.1rc1/src/pkg/crypto/tls/conn.go:401 +0xcc
crypto/tls.(_Conn).readRecord(0x10499780, 0x17, 0x0, 0x0)
/usr/local/Cellar/go/1.1rc1/src/pkg/crypto/tls/conn.go:481 +0x11c
crypto/tls.(_Conn).Read(0x10499780, 0x10516000, 0x1000, 0x1000, 0x0, ...)
/usr/local/Cellar/go/1.1rc1/src/pkg/crypto/tls/conn.go:796 +0x118
bufio.(_Reader).fill(0x104fce70)
/usr/local/Cellar/go/1.1rc1/src/pkg/bufio/bufio.go:79 +0x144
bufio.(_Reader).Peek(0x104fce70, 0x1, 0x0, 0x0, 0x0, ...)
/usr/local/Cellar/go/1.1rc1/src/pkg/bufio/bufio.go:107 +0xbc
net/http.(_persistConn).readLoop(0x1051ddc0)
/usr/local/Cellar/go/1.1rc1/src/pkg/net/http/transport.go:670 +0xc4
created by net/http.(*Transport).dialConn
/usr/local/Cellar/go/1.1rc1/src/pkg/net/http/transport.go:511 +0x568

@jondot
Copy link
Owner

jondot commented Jul 2, 2013

Thanks, that's a very very useful trace. I'll be working on it right away.

@jondot
Copy link
Owner

jondot commented Jul 2, 2013

OK I think I got to the root of the problem. This may be due to network interruption or the TempoDB API being unavailable momentarily.

I'll be pushing a fix in the next few minutes, as well as a new binary build.

Thanks for the help guys! 👍

@jondot jondot closed this as completed in ae49be4 Jul 2, 2013
@spieiga
Copy link

spieiga commented Jul 2, 2013

woohoo thanks jondot!! you're the man 👍

@jondot
Copy link
Owner

jondot commented Jul 2, 2013

Binary is now up. Here's a direct link - http://jondot.github.io/groundcontrol/groundcontrol-0.0.3.tar.gz

Very happy we were able to solve this :)

@spieiga
Copy link

spieiga commented Jul 2, 2013

also, don't forget to update init.d required start & stop ... I had to set it to $all for it to start on bootup properly.

@jondot
Copy link
Owner

jondot commented Jul 2, 2013

Yup, it's already packed in the new tarball - thanks. I've eventually taken what ever preconditions sickbeard was using:

https://github.com/midgetspy/Sick-Beard/blob/master/init.ubuntu

And here's the new commit:

https://github.com/jondot/groundcontrol/blob/master/support/init.d/groundcontrol

@spieiga
Copy link

spieiga commented Jul 2, 2013

oh I see it now ... didn't realize it was in the tarball. thanks!

@jondot
Copy link
Owner

jondot commented Jul 2, 2013

Yep, I remembered your other issue just a few minutes before building the new tar so I managed to sneak it in :)

@dananick
Copy link
Author

dananick commented Jul 2, 2013

Question: i just moved the new 0.03 version to opt/groundcontrol/groundcontrol, chmod and chown appropriately... do i need to update my init.d as well?

@jondot
Copy link
Owner

jondot commented Jul 2, 2013

Nope. if it was working before, after the rename you did looks like you're good to go.

@dananick
Copy link
Author

dananick commented Jul 5, 2013

Happy to report that groundcontrol has been running smooth for 3 days now no issues 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants