Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Print job frozen mid print. Octoprint server offline. Hotend and bed remain heated. #2647

Open
devsfan1830 opened this Issue May 16, 2018 · 54 comments

Comments

Projects
None yet
@devsfan1830
Copy link

devsfan1830 commented May 16, 2018

What were you doing?

Printing a model

Run a print

What did you expect to happen?

The print to finish.

What happened instead?

The print was hung mid print, hotend and bed fully heated. Touch UI on my Pi 7" touchscreen reported server offline. Web access via any other computer failed. Access via Octoprint anywhere and Printoid also failed.

This is the same as previously reported and closed issue here: #2118

Only difference is that the print did not finish and had only been active for a couple hours.
I followed the advice in that thread and posted the results there. In case that cannot be reopened, this is what I posted there:

I can reach the Pi via SSH. But web ui via any computer and the pi touchscreen wont respond. Says the server is offline. , I use Octoprint Anywhere for remote access and it will not connect. Printoid I use for my phone access while home since the printer is now in my basement. That too will not connect.

So, via SSH i get the following:
"ps -ef | grep -i octoprint" outputs this:
pi 15441 14056 0 21:54 pts/0 00:00:00 grep --color=auto -i octoprint

"ps -ef | grep haproxy" outputs the following:
root 925 1 0 12:10 ? 00:00:00 /usr/sbin/haproxy-systemd-wrapper -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid
haproxy 927 925 0 12:10 ? 00:00:00 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds
haproxy 928 927 0 12:10 ? 00:00:06 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds
pi 19468 14056 0 22:05 pts/0 00:00:00 grep --color=auto haproxy

"ps -ef | grep -i octoprint | grep -i python" returns nothing

-Worth noting I do not have haproxy configured anymore. This is a few months old reflash of octopi and i since closed off public internet access to my network, Hence Octoprint Anywhere just for basic remote view and i can kill a failed job.

Using "sudo service octoprint restart" did indeed restart octoprint without a full reboot.

Did the same happen when running OctoPrint in safe mode?

Not tested in safe mode. The issue is extremely intermittent and seemingly random. Last case of the issue was a few months ago.

Version of OctoPrint

1.3.8

Operating System running OctoPrint

OctoPi 0.14.0

Printer model & used firmware incl. version

Prusa i3 MK2S, Firmware 3.1.0

Browser and version of browser, operating system running browser

Google Chrome 66.0.3359.139, Windows 10 64-bit

Link to octoprint.log

octoprint.log

Link to contents of terminal tab or serial.log

unavailable. serial.log not enabled.

Link to contents of Javascript console in the browser

not a UI issue

Screenshot(s)/video(s) showing the problem:

I have read the FAQ.
Yes

@foosel

This comment has been minimized.

Copy link
Owner

foosel commented May 17, 2018

So, this looks like a server crash, and one that doesn't even log any kind of exceptions to the log anymore. Notice how the heartbeat just stops, without any kind of errors being logged.

The problem is that since it doesn't even log anything, this makes it tricky to determine what might be the cause here. There's another issue about the Python process in which OctoPrint runs simple segfaulting, see #2424. That also contains steps on how to figure out what's making it crash - please try that and when it happens next provide the resulting log here.

If at all possible, running a lengthy period in safe mode would also help narrow this down further. It might be pure coincidence, but I notice a lot of people running into weird crash issues the past couple of months all had the Anywhere plugin installed and had it run into backoff log messages. Maybe those log messages have the same cause as the crashes or something like that.

@JabawokJayUK

This comment has been minimized.

Copy link

JabawokJayUK commented May 20, 2018

Same happening for me.

has happened on multiple prints, both in safe mode and normal.

Syslog would indicate that the serial driver is dropping:
May 20 18:53:05 octopi kernel: [194758.424374] ftdi_sio ttyUSB0: usb_serial_generic_read_bulk_callback - urb stopped: -32 May 20 18:53:05 octopi kernel: [194758.424479] ftdi_sio ttyUSB0: usb_serial_generic_read_bulk_callback - urb stopped: -32 May 20 18:53:05 octopi kernel: [194758.562635] usb 1-1.3: USB disconnect, device number 9 May 20 18:53:05 octopi kernel: [194758.562930] ftdi_sio ttyUSB0: error from flowcontrol urb May 20 18:53:05 octopi kernel: [194758.563302] ftdi_sio ttyUSB0: FTDI USB Serial Device converter now disconnected from ttyUSB0 May 20 18:53:05 octopi kernel: [194758.563339] ftdi_sio 1-1.3:1.0: device disconnected May 20 18:53:05 octopi kernel: [194758.855333] usb 1-1.3: new full-speed USB device number 10 using dwc_otg May 20 18:53:05 octopi kernel: [194759.013986] usb 1-1.3: New USB device found, idVendor=0403, idProduct=6001 May 20 18:53:05 octopi kernel: [194759.013995] usb 1-1.3: New USB device strings: Mfr=1, Product=2, SerialNumber=3 May 20 18:53:05 octopi kernel: [194759.013999] usb 1-1.3: Product: FT232R USB UART May 20 18:53:05 octopi kernel: [194759.014002] usb 1-1.3: Manufacturer: FTDI May 20 18:53:05 octopi kernel: [194759.014006] usb 1-1.3: SerialNumber: A907S16M May 20 18:53:05 octopi kernel: [194759.022120] ftdi_sio 1-1.3:1.0: FTDI USB Serial Device converter detected May 20 18:53:05 octopi kernel: [194759.022310] usb 1-1.3: Detected FT232RL May 20 18:53:05 octopi kernel: [194759.023110] usb 1-1.3: FTDI USB Serial Device converter now attached to ttyUSB0

This is also supported by the serial.log (taken in safe mode):

018-05-20 18:53:04,982 - Send: N114890 G1 X222.788 Y103.174 E0.2691*110 2018-05-20 18:53:04,993 - Recv: ok 2018-05-20 18:53:04,997 - Send: N114891 G1 X222.657 Y103.318 E0.2769*98 2018-05-20 18:53:05,163 - Unexpected error while reading serial port, please consult octoprint.log for details: SerialException: 'device reports readiness to read but returned no data (device disconnected or multiple access on port?)' @ comm.py:_readline:2417 2018-05-20 18:53:05,165 - Please see https://faq.octoprint.org/serialerror for possible reasons of this. 2018-05-20 18:53:05,177 - Changing monitoring state from "Printing" to "Offline (Error: SerialException: 'device reports readiness to read but returned no data (device disconnected or multiple access on port?)' @ comm.py:_readline:2417)" 2018-05-20 18:53:05,189 - Connection closed, closing down monitor
This behavior has only started since the most recent update. The server seems to remain up though so a different situation than previously diagnosed with the same behavior.

@cassianoleal

This comment has been minimized.

Copy link

cassianoleal commented May 21, 2018

I'm seeing a similar issue but the server stays up and running. I was running it on a tmux session and there's no segfault or anything weird apart from the serial error.

This is octoprint's output:

2018-05-21 20:55:23,141 - octoprint.util.comm - INFO - Changing monitoring state from "Offline" to "Opening serial port"
2018-05-21 20:55:23,145 - octoprint.util.comm - INFO - Changing monitoring state from "Opening serial port" to "Connecting"
2018-05-21 20:55:23,166 - octoprint.util.comm - INFO - M110 detected, setting current line number to 0
2018-05-21 20:55:25,011 - octoprint.util.comm - INFO - M110 detected, setting current line number to 0
2018-05-21 20:55:25,572 - octoprint.util.comm - INFO - Changing monitoring state from "Connecting" to "Operational"
2018-05-21 20:55:25,577 - octoprint.util.comm - INFO - M110 detected, setting current line number to 0
2018-05-21 20:55:25,635 - octoprint.util.comm - INFO - Printer reports firmware name "Marlin TH3D U1.R1.8c"
2018-05-21 20:55:33,465 - octoprint.filemanager.analysis - INFO - Analysis of entry local:E3Dv6_EZABL5.5.gcode finished, needed 21.42s
2018-05-21 20:55:34,768 - octoprint.filemanager.analysis - INFO - Starting analysis of local:_E3Dv6_EZABL5.5.gcode
2018-05-21 20:55:34,776 - octoprint.filemanager.analysis - INFO - Invoking analysis command: /home/pi/oprint/bin/python2 -m octoprint analysis gcode --speed-x=12000 --speed-y=12000 --max-t=10 --throttle=0.0 --throttle-lines=100 /home/pi/.octoprint/uploads/_E3Dv6_EZABL5.5.gcode
2018-05-21 20:55:35,840 - octoprint.filemanager.analysis - INFO - Analysis of entry local:_E3Dv6_EZABL5.5.gcode finished, needed 1.07s
2018-05-21 20:56:27,759 - octoprint.util.comm - INFO - Externally triggered heatup detected
2018-05-21 21:01:26,651 - octoprint.util.comm - INFO - Externally triggered heatup detected
2018-05-21 21:01:59,873 - octoprint.server.heartbeat - INFO - Server heartbeat <3
2018-05-21 21:06:15,600 - octoprint.util.comm - INFO - Changing monitoring state from "Operational" to "Printing"
2018-05-21 21:06:15,647 - octoprint.util.comm - INFO - M110 detected, setting current line number to 0
2018-05-21 21:07:26,057 - octoprint.util.comm - INFO - Printer seems to support the busy protocol, telling it to set the busy interval to our "communicationBusy" timeout - 1s = 2s
2018-05-21 21:16:59,876 - octoprint.server.heartbeat - INFO - Server heartbeat <3
2018-05-21 21:31:59,878 - octoprint.server.heartbeat - INFO - Server heartbeat <3
2018-05-21 21:46:59,879 - octoprint.server.heartbeat - INFO - Server heartbeat <3
2018-05-21 21:56:10,107 - octoprint.server.util.sockjs - INFO - Client connection closed: ::ffff:192.168.192.24
2018-05-21 21:56:19,885 - octoprint.server.util.sockjs - INFO - New connection from client: ::ffff:192.168.192.24
2018-05-21 21:56:26,259 - octoprint.util.pip - INFO - Using "/home/pi/oprint/bin/python2 -m pip" as command to invoke pip
2018-05-21 21:56:26,449 - octoprint.util.pip - INFO - Using "/home/pi/oprint/bin/python2 -m pip" as command to invoke pip
2018-05-21 21:56:26,484 - octoprint.util.pip - INFO - Using "/home/pi/oprint/bin/python2 -m pip" as command to invoke pip
2018-05-21 21:56:30,811 - octoprint.util.pip - INFO - pip installs to /home/pi/oprint/lib/python2.7/site-packages/ (writable -> yes), --user flag needed -> no, virtual env -> yes
2018-05-21 21:56:30,813 - octoprint.util.pip - INFO - ==> pip ok -> yes
2018-05-21 21:56:31,042 - octoprint.plugins.softwareupdate - INFO - Saved version cache to disk
2018-05-21 22:01:59,881 - octoprint.server.heartbeat - INFO - Server heartbeat <3
2018-05-21 22:16:59,883 - octoprint.server.heartbeat - INFO - Server heartbeat <3
2018-05-21 22:31:59,885 - octoprint.server.heartbeat - INFO - Server heartbeat <3
2018-05-21 22:46:59,887 - octoprint.server.heartbeat - INFO - Server heartbeat <3
2018-05-21 22:48:58,264 - octoprint.util.comm - INFO - Communication timeout while printing, trying to trigger response from printer.
2018-05-21 22:49:01,271 - octoprint.util.comm - INFO - Communication timeout while printing, trying to trigger response from printer.
2018-05-21 22:49:04,274 - octoprint.util.comm - INFO - Communication timeout while printing, trying to trigger response from printer.
2018-05-21 22:49:07,280 - octoprint.util.comm - INFO - Communication timeout while printing, trying to trigger response from printer.
2018-05-21 22:49:10,287 - octoprint.util.comm - INFO - Communication timeout while printing, trying to trigger response from printer.
2018-05-21 22:49:13,295 - octoprint.util.comm - INFO - No response from printer after 6 consecutive communication timeouts, considering it dead.
2018-05-21 22:49:13,323 - octoprint.util.comm - INFO - Changing monitoring state from "Printing" to "Offline (Error: Too many consecutive timeouts, printer still connected and alive?)"

This is /var/log/syslog:

[12138.110707] usb 1-1.1.3.1: Product: FT232R USB UART
[12138.110716] usb 1-1.1.3.1: Manufacturer: FTDI
[12138.121349] ftdi_sio 1-1.1.3.1:1.0: FTDI USB Serial Device converter detected
[12138.121512] usb 1-1.1.3.1: Detected FT232RL
[12138.122306] usb 1-1.1.3.1: FTDI USB Serial Device converter now attached to ttyUSB0
[19130.659012] ftdi_sio ttyUSB0: usb_serial_generic_read_bulk_callback - urb stopped: -32
[19130.659107] ftdi_sio ttyUSB0: usb_serial_generic_read_bulk_callback - urb stopped: -32

There's nothing of more value in dmesg or other log files. I didn't have serial.log enabled unfortunately.

@foosel

This comment has been minimized.

Copy link
Owner

foosel commented May 22, 2018

@JabawokJayUK @cassianoleal if the server is staying up for you but you are seeing serial disconnects, it is not the same issue as reported by @devsfan1830. The fact that the serial line is vanishing on the OS level is something that OctoPrint can't do anything about. As always with such problems, make sure you've read this and this FAQ entry and that your power supply is decent. It could be that the update from 1.3.6 to 1.3.8 put just that much more resource consumption on your setup that a struggling PSU is now overwhelmed and causing issues like this.

This ticket is explicitly about the server going fully offline due to crashing and possibly a duplicate of the aforementioned #2424, but that won't be possible to determine until @devsfan1830 has tried what was suggested in that ticket in order to narrow things down.

@devsfan1830

This comment has been minimized.

Copy link
Author

devsfan1830 commented May 22, 2018

Still working on it. So far ive run some 6 hour prints in safe mode or with Octoprint Anywhere disabled only and they all finished. I will re-enable OA and continue long prints until failure and use what was suggested to provide more info. This seems to be extremely intermittent. Almost wonder if its OA poorly handling a sporadic failed connection to its cloud service.

@devsfan1830

This comment has been minimized.

Copy link
Author

devsfan1830 commented May 22, 2018

I think I'm going to remove the Octoprint Anywhere plugin. I'm now observing odd behavior where the 2.5GHz wifi that the Pi and two other devices that are on it goes "wonky". I have a garage door and an ecobee themostat on the 2.4ghz band of my tri band router. They both lose their cloud connections. Octoprint becomes unreachable from my laptop, desktop and phone. I can still interact with the local 7" touchscreen. Oddly enough, the Octoprint Anywhere site still gets its data. I reboot my router and it all comes back. The octoprint.log shows what seems to be ALOT of frequent check ins with the OA servers. My current theory is that its overwhelming my router and either congesting or locking up the 2.5GHz channel. Could explain the octoprint server crashing too somehow. I figure better safe than sorry and just remove it. Go back to a Haproxy based remote access. Thus way nothing is actively pinging the internet other than the connectivity check. I've posted my latest octoprint.local to see if you agree with my assessment of the plugin. That is, if it shows the needed info.

@devsfan1830

This comment has been minimized.

Copy link
Author

devsfan1830 commented May 22, 2018

@foosel

This comment has been minimized.

Copy link
Owner

foosel commented May 23, 2018

Certainly looks a bit spammy based on the errors it logs, but can't see if during normal (error free) operation it would also be this ping-heavy.

If your server runs stable without that plugin, I guess this should be raised with the plugin's author as well.

@kennethjiang

This comment has been minimized.

Copy link

kennethjiang commented May 23, 2018

@devsfan1830 I'm the author of OctoPrint Anywhere. When there is an active print job OA will consume ~3mbps uplink bandwidth (assuming your webcam is 640x480. it can be much higher if the webcam is higher resolution). Depending on your ISP there is a possibility that your router will be overwhelmed.

I have released a version a few months ago to reduce the bandwidth consumption to about 5% when the printer is idle. I have not heard many reports on router issue since then.

Your octoprint.log indicates a strong possibility that the router is overwhelmed. OA is quiet when there are no connectivity issues.

If your ISP can't support the uplink bandwidth it requires, it's a good idea to uninstall OA.

@kennethjiang

This comment has been minimized.

Copy link

kennethjiang commented May 23, 2018

Oh OA is pure Python so it's high unlikely that it'd cause any seg fault.

@devsfan1830

This comment has been minimized.

Copy link
Author

devsfan1830 commented May 23, 2018

@kennethjiang Thanks for checking in. Yeah I'm actually using a logitch C920 @ 1080p. The 2.4GHz wifi does indeed fail once the print starts. So it seems yea, the router and possibly the pi is getting overwhelmed with the uplink traffic. ISP has nothing to do with things in this case. Its gigabit fios (900down/800up) but its my own router connected to the FIOS hub via ethernet. Unless theres a way to reduce video quality for JUST OA then I think yeah OA is no good for my use. Since I'd prefer 1080p for my Octolapse timelapses.

@foosel

This comment has been minimized.

Copy link
Owner

foosel commented May 23, 2018

OA is pure Python so it's high unlikely that it'd cause any seg fault.

Its dependencies too? OctoPrint itself is also 100% pure Python, but I can't vouch for the third party dependencies that I rely on.

I'm also not saying your plugin is causing issues with regards to stability, I just noticed that it is installed in a lot of recent crash reports, and that has me worried there might be something up here.

@kennethjiang

This comment has been minimized.

Copy link

kennethjiang commented May 23, 2018

The dependencies of OA plugin are simple enough that I almost want to vouch for them. ;)

@Winfieldmd

This comment has been minimized.

Copy link

Winfieldmd commented Jun 9, 2018

My apologies as I am a newb but Octoprint anywhere is my saving grace to monitor prints as port fowarding has been challenging however I have done 4 print since installing OA and two of them failed in exactly this way - server suddenly offline with no access from any device using any program, bed and hotend remain hot with print head just sitting there hung midprint. I am unable to restart the print, give up and resetting my pi (by unplugging) restores everything to normal. I tried upgrading my Pi power supply to 3.5a and plugged in the ethernet cord directly to router after the first time it happened just to try something but it failed again (but the cord is plugged into a router extender). My laptop wifi to the same router extension next to my printer is getting 76 download and 12Mbps upload if that matters or helps. I am not saying it is OA and I need and love OA, but with my sudden hangs and a few months of no problems with octoprint or my pi and this thread, it is suspicious. I am too much of a newb to begin to know how to get the logs etc. but I can try but I thought maybe my experience might add insight. Maybe OA works too well! It is amazing the videos of my prints instantly appearing on my cell phone on LTE. Oh my camera is the logitech c270 with basic settings (have not changed anything).

@kennethjiang

This comment has been minimized.

Copy link

kennethjiang commented Jun 9, 2018

It does sound concerning @Winfieldmd . What is the make and model of your printer? Do you mind sharing the octoprint.log and serial.log files that include the times when the crashes happened?

Another thing that you can help me with - next time when you are printing and OA is active, can you ssh to the Pi and check the CPU % using top? The very first version of OA was very CPU-demanding (>80%) and it took quite a bit of optimization to bring the CPU % under 15% (it shares the same process with OctoPrint so look for the python process) on my Pi 3 B+. But depending on your hardware (Pi + webcam) it may be different for you.

@Winfieldmd

This comment has been minimized.

Copy link

Winfieldmd commented Jun 10, 2018

Ok let me try to help. I am using a Creality CR-10s and slicing with Simplify3D. I think I found the logs but its a bit confusing to decipher the ones I need. The last print that failed was started friday night 6/8 pretty late and failed in the night a few hours later which would be on 6/9. I see 1 octoprint log and 2 serial log files around then so I will try and attach them all I suppose but I cannot open them on my computer and dragging them here gives me a message that the file type is not supported. I am using Win 10. How do I open or convert them? As far as SSh into my Pi that will take some time and work as I know zero about doing that other than the one time I did it to set up my Pi using youtube video tutorials lol. Sorry you are dealing with a newb. Sorry for the late reply, I was out of town. Thanks for your help.

@kennethjiang

This comment has been minimized.

Copy link

kennethjiang commented Jun 10, 2018

Can you put all of them into a zip file and upload the zip file here? @Winfieldmd

@Winfieldmd

This comment has been minimized.

Copy link

Winfieldmd commented Jun 11, 2018

@kennethjiang see if this works.

octoprint.log.zip

@kennethjiang

This comment has been minimized.

Copy link

kennethjiang commented Jun 12, 2018

Thank you @Winfieldmd for getting the log files to me. And sorry about the delay.

From the log files, it looks like you rebooted OctoPi (did the previous print job crashed just before that?) at 22:32 on 2018-06-08. And the print job lasted until about 02:34 on 2018-06-09 when OctoPi crashed. Does the timeline sound right to you?

For this period (22:32 on 2018-06-08 - 02:34 on 2018-06-09), OA reported twice that the connection to cloud timed out. So it had trouble connecting to the backend in the cloud for about 6 minutes during the period of 4 hours. It does seem a bit strange since you said that OctoPi was wired to your router so the connection should have been solid. But it doesn't seem too alarming either, especially because the connectivity issues were at least 2 hours before the crash.

I suspect the crash has to do with the serial communication between OctoPrint and the 3D printer. Please turn on serial logging by going to Settings > Serial Connection > Log communication so that we have a bit more info next time when crash happens again.

@Winfieldmd

This comment has been minimized.

Copy link

Winfieldmd commented Jun 13, 2018

@kennethjiang yes that looks like the correct timeline. I will turn on serial logging and start a print as soon as I can. I will have to be a bit more careful as it crashes with everything staying hot which makes me a bit nervous. I will let you know the results. Thanks.

@Winfieldmd

This comment has been minimized.

Copy link

Winfieldmd commented Jun 14, 2018

@kennethjiang I went ahead a reprinted the failed print above. It completed with no problems this time with no changes other than activating serial logging. I don't know if it will be helpful but here are the log files with serial logging activated on this print. Not that I would know what to do anyway yet, but I unfortunately I have not been able to log into my Pi to try and check the CPU utilization as for some crazy reason it will not accept my username/password which makes no sense. I may have to do re-image of octopi. Greatly appreciate your help.

octoprint log files.zip

@kennethjiang

This comment has been minimized.

Copy link

kennethjiang commented Jun 14, 2018

@Winfieldmd This log file looks normal, which is not surprising.

Don't try to figure out a way to log into your Pi, or re-image it. If the CPU % is high, the print head movement will become jerky, which doesn't seem to be the case.

Just keep an eye on this. In case it crashes again, please upload the octoprint.log and serial.log.

Thanks for providing all these info to debug, and your kind words! :)

@kennethjiang

This comment has been minimized.

Copy link

kennethjiang commented Jun 14, 2018

The same crash happened to me today too!

I attached all the logs that seem relevant, including the files in /var/log.

I googled it and it seems like many similar issues are related to insufficient power supply. @foosel A question for you - in the cases of insufficient power supply, would the whole Raspberry Pi reboot? In my case the Pi itself stayed on the whole time.

This is the 1st time such crash happened to me. I highly doubt it has to do with OA since, obviously, I have been using OA from the very beginning (about 1 year now).

octoprintlogs 2.zip

@foosel

This comment has been minimized.

Copy link
Owner

foosel commented Jul 5, 2018

To repeat myself from an earlier comment, it would seriously help if anyone running into this could please run these steps I already linked to in an earlier comment to figure out why the server is crashing. So far it's still just "it's crashing and nobody has any logs of that" - please help change that.

Just to repeat that here:

It might also be worth a shot to have OctoPrint run directly from within a screen instance instead of as a service:

sudo apt-get update && sudo apt-get install screen
sudo service octoprint stop
screen
/home/pi/oprint/bin/octoprint serve

That way everything printed to stdout (even a segfault of the underlying python process) will be recoverable by simply reattaching to the screen session after a crash (screen -x).

@CoderFX

This comment has been minimized.

Copy link

CoderFX commented Jul 16, 2018

Got the same crash
octoprint2.log

@foosel

This comment has been minimized.

Copy link
Owner

foosel commented Jul 17, 2018

@CoderFX please read my comment from earlier. octoprint.log doesn't help here, the linked steps might.

@AndKe

This comment has been minimized.

Copy link

AndKe commented Jul 24, 2018

to all affected: for safety, use M85 in your custom gcode...

@reclaimer5146

This comment has been minimized.

Copy link

reclaimer5146 commented Jul 27, 2018

@foosel

I ran into this issue ever since I blew out my sd card and moved to 1.3.8. I didn't have this issue on prior releases, which I believe I had stopped on 1.3.6
I pulled dmesg and all of the syslogs. I've had the issue multiple times, but never collected the dmesg. I collected it for the most recent crash.

I've also started octoprint in a screen, so if I run into it I'll post it up as well.

Please let me know if you need anything else.
logs.tar.gz

@reclaimer5146

This comment has been minimized.

Copy link

reclaimer5146 commented Jul 28, 2018

@foosel

I experienced another crash today. This time I used M85 to terminate it after 5 minutes. It was happening every 2-3 prints, but now its happening every print. I had screen running, but in my infinite wisdom, I terminated the detached screen. I did save the output of it from MobaXterm, but as Rich Text, so use with a Rich Text editor. Here is the last few lines of what I see:
2018-07-28 08:09:18,213 - octoprint.server.heartbeat - INFO - Server heartbeat <3
2018-07-28 08:09:30,820 - octoprint.util.connectivity_checker - INFO - Connectivity changed from online to offline
2018-07-28 08:09:30,829 - octoprint.plugins.ipOnConnect - INFO - ipOnConnectPlugin: 192.168.50.97
2018-07-28 08:24:18,215 - octoprint.server.heartbeat - INFO - Server heartbeat <3
*** stack smashing detected ***: /home/pi/oprint/bin/python2 terminated
Aborted
pi@octoprintCR10:~/logs $

In an effort to be more organized, I've resorted to naming the folders with the date of the crash. Then creating an archive of said folders. I included the output of the screen session in the 2018-07-28 folder. Going forward, each date will have the associated set of logs included.

Please let me know if there is anything else you need.
2018-07-28_python_crash.tar.gz

@xJMV

This comment has been minimized.

Copy link

xJMV commented Aug 1, 2018

I do also encounter this problem and I think I found out why on my side (I have no log to display right now, I will try to reproduce later...)

First, I had a rPI B (yes the good old rPI B V1) that had nothing to do at home so I installed OctoPi and tried it out. At first, the printer was pausing mid-print making lots of plastic blob everywhere(#2771 ?). Then I made tests monitoring with TOP. I know the rPi B is not the best device for that, but having it ready to use with a 32gb card to store files was just perfect for me...

At first I reniced octoprint to be nice -19 (sudo crontab -e : */5 * * * * renice -19 $(pidof -x octoprint)). This made things faster and usable. I was able to print the test file I first upload without any pause.

So I uploaded my next print and started it right away. It started and few minutes later, the rpi ssh disconnected, the printer stopped and the temperatures settings remained. I disconnected and reconnected the rPI power supply to regain access. That is when I noticed a python script running even while not printing...

For unknown reason, even with niceness sets in favour of octoprint, if the python script that analyze the gcode to gives the various estimation is runing while printing it freeze and bring everything down. Without niceness, it seems to be the same script that makes OctoPrint pause while printing.

When I am done doing my projects I will makes more tests if nobody else has been able to replicate it.

@xJMV

This comment has been minimized.

Copy link

xJMV commented Aug 4, 2018

A long printing job (4days) just died at 32%... Lots of material going to the dumpster. Anyway here are some helping point I guess...

`
ps -ef | grep -i octoprint ==> Returns nothing. Octoprint is well dead.

ps x ==> Camera stream is still active however.
`

USB device attached to the pi : CR-10 3D printer & APC UPS data cable that gives status of the UPS.

The only thing going on was the printing and I had only my laptop monitoring the print. No action of any kind as been done other than viewing the camera display or the temp chart.

Good point is that for once, SSH remained accessible... So logs are available! Both files are TXT,

DMESG: https://paste.jmvien.net/?f959a6d839782521#xDbw+kvHWKTfpfAvspCbKJJNZ1CD1LLklhQ7qmELy9A=
Syslog: https://paste.jmvien.net/?f837dd15853516ea#XNcO2MImjkeBXJHUZba8a6jIzY+iQyIgqTWtBPIuRvE=

Next step is to restart the print from zero :/ but this time using screen.

EDIT: Around 0200 I've seen the printer working. It happened between 0200 and 0230.

@kennethjiang

This comment has been minimized.

Copy link

kennethjiang commented Aug 4, 2018

It sounds like painful! Sorry to hear that @xJMV. Did you have OctoPrint Anywhere plugin installed and active when the printer died?

@foosel I was looking at the code of my plugin and I started to wonder if thread-safety may be the culprit here. Like I said my plugin is simple but it spawns up a few threads. I use Python synchronized Queue to communicate between these threads but I realized that these threads also use shared variables, such as _printer. I don't "write" (calling functions that obviously mutate internal states) to these variables. Just query status and settings. But I'm wondering:

  • if Python guarantees thread-safety for 1-write + n-reads access model.
  • if other threads from OctoPrint core or other plugins may write to some of these variables.
@xJMV

This comment has been minimized.

Copy link

xJMV commented Aug 4, 2018

@kennethjiang negative. I have a VPN if I ever need to check things out when not on my home WIFI... Always using the Octoprint web guy.

@xJMV

This comment has been minimized.

Copy link

xJMV commented Aug 5, 2018

Here is one... I stopped OctoPrint, flushed the output file and restarted it. I let it stand by cause I prefered to be around when starting the print... So I got this error while on stand by.

pi@redcap:~ $ /home/pi/oprint/bin/octoprint serve > octoprint_output.txt Exception in thread Thread-7: Traceback (most recent call last): File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner self.run() File "/usr/lib/python2.7/threading.py", line 754, in run self.__target(*self.__args, **self.__kwargs) File "/home/pi/oprint/local/lib/python2.7/site-packages/octoprint/filemanager/analysis.py", line 227, in _work self._analyze(entry, high_priority=high_priority) File "/home/pi/oprint/local/lib/python2.7/site-packages/octoprint/filemanager/analysis.py", line 261, in _analyze result = self._do_analysis(high_priority=high_priority) File "/home/pi/oprint/local/lib/python2.7/site-packages/octoprint/filemanager/analysis.py", line 401, in _do_analysis raise RuntimeError("No analysis result found") RuntimeError: No analysis result found

After I restarted the server once again and let it stand by while I was doing something else, I came back to this:

2018-08-05 16:59:53,961 - octoprint.server.heartbeat - INFO - Server heartbeat <3 2018-08-05 17:14:53,969 - octoprint.server.heartbeat - INFO - Server heartbeat <3 2018-08-05 17:29:53,977 - octoprint.server.heartbeat - INFO - Server heartbeat <3 2018-08-05 17:40:27,335 - octoprint.filemanager.analysis - INFO - Analysis of entry local:Raspberry_Pi/rpi3-top_25mm-fan-4x-PLA.gcode finished, needed 3304.54s 2018-08-05 17:40:28,397 - octoprint.filemanager.analysis - INFO - Starting analysis of local:Address_Plate_-_Black.gcode 2018-08-05 17:40:28,419 - octoprint.filemanager.analysis - INFO - Invoking analysis command: /home/pi/oprint/bin/python2 -m octoprint analysis gcode --speed-x=6000 --speed-y=6000 --max-t=10 --throttle=0.01 --throttle-lines=100 /home/pi/.octoprint/uploads/Address_Plate_-_Black.gcode 2018-08-05 17:44:53,991 - octoprint.server.heartbeat - INFO - Server heartbeat <3 2018-08-05 17:46:11,720 - octoprint.filemanager.analysis - INFO - Analysis of entry local:Address_Plate_-_Black.gcode finished, needed 343.32s
Those files analyzed have been analyzed and printed already so it should no have happened again. I really feel that the analysis script is the culprit in that...

@xJMV

This comment has been minimized.

Copy link

xJMV commented Aug 5, 2018

On mobile. While printing there was a segmentation fault without any other informations. The process output file do not show anything not even the seg fault. The screen output just display heartbeat then : Segmentation Fault.

Now home, Here is the result from screen:
pi@redcap:~ $ /home/pi/oprint/bin/octoprint serve > octoprint_output.txt
::ffff:127.0.0.1 - - [05/Aug/2018 16:44:55] "GET /sockjs/info?t=1533487494056 HTTP/1.1" 404 -
::ffff:127.0.0.1 - - [05/Aug/2018 16:45:03] "GET /sockjs/info?t=1533487502092 HTTP/1.1" 404 -
::ffff:127.0.0.1 - - [05/Aug/2018 16:45:15] "GET /sockjs/info?t=1533487514109 HTTP/1.1" 404 -
Segmentation fault

@xJMV

This comment has been minimized.

Copy link

xJMV commented Aug 7, 2018

I don't know if this can be related... Actually printing, all going well, the print did not pause or anything...

2018-08-07 18:16:08,130 - octoprint.server.heartbeat - INFO - Server heartbeat <3
2018-08-07 18:17:57,418 - tornado.general - ERROR - Uncaught exception, closing connection.
Traceback (most recent call last):
File "/home/pi/oprint/local/lib/python2.7/site-packages/tornado/iostream.py", line 523, in _handle_events
self._handle_write()
File "/home/pi/oprint/local/lib/python2.7/site-packages/tornado/iostream.py", line 872, in _handle_write
del self._write_buffer[:self._write_buffer_pos]
BufferError: Existing exports of data: object cannot be re-sized
2018-08-07 18:17:57,422 - tornado.application - ERROR - Exception in callback None
Traceback (most recent call last):
File "/home/pi/oprint/local/lib/python2.7/site-packages/tornado/ioloop.py", line 888, in start
handler_func(fd_obj, events)
File "/home/pi/oprint/local/lib/python2.7/site-packages/tornado/stack_context.py", line 277, in null_wrapper
return fn(*args, **kwargs)
File "/home/pi/oprint/local/lib/python2.7/site-packages/tornado/iostream.py", line 523, in _handle_events
self._handle_write()
File "/home/pi/oprint/local/lib/python2.7/site-packages/tornado/iostream.py", line 872, in _handle_write
del self._write_buffer[:self._write_buffer_pos]
BufferError: Existing exports of data: object cannot be re-sized
2018-08-07 18:17:57,440 - octoprint.server.util.sockjs - INFO - Client connection closed: ::ffff:192.168.0.100
2018-08-07 18:17:57,455 - octoprint.server.util.sockjs - WARNING - Could not send message to client None: 'NoneType' object does not support item deletion
2018-08-07 18:18:13,944 - octoprint.server.util.sockjs - INFO - New connection from client: ::ffff:192.168.0.100
2018-08-07 18:31:08,139 - octoprint.server.heartbeat - INFO - Server heartbeat <3

@xJMV

This comment has been minimized.

Copy link

xJMV commented Aug 13, 2018

Got it again, on purpose this time, and twice!

uploaded the gcode file via Slic3r and started the printing right away without waiting for it to be fully evaluated. It seems to stop and bring OctoPrint service down every time.

Here are the second attempt log output (part of it, because there is nothing interesting to display...) The file analysis beginning appears twice in the log because I made a small update and re-uploaded it. First time I did not and the print stopped just like this one.

Can anybody reproduce this ?

It feels like having the service reading for printing and having the evaluation script reading the file too cause a problem...

^Cpi@redcap:~ $ /home/pi/oprint/bin/octoprint serve > octoprint_output.txt
Segmentation fault

pi@redcap:~ $ tail octoprint_output.txt
2018-08-13 22:16:15,415 - octoprint.server.heartbeat - INFO - Server heartbeat <3
2018-08-13 22:26:49,715 - octoprint.filemanager.analysis - INFO - Starting analysis of local:vertical_spool_holder.gcode
2018-08-13 22:26:49,809 - octoprint.filemanager.analysis - INFO - Invoking analysis command: /home/pi/oprint/bin/python2 -m octoprint analysis gcode --speed-x=6000 --speed-y=6000 --max-t=10 --throttle=0.0 --throttle-lines=100 /home/pi/.octoprint/uploads/vertical_spool_holder.gcode
2018-08-13 22:27:17,010 - octoprint.filemanager.analysis - INFO - Starting analysis of local:vertical_spool_holder.gcode
2018-08-13 22:27:17,266 - octoprint.filemanager.analysis - INFO - Invoking analysis command: /home/pi/oprint/bin/python2 -m octoprint analysis gcode --speed-x=6000 --speed-y=6000 --max-t=10 --throttle=0.0 --throttle-lines=100 /home/pi/.octoprint/uploads/vertical_spool_holder.gcode
2018-08-13 22:29:17,531 - octoprint.util.comm - INFO - Changing monitoring state from "Operational" to "Printing"
2018-08-13 22:29:17,564 - octoprint.util.comm - INFO - M110 detected, setting current line number to 0
2018-08-13 22:31:15,428 - octoprint.server.heartbeat - INFO - Server heartbeat <3
2018-08-13 22:44:51,114 - octoprint.server.util.sockjs - INFO - New connection from client: ::ffff:192.168.0.190
2018-08-13 22:46:15,442 - octoprint.server.heartbeat - INFO - Server heartbeat <3

pi@redcap:~/.octoprint/logs $ tail serial.log
2018-08-13 22:47:08,814 - Recv: ok
2018-08-13 22:47:08,820 - Send: N2382 M10528
2018-08-13 22:47:08,878 - Recv: ok T:235.8 /235.0 B:40.4 /40.0 T0:235.8 /235.0 @:65 B@:0
2018-08-13 22:47:08,886 - Send: N2383 G1 X211.070 Y163.574 E21.56021
108
2018-08-13 22:47:08,909 - Recv: ok
2018-08-13 22:47:08,918 - Send: N2384 G1 X209.696 Y164.397 E21.64392106
2018-08-13 22:47:08,926 - Recv: ok
2018-08-13 22:47:08,935 - Send: N2385 G1 X208.247 Y165.083 E21.72765
110
2018-08-13 22:47:08,974 - Recv: ok
2018-08-13 22:47:08,982 - Send: N2386 G1 X206.739 Y165.622 E21.81133*107

pi@redcap:~/.octoprint/logs $ tail octoprint.log
| plugins:
| octopi_support:
| model: 1B
| revision: 000e
| version: 0.15.1
| python:
| pip: 9.0.3
| version: 2.7.13
| virtualenv: /home/pi/oprint
2018-08-13 22:46:15,598 - octoprint.server - INFO - ------------------------------------------------------------------------------

@M00nPatrolz

This comment has been minimized.

Copy link

M00nPatrolz commented Sep 29, 2018

@kennethjiang
log.zip
Just had the same issue. Print froze midway through. Could only ssh and reboot. The only changes I have made are Octopi Anywhere and Octolapse. Although the logs are semi foreign to me, after inspecting them it appears I have been having some network related drops since I installed OA although I never lost a print or even noticed until today. Yesterday I did change the resolution and fps which could explain the extra load that caused things to finally fail. I was also heavily monitoring the print today. Logs attached.

@kennethjiang

This comment has been minimized.

Copy link

kennethjiang commented Sep 29, 2018

@M00nPatrolz thank you for sending the log files. In the log files there are lot of error messages about OA being unable to connect to the cloud server.

It looks like the error started to occur around 2018-09-25 15:25:37 UTC. Did you change webcam's resolution before or after that time?

What's your webcam's current resolution? Do you always have a page open that shows OctoPrint Anywhere dashboard? The reason why I ask this question is when the printer is not watched on OA's dashboard, its bandwidth consumption is reduced by 90% and is usually low enough to not cause any problem.

@Balduc

This comment has been minimized.

Copy link

Balduc commented Oct 6, 2018

Hi everyone,

I've just experienced the same issue.

3d printer: cr-10 mini 12v
Hardware for Octopi: raspberry pi 3
connection printer/raspberry: usb wire from cr-10
Version of Octopi and octoprint: The most up to date as this hour ( 06 october 2018, 3h02 PM)
model print time: 22 hours total, 86% done...
Slicer: Simplify 3d 4.0.0.0
Connection service via Wi-fi

Something new? yes: Octoprint Anywhere. But, since i'm using octoprint for 2 weeks, it's tricky to say that OA have something to do with it...

I've connected octopi to avoid the issue we all experienced with a PC: serial error that stop the print in the middle or more. Ironically, the same thing happend. Coincidence?

So for 7 days no issue without OA.

With OA, for 3 print, i have 3 different ends: One fine and print perfectly, one evolve badly ( i've lost connection with the octoserver in the middle and there were a lot of blobs, lines not aligned, etc, I was thinking that i did a bad slice, now i think it's the pi) and the last one just stop, heated bed and nozzle still hot.

I did not power off the pi, but i did power off the printer. Need to print this piece ASAP so I switch to a SD card print. No, i will not use the pi again to monitor/service my printer. Too risky.

But my pi is still up and running since the crash, and i can ssh into it. So if you need me to do something to help you, i'm waiting instructions.

@kennethjiang

This comment has been minimized.

Copy link

kennethjiang commented Oct 7, 2018

Can you upload octoprint.log file that includes the periods when OctoPrint crashed?

@Balduc

This comment has been minimized.

Copy link

Balduc commented Oct 7, 2018

I've tried to. But Octo did not find the log. And the python script was off. I've tried any kind of manipulation Foosel ask, but nothing works. So, the bad news is: I redownload the image, format the sd card and write a new image on it, I shouldn't turn the printer off in the first place maybe...

While i'm writing this, i'm printing off a sd card with a new Octopi/octoprint up and running to monitor the print while doing something else, without any plug-ins. I was thinking to forward the octoprint server port using a virtual static-ip service to monitor the print if i'm out of range of my network. I will, step by step, get close to OA (without it) to see if something change.

By the way, i'm using an ten+ years old 2 mp logitech usb webcam. And to put more data, the server stop each time after i first monitor the print using OA, using my android smartphone, for the 2 prints that failed. Did not change any fps variable. Did not use the low bandwith option.

Hope it helps, even just a little.

@kennethjiang

This comment has been minimized.

Copy link

kennethjiang commented Oct 8, 2018

@Balduc When you open OA dashboard on your smartphone, the OA plugin will start to stream webcam at 5x faster rate, to make it a more responsive webcam feed. This probably explains why you see difference when you open OA on your phone.

2mp resolution is not very low. Official Pi cam has default resolution to be 640x480 (~300kp).

What is your internet upload bandwidth?

@Balduc

This comment has been minimized.

Copy link

Balduc commented Oct 8, 2018

@kennethjiang

This comment has been minimized.

Copy link

kennethjiang commented Oct 8, 2018

Your network speed is really good. I'm puzzled why there seems to be a network-related issue. Can you send me email k@getanywhere.io so that we can coordinate a time for phone call or google hangout? @Balduc

@xJMV

This comment has been minimized.

Copy link

xJMV commented Nov 12, 2018

Ok, so for sometimes now I changed my printing routine and did not encounter that bug again...

At first, I was leaving OctoPrint tab open 24/7 on my project laptop so I could monitor from my office.

Now I open OctoPrint in my browser to start the printing and then close the browser tab. I monitor it once in a while but always clause the tab when done.

No frozen prints since then...

@Kraligor

This comment has been minimized.

Copy link

Kraligor commented Dec 10, 2018

Hi all,

had the same issue yesterday, about 4 hours in the print. Anycubic i3 Mega, Octoprint running on OrangePi Prime/Armbian. Printer stopped moving, but temperatures (hotend and bed) didn't drop. When I realized and checked, the Octoprint webinterface was up, but no longer connected to the printer. Clicking "connect", it connected immediately.

I can provide you with logs in the evening.

It stopped printing after I walked by the printer, so maybe it's just a faulty USB cable that lost contact.

Should I raise a new ticket (because OPi/Armbian)?

Cheers
Oli

@Kraligor

This comment has been minimized.

Copy link

Kraligor commented Dec 10, 2018

Here's the log; disconnect happened at 2018-12-09 22:14:37,332

octoprint.log.2018-12-09.txt

@foosel

This comment has been minimized.

Copy link
Owner

foosel commented Dec 12, 2018

@Budge72

This comment has been minimized.

Copy link

Budge72 commented Jan 21, 2019

Hi. I'm using a RPi 3B+ with an Ender 3 Pro and a C270 webcam. I've had 2 stopped prints now, where the print just stops, the heaters are still running, and Octoprint is unresponsive. The first time it happened when I was using a questionable power supply. I noticed an undervoltage warning the day before but didn't look into it. Totally my fault. I replaced it with a 2A charger (now I see it should be 3A for B+), and it happened again last night about 24 hours into a 60+ hour print job. It might have been the power supply, but what logs should I check to see if it was Serial Exception? I'm including the Octoprint log files. Serial log was not running.
Octoprint Log from 2019-01-20.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.