Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial serial connection needs a connect/disconnect/connect cycle #1770

Closed
skorokithakis opened this issue Feb 13, 2017 · 16 comments

Comments

Projects
None yet
4 participants
@skorokithakis
Copy link

commented Feb 13, 2017

I (and many, many other people I know) have the following problem on a Wanhao i3:

When connecting to the printer, after pressing Connect, the connection takes forever (it never completes). Pressing Disconnect and the Connect again connects to the printer right away.

I went to Octoprint thinking "this'll be easy to reproduce endlessly", as it happens every single time, but I can no longer reproduce this with 1.3.1, it connects immediately the first time. I'm going to file this issue anyway, because it's a widespread problem, but keep in mind that it might be fixed with 1.3.1.

@GitIssueBot

This comment has been minimized.

Copy link
Collaborator

commented Feb 13, 2017

Hi @skorokithakis,

It looks like there is some information missing from your bug report that will be needed in order to solve the problem. Read the Contribution Guidelines which will provide you with a template to fill out here so that your bug report is ready to be investigated (I promise I'll go away then too!).

If you did not intend to report a bug but wanted to request a feature or brain storm about some kind of development, please take special note of the title format to use as described in the Contribution Guidelines.

Please do not abuse the bug tracker as a support forum - if you have a question or otherwise need some kind of help or support refer to the Mailinglist or the G+ Community instead of here.

Also make sure you are at the right place - this is the bug tracker of the official version of OctoPrint, not the Raspberry Pi image OctoPi nor any unbundled third party OctoPrint plugins or unofficial versions. Make sure too that you have read through the Frequently Asked Questions and searched the existing tickets for your problem - try multiple search terms please.

I'm marking this one now as needing some more information. Please understand that if you do not provide that information within the next two weeks (until 2017-02-27 11:10 UTC) I'll close this ticket so it doesn't clutter the bug tracker. This is nothing personal, so please just be considerate and help the maintainers solve this problem quickly by following the guidelines linked above. Remember, the less time the devs have to spend running after information on tickets, the more time they have to actually solve problems and add awesome new features. Thank you!

Best regards,
~ Your friendly GitIssueBot

PS: I'm just an automated script, not a human being, so don't expect any replies from me :) Your ticket is read by humans too, I'm just not one of them.

@skorokithakis

This comment has been minimized.

Copy link
Author

commented Feb 13, 2017

Hmm, here's something odd, which I believe might have something to do with the issue. I reproduced the problem, but when trying to copy the log, I can't copy past some point:

Changing monitoring state from 'Opening serial port' to 'Connecting'
Send: N0 M110 N0*125
Recv: 

However, this is what I see:

selection_001

I think the printer might be sending some odd character that confuses OctoPrint? I'm not sure if the "Recv" lines are strictly what the printer is sending (i.e. if the printer is sending "wait" verbatim) or if that's OctoPrint's logging, but I think I should print hex chars instead so we can figure it out.

If you can tell me which file handles logging to that window, I can go change it to output hex instead, and report back.

@foosel

This comment has been minimized.

Copy link
Owner

commented Feb 13, 2017

Please provide a fully filled out ticket. As stated in the other ticket I need logs, not excerpts from logs, information on versions, hardware, firmware etc etc :) You'll also need to enable serial.log and provide that from a connection attempt. And also octoprint.log. All information on where to find and enable the requested logs are in the linked Contribution Guidelines. Please read them and provide a fully filled out ticket.

Don't take this the wrong way, but it's really really frustrating getting "Hey, something doesn't work here" issues all the time but without the information I desperately need to be able to help in any way and then having to run after that. Especially when it's a case of "and many others". It's exhausting and it sometimes makes me lash out and I don't like that :(

@ntoff

This comment has been minimized.

Copy link
Contributor

commented Feb 19, 2017

If you can tell me which file handles logging to that window, I can go change it to output hex instead, and report back.

@skorokithakis settings -> logs -> serial.log but first you need to enable it in the "serial connection" section.

You shouldn't need to change anything to output as hex. Enable the serial log, reproduce your issue, then go into logs and download the log. It should download to your own computer which you can then just attach here.

Do you have anything set to auto? If you do, try setting the port and baud rate manually and see if it connects ok.

edit: @foosel Since this sounds a bit like the same issue I had ages ago, have at my logs https://drive.google.com/open?id=0B6INQmg_kjkhTW8ybkRMWjQ2Mjg (sorry for google drive, there's multiple serial.log's and github won't let me attach a .zip)

This happens with port set to auto and "wait for start on connect". Port set to manual and no waiting for start, printer connects just fine. Firmware is repetier which I think may also be what the wanhao machines use (though I may be wrong). Ignore the temperature errors, I haven't got any thermistors plugged in to this board however repetier firmware doesn't care and will still "print" ok (motors run, it just disables all the heater functions) and does NOT block communications.

@skorokithakis

This comment has been minimized.

Copy link
Author

commented Feb 19, 2017

I have attached two logfiles. One, failed-serial.log is the first attempt (that never completes), and serial.log is the second, successful attempt. You can see in failed-serial.log that it contains the following line:

2017-02-19 13:43:37,835 - Recv: \00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00ok

There's a long string of received null bytes before the OK, which confuses OctoPrint, I think.

serials.zip

@skorokithakis

This comment has been minimized.

Copy link
Author

commented Feb 19, 2017

Removing the null bytes in _readline() does not seem to fix this, for what it's worth.

@ntoff

This comment has been minimized.

Copy link
Contributor

commented Feb 19, 2017

I wonder if repetier firmware is assuming binary communication?

https://github.com/repetier/Repetier-Firmware/blob/master/repetier%20communication%20protocol.txt

@foosel

This comment has been minimized.

Copy link
Owner

commented Feb 20, 2017

As far as I understand that protocol and its implementation, it's only for transfering GCODE to the printer, the responses are unmodified plain ASCII. Might be mistaken though.

In any case, that null byte prefix is indeed the issue because OctoPrint checks for ok on the start of a line (otherwise something like SD card ok or Opened file file_ok.gco or similar would cause issues). It also looks like the controller is not resetting here, so the usual fallback of "If you see start, try the handshake again" doesn't trigger.

I've pushed a commit to maintenance that will a) strip null bytes and b) also make sure to interpret wait during connection as a case of "oops, apparently we missed an ok", so that even if we get something entirely messed up like \0\0\0o\r\n\0\0\0k\r\n as acknowledgement to our very first M110 it should hopefully still get things running. I tested this against the built-in virtual printer which I also adjusted to behave like what we saw in the serial log file up there and in that case it fixes things, but it would be great to get a report back @skorokithakis if it also solves the problem with an actual printer.

As a small side-note, that whole adjustment is only to get around any kind of weird garbage on the line during initial connect though (and I actually remember seeing something similar on early versions of the Hephestos 2 firmware, where the bootloader was doing some odd things, so that's not an isolated case even though it still strikes me as odd).

@skorokithakis

This comment has been minimized.

Copy link
Author

commented Feb 20, 2017

I tried stripping null bytes already, I'm afraid it made no difference...

foosel added a commit that referenced this issue Feb 20, 2017

Interpret wait during connect fully as ok
Otherwise we might switch to connected state but lack an
acknowledgement to send the next lines and stay in a wait cycle.

Also related to #1770
@foosel

This comment has been minimized.

Copy link
Owner

commented Feb 20, 2017

It definitely made here, but as mentioned, I also did an additional step to even recover from the case that stripping the null bytes isn't sufficient (and I just pushed another commit for that since I forgot something in that additional step)

Here's what the connection to the virtual printer with simulated garbage now looks like:

Changing monitoring state from 'Opening serial port' to 'Connecting'
Send: N0 M110 N0*125
Recv: \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00ok
Changing monitoring state from 'Connecting' to 'Operational'
Send: N1 M105*38
Recv: ok
Send: N0 M110 N0*125
Recv: T:0.00 /0.00 B:1.00 /1.00 @:64
Recv: ok

and here's what now happens even without stripping the null bytes thanks to the second fallback workaround:

Changing monitoring state from 'Opening serial port' to 'Connecting'
Send: N0 M110 N0*125
Recv: \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00ok
Recv: wait
Changing monitoring state from 'Connecting' to 'Operational'
Send: N1 M105*38
Recv: ok
Send: N0 M110 N0*125
Recv: T:0.00 /0.00 B:1.00 /1.00 @:64
Recv: ok

(simulated Repetier Firmware behaviour, hence the ok comes before the command output).

For reference, this is without the changes:

Changing monitoring state from 'Opening serial port' to 'Connecting'
Send: N0 M110 N0*125
Recv: \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00ok
Recv: wait
Recv: wait
Recv: wait
Recv: wait
Recv: wait
Recv: wait
Recv: wait
Recv: wait
Recv: wait
Recv: wait
Recv: wait
Recv: wait
Recv: wait
Connection closed, closing down monitor
Changing monitoring state from 'Connecting' to 'Offline'
@skorokithakis

This comment has been minimized.

Copy link
Author

commented Feb 20, 2017

Ah, alright. Can you give me the list of commits to apply to my branch so I can test this? I want to just patch the files so I can revert them later, as I use Octoprint for all my printing and wouldn't want to be running HEAD, for obvious reasons.

EDIT: Ah, it seems there are too many changes. I'll see if I can just pull and then revert.

@foosel

This comment has been minimized.

Copy link
Owner

commented Feb 20, 2017

You should just check out the maintenance branch, install it and restart your server (should display something like Version: 1.3.2.dev45+g9f96369 (maintenance branch) then, or a higher number and different commit hash after the dev), then when you are done testing check out master again, install it and restart the server. If it says "1.3.1" again it's exactly what you had before. For reference: master is ONLY updated on new releases (or hotfixes, that has only happened once so far), maintenance is what is going to be the next stable release and only gets fixes and improvements. devel is what's actually bleeding edge.

Since there can always be some weird kind of merge or code conflict when manually applying commits on a different base I'd prefer you test that way since that would actually verify that what is going to be the next stable version 1.3.2 solves your problem (which is the goal here), not that some "Frankenstein version" that will never actually exist like that does or doesn't solve it.

But if you insist: Related commits are the two cross-referenced above plus 7f0eb8d for escaping unprintable characters in the terminal tab for better debugging. If you apply those manually and it works, yay, if you do apply those and it doesn't I guess we'll have to wait until 1.3.2 is released to verify if you don't trust maintenance for whatever reason.

@skorokithakis

This comment has been minimized.

Copy link
Author

commented Feb 20, 2017

My problem is that I use Octopi, which does some weird things with packaging to get the latest version, and I don't want to mess that up. I'm just going to run the standalone server now, though, it should work fine. Thank you!

@foosel

This comment has been minimized.

Copy link
Owner

commented Feb 20, 2017

On OctoPi the only weird thing is the virtual environment maybe, but other than that it's straight forward:

source ~/oprint/bin/activate
cd ~/OctoPrint
git checkout maintenance
python setup.py clean
python setup.py install
sudo service octoprint restart

Perform tests. Then to get back to stable:

source ~/oprint/bin/activate
cd ~/OctoPrint
git checkout master
python setup.py clean
python setup.py install
sudo service octoprint restart

I do this pretty much all the time on my local test RPis, jumping around multiple branches like that. If you run into any problems there, that should be investigated.

@skorokithakis

This comment has been minimized.

Copy link
Author

commented Feb 20, 2017

I tried three times, and it looks like this is fixed! I used to get the problem every single time before, and I didn't it at all now, so I'm fairly confident it's good, thank you!

@foosel

This comment has been minimized.

Copy link
Owner

commented Feb 20, 2017

🙌

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.