New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Repetier Resend Request #1015
Conversation
Alright after testing this 2 prints went through without problems. |
Can you provide an example log output of what Repetier sends in case of a Resend request? I'm a bit puzzled that it didn't get caught by the existing mechanism. The processing in place so far should match all of
unless I'm mistaken in reading the code, so I'm a bit surprised that apparently that wasn't enough. Also, what Format error? |
Repetier send "Resend:12345" without a space between : and 12345 and it also is not caught by the if "rs" in line |
Well as I understand the Repetier code right it tries 3 times to get the command resended if this fails it says the command has a Format Error. |
The code should turn the
So a Something in that line apparently doesn't work like it should when encountering a resend request from your repetier version and hence necessitates your change. I'd prefer to first understand what is really happening here, I fear it might lead to other problems otherwise. If you can get around to reproducing a resend error somehow (I know that can be tricky, stuff never fails when you need it to ;)) and add the log here that would probably help to get to the root cause. Let us discuss the Format Error stuff in another issue/on IRC on Monday and tackle this resend business here first, one problem at a time ;) From the sounds of it however I'd expect the firmware to reset its internal "I'm not receiving that line properly" counter once the resend succeeds, so the error should only occur when stuff really runs amiss, in which case a print abort with all consequences is valid behaviour. |
Hmm you are right this line should capture it... could it be because of the -1 instead of 1? |
|
Okay I checked the source the full Resend line "Resend:12345\r\n" Will try that line with the parsing in the python interpreter later this day. |
Already tried:
Something's not right here. We need a log I fear. |
Think so too, tried too because I didn't saw your response... Will try to capture a Resend request gimme a moment |
Not sure if this is important here, but Repetier always requests resends like this:
ok is without line number here even if line numbers are enabled for ok. |
@repetier thanks, that looks exactly like what I would expect it to look like and what should cause no problems to parse. OctoPrint will react to the |
... great I have forced a Resend request (Wrong checksums are great hehe), Well the Resend parsing works great so my "fix" is a piece of shit... But that doesn't explain why the Resend failed for me here... |
Aww come on. It isn't, it just is fixing the wrong problem. It probably makes sense to log any exceptions that occur while trying to parse the resend request, at least if the rs branch isn't hit, my guess is that is what happened for you. Some hiccup on the line causing an exception to be raised here which was then silently thrown away. So that should in fact be looked into. |
I'm already looking into because this bug crashed 3 of my prints xD It's something personal now hehehe |
Worst case as always can reproduce the bug with serial logging enabled... I hate sporadic bugs... But I believe it could also be the firmware trying to investigate further... |
Alright I got a truncated serial.log now, sorry for that but I had to disable kommunikation logging to force this bug: https://gist.github.com/Salandora/4b1daa80f091e2e4c826 Summary:
For me it now looks like a Firmware bug, it might be I'm sending to fast for Repetier to handle it. @foosel, @repetier what are your opinions on this? Could it be a Firmware bug? |
First repetier uses error prefix to tell something is wrong, not to stop communication. A error is a message a user normally want's to see in log so he knows why things go wrong. That is also the way my server and host handle errors. If firmware needs to stop it can call kill to refuse new commands or reset causing a new "start" line to appear which also should reset ongoing prints. A resend request can get lost by transmission error making it "Reend:23344" or something similar. For that reason it is repeated after a time (0.2s I think it is) so it can make sure that host always will react correctly. With bad timing it can of course get tricky if you resend a line and get a resend request in parallel. |
That is a fairly individual interpretation of the "protocol" though. Once upon a time an So, your
So you are basically spamming the host and leave it up to it to weed through the garbage :/ How should a host distinguish between the repeated Resends caused by one misreceived line or a misreceiving of the same line multiple times (which can happen just as easily)? E.g.
My guess is that you always prefix your "new" resends with additional I see the point about the Finally, I understand the reasoning behind these decisions, but reinterpreting the "protocol" in such a way is a bit tricky if you don't also document it along side your firmware. If you care about interoperability that is ;) You are basically creating your custom sub-protocol here that is incompatible with the "established standard" (however idiotic that might be at places). |
Closed because of new PR #1032 |
Well an error is something that went wrong. It does not say you need to disconnect per definition. And regarding standards you know good enough that there is no real common standard. When I started with the firmware there was no Marlin and no echo: at all and so repetier copied that behavior. Later Marlin started to prefix it with echo: but is it a standard if Marlin changes behavior? Disconnecting is a bit hard I find, especially since firmware continues to work. I agree that stopping a running print for most errors is normally a good thing since we can expect that the print has failed. Regarding your whitelist I'd put these recoverable com errors into the whitelist if not already done: Regarding resend logic, I use the following for repetier firmware (I connect with M115 to detect firmware type or let user define it in case of server): |
I was not referring to Marlin in my statement. I can't find where I read that though at the moment, but it was a quite succinct definition along the lines of "Error only if broken beyond repair". Might have been the source of Sprinter or Teacup, or something else. Considering the above stated problems though it made sense and I still think there needs to be some kind of definition with regards to what is recoverable and what isn't, and that a (user extendable) white list is the only "safe" approach until something like that gets established. I agree though that the disconnect should be reconsidered. @Salandora pushed a commit (see above) that adds the format error to the whitelist. Anything with the word "checksum" in it was already whitelisted.
For what string in the output of that one can I be absolutely certain I'll always be able to detect your firmware? I've had bad experiences with the reliability of the response of that one, which is why I'm asking. If you gave me something dependable to work here, that would help a lot. |
You are right that M115 is not that safe. Host uses the word "repetier" in the lowercased string. But of course I already had companies renaming the M115 string into printer name and wondering why it does not get detected. So for the server I now let user enter the firmware type. This also has some advantages since I have a xml firmware definition file for each firmware. So I have less strings to compare and can adjust to different behavior and can also use command definitions in cases where it is different for different firmwares. Makes the whole thing much more portable once you have a processor that converts such a xml file into parser and sender. I agree that a distinction would be great to have. Didn't java have log level fatal for not recoverable errors? But as always we can not fix old stuff, so we need to take care of that in newer software. Until now I was letting prints continue even with errors knowing that not all errors are fatal ones, and now you make me think. I really would not disconnect since many errors are not that harmful like moves outside print area catched by firmware. This lets firmware continue but of course aborting a print might be a good idea if this happens. Errors about failed heaters also ruin the print and should aborted. Here if I remember right marlin ignores anything but M999 while repetier-firmware continuous without extrusion. And I also saw kill being called on Marlin and do reset as well in very severe states, so disconnect is a bit of overcompensation, but that is your software so your decision. And again you are right that blacklisting will never work. We can not predict future additions even if we would try to find all existing error messages. So I guess if I start stopping on errors I will also use a whitelist as protection. |
Ok, that's something then.
I feared as much. If only use it as a guess basically, overridable by the user.
Hah. Just the other day I was talking with @Salandora on IRC about adding something like that to OctoPrint, for redefining white lists and such stuff in a portable way. Didn't think about xml though, I don't like the overhead of that so much to be honest. I was more thinking into the direction of YAML.
Yeah. I would lean more towards error codes though. Something like the http approach maybe. And going with two letter prefixes (ok, !!, rs) instead. Man... I really need to find the time again to get back on that reprapcode definition stuff. The last month's have been nuts.
If so, I can tell you that OctoPrint currently has these words on it: "line number", "expected line", "checksum", "format error" (those are usually followed by a resend request) and "volume.init", "openroot", "workdir", "error writing to file". That lost can definitely reduced further when only prints are cancelled as a consequence instead of performing a disconnect to include only the first (communication problem indicating) block |
I used XML as it is a bit better readable with attributes (which YAML doe snot have) that can later be extended. Never used YAML myself so it might be more of being more used to it. Concerning the parser you are fully right. Compared with JSON which I use for communication, xml is a beast I also like to avoid and for simple xml files I have nice classes to simply read them out. If you outsource do not forget to put some things like "does ok after resend" into it, so you can reduce config options obsolete once you know the firmware. |
I noticed that the Resend Request of Repetier weren'T handled correctly after a short look I noticed that only the new resend way was getting recognized but not parsed correctly.
I added the missing Resend parsing.