Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Issues with big packets #26
I'm sorry for the long delay in my response. I hadn't had the time to test everything until this week. Fortunately, after a lot of testing, I can say it is very stable now when sending short messages. I haven't noticed any more problems so far. I added two more bytes to my messages (a total of 4 bytes now) to be more stable in general, though.
After the problems with small packets, however, I have now problems with big packets (e.g. 50 bytes or more) :-) The problem is not the message's size itself, though. There must be another bug in the receive function. The problem might be that the CRC could be correct although the message isn't.
1.) When PACKET_MAX_LENGTH (e.g. = 50) was lower than the actual size of the received package (e.g. = 60), the first strange thing happened. The sender tried to send the message over and over again but failed with 256 (FAIL). The receiver, however, received the message and instead of declining it, this->_receiver() was called with the data of the last correct message (e.g. the last short message). Long story short, with every FAIL at the sender, I got the same old message at the receiver.
2.) I have to send about 30 packets with a size of about 250 bytes each. As this didn't work very well, I added an id to the messages to see if the order was correct and if all packets were received correctly. I was very surprised when I checked the ids in my receiver function:
0, 0, 1, 2, 3, 4, 5, 5, 8, 9, 10, ...
So, in some cases I recevied the same package twice (while the sender failed with 256). In other cases, I didn't receive the package at all (while the sender reported ACK). It was definitely better after I changed
My assumption is that there are cases where the loop breaks after one or two bytes (because package_length is 1 or 2 or something) but the CRC is still correct. Then it calls
I have no problem with the fact that big packets might cause a lot of trouble (several FAILs before ACK, interference, etc.) but the fact that the library seems to behave unexpected is the main problem in this case. It works if I send the data splitted into very small packets but that makes it slower. Any idea?
P.S.: As promised I will also reply to your e-mail in the next few days.
Hei! Thank you again for your support.
I think a solution might be at line 409 https://github.com/gioblu/PJON/blob/master/PJON.cpp#L409
So every time a length is saved if less than 4 return FAIL.
Hi domonetic. You mean this changes in particular of you are suggesting to make a new release for every changes made?
I any case I am working on this problem, but seems the code I proposed is a solution.
added a commit
Feb 27, 2016
So I tested over and over again, and I have obtained similar results. As described filtering away possible packets with a length from 0 to 3 (that in any case should not be considered) the problem seems solved. By my side I agree with @TeeTrizZz .
To give a fix to the second issue I added the
I will now make a new release as @domonetic correctly suggested.
Thank you for the tests and the fix gioblu! I can't test it with large messages right now. I had a few more problems with short messages in the last few days, though. I used six Arduino Nanos and after starting them, the communication between the devices usually worked flawlessly for a while. But after a few seconds, minutes, or whatever, it broke. It felt like the communication was stuck somehow. I had to restart the Nanos to get it back working. As this was on a trade fair, I had no time to debug the system. I will do that on Monday. I assume that all these problems have the same reason.
I just wrote a litte console application to "simulate" your library and all of the messages our system currently uses. This tool basically checks what happens when an Arduino starts to read the data stream at an unknown time. It calculates the original CRC and compares it to the CRC of a message where the first byte, the first two bytes, the first three bytes, etc. are missing. It outputs all messages where the CRC was correct although one or more bytes were missing or are part of the next message (i.e. all messages which would lead to a faulty ACK). A faulty ACK also means that two messages are sent over the channel at the same time if the sender is still sending or is already sending the next message.
I will post the results later this day!
So, @TeeTrizZz if you are ok with it and you perceive this problem fixed I would close this issue and open anotherone for the "freezing after a while" problem. I am sorry if this affected your showcase, but I am in any case sure we will identify and fix the problem soon. Thank you again for your support and I am so curious to see your console application.
Ok, I started a test with only two devices on the bus, I think it will not crash, and this will be a good indication of where the bug could be located in the codebase.
You want me to create the issue or you are doing so (only want to avoid duplication :) )?
Thank you again!
Unfortunately - or fortunately :-) -, I have no relevant results yet. Only two of the 3900 possible messages might be problematic every now and then. While this is actually pretty good, the CRC respectively the checksum could be a bit more reliable, though. Maybe it should take the position of a byte into account or something.
Anyway, I guess something else was the reason for the frozen communication. A buffer overflow could've been one of the problems. I have changed MAX_PACKETS but I'm not sure if I did this for every device. I also forgot to handle the MEMORY_FULL error. However, this shouldn't break everything and so this can't be the only reason, though. I'm going to think about a test program which runs directly on the Arduinos. I will test this on Monday when our six devices are back from the trade fair. I will open an issue as soon as I know more!
P.S.: I like the CONTENT_TOO_LONG error. This is pretty useful!