Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

message CRC checksum is wrong when a mavlink_message_t is converted to a sending buffer #237

Open
WeifengY opened this issue Nov 7, 2018 · 26 comments

Comments

@WeifengY
Copy link

WeifengY commented Nov 7, 2018

issue: we may receive valid mavlink2 messages that contains "0" bytes
       in the message payload which are not generated by this library.
       for such a messages, we can succussfully convert it to a
       mavlink_message_t. we may decide to forward it to its next stop.
       when "mavlink_msg_to_send_buffer()" is called to convert the
       mavlink_message_t to another sending buffer, the "0"s in the
       playload are trimmed, but the message crc checksum is not updated
       accordingly. as a result, on the next receiving end, the buffer
       can no longer be assembled back to a mavlink_message_t.
cause: the message crc checksum doesn't match the payload content.
fix:   recaculate the crc checksum if message payload is changed.

patch is attached.
recalculate_crc_if_payload_changes.zip

@julianoes
Copy link
Contributor

This is the patch:

diff --git a/generator/C/include_v2.0/mavlink_helpers.h b/generator/C/include_v2.0/mavlink_helpers.h
index 19a76e0..cc9b385 100644
--- a/generator/C/include_v2.0/mavlink_helpers.h
+++ b/generator/C/include_v2.0/mavlink_helpers.h
@@ -15,6 +15,7 @@
 #ifdef MAVLINK_USE_CXX_NAMESPACE
 namespace mavlink {
 #endif
+MAVLINK_HELPER uint8_t mavlink_get_crc_extra(const mavlink_message_t *msg);
 
 /*
  * Internal function to give access to the channel status for each channel
@@ -447,8 +448,17 @@ MAVLINK_HELPER uint16_t mavlink_msg_to_send_buffer(uint8_t *buf, const mavlink_m
 		ck = buf + header_len + 1 + (uint16_t)length;
 		signature_len = (msg->incompat_flags & MAVLINK_IFLAG_SIGNED)?MAVLINK_SIGNATURE_BLOCK_LEN:0;
 	}
-	ck[0] = (uint8_t)(msg->checksum & 0xFF);
-	ck[1] = (uint8_t)(msg->checksum >> 8);
+	if (length != msg->len) {
+		uint8_t crc_extra = mavlink_get_crc_extra(msg);
+		uint16_t checksum = crc_calculate((const uint8_t*)&buf[1], header_len);
+		crc_accumulate_buffer(&checksum, _MAV_PAYLOAD(msg), length);
+		crc_accumulate(crc_extra, &checksum);
+		ck[0] = (uint8_t)(checksum & 0xFF);
+		ck[1] = (uint8_t)(checksum >> 8);
+	} else {
+		ck[0] = (uint8_t)(msg->checksum & 0xFF);
+		ck[1] = (uint8_t)(msg->checksum >> 8);
+	}
 	if (signature_len > 0) {
 		memcpy(&ck[2], msg->signature, signature_len);
 	}

@julianoes
Copy link
Contributor

julianoes commented Nov 7, 2018

Something to check in the next mavlink call. @hamishwillee can you table this or add the tag please? Thx.

@auturgy
Copy link
Contributor

auturgy commented Nov 7, 2018 via email

@hamishwillee
Copy link
Contributor

Perhaps better to regenerate this as a PR so that it can be merged if approved?

Possibly I'm missing something, but I don't understand why there is an expectation that the mavlink_msg_to_send_buffer() will recalculate the checksum. Also, why it would need to - just populate the mavlink_message_t with the checksum from the incoming message (it was calculated when originally sent for a message that has no zeros).

@julianoes What is it that you want to discuss about this? I have added to the 20181114 but is intention just getting review or determining whether this is even a good idea? Or to put it another way, why not follow normal PR process?

As an aside:

  • The routing rules state that messages should be routed unchanged. Is this the right mechanism to do so?
  • FYI there is a possibly related comment here on the incoming parser for bad crcs). Possibly not :-)

@WeifengY
Copy link
Author

WeifengY commented Nov 8, 2018

the use case is like this:
A
|
|
V
B -----> C
|
|
V
D
the connection diagram is like above.
The key of the problem is that A is coded with some other Mavlink libraries, which don't trim the tailing zeros in the payload of its outcoming messages. But doubtlessly, the messages conform to Mavlink2 protocol.
A sends a message to B, and B forwards the message to C or D based on the routing information (message id, target component id, etc) included in the message.
On receiving the message, B converts the receiving buffer to a mavlink_message_t with function "mavlink_parse_char()". the crc checksum in this mavlink_message_t matches its payload (including the tailing zeros), so it's OK. now B decides to forward it to the next stop and calls "mavlink_msg_to_send_buffer()" to convert the mavlink_message_t to a sending buffer. As the mavlink_message_t contains tailing zeros, "mavlink_msg_to_send_buffer()" trims the zeros but doesn't change the crc checksum accordingly. B sends this sending buffer to C or D.
As a result, the message received on C or D end fails to assemble the buffer back to a mavlink_message_t due to crc check failure.

@hamishwillee
Copy link
Contributor

@WeifengY Thank you for the more detailed explanation. Some thoughts:

  1. I'm not sure a message can conform to MAVLink 2 if it doesn't strip the zero's (ie it is not "optional"). What systems are generating these messages?
  2. Messages are supposed to be forwarded unchanged. Unfortunately I don't see a mechanism for getting the original message as a buffer and pushing it out to another channel ....

Interested to hear what others think about this. It may be that we should make the code robust enough to handle this case even if it is "off spec" (and I'm not saying it definitely is!)

@yaoyz-yuneec
Copy link

yaoyz-yuneec commented Nov 9, 2018

@hamishwillee @julianoes
Thanks very much for the help.
It is meaningful for the transmission bandwidth to cut off 0 of the tail, but the corresponding CRC should also be synchronized.
The case(buf->msg->buf) mentioned by @WeifengY will be often used by some complex systems and products.

@hamishwillee
Copy link
Contributor

Hi @yaoyz-yuneec

I believe I do understand what you are saying. The incoming message comes in "untrimmed" and with a CRC calculated for the untrimmed string. You can't use the original CRC from the incoming message because mavlink_msg_to_send_buffer will trim the message before sending (so incoming CRC no longer matches).

My issue with this is not what is happening, but what the right response should be given that the incoming message is not compliant with the spec. After all, you're adding a CRC calculation to every single time this method is used - even though it should not be needed if the incoming message was correct.

@julianoes @auturgy Do you have an opinion?

@WeifengY
Copy link
Author

@hamishwillee ,
We are not recalculating the CRC for every single call of mavlink_msg_to_send_buffer . recalculation only happens when there are some "0"s are trimmed. Please refer to @julianoes ' comment: #237 (comment) or the patch file posted in this thread.

@haiyangyuneec
Copy link

There are two options:

  1. We define the spec to say with zero at end of message is INVALID
    In such case,
    a. The mavlink parse should NOT parse message or accept it, since message with zero at end of itself is INVALID.
    b. The trim in mavlink function should be removed, since we needn't such trim for INVALID message which should never be there. We are spending CPU for unnecessary cases.

  2. We define the spec to say with zero at end of message is VALID
    In such case, the trim logic makes sense to decrease the message payload, and the CRC recalculate trim is required.

In any case, trim a message without recalculating CRC is incorrect.

@julianoes
Copy link
Contributor

@haiyangyuneec @WeifengY we discussed this in the mavlink coordination call trying to understand it and come to a solution.

We wondered again which implementation sends untrimmed mavlink 2? Is it one of the open-source ones?

We agreed that the spec is not clear on whether you have to trim the zeros or not, so it probably makes sense that messages with zeros get correctly parsed.

It was brought up in the call that you should be using the _finalize method to make sure the CRC of the message is set, however, this means that you need to calculate the CRC for all messages (trimmed or not) and you waste CPU.

I also found the function _mavlink_resend_uart (https://github.com/ArduPilot/pymavlink/blob/master/generator/C/include_v2.0/mavlink_helpers.h#L363-L411) which might work for re-sending it and does not trim the message.

My takeaway after reading all this and the notes at the meeting is that your change makes sense, or at least is a valid workaround for this edge case that exists because the spec is not clear.

I suggest that you make a pull request with the change, and hopefully we can get it reviewed.

@WeifengY
Copy link
Author

Hi @julianoes
It's great to hear your update. I had a try to commit a PR to the pymavlink project but failed due to lack of write permission before I opening this issue.
I'm glad to have another attempt if permission is granted.

@julianoes
Copy link
Contributor

@WeifengY oh got it. You need to make a fork and make the pull request from there:
https://help.github.com/articles/creating-a-pull-request-from-a-fork/

@hamishwillee
Copy link
Contributor

@haiyangyuneec @WeifengY See ...

We wondered again which implementation sends untrimmed mavlink 2? Is it one of the open-source ones?

@WeifengY
Copy link
Author

I'll find some people to get the answer of this question. It may take some time.

@WeifengY
Copy link
Author

@julianoes . Well, your instruction helps. I made the pull request. #241

@WeifengY
Copy link
Author

Hi @julianoes, @hamishwillee ,
The implementation who sends untrimmed mavlink messages is not from an open source project.

@hamishwillee
Copy link
Contributor

hamishwillee commented Nov 18, 2018

Hi @WeifengY

Thank you.

The implementation who sends untrimmed mavlink messages is not from an open source project.

Even though we are probably going to add support for handling this kind of message, it is non compliant. I have added a PR to make this clearer in the docs: mavlink/mavlink-devguide#135

Can you forward this information to the implementation and let them know that they should change this?

@haiyangyuneec
Copy link

Again, if untrimmed message is invalid, please check these,

a. The mavlink parse should NOT parse message or accept it, since message with zero at end of itself is INVALID.
b. The trim in mavlink function should be removed, since we needn't such trim for INVALID message which should never be there. We are spending CPU for unnecessary cases.

@hamishwillee
Copy link
Contributor

@haiyangyuneec
What mavlink/mavlink-devguide#135 says is that it is invalid to send an untrimmed message, but not invalid to receive one.

The spec has always said that messages should be trimmed, but that was not forceful enough.

The spec mentioned did not mention the case of receiving untrimmed messages because we assumed there would be no one sending such messages. However now that we have a non compliant sender we have a choice to accept the messages or not. This update to the spec says that we will. If you disagree, then please add your reasons as a response to mavlink/mavlink-devguide#135

@haiyangyuneec
Copy link

I am quite confused. Why accept invalid message and the parser can get invalid message out of the stream? Even accept the invalid message, the trim logic is definitely wrong without recalculating crc after trimming.

Everything is clear and it is up to you to fix such bug or not.

@WeifengY
Copy link
Author

Hi @hamishwillee ,
Thanks a lot for accepting it even though the message may be not strictly comply with the spec. I can inform the message sender of the risk of the possible incompatibility with the spec.
However, let's make one step backward. I assume there are lots of programmers around the world who are interested in playing with drones, some of them are professionals but many are merely amateurs. So you cannot force all of them not to send out "untrimmed" messages, if such messages can be well parsed and accepted. So the best thing the library can do is either to reject this kind of messages OR make them fully compatible. It's not a good idea to make it "partially" compatible, that is accepting the messages on the receiving end but failing to resending them on the forwarding port.

@hamishwillee
Copy link
Contributor

hamishwillee commented Nov 19, 2018

Hi @WeifengY @haiyangyuneec

Note that neither the refinement to the spec or the PR have been accepted. We don't have a test case for this so even though allows this is easy for C we need to have a look at the other language bindings too. A test case is being written.
If all the other language bindings fail, we may have to revisit this.

BUT, if it is possible then ...

Why accept invalid message and the parser can get invalid message out of the stream?

Because we wanted to make things easier for you, and because it is easy to do with our current implementation.

Even though we will effectively allow non-compliant messages to be sent by letting systems handle them, we can still go to vendors who are sending untrimmed zeros and say "you're not compliant". This means we can encourage the right behaviour without breaking you.

So the best thing the library can do is either to reject this kind of messages OR make them fully compatible. It's not a good idea to make it "partially" compatible, that is accepting the messages on the receiving end but failing to resending them on the forwarding port.

Just to be clear, the proposed refinement to the spec will require the non-compliant messages to be resent. I just changed mavlink/mavlink-devguide#135 to make that clear.

But again, it might not happen if the effort to update the other implementations is too great. As you say, in some ways it is better to just reject the invalid messages and remove any confusion or doubt.

@magicrub
Copy link
Contributor

magicrub commented Nov 8, 2019

any movement on this?

@hamishwillee
Copy link
Contributor

@magicrub No. This is waiting on "someone" to write a test case for mavlink 2 untrimmed messages. FWIW IMO we'd have been better saying they are not allowed.

@iottrends
Copy link

Hi,
I am also facing bad crc error, I am using a transparent datalink with 250 byte pkt limit...
any fix that handles partial Mavlink messages... As stats show link quality as ~50%
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants