Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

recalculate crc checksum if zeros in payload are trimmed #241

Closed
wants to merge 1 commit into from
Closed

recalculate crc checksum if zeros in payload are trimmed #241

wants to merge 1 commit into from

Conversation

WeifengY
Copy link

issue: we may receive valid mavlink2 messages that contains "0" bytes
in the message payload which are not generated by this library.
for such a messages, we can succussfully convert it to a
mavlink_message_t. we may decide to forward it to its next stop.
when "mavlink_msg_to_send_buffer()" is called to convert the
mavlink_message_t to another sending buffer, the "0"s in the
playload are trimmed, but the message crc checksum is not updated
accordingly. as a result, on the next receiving end, the buffer
can no longer be assembled back to a mavlink_message_t.
cause: the message crc checksum doesn't match the payload content.
fix: recaculate the crc checksum if message payload is changed.

issue: we may receive valid mavlink2 messages that contains "0" bytes
       in the message payload which are not generated by this library.
       for such a messages, we can succussfully convert it to a
       mavlink_message_t. we may decide to forward it to its next stop.
       when "mavlink_msg_to_send_buffer()" is called to convert the
       mavlink_message_t to another sending buffer, the "0"s in the
       playload are trimmed, but the message crc checksum is not updated
       accordingly. as a result, on the next receiving end, the buffer
       can no longer be assembled back to a mavlink_message_t.
cause: the message crc checksum doesn't match the payload content.
fix:   recaculate the crc checksum if message payload is changed.
@julianoes
Copy link
Contributor

For more discussion check #237.

@julianoes
Copy link
Contributor

@tridge It would be great if you could check this and comment.

@OXINARF OXINARF requested a review from tridge November 14, 2018 13:21
@hamishwillee
Copy link
Contributor

FYI only, further discussion in #237 (comment)

@hamishwillee
Copy link
Contributor

FYI @tridge I have also updated the spec so that it makes it clear that trimming is expected, but we're allowing/recommending that implementations support processing/routing of non-compliant messages: mavlink/mavlink-devguide#135

@tridge
Copy link
Contributor

tridge commented Nov 19, 2018

I'd like to see this case tested in the test suite. I just tried modifying the test data so that all the generated fields are zero, and the new code never triggered. I'm not quite sure why that is.
I think this is an odd enough corner of the protocol that we do need it to be part of the test suite triggered in test_generator.sh before we make any changes in the implementation

@hamishwillee
Copy link
Contributor

@tridge YOu're absolutely right, but what should we be aiming to show from the test (i.e. what is a fail)?
Ie you could argue that the incoming untrimmed packet is

  • non-compliant and should just be dumped.
  • non-compliant but we should still allow it to be parsed and forwarded
  • compliant, and systems can choose to send with or without, and must handle incoming.

The dev meeting decided that the second is the best solution, but it occurs to me that this very much should depend on a) what all our implementation do (not just C) and b) "the right thing".

@auturgy
Copy link
Contributor

auturgy commented Nov 19, 2018 via email

@hamishwillee
Copy link
Contributor

@auturgy Thanks for the correction. I really don't think it is too late for us to decide how this "should" be. The original version strongly implied that the trimming should be used. Close enough that I'd be willing to say that a non-trimming implementation is non compliant if that is the best thing for the standard.

@tridge
Copy link
Contributor

tridge commented Dec 29, 2018

@WeifengY as I mentioned above, this needs a test. Can you please add a test to generator/C/test/posix/testmav.c to test the change. The test should fail without your patch and pass with it

@WeifengY
Copy link
Author

Hi @tridge ,
Looking into generator/C/test/posix/testmav.c, I have yet understood how test cases get run. I'll try to figure it out and add a case if possible.

@potaito
Copy link

potaito commented Feb 14, 2019

So to summarize

  • @tridge would be willing to merge if there is a unit-test showing failure without this PR and a pass with the fix proposed by this PR.
  • @auturgy does not think that zero-length mavlink messages are specs compliant in the first place
  • @WeifengY couln't figure out how to add new tests to generator/C/test/posix/testmav.c

I believe this is an actual bug. Regardless of the specs, zero-length messages should be handled properly on a transport protocol level. It would be a shame to just ignore the bug and corresponding fix proposed in this PR because zero-length mavlink messages are not expected. So before this PR becomes stale and disappears, is there anyone who can help @WeifengY write a test to prove the existence of the bug? Please? :)

@hamishwillee
Copy link
Contributor

@potaito Just to be very pedantic, @auturgy is saying that we'd like to say "zero trimmed MAVLink 2 messages are not allowed" because that is implied by the spec, but that we can't because the spec is not precise enough.
I am of the opinion that the spec is close enough to this, and I was hoping @tridge would agree so we could "make it so". At that point we would have a different PR than this and a different test.

But yes, this is too important to let go stale.

I don't understand the test system so can't help with that.

@WeifengY
Copy link
Author

Actually, I captured the UDP packets with 'winshark' and wrote a test program to verify it some weeks ago. However, the outcome of parsing was too weird and inconsistent with the result we got from the running drones.

here is my test program:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdint.h>
#include <ctype.h>
#include <sys/types.h>
#include <sys/time.h>
#include <unistd.h>
#include <pthread.h>
#include "mavlink_types.h"
#include "mavlink.h"
#include "checksum.h"

#define GIMBAL_DEBUG_CRC_EXTRA 224
#if 0
unsigned char gimbal_debug_buf[] = {
    0xfd, 0x21, 0x00, 0x00, 0x01, 0x01, 0x00, 0x8a, 0x13, 0x00, 0x02, 0x00, 0x00, 0x00, 0x00, 0x00,
    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xb0, 0x6b,
};
#else
unsigned char gimbal_debug_buf[] = {
    0xfd, 0x21, 0x00, 0x00, 0x01, 0x01, 0x00, 0x8a, 0x13, 0x00, 0x02, 0x00, 0x00, 0x00, 0x00, 0x00,
    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xb0, 0x6b
};
#endif

#define HEATBEAT_CRC_EXTRA 50
unsigned char heartbeat_buf[] = {
    0xfd, 0x09, 0x00, 0x00, 0x36, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x06, 0x08,
    0xc0, 0x04, 0x03, 0xc9, 0x5b,
};

uint16_t calc_check(const uint8_t *buf, int len, uint8_t extra)
{
    uint16_t ck = 0;
	ck = crc_calculate(buf, len);
	crc_accumulate(extra, &ck);

    printf("\ncrc checksum: 0x%4x\n", (ck & 0xffff));

    return ck;
}

int parse_mavlink_buffer(unsigned char *databuf, int datalen, uint8_t crcext)
{
    mavlink_message_t msg;
    mavlink_status_t status;
    int i = 0;
    uint16_t ck = calc_check((const uint8_t *)&databuf[1], datalen - 3, crcext);

    printf("try to parse a mavlink message buffer (checksum 0x%04x):\n", 0xffff & ck);
    for (i = 0; i < datalen; )
    {
        int ret = mavlink_parse_char(MAVLINK_COMM_0, databuf[i], &msg, &status);
        printf("0x%02x ", (databuf[i] & 0xFF));
        if (++i % 16 == 0) {
            printf("\n");
        }
        if (ret == MAVLINK_FRAMING_OK) {
            printf("\n****\n");
            return 1;
        }
    }

    printf("\n^^^^\n");
    return 0;
}

int main(int argc, char * argv[]) {
    //unsigned char buf[256];
    //int  len;

    if (parse_mavlink_buffer(gimbal_debug_buf, sizeof(gimbal_debug_buf), GIMBAL_DEBUG_CRC_EXTRA)) {
        printf("\"gimbal_debug_buf\" is a valid mavlink2 message!.\n");
    } else {
        printf("\"gimbal_debug_buf\" is NOT a valid mavlink2 message!.\n");
    }

    if (parse_mavlink_buffer(heartbeat_buf, sizeof(heartbeat_buf), HEATBEAT_CRC_EXTRA)) {
        printf("\"heartbeat_buf\" is a valid mavlink2 message!.\n");
    } else {
        printf("\"heartbeat_buf\" is NOT a valid mavlink2 message!.\n");
    }

    return 0;
}

I ran it in both my ubuntu box and the drone console. the output came the same:

crc checksum: 0x6bb0
try to parse a mavlink message buffer (checksum 0x6bb0):

^^^^
"gimbal_debug_buf" is NOT a valid mavlink2 message!.

crc checksum: 0x5bc9
try to parse a mavlink message buffer (checksum 0x5bc9):


"heartbeat_buf" is a valid mavlink2 message!.

the result showed that the message buffer with zeros within failed to be reassembled to a Mavlink message, which is NOT the result of our drone firmware.
I didn't further dig in and had no willing to do that. I couldn't understand why the same mavlink library (I compiled the test program with the same mavlink library used in our truck) behaved in different manners. Only guessed that probably there was some compile options that mattered.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants