Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Forwarding of unknown messages in C library not supported #1092

Closed
MaEtUgR opened this issue Feb 28, 2019 · 13 comments
Closed

Forwarding of unknown messages in C library not supported #1092

MaEtUgR opened this issue Feb 28, 2019 · 13 comments

Comments

@MaEtUgR
Copy link
Contributor

MaEtUgR commented Feb 28, 2019

I had this problem together with @simonegu and asked @julianoes about the matter and he mentioned it's totally worth an issue:

Problem
Forwarding messages without parsing the payload seems to be not supported by the mavlink C library.

There are always "dumb" components which are not interested in the payload of the message but only have the responsibility to forward messages to the correct destination comparable to a network router or hub. Examples for such components commonly used with MAVLink are:

  • ESP8266 WiFi module shipped with the Pixracer board that forwards packets from the serial port of the Pixracer to the Wifi network devices.
  • The simulator in the PX4 HITL setup which connects to the drone over serial and forwards packets to the ground control station via UDP.

These components are often not upgraded for every single change in the MAVLink payload definitions. I'm also speaking here for companies that might want to adjust their custom MAVLink definitions more frequently than the common ones and they are forced to update every single component that's only relaying for every new message.

Cause

  • As far as I know you would forward MAVLink packets by calling the C library like this:
if (mavlink_parse_char(MAVLINK_COMM_0, c, &msg, &status) == MAVLINK_FRAMING_OK)
{
_mavlink_resend_uart(..., msg);
}

You receive a char c from your UART driver, you call the parse char to process it and it will return framing ok when its state machine ended up with a CRC checked, known packet that is ready for decoding or resending.

But when you're just resending the packet you don't care if the payload definition is known to the relaying component. You want to parse the MAVLink packet to know where it's supposed to go from the header and likely verify the data integrity with CRC to not forward any broken packets.

The part of the parsing that requires knowledge of the message definition is https://github.com/ArduPilot/pymavlink/blob/f92b861b34c06cbb2820709c5712ad4f4f01b350/generator/C/include_v2.0/mavlink_helpers.h#L714-L716

Solution?
The part that requires knowledge about the message definition is for the CRC_EXTRA calculation which is there to ensure that sender and receiver share the details of the message definition. Being able to check this property is useful but why is it not independent of the CRC? Isn't the CRC there to ensure the integrity of the packet and not the shared understanding about the payload definition.

If I read it correctly and the CRC check is bound to the payload definition could we at least add an option to parse a mavlink message worst case without the CRC check or known payload defintion to be able to just forward it?

@hamishwillee
Copy link
Collaborator

@MaEtUgR You are 100% correct. The other key case here is that if you're forwarding over a network that already has its own packet checking the CRC test is redundant - e.g. IP networks. So you want the ability to not do the message integrity check, the message signing check, or the message supported check.
This could probably be achieved pretty easily using a #define that is default disabled - thereby not breaking any existing APIs.

Can you create a PR in ArduPilot/MAVLink for the C library that solves this problem?

@MaEtUgR
Copy link
Contributor Author

MaEtUgR commented Mar 1, 2019

Can you create a PR in ArduPilot/MAVLink for the C library that solves this problem?

You mean https://github.com/ArduPilot/pymavlink right? To be honest that doesn't sound like peanuts to me because it looks easy to break functionality. I hoped there's a MAVLink expert who would be quicker. But I can try to convince myself to solve it.

@hamishwillee
Copy link
Collaborator

hamishwillee commented Mar 3, 2019

@MaEtUgR Yes, I mean https://github.com/ArduPilot/pymavlink

There may be such a MAVLink expert, but from what I've seen, the only changes that happen in any kind of timely fashion are those that people PR themselves.

Yes, this may not be easy. On further thinking:

  • forwarding has to be done by channel, because not every message can/should automatically be forwarded.
  • In addition to ignoring and CRC, CRC_EXTRA you also need to be able to forward even if the message is incorrectly signed with respect to the current node.

@hamishwillee
Copy link
Collaborator

But if you can't convince yourself to do this, I'd still post as an issue on https://github.com/ArduPilot/pymavlink

@hamishwillee
Copy link
Collaborator

@peterbarker I was wondering if you could offer advice on this. Essentially the question is "what is the best way to parse a message fully, ignoring any errors, so that you can forward it over another interface. This is needed so we can forward messages where the current library does not "understand" the message (not built to same dialect) and where it does not have the message signature. Ideally you would additionally like to be able to conditionally not-do the CRC check either, for example if routing over an IP network that does its own CRC.

The easiest way to do this would be to have new method like mavlink_parse_char() that always returns the final message object even if there are parsing errors. I even think you could reasonably argue that this is how the method should be - though it is a break in behaviour because the current version assumes a bad signature or bad CRC are not successful decoding.

I think you can just call mavlink_frame_char_buffer() ignoring all errors until either it returns msg_received of framing complete OR the parse state returns to MAVLINK_PARSE_STATE_IDLE . Possible problem -> I'm not certain that the signature gets copied in the case where there was already a CRC error.

MAVLINK_HELPER uint8_t mavlink_parse_char_new(uint8_t chan, uint8_t c, mavlink_message_t* r_message, mavlink_status_t* r_mavlink_status)
{
    uint8_t msg_received = mavlink_frame_char(chan, c, r_message, r_mavlink_status);
    if (msg_received == MAVLINK_FRAMING_BAD_CRC ||
	msg_received == MAVLINK_FRAMING_BAD_SIGNATURE) {
           if (status->parse_state == MAVLINK_PARSE_STATE_IDLE) {

	    // we got a bad CRC or a bad signature AND the parser has returned back to idle, indicating that we have got the signature. 
            //Do we need to copy our final value -  appears to happen in mavlink_frame_char_buffer. 
           // Probably should still  copy the error 

	    if (c == MAVLINK_STX)
	    {
		    status->parse_state = MAVLINK_PARSE_STATE_GOT_STX;
		    rxmsg->len = 0;
		    mavlink_start_checksum(rxmsg);
	    }
	    return 0;

           }
    }
    return msg_received;
}

Thoughts?

@MaEtUgR
Copy link
Contributor Author

MaEtUgR commented Mar 6, 2019

MAVLink call: @hamishwillee has ideas for a solution and James needs to be mentioned.

@hamishwillee
Copy link
Collaborator

@auturgy Here is summary of problem statement and possible solutions for @tridge.

The MAVLink specification states that:

  • systems must only locally process messages that they understand. So the message must:
    • have a valid CRC, indicating that the message is not corrupted AND that the library was built with a compatible message definition (CRC_EXTRA).
    • have a valid signature that matches the stored key.
    • have no incompatibility bits that are not understood
  • systems should route/forward messages according to another set of rules, irrespective of whether the system understands or trusts the message (as above).

The current MAVLink C library supports the first case (local processing) but not the second (forwarding). The reason is that:

  • mavlink_parse_char() never reports a complete message if there is a CRC, signing error or incompatiblity (depending on error, may just set parsing state to idle).
  • Under the hood mavlink_frame_char_buffer does not even separately check the CRC_EXTRA.
  • Incompatibility flags being detected sets parse state to idle - ie rest of message is skipped until next STX detected.

Anecdotally forwarding systems currently just build the system for as many dialects as possible, and allow all messages to be forwarded irrespective of signature. There is nothing they can do about incompatibility flags. None of which works very well or is very "safe".

What is needed is for compatible libraries to be able to:

  • fully parse messages with an invalid CRC_EXTRA, invalid signature, incompatibility bits, other errors
  • Report error/s so that the local system can choose whether to process the message locally (and forward)
  • IDEALLY, conditionally omit CRC check altogether for efficiency where message is being routed over a transport that does its own CRC.

Possible solution is:

  • Update mavlink_frame_char and mavlink_frame_char_buffer to separately report errors on all the fail cases - CRC, CRC_EXTRA, signature, incompatibility, (anything else? untrimmed mavlink 2 messages?)
  • A new version of mavlink_parse_char() that parses the whole message to completion (rather than stopping on errors) and returns an error object containing all the known errors.
    • users would call message until it returns complete in same way as mavlink_parse_char(). Unlike for mavlink_parse_char() they could not assume that a complete message is valid but must check the errors to determine whether to process locally.
    • They could then forward irrespective of errors, or based on specific errors (e.g. you might not forward bad CRC)
    • Could implement mavlink_parse_char in terms of this new method for compatibility.

It would be nice to be able to just rewrite mavlink_parse_char as above, but this is probably too great a compatibility break.

@hamishwillee hamishwillee changed the title Forwarding of unkown messages in C library not supported Forwarding of unknown messages in C library not supported Mar 6, 2019
@hamishwillee
Copy link
Collaborator

OK, I had a discussion on this with @tridge. The summary of which is:

  1. An approach like this might work but has the cost that it is likely to result in more lost data in lossy environments.
    • The reason being that right now when there is an error the parser immediately returns to the idle state and starts looking for the magic byte (STX). If we were to always process these to the end then missing packets in the stream could result in the next STX being missed before erroring, and we'd miss a whole packet (apparently there are other issues).
  2. This approach won't help deal with other types of packets interleaved on the interface
  3. There are various workarounds but they all cost memory. That would be fine on a companion or linux bridge computer perhaps, but not on FC.
  4. General direction for users is to roll out your own solution for forwarding. There is an example in the ardupilot fork of ESP8266 that does some clever/complicated parsing to forward packets, including data that is not understood.
    • This requires fairly deep knowledge of the parser state.

A solution is to buffer bytes since the last successful packet collected. In case of an error then callback with the current buffer (which may be forwarded) when the next STX is detected.

  • The buffer could be the maximum size of a message+signature by default
  • This should be enabled with a define so that low memory systems don't need to support it.
  • This would cope with any other type of non-parsable data too.
  • not sure what it should do with large amounts of non-parsable data - ie if the buffer over-runs should we just dump it? My first thought is that if we're in idle and we get a non-STX packet we should callback immediately.

@olliw42
Copy link
Contributor

olliw42 commented Apr 26, 2019

I think you guys forget about target sysid and compid
without having the dialect (i.e. MAVLINK_MESSAGE_CRCS) there is no way to know them and thus no chance to forward them (as specified in https://mavlink.io/en/guide/routing.html)
a major design flaw of mavlink (could have been easily corrected when going to v2, but no one listens, right)

:)

@hamishwillee
Copy link
Collaborator

I think you guys forget about target sysid and compid
without having the dialect (i.e. MAVLINK_MESSAGE_CRCS) there is no way to know them and thus no chance to forward them (as specified in https://mavlink.io/en/guide/routing.html)
a major design flaw of mavlink (could have been easily corrected when going to v2, but no one listens, right)

@olliw42 I hadn't forgotten about these. If the message target is unknown (ie it can't be read because we don't understand the message) it can be assumed to be broadcast - and so should be forwarded on all interfaces. We could state that in the routing docs, but I think something people would reasonably assume.

@olliw42
Copy link
Contributor

olliw42 commented Apr 29, 2019

well, I couldn't
thx for the clarification anyways

I think it MUST be added!
On the one hand you claim MAVLink to be a safety critical part and be thus soo concerned and soo careful and soo professional and on the other hand leave the specification to the people's guesses ...

It's also a matter of exposing a logically consistent protocol.

The conclusion of the forwarding of the unknowns then MUST be that essentially every piece of junk which starts with a STX is broadcasted to any other device. I'm not sure - and am not convinced - if that is what one would reasonably expect a protocol to do. I note that essentially any other protocol does this as expected, and one could have learned from the existing.

So, it seems we at least agree in that MAVLink is not good at keeping the channels free of junk. :)

@hamishwillee
Copy link
Collaborator

hamishwillee commented Apr 30, 2019

@olliw42 Fair enough - if it is ambiguous or unclear it should be fixed. Done in mavlink/mavlink-devguide#179. Note that anyone can make fixes to the spec and we'll review.

The conclusion of the forwarding of the unknowns then MUST be that essentially every piece of junk which starts with a STX is broadcasted to any other device. I'm not sure - and am not convinced - if that is what one would reasonably expect a protocol to do. I note that essentially any other protocol does this as expected, and one could have learned from the existing.
So, it seems we at least agree in that MAVLink is not good at keeping the channels free of junk. :)

If you don't do it this way then you need to ensure that every routing component is kept up to date with the latest version of the library, and it has to know every message type that you might want to send on the network in advance. That is quite a burden on users and manufacturers of MAVLink bridge components.

Further, it is quite possible that other things are running on the same network and you might not want to swallow/clean up their packets either. The ESP8266 wifi bridge for example also forwards any packet it gets whether MAVLink or not.

Yes, we can certainly learn from other protocols in many areas (e.g. discoverability and id allocation), but I'm not convinced that this particular case is one of them.

@LorenzMeier
Copy link
Member

Closing this in favor of #1166.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants