Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add helper function for routing #1166

Open
LorenzMeier opened this issue Jun 26, 2019 · 11 comments
Open

Add helper function for routing #1166

LorenzMeier opened this issue Jun 26, 2019 · 11 comments
Labels

Comments

@LorenzMeier
Copy link
Member

LorenzMeier commented Jun 26, 2019

We need a new parser that allows parsing a message with a failed CRC match such that we can show how routing should be done.

The beauty of the approach laid out below is that it is completely sound for links that have signing enabled and reasonably robust for links without signing.

@hamishwillee Came up with this rough proposal:
#1092

The new parser code is below. The state machine would look like this:

Assumptions:

  • The router sees data from an IP or wireless link which has packet-level checksums (true for all known instances)
  • The router might also see data from a serial link and has a sufficiently large RX buffer that the buffer doesn't overflow
  • We infer packet boundaries by looking at STX + len and checking that the following byte is STX (for a FIFO) or the packet is complete.

If any of the above assumptions are violated, the parsing could lose sync while it sees unknown messages and would only re-gain sync on parsing a known message. In pretty much all "real" implementations this seems like an acceptable known limitation.

Flow:

  • Have a buffer per connected link
  • Parse all bytes incoming from this link
  • Look in first priority for a valid signature to verify the packet
  • If not signed: Look for packet end markers (wireless links / UDP) or STX + len followed by another STX
  • Forward packet
for (int i = 0; i < len; i++) {
if (i = len - 1 && source == udp) {
next_c == STX;
} else if {
// Provide next character from buffer
}

mavlink_parse_char_new(uint8_t chan, uint8_t c, uint8_t next_c, mavlink_message_t* r_message, mavlink_status_t* r_mavlink_status);
}

MAVLINK_HELPER uint8_t mavlink_parse_char_new(uint8_t chan, uint8_t c, uint8_t next_c, mavlink_message_t* r_message, mavlink_status_t* r_mavlink_status)
{
uint8_t msg_received = mavlink_frame_char(chan, c, r_message, r_mavlink_status);
if (msg_received == MAVLINK_FRAMING_BAD_CRC ||
msg_received == MAVLINK_FRAMING_BAD_SIGNATURE) {
if (status->parse_state == MAVLINK_PARSE_STATE_IDLE && next_c == MAVLINK_STX) {

// we got a bad CRC or a bad signature AND the parser has returned back to idle, indicating that we have got the signature.
//Do we need to copy our final value - appears to happen in mavlink_frame_char_buffer.
// Probably should still copy the error

if (c == MAVLINK_STX)
{
status->parse_state = MAVLINK_PARSE_STATE_GOT_STX;
rxmsg->len = 0;
mavlink_start_checksum(rxmsg);
}
return 0;

}
}
return msg_received;
}
@olliw42
Copy link
Contributor

olliw42 commented Jun 27, 2019

We infer packet boundaries by looking at STX + len and checking that the following byte is STX (for a FIFO) or the packet is complete.

which implies that if there is a time gap to the next message the parser and thus the whole mavlink handling gets stuck until a next message comes in.

So this proposal, as is, is based on the additional assumption that there is a pretty intense and continuous communication on each channel with sufficiently small time gaps between the messages. One should at least list this.

Since this is the parser, this also would affect end-point components, not only routing components, and it here could have potentially dire consequences.

A remedy I would see on the quick is to modify the parser such that it behaves as before if there is no next byte in the buffer, or a short timeout is set. In which case the change suggested by the proposal is largely a none change. One also could offer two parsers, for routers and endpoints, maybe via a #define.

The proposal also doesn't address the issue of not being able to identify the target system. It's "funny" how inconsistent the arguments are used. One time messages with a target system should not be broadcast, the other time its accepted as default (like here).

This is beauty and not a hack to work around a flaw?

I would suggest to grab the issue at it's horns. There are probably more potential solutions, and probably better ones than the following, it's meant just as an example to give the spirit and direction and not a ready-to-go solution. (the above proposal isn't ready to go either, right)

The real issue is that the mavlink format entangles what could be called protocol and data, and I thus would suggest to find ways to disentangle. The entanglement is largely due to two points, the extra crc and the unknown location of the target values. The proper solution to the target thing is obvious, they should be placed at a fixed position in the data stream, e.g. after the message id (and a flag should/could indicate it's presence/nonpresence if one doesn't want them to be there always). There is no obvious unique solution to the extra crc thing, all options I considered at the end always led to the addition of one additional byte in the data stream. If one were to take this as a given, a simple solution would be to simply place it at the end of the data, before the crc. Now the data stream has been modified, let's call it Mavlink 2.1, so we need to ensure backwards compatibility. One could for instance use some of the free bits in the header to indicate that. The parsing and the routing for these messages would now be totally canonical, and one also wouldn't have to do the search for the target anymore, which would be a performance and resource gain. Since the free bits were not used so far, setting them would not affect existing implementations, for them things would be just as they are. The penetration into the routing components also would not be slower than with the above proposal, since they need to be updated either way. The penetration into GCSes should be no problem, since they are updated frequently and users - I presume - follow the updates frequently. The penetration into components might be slower however, and this would be the disadvantage in comparison to the above proposal. However, the substantial advantage, namely that one would set the train properly on the track and would finally set a path to converge to something robust, reliable, performance efficient, and to what is just a reasonable protocol should - IMHO - more than outweight the one drawback.

beautiful hacks are great, but not all hacks are beautiful ;)

@hamishwillee
Copy link
Collaborator

hamishwillee commented Jun 28, 2019

@LorenzMeier how would this new method be used? The current version of mavlink_parse_char() essentially always swallows errors from mavlink_frame_char() by returning 0 for these cases - and otherwise returns 0 (incomplete framing) until you have fully parsed message.
I think that this code example doesn't swallow the errors because the code to return 0 is only executed when you're in idle/next thing is STX. So this will complete as soon as you have any error. Could be wrong.

Is the intention that when the next STX is received that is when you return msg_received which will have the whole message decoded (and errors)? So you would then check the error and decide whether to forward or not. Would be good to have an example.

I think @olliw42 has a point in that mavlink_message_t still doesn't make it easy to work out the destination for forwarding the message - so still non-trivial to work out whether message should be forwarded and where (according to spec if it is addressed then you should only forward to an interface if the target has been seen on that link).

I'll leave you to discuss the rest of @olliw42 post as I'm not sure I fully understand it.

@julianoes julianoes added the Dev Call Issues to be discussed during the Dev Call label Jul 1, 2019
@stale
Copy link

stale bot commented Apr 23, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Apr 23, 2020
@stale
Copy link

stale bot commented Jun 22, 2020

This issue has been automatically marked as stale because it has not had recent activity. Thank you for your contributions.

@stale stale bot added the stale label Jun 22, 2020
@hamishwillee hamishwillee removed stale wontfix Dev Call Issues to be discussed during the Dev Call labels Jun 22, 2020
@hamishwillee
Copy link
Collaborator

I've un-staled this. It is stale, but it would also be good to finish.

@eminakgun
Copy link

What is current plan here?
I am implementing new messages to communicate GCS with companion computer over autopilot routing. Now I need to rebuild autopilot since it ignores the messages that it does not know. And autopilot has nothing to do with that messages, just routing.

@hamishwillee
Copy link
Collaborator

This is desirable but not a priority. Feel free to create a PR with a solution that suits you and we will review.

You might reverse your connection so that the companion routes the Autopilot traffic. You'd still need to keep things updated, but presumably easier to do that in your companion than the autopilot.

@stale
Copy link

stale bot commented Sep 20, 2020

This issue has been automatically marked as stale because it has not had recent activity. Thank you for your contributions.

@stale stale bot added the stale label Sep 20, 2020
@stale
Copy link

stale bot commented Nov 22, 2020

This issue has been automatically marked as stale because it has not had recent activity. Thank you for your contributions.

@stale stale bot added the stale label Nov 22, 2020
@hamishwillee
Copy link
Collaborator

FYI only MAVLink forwarding is done for the ESP8266 in this way: https://github.com/dogmaphobic/mavesp8266/blob/master/src/mavesp8266_gcs.cpp#L131-L137

@stale stale bot removed the stale label Feb 4, 2021
@stale
Copy link

stale bot commented Jun 20, 2021

This issue has been automatically marked as stale because it has not had recent activity. Thank you for your contributions.

@stale stale bot added the stale label Jun 20, 2021
asvol added a commit to asv-soft/asv-mavlink that referenced this issue Apr 22, 2023
Devices in network could not understand unknown messages (cause mavlink/mavlink#1166) and will not forwarding it.
This flag indicates that the message is not standard and we need to wrap this message into a V2_EXTENSION message for the routers to successfully transmit over the network.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants