Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MISSION_ITEM_REACHED not sent in SITL #8245

Closed
lamping7 opened this issue Nov 6, 2017 · 16 comments
Closed

MISSION_ITEM_REACHED not sent in SITL #8245

lamping7 opened this issue Nov 6, 2017 · 16 comments
Labels

Comments

@lamping7
Copy link
Member

lamping7 commented Nov 6, 2017

Bug Report

MISSION_ITEM_REACHED is not being sent for every WP reached. Verified using MAVLink Inspector in QGC, then set mavlink verbose on.

Console output:

INFO  [mavlink] WPM: Send MISSION_COUNT 2 to ID 0, mission type=0
WARN  [navigator] offboard mission updated: dataman_id=0, count=2, current_seq=0
INFO  [mavlink] WPM: got mission result, new current_seq: 0
INFO  [mavlink] WPM: got mission result, new current_seq: 0
INFO  [tone_alarm] neutral
INFO  [mavlink] WPM: got mission result, new current_seq: 1
INFO  [mavlink] WPM: got mission result, new current_seq: 1
INFO  [mavlink] WPM: got mission result, new current_seq: 1
INFO  [mavlink] WPM: Send MISSION_ITEM_REACHED reached_seq 1
INFO  [mavlink] WPM: got mission result, new current_seq: 1
INFO  [mavlink] WPM: Send MISSION_ITEM_REACHED reached_seq 1
[ INFO] [1509999513.762043763, 231.870000000]: WP: reached #1
INFO  [mavlink] WPM: Send MISSION_ITEM_REACHED reached_seq 1
[ INFO] [1509999513.789132445, 231.898000000]: WP: reached #1
INFO  [mavlink] WPM: Send MISSION_ITEM_REACHED reached_seq 1
INFO  [mavlink] WPM: Send MISSION_ITEM_REACHED reached_seq 1
[ INFO] [1509999513.889499192, 231.998000000]: WP: reached #1
INFO  [mavlink] WPM: Send MISSION_ITEM_REACHED reached_seq 1
INFO  [mavlink] WPM: Send MISSION_ITEM_REACHED reached_seq 1
[ INFO] [1509999513.989851483, 232.099000000]: WP: reached #1
INFO  [mavlink] WPM: Send MISSION_ITEM_REACHED reached_seq 1

I loaded a mission with 2 waypoints. Expected to receive a reached message for seq 0 and didn't. Instead, got multiple reached messages for seq 1 once it was reached.

Related to something around here? https://github.com/PX4/Firmware/blob/3c18be387cc47c9875137745b171e7ccec901f2e/src/modules/mavlink/mavlink_mission.cpp#L502

Ubuntu 16.04, ROS Kinetic, PX4 aa699cf (today's master)

@lamping7 lamping7 changed the title MISSION_ITEM_REACHED not sent in STIL MISSION_ITEM_REACHED not sent in SITL Nov 6, 2017
@dagar dagar added the bug label Nov 6, 2017
@dagar
Copy link
Member

dagar commented Nov 6, 2017

I can look at this, but when is MISSION_ITEM_REACHED (http://mavlink.org/messages/common#MISSION_ITEM_REACHED) actually needed? Isn't knowing the current waypoint you're on (MISISON_CURRENT) usually enough?

@lamping7
Copy link
Member Author

lamping7 commented Nov 6, 2017

I suppose you can use /mavros/mission/waypoints and check the sequence number, but in the case that you one have one waypoint at a time, this strategy can break. Mavros recently added /mavros/mission/reached that I'd like to use.

See mavlink/mavros#830 as well

@dagar
Copy link
Member

dagar commented Nov 6, 2017

What were the 2 mission items above?

@lamping7
Copy link
Member Author

lamping7 commented Nov 6, 2017

CMD 16, NAV_WAPOINT. What I did: Take off, send mission, and put FCU in AUTO.MISSION.

@lamping7
Copy link
Member Author

lamping7 commented Nov 16, 2017

@dagar From a quick look, I can tell that navigator is calling the function to set the item as reached here https://github.com/PX4/Firmware/blob/e9e663432b397e939db0a0e20a7f9fbf242dbf06/src/modules/navigator/mission.cpp#L209

which is defined here
https://github.com/PX4/Firmware/blob/4416c4ddb3ef84a958b023f50b066c0c3f1d87d7/src/modules/navigator/mission.cpp#L1407

But, when the mavlink module checks the mission item at the line I referenced above, it is false. Perhaps the reached flag is getting reset somewhere?

@dagar
Copy link
Member

dagar commented Nov 17, 2017

You're definitely on the right track. I'm wondering if it's the takeoff logic (mission started from landed?).

https://github.com/PX4/Firmware/blob/e9e663432b397e939db0a0e20a7f9fbf242dbf06/src/modules/navigator/mission.cpp#L209

@lamping7
Copy link
Member Author

lamping7 commented Nov 17, 2017

@dagar Nope. It isn't related to takeoff.

The problem is that it's being reset to false by Mission::set_current_offboard_mission_item() before mission_result is published here
https://github.com/PX4/Firmware/blob/e9e663432b397e939db0a0e20a7f9fbf242dbf06/src/modules/navigator/navigator_main.cpp#L722
or that MavlinkMissionManager::send(const hrt_abstime now) isn't called in time here
https://github.com/PX4/Firmware/blob/e9e663432b397e939db0a0e20a7f9fbf242dbf06/src/modules/mavlink/mavlink_receiver.cpp#L2498
to relay the mission_result after getting updated by this call
https://github.com/PX4/Firmware/blob/e9e663432b397e939db0a0e20a7f9fbf242dbf06/src/modules/navigator/mission.cpp#L209

If I comment out this line (in Mission::set_current_offboard_mission_item())
https://github.com/PX4/Firmware/blob/e9e663432b397e939db0a0e20a7f9fbf242dbf06/src/modules/navigator/mission.cpp#L1418
I get the MISSION_ITEM_REACHED message as expected, but I'm not aware of all of the side effects using that as a solution to this issue even though the reached flag is already being reset when the message is published here
https://github.com/PX4/Firmware/blob/e9e663432b397e939db0a0e20a7f9fbf242dbf06/src/modules/navigator/navigator_main.cpp#L1133

Also, be aware that I've only been looking at this with SITL. HITL testing is probably necessary as there are time driven events happening also.

So, if you open up QCG connected to a SITL environment (multicopter), put a few WPs out there and drag the slider to start the mission from the ground, you will get a message for each WP reached following the takeoff with the line mentioned earlier commented out. As it is now, you only get one message, once the mission is finished (seen by mission_result.finished being set also), because there are no more mission items causing a call to set_current_offboard_mission_item() that is resetting the flag in the race condition explained above. In an ideal world, you'd want to make sure each unique message is published, rather than fields getting modified in multiple places and recipients missing certain info, but I'm not sure how this is possible here. It would be easiest to force a publish in Mission::set_mission_item_reached(), but publish_mission_result is a private function. Thoughts? With some direction on how this should be fixed, I can create the PR if you'd like.

Lastly, you'll see that multiple duplicate reached messages are seen due to the condition here
https://github.com/PX4/Firmware/blob/3c18be387cc47c9875137745b171e7ccec901f2e/src/modules/mavlink/mavlink_mission.cpp#L520
but I assume that is due to time issues with the simulation.

@dagar
Copy link
Member

dagar commented Nov 17, 2017

Yes, the handling with mission_result is a bit of a mess and needs to be sorted out. It's been the source of several other issues.

Continuously publishing the reached bool and sequence number doesn't really make sense to me. When should we switch over to "no longer reached" for the next waypoint? It seems like navigator should be broadcasting once for each waypoint reached (and the seq number), but otherwise nothing.

For mavlink it's necessary to continue streaming the messages periodically. These messages aren't acknowledged by the recipient.

Great work looking through the code in detail.

@lamping7
Copy link
Member Author

I'd expect to get a message with seq_reached == seq_current and reached = 1, then reset reached and advance seq_current. The problem with this is that the software is not really sequential and is asynchronous to some degree, relying on the navigator module to publish info to the mavlink module who periodically sends mavlink messages out. Without refactoring the whole thing you can force navigator to publish the info (no longer make the publish function mentioned above private) so mavlink can see the update or can the line mentioned (commented out) be removed? I can't think of a situation where is it required, considering the publish function resets mission_result.reached. How can this be validated?

I guess you could find a different way/place to call send_mission_item_reached((uint16_t)mission_result.seq_reached); that is deterministic, guaranteeing that the MISSION_ITEM_REACHED mavlink message is sent when an item is reached. This option would take more thought and digging into.

We could also make a comparison with the seq_reached and seq_current to determine when to send the mavlink message. The only issue I see with seq_reached alone is that it starts at 0, which is a valid mission item and isn't necessarily reached, which is why we rely on the reached flag to trigger send_mission_item_reached((uint16_t)mission_result.seq_reached);. This is kind-of the same reason why you can't use the /mavros/mission/waypoints topic and why I'd like this to work.

@dagar
Copy link
Member

dagar commented Nov 17, 2017

Comparing seq_reached and seq_current in mavlink is another (potentially easy) option I was thinking about, but when should mavlink stop broadcasting the last waypoint reached?

How about...

  • anytime we're in mission mode and seq_reached is valid (between 0 and the max seq) mavlink would publish
  • switching out of mission or sending a new mission would reset seq_reached (-1)
  • mavlink broadcasts seq_reached while executing a mission
    • changes are broadcast immediately, but otherwise can fallback to 1Hz or something

Does that cover everything?

@lamping7
Copy link
Member Author

I'm assuming that by mavlink broadcasting you're referring to the MISSION_ITEM_REACHED message. Correct me if I'm wrong.

In reality this isn't continuously sent (at least not in SITL -- because _last_reached >= 0 is never satisfied (starts at -1) due to mission_result.reached always being overwritten by Mission::set_current_offboard_mission_item())

I'm not sure if this still works as intended here:
#5486
or if threading fixes affected this:
#7226

I'm with you in that constantly sending this MISSION_ITEM_REACHED message seems to be unnecessary, but may be the best solution and operationally can work as you describe. I'd like to think about this more, but a move of send_mission_item_reached((uint16_t)mission_result.seq_reached); to constantly publish based on the conditions you wrote above would be the fix?

I also think the MISSION_ITEM_REACHED message definition is flawed as there is no timestamp or other field to compare.

From a quick look (just now) at ArduPilot, they just send the mavlink message once, upon reaching the item (essentially make navigator publish the mavlink message directly I think). But I did see some open issues relating to the issue.

Again, can't we just remove the line in Mission::set_current_offboard_mission_item() to avoid the race condition?

@dagar
Copy link
Member

dagar commented Nov 17, 2017

It's not even a race condition (navigator is a single thread), just dumb code.
I think we can effectively fix this pretty easily.

With mavlink 2 you could add new fields, but this quickly gets into a larger discussion for mission protocol improvements. One thing I've wanted is a mission hash to ensure the GCS and vehicle are actually in sync.

@dagar
Copy link
Member

dagar commented Nov 17, 2017

Could you open a PR for the mission reached change? Otherwise I can do it after the v1.7.0 release is out (hopefully next week).

@lamping7
Copy link
Member Author

@dagar According to the Waypoint protocol: http://qgroundcontrol.org/mavlink/waypoint_protocol#waypoint_reached_status_message_from_mav

This message shouldn't be sent continuously. It should only be sent when the WP is reached. I recommend making it adhere to the protocol as defined, just like ArduPilot has.

So, after the MISSION_ITEM_REACHED message is sent (via send_mission_item_reached((uint16_t)mission_result.seq_reached);, the mission_result.reached value should be reset.

I'm not sure of the motivation in #5486 that made this change. The current mission is sent continuously with the current seq #. It makes not sense to continuously publish the last seq # reached when you have the current seq #. @LorenzMeier @julianoes thoughts?

I called it a race condition because there are multiple places modifying the value of reached, changing the desired output before the result is read.

@lamping7
Copy link
Member Author

I've submitted a PR adhering to my comments above. See #8332

@lamping7
Copy link
Member Author

Per the merged PR, we're now sending the mavlink message when the item is reached, then another 10 messages, at the rate of broadcast for other mavlink mission items (currently 10Hz). The duplicate messages are sent because there is no ack and we wanted to reduce the chances of a missed message due to poor telemetry links. Closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants