Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add descriptive Plane failsafe strings #20129

Closed

Conversation

peterbarker
Copy link
Contributor

This is not without significant flash costs - 1200-odd bytes.

There's also the fact it puts us into future-debt as this list is unlikely to get shorter.

@timtuxworth
Copy link
Contributor

This is an awesome idea, the failsafe messages are incredibly unhelpful. Not to mention the challenge of finding the "reason code" in the documentation, this is something that needs urgent attention when it happens. Who has time to go do a google search for "Arduplane failsafe reason = 3" when you are flying a plane and need to know how to react right now!

Also - device technology is advancing in leaps and bounds. Just imagine what your desktop PC experience would be if Windows developers were still trying to get Windows 11 to run in 640k because there are still some IBM XTs around? The capabilities of the software should not be limited by legacy hardware.

Copy link
Contributor

@timtuxworth timtuxworth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible existing clients depend on the numeric codes? e.g. a mavproxy consumer or something like mavexplorer might be expecting the numeric codes. Suggest adding the text version of the codes at the end of message.

Copy link
Contributor

@timtuxworth timtuxworth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't "Throttle Failsafe" (in radio.cpp) and "Radio Failsafe" (in ModeReason.cpp) the same thing? I find this confusing. the documentation seems to use them interchangeably so I'm not 100% sure. If they are the same thing then both references should say the same thing (perhaps "Throttle/Radio Failsafe"?). If they are different, how can you turn "Throttle Failsafe" on/off, but have no reason code for Throttle failsafe?

Copy link
Contributor

@timtuxworth timtuxworth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is "Fixedwing" a word? Perhaps Fixed wing? (ModeReason.cpp line 64)

Copy link
Contributor

@timtuxworth timtuxworth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same change also needs to be made to failsafe_long_on_event() which raises the question about what the "type" parameter is for. It seems that currently failsafe_short_on_event() is only ever called with the parameter FAILSAFE_SHORT, Based on the code you see here you could get a message that reads;

Failsafe. Short Event on: type=Long/Reason=x

Perhaps they should both use the same format string? e.g.:
"%s event on type = %u %s/reason = %u/%s"

@Hwurzburg
Copy link
Collaborator

Hwurzburg commented Feb 20, 2022

Aren't "Throttle Failsafe" (in radio.cpp) and "Radio Failsafe" (in ModeReason.cpp) the same thing? I find this confusing. the documentation seems to use them interchangeably so I'm not 100% sure. If they are the same thing then both references should say the same thing (perhaps "Throttle/Radio Failsafe"?). If they are different, how can you turn "Throttle Failsafe" on/off, but have no reason code for Throttle failsafe?

they are not the same thing.....radio failsafe can occur with the loss of pulese, with the failsafe bit in the SBUS or CRSF streams becoming active, if pulses are out of bounds, etc... throttle failsafe is the detection of throttle PWM below a certain value

@Hwurzburg
Copy link
Collaborator

personally, I think this is a terrible waste of code space....99.9% of failsafes will be radio failsafes unless intentionally triggered by throttle (I use this for RTL/QRTL mode force to avoid using a mode switch position)....if its a GCS failsafe, that will usually be obvious also...unexpected failsafes can be analyzed from the log messages and the wiki.....

@IamPete1
Copy link
Member

We don't have to provide a full list, some could still return the number.

@timtuxworth
Copy link
Contributor

We don't have to provide a full list, some could still return the number.

That's an awesome idea @IamPete1 we could have two different tables based on an #ifdef. For low flash devices most of the text strings could be "" except for a few important ones which could be heavily abbreviated. For higher (2 meg and above?) flash devices we could include more descriptive reasons. What do you think @peterbarker ?

Copy link
Contributor

@timtuxworth timtuxworth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ap_mode_reason_string() should return "" rather than nullptr in the not used/#else case.

Also, as written I think the reason (not even the number) will not be displayed at all for small boards, I don't think this is good.

Or how about this:
gcs().send_text(MAV_SEVERITY_WARNING, "Failsafe. Short Event on: type=%u/reason=%u %s", fstype, static_cast(reason), ap_mode_reason_string(reason));

And change the #else case for ap_mode_reason_string() to return "";.

Also needs this change in failsafe_long_event _on()

#include <AP_HAL/AP_HAL_Boards.h>

#ifndef AP_MODE_REASON_STRINGS
#define AP_MODE_REASON_STRINGS (BOARD_FLASH_SIZE > 1024)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this not be >= 1024? I'm thinking it would be ok on my PixHawk1-1M for example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's really not :-) Check how much space you don't have on that thing...

It's the sort of thing that you could drop into your own hwdef, 'though - which is why it's gated in AP_MODE_REASON_STRINGS

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's really not :-) Check how much space you don't have on that thing...

It's the sort of thing that you could drop into your own hwdef, 'though - which is why it's gated in AP_MODE_REASON_STRINGS

I have > 13k if I build your PR. ;-)

Copy link
Contributor

@timtuxworth timtuxworth Feb 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BUILD SUMMARY
Build directory: /Users/tim/ArduPilot/ardupilot/build/Pixhawk1-1M
Target         Text (B)  Data (B)  BSS (B)  Total Flash Used (B)  Free Flash (B)
--------------------------------------------------------------------------------
bin/arduplane   1017868       856   195948               1018724           13452

@Autonomost
Copy link
Contributor

Why not do the work on the GCS side? Translate these error codes into meaningful data to someone monitoring the system?

@timtuxworth
Copy link
Contributor

timtuxworth commented Feb 23, 2022

Why not do the work on the GCS side? Translate these error codes into meaningful data to someone monitoring the system?

That's an interesting idea, but isn't it a significant architectural change which would fundamentally change what GCS software like Mission Planner, QGroundControl, and the mavproxy tools expect? It would have a benefit of allowing for localization (different languages) too. I'm thinking this is not how the system, with all messages, not just these ones, is currently designed fundamentally.

@@ -280,7 +280,7 @@ void Plane::control_failsafe()
// throttle has dropped below the mark
failsafe.throttle_counter++;
if (failsafe.throttle_counter == 10) {
gcs().send_text(MAV_SEVERITY_WARNING, "Throttle failsafe on");
gcs().send_text(MAV_SEVERITY_WARNING, "Throttle failsafe %s", "on");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do this? just wastes flash

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No it doesn't. It saves a whole 12 bytes. I did measure - but here, I'll do it again :-)

before: bin/arduplane    963308      2024   129140                965332           17696
 after: bin/arduplane    963296      2024   129136                965320           17712

common substring combination thingy...

Now I'm still kind of ambivalent about the patch, it certainly doesn't make the code easier to read.

Copy link
Contributor

@timtuxworth timtuxworth Feb 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I notice that with the changes to failsafe_short_on_event() and failsafe_long_on_event(), the messages seem redundant. You get:

AP: Throttle failsafe on
AP: Failsafe. Short event ON
AP: Failsafe reason=1 (Throttle below limit)

Could we not just remove the messages from radio.cpp?

ArduPlane/radio.cpp Show resolved Hide resolved
break;
}

gcs().send_text(MAV_SEVERITY_WARNING, "Failsafe. Short event on: type=%s/reason=%s", type_string, ap_mode_reason_string(reason));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

far too wide for both mission planner and OSD

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@timtuxworth timtuxworth Feb 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

far too wide for both mission planner and OSD

How about splitting into two messages then? I've made a proposal which displays this:

CIRCLE> AP: Throttle failsafe on
AP: Failsafe. Short event ON
AP: Failsafe reason=1 (Radio Failsafe)
AP: Flight mode = CIRCLE
Mode CIRCLE
RTL> Mode RTL
AP: Failsafe. Long event ON
AP: Failsafe reason=2 (Radio Failsafe)
AP: Flight mode = RTL

The pronunciation is also much nicer in QGC when the text reads like this.

the code is here: https://github.com/timtuxworth/ardupilot/tree/pr/20129

@tridge tridge removed the DevCallEU label Feb 23, 2022
@peterbarker
Copy link
Contributor Author

Why not do the work on the GCS side? Translate these error codes into meaningful data to someone monitoring the system?

Yes, the events API which has been proposed in MAVLink would allow for this sort of thing.

However, it's a whole chunk of bytes to support the events API. I've got a branch....

Further, the events API wouldn't help when displaying on an old-skool OSD which only understands statustexts, not a fancy new API.

@Hwurzburg
Copy link
Collaborator

Hwurzburg commented Feb 23, 2022

this really opens several other issues:

  1. Short failsafe can occur ONLY for radio failsafe, either real (RC issues..fs bit, invalid pulses for x time) or forced with throttle failsafe (which has its own message)...so it really needs no expansion or reasons and could be actually shortened to just " Short Failsafe On/Off".
  2. AIUI, most of ModeReasons are for Mode changes as part of normal operations and not failsafes....and almost all are used only for internal communication, not user communication.....and many are not Plane applicable....
  3. Long Failsafes are either radio or gcs....battery failsafe is handled in the battery monitor code....it could be simply either of those (again forced radio fs via throttle has it own message to differentiate loss/bad rc vs throttle fs)

Sorry, but I think this whole approach is wrong

@timtuxworth
Copy link
Contributor

this really opens several other issues:

  1. Short failsafe can occur ONLY for radio failsafe, either real (RC issues..fs bit, invalid pulses for x time) or forced with throttle failsafe (which has its own message)...so it really needs no expansion or reasons and could be actually shortened to just " Short Failsafe On/Off".
  2. AIUI, most of ModeReasons are for Mode changes as part of normal operations and not failsafes....and almost all are used only for internal communication, not user communication.....and many are not Plane applicable....
  3. Long Failsafes are either radio or gcs....battery failsafe is handled in the battery monitor code....it could be simply either of those (again forced radio fs via throttle has it own message to differentiate loss/bad rc vs throttle fs)

Sorry, but I think this whole approach is wrong

This literally happened to me yesterday out at the field with a new build, PixHawk1-1M running 4.1.7 with my MacBook Air running Mission Planner and QGC. My plane wouldn't arm and kept throwing these cryptic messages. If I wasn't working with @peterbarker on this, I would have been completely bamboozled. As it was I was only a little bamboozled.

What I needed, out at the field with no internet, was clear concise information about what the problem was and what I should do about it. Even knowing that reason=3 means Throttle/Radio triggered short and then long event, I ended up having to call it and bring the plane back home to fix it. Turns out my Failsafe throttle limit was set to 1000 while the radio was sending 988 on throttle down. Now that would have been good to know at the field.

This brings me to the battery failsafe. I had just noticed that there is no similar message in the battery failsafe code - doesn't it need something too? I certainly would want to know what's going on if the plane unexpectedly starts doing "something" due to battery failsafe.

If the approach is wrong @Hwurzburg, is the question "what are we trying to achieve?". My answer would be as a pilot, I want actionable information about the current situation in my GCS or OSD, not something I can look up after the plane has crashed. Is there a better way to get this?

@Hwurzburg
Copy link
Collaborator

Hwurzburg commented Feb 23, 2022

first reading the wiki before flying is always a good thing....the First Time Setup section and First Flight section are must reads
throttle failsafe would not have been unexpected if https://ardupilot.org/plane/docs/apms-failsafe-function.html was read and understood
there is no possible way to provide corrective information via GCS messages....in the case of failsafes, there is nothing that we can provide to explain the exact cause AND remedy at the field, much less in flight....only the cause
an understanding and implementation of the basic setup as outlined in the wiki is the only way issues can be minimized at the field...the code can give messages that hint at where to investigate further

@Hwurzburg
Copy link
Collaborator

Battery failsafe has its own message and will display the batt stats and action level when triggered

@Hwurzburg
Copy link
Collaborator

Hwurzburg commented Feb 23, 2022

btw, we have many, many pre-arm checks that try to prevent a user from attempting flight with incorrect setup....most require study of the wiki to actually determine cause when they occur....and even then it can be elusive..BAD AHRS being an infamous one..or ROLL/PITCH inconsistent...etc.

..on a new setup, I always try to go thru initialization and arming at HOME before travelling to the field....I can do that inside on the bench since I am lucky enough to get good GPS reception, but at worst, it would mean taking the vehicle outside on the driveway and going thru arming...

@timtuxworth
Copy link
Contributor

btw, we have many, many pre-arm checks that try to prevent a user from attempting flight with incorrect setup....most require study of the wiki to actually determine cause when they occur....and even then it can be elusive..BAD AHRS being an infamous one..or ROLL/PITCH inconsistent...etc.

..on a new setup, I always try to go thru initialization and arming at HOME before travelling to the field....I can do that inside on the bench since I am lucky enough to get good GPS reception, but at worst, it would mean taking the vehicle outside on the driveway and going thru arming...

That's what happened to me yesterday @Hwurzburg - I live in an apartment surrounded by hi-rises and no driveway. No GPS fix until I drive out of my underground parking garage and head to the field. So first thing I did at the field was compass calibration and then enabled safety checks (I had to turn them off inside because of no GPS).

@timtuxworth
Copy link
Contributor

first reading the wiki before flying is always a good thing....the First Time Setup section and First Flight section are must reads
throttle failsafe would not have been unexpected if https://ardupilot.org/plane/docs/apms-failsafe-function.html was read and understood
there is no possible way to provide corrective information via GCS messages....in the case of failsafes, there is nothing that we can provide to explain the exact cause AND remedy at the field, much less in flight....only the cause
an understanding and implementation of the basic setup as outlined in the wiki is the only way issues can be minimized at the field...the code can give messages that hint at where to investigate further

Thanks @Hwurzburg I've proposed a couple of changes to the wiki page to reference the GCS messages, so it's clearer that the messages you see are related to the Failsafe event triggering. My PR for this is here ArduPilot/ardupilot_wiki#4150 . I just used the web to make the changes, I still haven't figured out how to edit reStructuredText on my Mac.

@timtuxworth
Copy link
Contributor

timtuxworth commented Feb 23, 2022

Aren't "Throttle Failsafe" (in radio.cpp) and "Radio Failsafe" (in ModeReason.cpp) the same thing? I find this confusing. the documentation seems to use them interchangeably so I'm not 100% sure. If they are the same thing then both references should say the same thing (perhaps "Throttle/Radio Failsafe"?). If they are different, how can you turn "Throttle Failsafe" on/off, but have no reason code for Throttle failsafe?

they are not the same thing.....radio failsafe can occur with the loss of pulese, with the failsafe bit in the SBUS or CRSF streams becoming active, if pulses are out of bounds, etc... throttle failsafe is the detection of throttle PWM below a certain value

So if they are not the same thing, should there not be a different "ModeReason" for Throttle Failsafe vs Radio Failsafe?

there could be, I suppose, but what would it be used for? as I said, its clear which is which already from the messages....

a possible change is as I suggested....change the short failsafe message to only Short FS on or off, Throttle FS is already announced as cause vs RC issues
and change long failsafe message to be either Long FS Radio or Long FS GSC...the only possibilities in Plane

here is a PR #20156 that does that....only costs 40 bytes...if Peter's string thing was added for the Throttle Failsafe message, which saves 12 bytes, then the cost would only be 28 bytes....I think its clearer without too much flash increase for small boards...

@timtuxworth
Copy link
Contributor

I split out THROTTLE_FAILSAFE from RADIO_FAILSAFE as a new reason code. As @Hwurzburg explained, they are different things. I've made changes to ModeReason.h, ModeReason.cpp and system.cpp to implement this. It means that the "reason" for either radio or throttle has to change. I chose throttle, but it could just as easily be radio. Either way there is a new "reason=x", but with text messages it's easier to deal with (IMHO). The code is in https://github.com/timtuxworth/ardupilot/blob/pr/20129

@peterbarker
Copy link
Contributor Author

We merged the alternative.

@peterbarker peterbarker closed this Mar 1, 2022
@peterbarker peterbarker deleted the pr/plane-failsafe-strings branch March 1, 2022 00:06
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants