-
Notifications
You must be signed in to change notification settings - Fork 17.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AP_BattMonitor: Cell voltages and Temperature #6020
AP_BattMonitor: Cell voltages and Temperature #6020
Conversation
libraries/GCS_MAVLink/GCS_Common.cpp
Outdated
@@ -233,6 +233,30 @@ void GCS_MAVLINK::send_power_status(void) | |||
hal.analogin->power_status_flags()); | |||
} | |||
|
|||
void GCS_MAVLINK::send_battery_status(const AP_BattMonitor &battery, const uint8_t instance) const | |||
{ | |||
GCS_MAVLINK::send_statustext_all(MAV_SEVERITY_INFO, "1: %d 2: %d 3: %d 4: %d 5: %d 6: %d", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove the debug for PR ...
f0c9ef5
to
f594c87
Compare
@@ -480,6 +480,8 @@ struct PACKED log_Current { | |||
float battery_voltage; | |||
float current_amps; | |||
float current_total; | |||
int16_t temperature; // degrees C * 100 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do you think about just recording the temperature in C in a signed byte? It's not the size so much as the slightly awkward C*100 scaling. I'll leave it up to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay the comment might actually be more misleading... We're setting the FMT_LABEL to have the unit * 100 already, so this will result in your graping program automatically dividing this by 100 again for you. The reason I selected this resolution is that our only current source of temperature information is only good to 0.1 C, so this will convey that level of accuracy, and I didn't want to use a 4 byte storage like a float. Because you will automatically see this divided again I think this is the correct approach.
Does that make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah right. makes sense, thanks!
I'm not a huge fan of the scaling of the temperature in the log message but it can be argued either way so LGTM. |
libraries/GCS_MAVLink/GCS_Common.cpp
Outdated
mavlink_msg_battery_status_send(chan, | ||
instance, // id | ||
MAV_BATTERY_FUNCTION_UNKNOWN, // function | ||
MAV_BATTERY_TYPE_UNKNOWN, // type | ||
0x7fff, // temperature. INT16_MAX if unknown | ||
got_temperature? (temp * 100) : 0x7fff, // temperature. INT16_MAX if unknown |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
miss a cast here as temp is float and the msg variable is int16_t
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made a couple of comments. It's mainly about the mixing of MAVLink, battery monitor and dataflash cell field size - each have their own number. You might want to define a macro in battery monitor header so it can be used in dataflash, that way it will always log the correct number of cells - you'd still need to convert that into the MAVLink field size (unless you use the MAVLink macro everywhere, it's fine to me).
Also temperature is saved as float in C, but always used as int16_t in C * 100. Any reason for not saving it as the latter?
@@ -116,6 +119,10 @@ class AP_BattMonitor | |||
bool overpower_detected() const; | |||
bool overpower_detected(uint8_t instance) const; | |||
|
|||
// cell voltages | |||
const cell_voltages_t & get_cell_voltages() const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unimplemented
@@ -333,3 +336,13 @@ bool AP_BattMonitor::overpower_detected(uint8_t instance) const | |||
return false; | |||
#endif | |||
} | |||
|
|||
// return the current cell voltages, returns the first cell if the instance is out of range |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
first cell -> first monitor cells
// returns true if there is a temperature reading | ||
bool AP_BattMonitor::get_temperature(float &temperature, const uint8_t instance) const | ||
{ | ||
temperature = state[instance].temperature; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just like the cell voltages, this should be protected against a bad instance number.
|
||
// read temperature | ||
if (read_word(BATTMONITOR_SMBUS_MAXELL_TEMP, data)) { | ||
_state.has_temperature = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should it change the has_temperature status if it can't read the value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. It's possible this is just a bus error, but we should probably flag it anyways, otherwise we will have a bad stale temperature without knowing it....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I implemented this, but it appears to be a bad decision, I've attatched an image from a Solo battery, and it appears this part of the communication is really unreliable. The better approach I think will involve a timer that sets it to false if we haven't received a new temperature update for a period of time (5 seconds comes to mind as the number already used elsewhere in the drivers). Temp isn't a high priority information, but it would be nice to not be oscillating like this.
libraries/DataFlash/LogFile.cpp
Outdated
struct log_Current pkt = { | ||
LOG_PACKET_HEADER_INIT(LOG_CURRENT_MSG), | ||
time_us : AP_HAL::micros64(), | ||
battery_voltage : battery.voltage(0), | ||
current_amps : battery.current_amps(0), | ||
current_total : battery.current_total_mah(0), | ||
temperature : (int16_t)((has_temp ? temp : 0) * 100), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confusing, to me, at least. I would prefer (int16_t) (has_temp ? temp * 100 : 0)
libraries/DataFlash/LogFile.cpp
Outdated
}; | ||
for (uint8_t i = 0; i < MAVLINK_MSG_BATTERY_STATUS_FIELD_VOLTAGES_LEN; i++) { | ||
pkt.cell_voltages[i] = battery.get_cell_voltages(0)[i] + 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please get the cell voltages before the cycle instead of calling it 10 times. Also, why use the MAVLink field size here? Not only it doesn't need to match, but cell_voltages_t isn't defined with the MAVLink size. It should check it has a maximum of sizeof(pkt.cell_voltages)
.
What's the +1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah the 10 times thing was a bit of a hack to see if it made graphing nice, will swap it over.
The +1 was to make log review saner... Basically we internally stored voltages with the MAVLink representation of invalid values for a cell (UINT16_MAX) which makes sending the mavlink message handy, and avoids the fact that a dead cell in the middle of a pack actually would be reported as 0. However for the DF log it has the downside of really messing up scaling when you try and graph the cell voltages. Nudging the value by 1 mV means that a unknown cell shows up as 0 when graphed which doesn't destroy the ranges nearly as badly, but also that an actually dead cell would be a distinct value from an unknown cell.
As far as the MAVLINK_MSG_BATTERY_STATUS_FIELD_VOLTAGES, that is what we are trying to hang the voltages array off of for length, but when I include the GCS_MAVLink/GCS.h header in AP_BattMonitor, I get dependency problems. I think I can peak directly at the msg_battery_status header though. I'll give that a shot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think that including the message header directly isn't a problem. We want to get a field from it, not exactly from GCS.h
libraries/DataFlash/LogFile.cpp
Outdated
WriteBlock(&pkt, sizeof(pkt)); | ||
} | ||
|
||
if (battery.num_instances() >= 2) { | ||
float temp; | ||
bool has_temp = battery.get_temperature(temp, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0 -> 1
// clear out the cell voltages | ||
memset(&state[instance].cells, 0xFF, sizeof(cell_voltages_t)); | ||
// clear out the temperature | ||
state[instance].has_temperature = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Size reduction squad: shouldn't be needed, it is false on construction.
libraries/GCS_MAVLink/GCS_Common.cpp
Outdated
MAV_BATTERY_FUNCTION_UNKNOWN, // function | ||
MAV_BATTERY_TYPE_UNKNOWN, // type | ||
got_temperature? (temp * 100) : 0x7fff, // temperature. INT16_MAX if unknown | ||
battery.get_cell_voltages(instance), // cell voltages |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get_cell_voltages returns cell_voltages_t. I'm not a fan of that typedef, I can live with it if it has some use, which apparently isn't the case as this is relying on it being an array of 10 uint16_t. We should probably have some protection here to make sure the returned array has a size equal or bigger than 10.
10d74cd
to
863b7a7
Compare
Alright, I spent about 4 hours on rebasing problems/type problems so please carefully review this. And @jmachuca77 I hate to ask but can you retest against a solo battery again? (or anyone else can, unless your GCS understands the BATTERY_STATUS message you will need to look at the DF log and at the CURR.Temp and CURR.V1 through CURR.V4 fields). @OXINARF I unfortunately can't include GCS_MAVLink/include/mavlink/v2.0/common/mavlink_msg_battery_status.h inside of AP_BattMonitor.h without missing a ton of definitions/linker assistance, so I ended up leaving the cells array as a hard define of 10, I'm not enthused about it, but I don't see a better way at the moment. The typedef was an attempt to allow the compiler to ensure that I never violate access to outside the array length, however that didn't really allow storing the function result (or at least I couldn't get the compiler to be happy, and I regressed to the point of just trying every combination of */& to get a local handle on it :) @tridge suggested passing a struct around instead, which while looking weird has the merits of just being a nice direct access and compiles well. This was a painful lesson in doing PR's based upon other PR's... Either that or I need to strongly improve my git foo so I spend far less time editing/fixing rebase/merge conflicts/getting things wrong. |
Since my build environment is still not working, can someone send/link me the px4 file for this and I will test it tonight on my Solo? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have any issue with the structure, it looks cleaner that always writing 10 everywhere. But, like I wrote in the comments to code, we should look for the size of it.
If it isn't urgent to get this in, I'll see if I can understand why we can't include the MAVLink header.
libraries/GCS_MAVLink/GCS_Common.cpp
Outdated
mavlink_msg_battery_status_send(chan, | ||
instance, // id | ||
MAV_BATTERY_FUNCTION_UNKNOWN, // function | ||
MAV_BATTERY_TYPE_UNKNOWN, // type | ||
INT16_MAX, // temperature. INT16_MAX if unknown | ||
voltages, | ||
battery.get_cell_voltages(instance).cells, // cell voltages |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think we should check that sizeof(cells) >= 10
, otherwise this is going to be a problem if someone changes the number in AP_BatteryMonitor library.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed with static_assert above this.
libraries/DataFlash/LogFile.cpp
Outdated
}; | ||
AP_BattMonitor::cells cells = battery.get_cell_voltages(0); | ||
for (uint8_t i = 0; i < MAVLINK_MSG_BATTERY_STATUS_FIELD_VOLTAGES_LEN; i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we aren't using the MAVLink field size in the library can we change this to MIN(ARRAY_SIZE(cells.cells), ARRAY_SIZE(pkt.cell_voltages))
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks reasonable to me, but then this happens:
../../libraries/DataFlash/LogFile.cpp:1813:77: error: cannot bind packed field ‘pkt.log_Current::cell_voltages’ to ‘short unsigned int (&)[10]’
for (uint8_t i = 0; i < MIN(ARRAY_SIZE(cells.cells), ARRAY_SIZE(pkt.cell_voltages)); i++) {
~~~~^
../../libraries/AP_Common/AP_Common.h:69:52: note: in definition of macro ‘ARRAY_SIZE’
#define ARRAY_SIZE(_arr) sizeof(_ARRAY_SIZE_HELPER(_arr))
^~~~
Lazy version would be to simply make the DF log length always match the AP_BattMonitor::cells.cells length?
863b7a7
to
14fcd2e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll try to do something better for the sizes tomorrow, if I can't, I think we should put it in as is.
libraries/GCS_MAVLink/GCS_Common.cpp
Outdated
uint16_t voltages[MAVLINK_MSG_BATTERY_STATUS_FIELD_VOLTAGES_LEN]; | ||
memset(&voltages, 0xff, sizeof(uint16_t)*MAVLINK_MSG_BATTERY_STATUS_FIELD_VOLTAGES_LEN); // fill with UINT16_MAX for unknown cells | ||
// catch the battery backend not supporting the required number of cells | ||
static_assert(sizeof(AP_BattMonitor::cells) >= (sizeof(uint16_t) * MAVLINK_MSG_BATTERY_STATUS_FIELD_VOLTAGES_LEN)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have to include a message as the last parameter to static_assert.
Argh, static_assert appears to be compiler dependent as it was happy on my machine, will try adding the two arg version. |
Tested tonight. Cell voltage looks GREAT! Battery temperature looks like it is accurately reporting what the battery told it. But it looks like a sawtooth jumping from 0 to 29 to 0 to 29 repeatedly. Like the battery is not consistently reporting the temperature. Perhaps this should skip a temperature reading that is 0 degrees. |
@WickedShell In C++11 static_assert has two arguments, starting with C++14 there is the one with only one argument - at least according to Internet documentation, I didn't check the standard 😀 |
Yeah I think I swapped em :P Will get that pushed up in a bit. I'm converting the has_temp to be false only if we fail to get a temperature for >5 seconds. I actually suspect that if you were to throw a probe on it that the solo battery doesn't acknowledge a lot of requests (or has has flaky I2C connection). |
@WickedShell, shoot me the px4 file when you're ready and I'll test it again. |
Existing Solo battery code (without any of these temp/cell voltages stuff) only has a 27% chance of succeeding with any given query. (This is just on pack voltage, current, button status pack capacity). |
7b77228
to
b5dbff7
Compare
Alright, pending a review and some testing I think this is a reasonable compromise. Temperatures will be considered valid if we received one within the last 5 seconds. Because Solo is so lossy I removed the request for pack voltage and instead set the pack voltage based on adding up the voltages from all the cells. Based on logs from @Pedals2Paddles this looks to be entirely reasonable, and allows us to cut down the number of transactions from 6 to 5 every loop. This is all within just the last commit to help make it easier to review. We can squash it as needed, although I admit at this point I'm a bit tired of continually fixing merge conflicts and would rather not :) |
@Pedals2Paddles tested the changes and they were well behaved on his solo. I'm good with this at this point. |
Concur. Looks great and will be very useful data for both flight and post-flight. |
@WickedShell Please check master...OXINARF:review/6020, specially the last two commits that I will squash into yours if you agree with it. I've squash and shuffled your commits, they look good to me, but let me know if there is something wrong or that you don't agree with. I've also separated your Solo change into 7bec664 for @rmackay9 to review. |
@OXINARF if you've already managed it I won't rebase here, but there are merge conflicts showing on the PR with libraries/DataFlash/LogStructure.h, I was going to rebase before the dev call, but I don't want to break your change set. @jmachuca77 and I have determined master (not including any of this patch) appears to be much worse at talking to the Solo battery, we are still working through the proper resolution for that, but I think that will be a separate PR later to address it. |
@WickedShell Yes, I've rebased it, it's a very simple conflict. I'll squash my "fixes" into your commits since you agree with them (everything is CI green 😀). I'm just not sure if we should merge the last Solo one or not, specially if you are working on a proper fix. I'm not sure if I'll be able to attend the dev call, if I'm not then let me know what comes up from the discussion. |
Also fixs a buffer overflow when reading the ManufacturerData
@OXINARF @Pedals2Paddles the poor performance of Solo SMBus I2C has been identified, it's unrelated tot his patch set and changes are coming for it. |
Minimizes the number of transactions on the bus, and reduces the amount of noise we have to consider
b5dbff7
to
d7cbd43
Compare
This LGTM. nicely done, really looking forward to having this on my development copter! |
Splendid work. Really appreciate the work getting the Solo battery data working on this. |
Merged, thanks! |
This builds upon the work presented in #6015 and fetches the per cell voltages and temperatures from both the Maxell and Solo SMBus drivers. I added 10 cells of support to AP_BattMonitor, as that is what the MAVLink spec calls for and gives us a transport mechanism for, while I think supporting 12 cells makes sense in the future, until we have someway to actually get more then 6 cells I didn't allocate that many.
There are a couple of debug status text messages that improve the testing ability that will have to be removed before this can be merged. (As well as getting #6015 merged).
This also corrects a buffer overflow that happens within the SMBus_Solo driver, it hasn't had a noticable impact to date because it tends to overflow into memory we are already done using. (The ManufacturerData was 6 bytes long, but read into a 4 byte destination).
Within Dataflash I pulled the current definitions out into a pair of common defines.
@tatsuy tested the Maxell driver last night, and found a small error with the temperature which I've since fixed, but otherwise functioned correctly. The solo implementation hasn't been tested by anyone yet, and was written against the data sheet as I don't have the hardware. The logging wasn't present when Tatsuy tested, so hopefully whoever tests the solo battery can also confirm that their DF log looks correct? (The cell voltages are in millivolts, 65535 for any cell that is unknown).