Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AP_BattMonitor: Cell voltages and Temperature #6020

Merged
merged 9 commits into from
Apr 19, 2017

Conversation

WickedShell
Copy link
Contributor

This builds upon the work presented in #6015 and fetches the per cell voltages and temperatures from both the Maxell and Solo SMBus drivers. I added 10 cells of support to AP_BattMonitor, as that is what the MAVLink spec calls for and gives us a transport mechanism for, while I think supporting 12 cells makes sense in the future, until we have someway to actually get more then 6 cells I didn't allocate that many.

There are a couple of debug status text messages that improve the testing ability that will have to be removed before this can be merged. (As well as getting #6015 merged).

This also corrects a buffer overflow that happens within the SMBus_Solo driver, it hasn't had a noticable impact to date because it tends to overflow into memory we are already done using. (The ManufacturerData was 6 bytes long, but read into a 4 byte destination).

Within Dataflash I pulled the current definitions out into a pair of common defines.

@tatsuy tested the Maxell driver last night, and found a small error with the temperature which I've since fixed, but otherwise functioned correctly. The solo implementation hasn't been tested by anyone yet, and was written against the data sheet as I don't have the hardware. The logging wasn't present when Tatsuy tested, so hopefully whoever tests the solo battery can also confirm that their DF log looks correct? (The cell voltages are in millivolts, 65535 for any cell that is unknown).

@@ -233,6 +233,30 @@ void GCS_MAVLINK::send_power_status(void)
hal.analogin->power_status_flags());
}

void GCS_MAVLINK::send_battery_status(const AP_BattMonitor &battery, const uint8_t instance) const
{
GCS_MAVLINK::send_statustext_all(MAV_SEVERITY_INFO, "1: %d 2: %d 3: %d 4: %d 5: %d 6: %d",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove the debug for PR ...

@WickedShell WickedShell force-pushed the wickedshell/battery-voltages branch 2 times, most recently from f0c9ef5 to f594c87 Compare April 11, 2017 08:02
@@ -480,6 +480,8 @@ struct PACKED log_Current {
float battery_voltage;
float current_amps;
float current_total;
int16_t temperature; // degrees C * 100
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you think about just recording the temperature in C in a signed byte? It's not the size so much as the slightly awkward C*100 scaling. I'll leave it up to you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay the comment might actually be more misleading... We're setting the FMT_LABEL to have the unit * 100 already, so this will result in your graping program automatically dividing this by 100 again for you. The reason I selected this resolution is that our only current source of temperature information is only good to 0.1 C, so this will convey that level of accuracy, and I didn't want to use a 4 byte storage like a float. Because you will automatically see this divided again I think this is the correct approach.

Does that make sense?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah right. makes sense, thanks!

@rmackay9
Copy link
Contributor

I'm not a huge fan of the scaling of the temperature in the log message but it can be argued either way so LGTM.

mavlink_msg_battery_status_send(chan,
instance, // id
MAV_BATTERY_FUNCTION_UNKNOWN, // function
MAV_BATTERY_TYPE_UNKNOWN, // type
0x7fff, // temperature. INT16_MAX if unknown
got_temperature? (temp * 100) : 0x7fff, // temperature. INT16_MAX if unknown
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

miss a cast here as temp is float and the msg variable is int16_t

Copy link
Member

@OXINARF OXINARF left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a couple of comments. It's mainly about the mixing of MAVLink, battery monitor and dataflash cell field size - each have their own number. You might want to define a macro in battery monitor header so it can be used in dataflash, that way it will always log the correct number of cells - you'd still need to convert that into the MAVLink field size (unless you use the MAVLink macro everywhere, it's fine to me).

Also temperature is saved as float in C, but always used as int16_t in C * 100. Any reason for not saving it as the latter?

@@ -116,6 +119,10 @@ class AP_BattMonitor
bool overpower_detected() const;
bool overpower_detected(uint8_t instance) const;

// cell voltages
const cell_voltages_t & get_cell_voltages() const;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unimplemented

@@ -333,3 +336,13 @@ bool AP_BattMonitor::overpower_detected(uint8_t instance) const
return false;
#endif
}

// return the current cell voltages, returns the first cell if the instance is out of range
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first cell -> first monitor cells

// returns true if there is a temperature reading
bool AP_BattMonitor::get_temperature(float &temperature, const uint8_t instance) const
{
temperature = state[instance].temperature;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just like the cell voltages, this should be protected against a bad instance number.


// read temperature
if (read_word(BATTMONITOR_SMBUS_MAXELL_TEMP, data)) {
_state.has_temperature = true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it change the has_temperature status if it can't read the value?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. It's possible this is just a bus error, but we should probably flag it anyways, otherwise we will have a bad stale temperature without knowing it....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I implemented this, but it appears to be a bad decision, I've attatched an image from a Solo battery, and it appears this part of the communication is really unreliable. The better approach I think will involve a timer that sets it to false if we haven't received a new temperature update for a period of time (5 seconds comes to mind as the number already used elsewhere in the drivers). Temp isn't a high priority information, but it would be nice to not be oscillating like this.
figure_1-2

struct log_Current pkt = {
LOG_PACKET_HEADER_INIT(LOG_CURRENT_MSG),
time_us : AP_HAL::micros64(),
battery_voltage : battery.voltage(0),
current_amps : battery.current_amps(0),
current_total : battery.current_total_mah(0),
temperature : (int16_t)((has_temp ? temp : 0) * 100),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confusing, to me, at least. I would prefer (int16_t) (has_temp ? temp * 100 : 0)

};
for (uint8_t i = 0; i < MAVLINK_MSG_BATTERY_STATUS_FIELD_VOLTAGES_LEN; i++) {
pkt.cell_voltages[i] = battery.get_cell_voltages(0)[i] + 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please get the cell voltages before the cycle instead of calling it 10 times. Also, why use the MAVLink field size here? Not only it doesn't need to match, but cell_voltages_t isn't defined with the MAVLink size. It should check it has a maximum of sizeof(pkt.cell_voltages).

What's the +1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah the 10 times thing was a bit of a hack to see if it made graphing nice, will swap it over.

The +1 was to make log review saner... Basically we internally stored voltages with the MAVLink representation of invalid values for a cell (UINT16_MAX) which makes sending the mavlink message handy, and avoids the fact that a dead cell in the middle of a pack actually would be reported as 0. However for the DF log it has the downside of really messing up scaling when you try and graph the cell voltages. Nudging the value by 1 mV means that a unknown cell shows up as 0 when graphed which doesn't destroy the ranges nearly as badly, but also that an actually dead cell would be a distinct value from an unknown cell.

As far as the MAVLINK_MSG_BATTERY_STATUS_FIELD_VOLTAGES, that is what we are trying to hang the voltages array off of for length, but when I include the GCS_MAVLink/GCS.h header in AP_BattMonitor, I get dependency problems. I think I can peak directly at the msg_battery_status header though. I'll give that a shot.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think that including the message header directly isn't a problem. We want to get a field from it, not exactly from GCS.h

WriteBlock(&pkt, sizeof(pkt));
}

if (battery.num_instances() >= 2) {
float temp;
bool has_temp = battery.get_temperature(temp, 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0 -> 1

// clear out the cell voltages
memset(&state[instance].cells, 0xFF, sizeof(cell_voltages_t));
// clear out the temperature
state[instance].has_temperature = false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Size reduction squad: shouldn't be needed, it is false on construction.

MAV_BATTERY_FUNCTION_UNKNOWN, // function
MAV_BATTERY_TYPE_UNKNOWN, // type
got_temperature? (temp * 100) : 0x7fff, // temperature. INT16_MAX if unknown
battery.get_cell_voltages(instance), // cell voltages
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_cell_voltages returns cell_voltages_t. I'm not a fan of that typedef, I can live with it if it has some use, which apparently isn't the case as this is relying on it being an array of 10 uint16_t. We should probably have some protection here to make sure the returned array has a size equal or bigger than 10.

@WickedShell WickedShell force-pushed the wickedshell/battery-voltages branch 2 times, most recently from 10d74cd to 863b7a7 Compare April 12, 2017 08:34
@WickedShell
Copy link
Contributor Author

Alright, I spent about 4 hours on rebasing problems/type problems so please carefully review this. And @jmachuca77 I hate to ask but can you retest against a solo battery again? (or anyone else can, unless your GCS understands the BATTERY_STATUS message you will need to look at the DF log and at the CURR.Temp and CURR.V1 through CURR.V4 fields).

@OXINARF I unfortunately can't include GCS_MAVLink/include/mavlink/v2.0/common/mavlink_msg_battery_status.h inside of AP_BattMonitor.h without missing a ton of definitions/linker assistance, so I ended up leaving the cells array as a hard define of 10, I'm not enthused about it, but I don't see a better way at the moment. The typedef was an attempt to allow the compiler to ensure that I never violate access to outside the array length, however that didn't really allow storing the function result (or at least I couldn't get the compiler to be happy, and I regressed to the point of just trying every combination of */& to get a local handle on it :) @tridge suggested passing a struct around instead, which while looking weird has the merits of just being a nice direct access and compiles well.

This was a painful lesson in doing PR's based upon other PR's... Either that or I need to strongly improve my git foo so I spend far less time editing/fixing rebase/merge conflicts/getting things wrong.

@Pedals2Paddles
Copy link
Contributor

Since my build environment is still not working, can someone send/link me the px4 file for this and I will test it tonight on my Solo?

Copy link
Member

@OXINARF OXINARF left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have any issue with the structure, it looks cleaner that always writing 10 everywhere. But, like I wrote in the comments to code, we should look for the size of it.
If it isn't urgent to get this in, I'll see if I can understand why we can't include the MAVLink header.

mavlink_msg_battery_status_send(chan,
instance, // id
MAV_BATTERY_FUNCTION_UNKNOWN, // function
MAV_BATTERY_TYPE_UNKNOWN, // type
INT16_MAX, // temperature. INT16_MAX if unknown
voltages,
battery.get_cell_voltages(instance).cells, // cell voltages
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think we should check that sizeof(cells) >= 10, otherwise this is going to be a problem if someone changes the number in AP_BatteryMonitor library.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed with static_assert above this.

};
AP_BattMonitor::cells cells = battery.get_cell_voltages(0);
for (uint8_t i = 0; i < MAVLINK_MSG_BATTERY_STATUS_FIELD_VOLTAGES_LEN; i++) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we aren't using the MAVLink field size in the library can we change this to MIN(ARRAY_SIZE(cells.cells), ARRAY_SIZE(pkt.cell_voltages))?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks reasonable to me, but then this happens:

../../libraries/DataFlash/LogFile.cpp:1813:77: error: cannot bind packed field ‘pkt.log_Current::cell_voltages’ to ‘short unsigned int (&)[10]’
         for (uint8_t i = 0; i < MIN(ARRAY_SIZE(cells.cells), ARRAY_SIZE(pkt.cell_voltages)); i++) {
                                                                         ~~~~^
../../libraries/AP_Common/AP_Common.h:69:52: note: in definition of macro ‘ARRAY_SIZE’
 #define ARRAY_SIZE(_arr) sizeof(_ARRAY_SIZE_HELPER(_arr))
                                                    ^~~~

Lazy version would be to simply make the DF log length always match the AP_BattMonitor::cells.cells length?

Copy link
Member

@OXINARF OXINARF left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try to do something better for the sizes tomorrow, if I can't, I think we should put it in as is.

uint16_t voltages[MAVLINK_MSG_BATTERY_STATUS_FIELD_VOLTAGES_LEN];
memset(&voltages, 0xff, sizeof(uint16_t)*MAVLINK_MSG_BATTERY_STATUS_FIELD_VOLTAGES_LEN); // fill with UINT16_MAX for unknown cells
// catch the battery backend not supporting the required number of cells
static_assert(sizeof(AP_BattMonitor::cells) >= (sizeof(uint16_t) * MAVLINK_MSG_BATTERY_STATUS_FIELD_VOLTAGES_LEN));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have to include a message as the last parameter to static_assert.

@WickedShell
Copy link
Contributor Author

Argh, static_assert appears to be compiler dependent as it was happy on my machine, will try adding the two arg version.

@Pedals2Paddles
Copy link
Contributor

Tested tonight. Cell voltage looks GREAT! Battery temperature looks like it is accurately reporting what the battery told it. But it looks like a sawtooth jumping from 0 to 29 to 0 to 29 repeatedly. Like the battery is not consistently reporting the temperature. Perhaps this should skip a temperature reading that is 0 degrees.

@OXINARF
Copy link
Member

OXINARF commented Apr 13, 2017

@WickedShell In C++11 static_assert has two arguments, starting with C++14 there is the one with only one argument - at least according to Internet documentation, I didn't check the standard 😀

@WickedShell
Copy link
Contributor Author

Yeah I think I swapped em :P Will get that pushed up in a bit. I'm converting the has_temp to be false only if we fail to get a temperature for >5 seconds. I actually suspect that if you were to throw a probe on it that the solo battery doesn't acknowledge a lot of requests (or has has flaky I2C connection).

@Pedals2Paddles
Copy link
Contributor

@WickedShell, shoot me the px4 file when you're ready and I'll test it again.

@WickedShell
Copy link
Contributor Author

Existing Solo battery code (without any of these temp/cell voltages stuff) only has a 27% chance of succeeding with any given query. (This is just on pack voltage, current, button status pack capacity).

@WickedShell WickedShell force-pushed the wickedshell/battery-voltages branch 2 times, most recently from 7b77228 to b5dbff7 Compare April 14, 2017 01:39
@WickedShell
Copy link
Contributor Author

Alright, pending a review and some testing I think this is a reasonable compromise. Temperatures will be considered valid if we received one within the last 5 seconds. Because Solo is so lossy I removed the request for pack voltage and instead set the pack voltage based on adding up the voltages from all the cells. Based on logs from @Pedals2Paddles this looks to be entirely reasonable, and allows us to cut down the number of transactions from 6 to 5 every loop.

This is all within just the last commit to help make it easier to review. We can squash it as needed, although I admit at this point I'm a bit tired of continually fixing merge conflicts and would rather not :)

@WickedShell
Copy link
Contributor Author

@Pedals2Paddles tested the changes and they were well behaved on his solo. I'm good with this at this point.

@Pedals2Paddles
Copy link
Contributor

Concur. Looks great and will be very useful data for both flight and post-flight.

@OXINARF
Copy link
Member

OXINARF commented Apr 17, 2017

@WickedShell Please check master...OXINARF:review/6020, specially the last two commits that I will squash into yours if you agree with it. I've squash and shuffled your commits, they look good to me, but let me know if there is something wrong or that you don't agree with. I've also separated your Solo change into 7bec664 for @rmackay9 to review.

@WickedShell
Copy link
Contributor Author

@OXINARF if you've already managed it I won't rebase here, but there are merge conflicts showing on the PR with libraries/DataFlash/LogStructure.h, I was going to rebase before the dev call, but I don't want to break your change set.

@jmachuca77 and I have determined master (not including any of this patch) appears to be much worse at talking to the Solo battery, we are still working through the proper resolution for that, but I think that will be a separate PR later to address it.

@OXINARF
Copy link
Member

OXINARF commented Apr 17, 2017

@WickedShell Yes, I've rebased it, it's a very simple conflict. I'll squash my "fixes" into your commits since you agree with them (everything is CI green 😀). I'm just not sure if we should merge the last Solo one or not, specially if you are working on a proper fix. I'm not sure if I'll be able to attend the dev call, if I'm not then let me know what comes up from the discussion.

@WickedShell
Copy link
Contributor Author

@OXINARF @Pedals2Paddles the poor performance of Solo SMBus I2C has been identified, it's unrelated tot his patch set and changes are coming for it.

Minimizes the number of transactions on the bus, and reduces the amount of noise we have to consider
@rmackay9
Copy link
Contributor

This LGTM. nicely done, really looking forward to having this on my development copter!

@Pedals2Paddles
Copy link
Contributor

Splendid work. Really appreciate the work getting the Solo battery data working on this.

@OXINARF OXINARF merged commit a96772e into ArduPilot:master Apr 19, 2017
@OXINARF
Copy link
Member

OXINARF commented Apr 19, 2017

Merged, thanks!

This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants