Self documenting of log data #5864

magicrub · 2017-03-14T10:35:43Z

We need a self maintaining mechanism for keeping the log data documented. Something like the params scheme where both the wiki and a GCS is up to date by developer comments.

Georacer · 2017-03-14T10:42:30Z

Are you referring to the MAVLink dialect or the Dataflash logs?

magicrub · 2017-03-14T14:34:31Z

I was referring to DataFlash, specifically stuff like this which are sprinkled throughout the codebase:
https://github.com/ArduPilot/ardupilot/blob/master/libraries/DataFlash/LogStructure.h#L780

Georacer · 2017-03-14T14:49:36Z

For background and reference, as @OXINARF told me here that Dataflash, logs aren't exactly stable.
So, yes, it makes sense to have dynamically generated documentation, to avoid double work; one for coding and one for documenting.

Comparing with the parameter documentation, each parameter is declared in the code along with its description and default values. That means that the author must provide that description, it's not parsed automagically from the code, which is reasonable.

For the same reason, it makes sense that each DF log message type should be an object with accompanying information. Tthe author of each DF log message should document it with supplemental information, at minimum with a description and units.

It's not clear to me right now how the parameter description code is parsed to generate the documentation, but I know where to find that information (I think).

What is important IMO, because old logs from previous versions are often re-visited, is to be able to generate documentation from older versions. This could be problematic for previous versions, but from now on it shouldn't be a problem. The missing old information perhaps could be added by hand somehow.

hunt0r · 2017-03-14T16:10:07Z

A small comment: In addition to the documentation for usability (units, log message field explanations, etc. I've actually started a document to this effect for my own learning) we could begin with dev-level documentation of anything that's not-open-to-change about the logs.

For example: At present the first FMT message defines the "FMT" message type, including it's length. But there's a "catch-22": How would a log-processor, starting "from scratch" know the length of the first FMT message, before it's read the contents of the message? (which may require knowing the read-length)
Idea 1) The FMT length is fixed, and never going to change. (I think it's 89 now.) So the log reader can always read the first message of a log at that length, which allows it to learn other lengths as it goes.
Idea 2) The first log message will always be a FMT message, defining the FMT message type. Thus a log reader finds the first length byte (currently 2-byte header + 1-byte msg type + 1-byte new-msg type implies the 5th byte is the first length in the log)

Also, the comments in the code suggest the header is subject to change and/or elimination... is this still intended? Or should the header be fixed as something that's not subject to change?

Georacer · 2017-03-14T16:22:35Z

@hunt0r Actually, I don't think the message specification itself should be an issue for this specific ticket.

For comparison, check how new parameters are declared here, specifically in step 3.

Notice how (even though it is discouraged), all of the code defining and specifying any parameter can (and often does) change.
However, this change never needs to be transcribed manually, because there is automatic code which generates documentation based on the parameter metadata which exists embedded in the code.

Getting back to the case of DF messages, if each type was given metadata in a similar case, then automatic code generation could produce all of the format specification, without needing to actually parse the FMT messasges of the log.
Of course, extra human readable information should be also added, as discussed before (description, units etc).

magicrub · 2017-03-14T20:25:02Z

The self describing bit of the DF should stay the way it is, that ensures it is interpreted and displayed correctly in a gui. However, the description text should not go into the log because it will add 10k of text every time it starts a new log. That withdraw be nice, but unnecessary. The GUI can just get it from the internet like it does with params.

OXINARF · 2017-03-15T00:34:05Z

There was already an issue for this (#5460). @magicrub please search first next time.
There is a proposal there from Don that is interesting - basically generate log code and documentation from an XML file, like MAVLink. A lot of work would be needed to implement that though.

@Georacer What you are proposing isn't very feasible because it would imply searching the entire codebase for log documentation. Parameters are different - we look at each vehicle's Parameters.cpp file and from there we get all the information needed, including additional files to be parsed.

peterbarker · 2017-03-15T01:45:19Z

As opposed to MAVLink - which requires an external document to understand, and is a rather formal specification, dataflash log files by design intent are expected to be self-describing. This was my intent with the units branch - to extend the extent to which they are self-describing!

https://github.com/ArduPilot/ardupilot/compare/master...peterbarker:units?expand=1

That patch (apart from adding sanity checking) adds units and multipliers to the metadata the log carries. This information is really useful to log analysis tools as they can e.g. convert radians to degrees where appropriate. For that reason I think we should push ahead with that units patch.

It would also be wonderful to get textual descriptions of the fields into the logs. Sadly, from a practical standpoint, that's not possible on stm32-class flight controllers; we can't carry all of the extra text for that (as 2magicrub points out).

We could pre-process the code like we do for parameters - but pre-processing sucks. Or we could conditionally-compile and include the textual descriptions for platforms that can afford the extra text (Linux!) A compiled tool could then be run to generate an XML document to be used when the log does not include the extra text. The XML document would obviously suffer from the same versioning problem we have for parameter files (but we'll solve it at the same time we get around to solving the parameter one).

@OXINARF It's interesting you bring up Don's efforts there. We recently moved further away from that sort of system when we got Log_Write() - because setting all of the data structures up was too much like hard work for devs who wanted their DF message right now!

Georacer · 2017-03-15T07:17:15Z

@OXINARF I might just be dense, but I don't see why parsing the codebase for DF msg definitions wouldn't be very feasible.
Yes, I understand that parameters are more localized in the codebase and can be parsed quasi-independently of the rest of the project, but how worse could the situation with DF msgs be?

Would it be a deal breaker if a tool needed, say, 5 minutes to parse the entire codebase? Or am I completely deluded?

OXINARF · 2017-03-16T15:22:08Z

dataflash log files by design intent are expected to be self-describing

I always thought of it being a self-described binary format, not necessarily having a description of the information. As you said, that looks impossible to do with a PX4 board. I don't oppose having units and multipliers, but I think that without further information it is quite useless - you won't graph a variable if you don't know what it is and units aren't going to change that.

It's interesting you bring up Don's efforts there. We recently moved further away from that sort of system when we got Log_Write() - because setting all of the data structures up was too much like hard work for devs who wanted their DF message right now!

And you know I love Log_Write! But I think it wouldn't be needed if you just wrote the message format in, for example, a XML document and then you would know you had a method with a specific name format ready to be called. At least for me, I love Log_Write because I don't have to write code in several places just to have write a log message - but two places would be acceptable.

Would it be a deal breaker if a tool needed, say, 5 minutes to parse the entire codebase? Or am I completely deluded?

If we want to enforce it, then it needs to run on Travis so that the build fails if the documentation is incorrect or absent. And 5 minutes per job in Travis is indeed a lot of time.

Georacer · 2017-03-16T18:49:11Z

I don't oppose having units and multipliers, but I think that without further information it is quite useless - you won't graph a variable if you don't know what it is and units aren't going to change that.

+1

amilcarlucas · 2017-04-11T22:01:27Z

I like the units PR. it is at least a step forward. It will however break my code, that converts DF logs into DCM ( https://www.etas.com/download-center-files/products_ASCET_Software_Products/TechNote_DCM_File_Formats.pdf ) but hey... that's worth it.
BTW @magicrub and @peterbarker the last two commits on ArduPilot/mavlink#39 are also doing the same stuff on Mavlink. There it is easier because it is centralized XML

jaxxzer · 2018-05-30T18:55:31Z

In the past year or so.. Has there been any further effort, discussion or decisions on how to proceed with this issue?

amilcarlucas · 2018-05-31T10:42:39Z

I think the easiest way to get this done is to add doxygen documentation blocks above each Log-Write function call.
Doxygen would then bundle all those scattered documentation blocks into a single .html webpage.

What do you guys think about it ? Does Doxygen already run at out server?

peterbarker · 2019-07-10T00:33:23Z

I'm playing with documenting AP_Logger messages in a similar way to the way we document parameters in this branch:

#11765

@amilcarlucas whatever we do has to also produce a machine-readable document - probably XML. Do you think Doxygen would be a reasonable alternative given that requirement?

lvale · 2019-07-10T08:15:54Z

@peterbarker Yes it is

http://www.doxygen.nl/manual/output.html

amilcarlucas · 2019-07-18T11:47:39Z

yes, doxygen is a good generator.

peterbarker · 2020-07-17T22:40:13Z

We now have this.

magicrub added the Docs label Mar 14, 2017

OXINARF mentioned this issue Mar 15, 2017

Publish log item meaning. #5460

Closed

Georacer mentioned this issue Apr 4, 2017

Separate Units structure Georacer/ardupilog#38

Closed

OXINARF added Logging Enhancement labels Jun 4, 2017

jaxxzer mentioned this issue Jul 9, 2019

Sub: Auto-document DataFlash messages #9802

Closed

peterbarker closed this as completed Jul 17, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Self documenting of log data #5864

Self documenting of log data #5864

magicrub commented Mar 14, 2017

Georacer commented Mar 14, 2017

magicrub commented Mar 14, 2017

Georacer commented Mar 14, 2017

hunt0r commented Mar 14, 2017

Georacer commented Mar 14, 2017

magicrub commented Mar 14, 2017

OXINARF commented Mar 15, 2017

peterbarker commented Mar 15, 2017

Georacer commented Mar 15, 2017

OXINARF commented Mar 16, 2017

Georacer commented Mar 16, 2017

amilcarlucas commented Apr 11, 2017

jaxxzer commented May 30, 2018 •

edited

Loading

amilcarlucas commented May 31, 2018

peterbarker commented Jul 10, 2019

lvale commented Jul 10, 2019

amilcarlucas commented Jul 18, 2019

peterbarker commented Jul 17, 2020

Self documenting of log data #5864

Self documenting of log data #5864

Comments

magicrub commented Mar 14, 2017

Georacer commented Mar 14, 2017

magicrub commented Mar 14, 2017

Georacer commented Mar 14, 2017

hunt0r commented Mar 14, 2017

Georacer commented Mar 14, 2017

magicrub commented Mar 14, 2017

OXINARF commented Mar 15, 2017

peterbarker commented Mar 15, 2017

Georacer commented Mar 15, 2017

OXINARF commented Mar 16, 2017

Georacer commented Mar 16, 2017

amilcarlucas commented Apr 11, 2017

jaxxzer commented May 30, 2018 • edited Loading

amilcarlucas commented May 31, 2018

peterbarker commented Jul 10, 2019

lvale commented Jul 10, 2019

amilcarlucas commented Jul 18, 2019

peterbarker commented Jul 17, 2020

jaxxzer commented May 30, 2018 •

edited

Loading