Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Initial draft proposal of the new standard data type set for UAVCAN v1.0 #47

Merged
merged 49 commits into from
Oct 23, 2018

Conversation

pavel-kirienko
Copy link
Member

@pavel-kirienko pavel-kirienko commented Oct 11, 2018

This is my vision of the new data type set design built in accordance with our new approaches. One will notice that this new set is completely devoid of any equipment-specific data types, which is fully intentional. The old approach that we experimented with in v0 turned out to be deeply flawed because it forces one to design their applications in a very particular way, rendering the protocol unusable for use cases that were not foreseen when the set of data types was originally designed. More background info on the subject is available in the forum thread linked above.

The new approach is to provide a set of very generic data types that do not make any assumptions about the nature of the target application. The resulting type system is far less efficient, but the upside is that it is much more flexible and repurposable. Rather than trying to cater to every sensible application out of the box, we are now encouraging equipment vendors to provide custom vendor-specific message definitions for their equipment when higher efficiency or advanced functionality is desired; whereas simpler applications can be handled by using the standard set alone. I expect that there will be equipment that can be interfaced with using both generic data types (in a universal but not-so-efficient way) and vendor-provided data types (in a very vendor-specific but very efficient way). Similar patterns can be found in other systems, e.g. CANopen or USB.

I think it might be sensible to encourage vendors to publish their vendor-specific data types under very permissive licenses permitting other vendors to re-use their data type definitions to reduce fragmentation of the ecosystem. We could also come up with a set of simple domain-specific definitions to be used as a middle ground between the generic standard set and vendor-specific custom messages, although that should belong in a completely separate namespace outside of uavcan.*.

I want everyone involved to evaluate the new data type set carefully and pay particular attention to the following parts:

uavcan.node.GetPorts, uavcan.node.GetPortInfo, uavcan.node.GetPortStatistics - these data types provide a simple yet very powerful API that allows one to reconstruct the computational graph, collect the main communication metrics (input/output/errors) in real time, and verify that all nodes use compatible data types for communication with each other. Eventually, when the new GUI tool is around, it will be able to build things like this, showing the traffic intensity and transport errors in real time:

12
cv_graph

(regarding the term "port", see https://forum.uavcan.org/t/a-generic-term-for-either-subject-or-service/182)

The dynamic node ID allocation message was moved to a different namespace (named uavcan.pnp, short for plug-and-play). Now we have two versions of it: one legacy for CAN 2.0, and a new one for CAN FD and eventually other transports. The key difference is that the new message allows for much, much simpler allocation logic because it doesn't require the server and the client to engage in complex stateful multi-stage exchanges.

A new namespace uavcan.primitive was added, which contains primitive data types (integers, reals, arrays of those, etc). I expect it will be handy for prototyping, although I don't want to encourage reliance on it because such basic types lack semantic information making them hard to build robust interfaces with. Also, these are needed for the next feature:

The parameter API is gone. Meet its successor: uavcan.register. This is a pair of very simple services that allow the caller to manipulate the configuration parameters on the remote node, request its read-only named values such as calibration data and whatever else defined by the vendor, and trigger actions by writing values into special-function registers. The same interface can also be used to implement direct reads and writes of the target node memory remotely over the CAN bus. We use this approach (although with a completely different protocol over USB/RS485) for high-speed data collection and configuration parameter management elsewhere and it works quite well.

The data type uavcan.node.ExecuteCommand replaces overly specific definitions like RestartNode and BeginFirmwareUpdate. It also allows vendors to define custom commands there. This move should simplify implementation on resource-constrained nodes as one can get by with just one server where three used to be needed.

Finally, the brand new namespace uavcan.si, which is the main innovation here. Just a bunch of physical quantities. I personally think that the concept is solid but its implementation could use some refinement. For example, the data types may require renaming, some of them look clumsy (e.g., uavcan.si.length.WideVector3TS could certainly use a better name). Most types use float32 underneath, but some have to rely on wide integers or float64 because of insufficient precision/resolution/range of float32; such inconsistency is undesirable. As I expect this to be the most critical part of the new specification, I'd like everyone to subject these definitions to maximum scrutiny.

Paging @kjetilkjeka @thirtytwobits @EShamaev @antoinealb @mhkabir @uxvrob @dagar

@pavel-kirienko pavel-kirienko added this to the 1.0 milestone Oct 11, 2018
@pavel-kirienko pavel-kirienko self-assigned this Oct 11, 2018
@pavel-kirienko
Copy link
Member Author

The CI is broken because the current Python DSDL compiler doesn't understand our new data types yet.

@pavel-kirienko pavel-kirienko requested review from a team October 11, 2018 20:45
Copy link

@antoinealb antoinealb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm the register API is for me a bit of a regression. The parameter one was simpler and more standardized (standard ways to write config to flash for example), which made the experience with the GUI too nice. Here I think it will be harder to make a nice general purpose UI for node configuration.

@pavel-kirienko
Copy link
Member Author

@antoinealb check out this video here please: https://forum.zubax.com/uploads/default/original/1X/3f8b3b136d03e9035ef95ca7928f1ba4a4f746fa.webm (you can skip to 1:20). It demonstrates the user experience that can be built in a very generic way (no assumptions made about the application) using this approach. If you have any specific concerns about generality/UI please share them, perhaps the documentation should be clarified?

@antoinealb
Copy link

yes for parameter enumeration I am quite confident this can be done. I am more wondering about the "save to persistent storage" functionality that was there before. I don't see anything comparable to that in your new protocol ?

@pavel-kirienko
Copy link
Member Author

@antoinealb yes indeed, it is assumed (should be made explicit?) that persistence will be managed by the node automatically rather than externally. If manual persistence management is desired, it can be implemented via uavcan.node.ExecuteCommand (the command could be standard or vendor-specific).

@antoinealb
Copy link

Hmmm then I would suggest putting a standard command for this. Nodes could still manage it automatically but it would be cool to have standards commands for:

  1. Saving config to storage
  2. Loading config from storage
  3. Erasing stored config (factory reset)

@pavel-kirienko
Copy link
Member Author

@antoinealb added a new standard command to support manual persistence management. I decided to avoid defining the load command as the use cases for that are not yet perfectly clear; interested vendors should rely on vendor-specific commands for now.

@pavel-kirienko
Copy link
Member Author

How do we represent covariance information in general-purpose standard interfaces using the new data type model? Our options are, as I see it:

  • Use uavcan.primitive.ArrayOfReal16 or uavcan.primitive.ArrayOfReal32. These lack semantic information and timestamps.
  • Define timestamped primitives, e.g. uavcan.primitive.ArrayOfReal16TS.
  • Any other options?

Copy link
Contributor

@kjetilkjeka kjetilkjeka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to make this PR in several steps.

  • We should probably discuss the identifier ranges and namespaces (README.md) in a separate PR.
  • We cannot realistically stabilize ~100 datatypes in 5mins. Most of these are untested and should be in v0.
  • We should have a PR for each namespace, this makes it easier to discuss without things becoming a mess.
  • Once we have most of the picture in place and done some testing we can start upping the most important types to v1.0

README.md Outdated
--------|-----------|------------------------------------------------
0 | 0 | Not a valid ID
1 | 32767 | Dynamic
61440 | 62803 | Vendor-specific public static
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is meant by Vendor specific public static? Are these statically set by vendors? If so, how do we avoid collisions? Hand out a few IDs on request together with namespaces?

Should there be an applications static space?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this what Dynamic is though @kjetilkjeka ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed this further in https://forum.uavcan.org/t/on-subject-service-id-ranges/195

We should call it something like "Application Specific (Dynamic)" to not give the wrongful illusion that using it with static (application specific) IDs are not OK.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Application Specific (Dynamic)

Surely you mean "Application-specific static". Application-specific dynamic are in the lower range starting from 0. More info in the linked thread.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"The vendor specific public static" is fine. It's the "dynamic" I have a problem with (as it's not always dynamic). We should change that one to something like "Application specific".

@pavel-kirienko pavel-kirienko changed the title Initial draft proposal of the new standard data type set for UAVCAN v1.0 [WIP] Initial draft proposal of the new standard data type set for UAVCAN v1.0 Oct 14, 2018
@pavel-kirienko
Copy link
Member Author

Some definitions have been moved back to v0. I have posted my answers to the concerns raised by @kjetilkjeka here: https://forum.uavcan.org/t/on-subject-service-id-ranges/195/2

Copy link
Contributor

@kjetilkjeka kjetilkjeka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a lot of comments.

We should also see if we can get rid of the primitive namespace and just create a subject with a primitive type instead.

Thesi namespace doesn't really look ergonomic either. Perhaps it is not quite ready to be stabilized directly from this PR?

uavcan/diagnostic/65520.Text.1.0.uavcan Outdated Show resolved Hide resolved
uavcan/diagnostic/65520.Text.1.0.uavcan Outdated Show resolved Hide resolved
uavcan/diagnostic/Severity.1.0.uavcan Outdated Show resolved Hide resolved
uavcan/file/405.GetInfo.1.0.uavcan Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
@@ -0,0 +1,8 @@
#
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a high priority message to stabilize. Let's make it v0.1 or v0.0 to lower the work burden of initial stabilization?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. Demote to 0 for now.

uavcan/node/SemanticVersion.1.0.uavcan Outdated Show resolved Hide resolved
@@ -0,0 +1,99 @@
#
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it not possible to unite node ID allocation for CAN2.0 and other?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anonymous multi-frame transfers are not possible, they can be only single-frame. The maximum payload length of a CAN 2.0 single-frame transfer is limited to 7 bytes. In order to allocate a node ID, we have to communicate the unique ID of the local node (which is 16 bytes large) to the allocator. Seeing as it won't fit in one CAN frame, the allocatee and the allocator have to engage in a very complex multi-stage exchange process, which also has certain known hard-to-fix pitfalls.

The CAN FD-based protocol is vastly simpler: the allocatee can just send the request in one message, and the receive the response immediately.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. Can we keep it in v0.1 just until the node allocator is tested and works with CAN-FD?

@@ -0,0 +1,5 @@
#
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make it so we don't need these silly definitions? Can we, for instance, instead allow primitive types (and arrays) to be sent on a subject ID as well as composite definitions?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I beg to differ. The definitions may look silly, but they are architecturally sound, unless I am mistaken somewhere. They explicitly define the set of basic types that can be exchanged over the bus, avoiding possible excessive fragmentation if one were to use oddly sized types like uint51. They operate within the framework of existing type system, not needing additional provisions from the specification, keeping it simple. They would be required anyway if we were to add timestamped primitives.

I don't see a compelling reason to add more complexity to the specification.

uavcan/time/62804.Synchronization.1.0.uavcan Show resolved Hide resolved
@pavel-kirienko
Copy link
Member Author

I have temporarily disabled Travis here because we can't really fix it until the DSDL compiler is updated.

The si namespace doesn't really look ergonomic either. Perhaps it is not quite ready to be stabilized directly from this PR?

I would really like to have all namespaces reviewed jointly, as some of them are interdependent (e.g. file and node, register and primitive). However, you are probably right in that the SI namespace is a rather special one as it is the major piece of the new design and I admitted above that it might need refinement. I am open to removing it from this PR, to re-introduce it again in the very next PR.

@pavel-kirienko
Copy link
Member Author

pavel-kirienko commented Oct 19, 2018

I would prefer the "Dynamic" range to be changed to an "Application-specific" range before merging though.

Done.

Can we also clarify in the README.md that this is currently a draft as well?

Not done -- I'm worried we might forget to remove that before merging to master. The facts that the spec v1.0 is not yet released and that this is neither master nor a release branch should be indicative enough.

I would like @thirtytwobits and @antoinealb to confirm that they accept the changes as well before merging. After that we will continue with uavcan.diagnostic.

uavcan/internet/udp/500.HandleIncomingPacket.0.1.uavcan Outdated Show resolved Hide resolved
uavcan/internet/README.md Outdated Show resolved Hide resolved
uavcan/internet/udp/500.HandleIncomingPacket.0.1.uavcan Outdated Show resolved Hide resolved
uavcan/internet/udp/500.HandleIncomingPacket.0.1.uavcan Outdated Show resolved Hide resolved
uavcan/internet/udp/500.HandleIncomingPacket.0.1.uavcan Outdated Show resolved Hide resolved
uavcan/node/430.GetInfo.0.1.uavcan Show resolved Hide resolved
README.md Show resolved Hide resolved
@@ -0,0 +1,8 @@
#
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. Demote to 0 for now.

uavcan/time/Point.1.0.uavcan Show resolved Hide resolved
thirtytwobits and others added 9 commits October 20, 2018 15:40
Co-Authored-By: pavel-kirienko <pavel.kirienko@gmail.com>
…ely used, to the end of the message for easier manual serialization
Co-Authored-By: pavel-kirienko <pavel.kirienko@gmail.com>
Co-Authored-By: pavel-kirienko <pavel.kirienko@gmail.com>
Co-Authored-By: pavel-kirienko <pavel.kirienko@gmail.com>
Co-Authored-By: pavel-kirienko <pavel.kirienko@gmail.com>
@pavel-kirienko
Copy link
Member Author

Some thoughts about the file API (just food for thought. You're the architect here so I trust your final judgment): I'm wondering if we have an API that is weird? That is, we could have used something like TFTP over CAN for greater simplicity and standardization. On the other hand, or we could have provided something more complete like HTTP over CAN (e.g. using HTTP verbs for basic operations and RFC7233 range requests for more fine grained file access). With that in mind, is what we have here a happy median or an odd bastard?

@thirtytwobits I would say that what we have here is more of a happy bastard. I seriously dislike conventional file transfer protocols because of their heavy reliance on shared state which makes them extremely fragile; TFTP is a good example of that. Instead, we should be leaning towards client-driven protocols that ideally do not require the server to keep any context whatsoever (although this is probably not possible or not practical to obtain, it is something we should strive towards, like a motivational poster on the wall). Let's discuss that in detail in a separate PR.

# For example, this field can be used for reporting the short git commit hash of the current
# software revision.
# Set to zero if not used.
uint32 software_vcs_revision_id
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
uint32 software_vcs_revision_id
uint64 software_vcs_revision_id

for future-proofing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For that to make sense, there should be either:

  • More than 2^32 commits.
  • The number of commits should be large enough for a short hash collision to be likely. Consider this: assuming that the short hash is perfect, you need ~3000 commits to reach the collision probability of 0.1% with a 32-bit hash. 3k commits is not much (for reference, there are 28k commits in the PX4 Firmware repository), but you also have the major and the minor version numbers to further narrow down the exact revision, so your real constraint is that there shouldn't be more than a couple thousand of commits sharing the same minor version number. Which seems sensible.

Do we really want 64 bits here?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @thirtytwobits here, I would prefer 64 bits of vcs commit it, especially since since GetInfo is not going to be called very often anyway.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@antoinealb what is the rationale?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I computed correctly, at 28k commit, you have about 9% chance of a collision in the 32 bit git hash. Of course like you said it can be narrowed down using major and minor, but that's a hassle that is really easy to avoid. If we use 64 bits instead for example, the probability drops to about 10e-11.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we use 64 bits instead for example, the probability drops to about 10e-11.

Okay. That is probably an overkill, how about we used 40-bit here? For 40-bit hash:

  • ~0.1% at 50k commits
  • ~24% at 784k commits (that's how many there are in the Linux kernel currently)

For 48-bit hash:

  • ~0.0004% at 50k commits
  • ~0.1% at 784k commits

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

40 or 48 are ok for me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, increased to 40 bit.

uavcan/node/435.ExecuteCommand.0.1.uavcan Outdated Show resolved Hide resolved
#

# Standard pre-defined commands are at the top of the range (defined below).
# Vendors can define arbitrary vendor-specific commands in the bottom part of the range (starting from zero).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the term "operation" for SystemCall

uavcan/node/435.ExecuteCommand.0.1.uavcan Show resolved Hide resolved
uavcan/node/435.ExecuteCommand.0.1.uavcan Show resolved Hide resolved
uavcan/node/435.ExecuteCommand.0.1.uavcan Outdated Show resolved Hide resolved
uavcan/node/435.ExecuteCommand.0.1.uavcan Show resolved Hide resolved
uavcan/node/435.ExecuteCommand.0.1.uavcan Show resolved Hide resolved
uavcan/node/62805.Heartbeat.1.0.uavcan Show resolved Hide resolved
@pavel-kirienko pavel-kirienko merged commit e4180bb into uavcan-v1.0 Oct 23, 2018
@pavel-kirienko pavel-kirienko deleted the v1-standard-data-types branch October 23, 2018 18:26
This was referenced Dec 11, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants