-
Notifications
You must be signed in to change notification settings - Fork 156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question: Is there a standard signal serialization format? #627
Comments
There is a |
Thank you for the pointer, @erikbosch My understanding of the FMPOV it would be helpful to introduce fixed identifiers for the VSS Data Entries which remain constant over time and cannot be reused. You mentioned the use of UUIDs being discussed for that matter. That could be one way of doing it. Another option with a smaller footprint might be to use a similar scheme like the one being used in e.g. SNMP, where you assign a simple integer (counting from 1 up) to each node/leaf in each (sub-)tree. These identifiers could then be used as the property IDs in the protbof Message definitions and could also be used as a more compact identifier in other serialization formats. |
@adobekan - what is the status of your ideas to refactor UUID handling? Is it it still just a thought - it seems to be related to the comment from Kai above |
proto could be used but problem would be, how do you exchange and manage schema between integration points. What we were thinking is something related to what @sophokles73 is mentioning. It was related to short UUID element, e.g. 3 bytes (1byte for version/layers/source, e.g. what is public vs private) (2bytes for fixed id of each element, which stays with leaf after creation) @erikbosch Still on my ToDo list, soon we will start working on this. |
Here's an example of what I have in mind:
Now define the VehicleIdentification subtree
Now the Vehicle.VehicleIdentification.Year data entry could also be referred to by In a protopuf definition, this could also be used:
The IDs used in the message definitions are the values of the corresponding Data Entry definitions' id properties. These cannot be changed over time and if a new property is being added, a new id value is being defined in the vspec. Similarly, if a property is being removed, its id value will not be reused. This way, it should be quite simple to make sure that protobuf message definitions generated from the vspec files remain backward compatible. The ids could also be used in other serialization formats like JSON in order to increase the payload vs. meta data ratio. It makes a big difference if I use |
I like the idea, but some thoughts:
|
I like your proposal with id tag, but i think it has to be a bit more unique. I would say that each leaf needs short unique number that will stay with that leaf and it will even allow us to trace the leaf. Plus a value for overlays. Then we can identify if some leaf is coming from main repo, or it is new concept layer, or private modification. Where I see challenges.
|
I would prefer that we go for hex value 4 bytes at least, byte 0 -> layer concepts, byte 1-3 generate id should be enough to cover us for next few decades. :) |
When making incompatible changes like deleting/renaming a Data Entry or changing its type in an incompatible way, then we will need to create a new major version of the VSS spec, won't we? An application that was built using, say, VSS version 3 can (in general) not be expected to work with VSS version 4 without any alterations, right? Consequently, I would assume that it would be ok to change the numeric identifiers in between major version changes in an incompatible way as well, or am I mistaken? IMHO this means that we can only uniquely identify a Data Entry by means of the combination of the VSS (major) version and the path identifier (e.g.
IMHO this will be analogous to how much you can change before you need to assign a new name. (and thus need to do it in a new major version). |
@sophokles73 - for transport purposes I believe you are correct, but if we want to use the identifier also for backend purposes it might be relevant. Like if a server either supports multiple VSS-versions or needs to migrate stored historical data from version X to version Y. If signal X.Y change type (and meaning) from bool to int then old historical values does not make sense, in the backend database it must be treated as "different signals". On the other hand, if we move/rename "Vehicle.Speed" to "Vehicle.Status.Speed" we could theoretically reuse/keep/migrate the old values, if the meaning of them has not changed. |
I think one major question is, do you want identifiers to save bytes/processing for serialisation or is it important they also represent the underlying model. The first case is easy, and may be all that is needed for many applications: You just hash the path name with a robust hash, and use however many bytes you are comfortable with (wit a static model you can even check for collisions, so very few bytes ok). If doing so the identifier for The other extreme is some "Merkle-style" hashing where you has also all the VSS metadata, and all Childs. That way same has on a given branch means exactly the same model beneath. That would be good to see that "hey, Vehicle.ADAS.*" model really is 100% the same, but for practical purposes just adding one signal below destroys similarity all the way up. Maybe more practical is just doing it on a leaf basis: VSS metadata and Path in a hash. Tracking "movements" of data in the tee however is really hard with this, I am not sure there is a better option for that than really having a kind of "id database" created, that you ship with the spec, and where you could manually do such stuff, if you really want. Don't see a good way to do this automatically, because obviously going the hashing way to reduce data you can not include path, but then you also do not want the id to change, if you e.g. fix a typo in comment or description. If you leave all those volatile things out, suddenly the system would determine that everything that is "uint8 with min0 max 100" is really "the same". But as pointed out already, maybe there is also not a real use case for that, becasue if model ic changed that way, up the version number/mark it as different I think no golden bullet here, but for the OP request of "using it for more efficient serialisation/adressing", I feel hashing paths and making sure via deployment/tech stack both sides are on the "same" VSS model is best. That would even be more robust than the "numbering" scheme in cases, where there are composite model, where stuff is added via e.g. overlays, or left out. Becasue as long as you don CHANGE metadata of a datapoint, they can still be reliably referenced. |
My original concern in this issue was:
I do not really understand how the discussion about moving signals across the tree is related to this problem as FMPOV doing something like that will always result in a breaking change which would result in a major version change. So I wonder if automagical migration of data across major version changes actually is a use case/requirement? So far I haven't read anything about that in the context of VSS ... However, the problem I have stated above is a real world issue/concern that I ran into as soon as I started transmitting any VSS data between components that have not been implemented as part of the same project/system. |
Warning - very long comment! Feedback if this would be a reasonable approach is welcome! I came up with a possible idea for managing unique identifiers and handling version control.
If a new signal is added to the standard catalog the list needs to be extended.
If a signal is renamed but semantic meaning and hash remains then you can just add a new line with the same
That would practically mean that One the other hand if meaning of a signal change, for example new unit or new description the
But if the change affects hash but semantics are the same we could just add the new hash but with the old id
This would work for instances as well, like PassengerSide/DriverSide example.
One could even think of id-ranges so that any custom signals added must have ID>0xFFFF to avoid possible collision with future VSS standard signals. A file like this could potentially be useful also in cases where you do not need the |
I started scribbling something similar. I wanted to use yaml here, and then with overlay attach IDs to the tree. In this case even instances would not be too complicated to handle. As you mention, additional check when the leaf is moved but ID not fitting, or datatype changed. We can check in the tooling. Also in yaml structure it would be easy to append leaf changes and comments. `
` |
Another alternative that is implemented at the VISSv2 reference implementation as an experimental compression is to create an array of all leaf node paths in the tree, and then sort it. The index into the array can then be used to uniquely represent the path of each leaf node. This can be extended to include all nodes, not only the leaf nodes. Encoding/decoding is quite efficient. The hashing operation proposed in other alternatives is here instead a sorting operation. A uint16, two bytes, is sufficient for trees with max 65535 leaf nodes. |
Could you please provide a link or an example? If i try to follow the explanation, would not this already cause issues if vehicles are not configured with exactly same number of leafs with same names? I agree that 2 bytes would be enough specially if you combine layer mapping. |
What about adding a new signal to an existing node? This represents a backward compatible change to the VSS tree but would most likely screw up the array index, wouldn't it? If we were using the array index as the property IDs in the protobuf file this would lead to a non-backward compatible protobuf definition, wouldn't it? |
@adobekan Regarding an example, the client on this link implements it, in the protobuf compression among a few different compression experiments. |
@sophokles73 If a new node is added to an existing tree, the tree should also have a version update. Assuming that this new version of the tree is accessible by both end points, then a version synchronization like described above should fix it. |
Here if i look at proto file, basically you have something like hashmap but you are not using benefits of protobuf when it comes to reducing message size. You are still using path as identifier, and payload is always string what can be quite dangerous on version changes. I would suggest at least in this approach define Value as oneOf, which is supported in proto. Other challenge when it comes to array, sorting and compression is related to number of leafs, we can not assume that each vehicle will have support for all leafs. This is nothing related to version of VSS. SeatHeating status might not be existing in every single vehicle in the fleet, it might be just not there as feature and then this might cause additional challenges. Off course one can always think about ways how to handle this and keep 10k different variations for 30mil vehicles and involve some process of handshake. This is why i would prefer to have small 2-3 bytes static IDs assigned to leafs not directly in vspec files, and then you can get close to numbers of static binary serialization when it comes to message size but as well keep historical tracking of each leaf. |
If you look at the DataPackages message below, which is what is snt back in the response, path can there be an int32. message DataPackages {
} |
I was wondering if there is a standard mechanism/format for serializing VSS data to a byte array/stream.
This could be used to transfer signal data from a vehicle to a back end application using e.g. MQTT and or HTTP.
Ideally, such a format would not be overly verbose. In particular, the VSS data entry names like
Vehicle.Powertrain.CombustionEngine.DieselExhaustFluid.Level
could increase the payload size dramatically, so some form of meta data based serialization like protobuf comes to mind. However, I haven't (yet) found a corresponding protobuf definition, or am I mistaken?The text was updated successfully, but these errors were encountered: