New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
POSITIONING: Added tooling for CSV support #131
Conversation
Below are two excerpts from the YAML together with the CSV that it generates. Another issue is shown by the following CSV rows: A third issue is found in the second example below, in the CSV row addressing the node
|
The instance is the one for discussion here. I just thought it's easier talking about it with an example. Thanks for the bug reports. I'll fix them. I'd propose to go in two steps. First let's discuss the specification in #130 and then on the tooling and representation, to eventually close #81 :) |
I think the hard one is the resulting tree, not the YAML, so I think they need to be discussed together, or start with the Tool result. |
This is a bug, I'll fix. Thinking of dynamic registration it should anyway only be allowed on branch nodes I suppose. I'd see the instances as solely extensions on the physical tree. I changed the code at some point, because I had duplications in the output and then I introduced this bug. Please don't consider it as part of the spec. |
Good. With that node physically present in the tree I think the new node type "instance" is Ok. The properties in an instance node I think should only be the minimum mandatory - name, type, descr, as it inherits the properties of its "parent leaf node".
You are right.
They should be generated, but I think they are a good example of that properties from parent leaf should not be copied to them. This is logically equivalent to a branch node, to then have datatype and unit properties in it is confusing. |
@UlfBj: I changed now the code and fixed the bugs. We should still discuss the instance representation. Two things in that regard:
|
I think it should not be present. Only the VSS mandatory properties should be present: name, nodetyp, and description (the latter being something like "Instance of 'node name' where node name is the "leaf" node name it instantiates.
I think the instantiated nodes shall be addressable, so yes. A client should not have to handle them differently than any other node. |
Makes sense. And the protocol decides how they are served then. I'll change the PR then. |
In the latest CSV it looks like below
Should the instances have the same uuid "5057" as its parent leaf? |
I browsed again through the positioning thread and maybe the dot notation isn't the best way to use for the instances, when I think of addressing them. If we add a field to the CSV including the instance information, like you described here, that would leave the addressing to the implementation. |
Could you please exemplify? |
Of course :) That's what I thought:
|
In my understanding of the example above, there is room for many interpretations. For the last twoquestions, the answer could be that it is left to the implementation. |
The current data model with its parent-child relationship will have to be extended with the "instanceOf" relationship, which I think "costs more than it tastes". |
I think I agree with Ulf. Overall, we started with what seemed to me a fairly simple extension/instance mechanism in #81. And primarily it was there to reduce duplication. We then found that the devil was in the details. Of course it can be useful to draw the instance relationship (like the picture you showed) just for our shared understanding, so that's fine. But overall, my wish for this is that we don't complicate VSS itself and therefore we in particular don't "highlight" a lot of complex computer-science theory unless truly required for the model definition. I mean to avoid introducing these concepts too much into the actual model explanation, and into its usage. It's so easy to over-engineer things and difficult to keep things simple. VSS should still be expressive enough, so we need just enough, but don't fall into the trap of growing the number of concepts too much. (I'm in fact already a little skeptical about the many branch types and on, but that is a separate issue). I also feel that "equivalence" is key here. I thought it should not matter much if something is an instance or not? As Ulf implies, an instance just becomes a child. Isn't it that simple? It's just a way to efficiently avoid duplication in the definition of the tree? Would you agree first that it should be a free choice of the data model designer, and the end result should be equivalent:
should yield nodes:
each of the nodes having basically identical/duplicated metadata (except the instances of course), that is the following:
and I think this should be equivalent, in usage, to if the data model designer chose to write this in YAML instead:
With that said, I don't immediately like to focus on the CSV format that much. I suppose maybe in communication between humans and as documentation it can be nice to bring it into spreadsheet. But mostly I see this just as a way to generate to see what effect the instantiation had? Is that also your intention with the PR, Daniel? Or what do we think is really the intention of the CSV format, strictly speaking. The tool has been there for a while, but why? Just because? If that CSV format makes a note of that a node was created as an instantiation instead of being listed in the original YAML (as done in Daniel's example) I can of course see how this information can be useful to have - just to know it came from an instance. But the rest should show equivalence. In your example however:
...here you are writing "instance" in the same column as the type "sensor" and thereby suggesting it is a new type of node. No, I don't like it really... The type of this instance is surely still "sensor" isn't it? As already mentioned, separate nodes with the same UUID feels also not right. Each one must be separately addressable and if we allow for UUID addressing it must be unique. It's not that you could not generate a CSV like this - you could generate like 10 different formats if you wish (again, for understanding, documentation, whatever the user feels like) but if we are speaking of a definition of the model and we use it to explain VSS to adopters then I do not like it if it adds new concepts and more complexity. The most typical expansion of the YAML file into its instances (and I guess therefore the most typical TXT or CSV format file) should in my opinion list 4 lines in which the full name is fully expanded:
In fact, I thought this has been very clear in examples from both me and Ulf in #81? Stepping back from this to the big picture, we must focus on simplicity -- not to become the most complex and expressive model ever. I don't think VSS will ever become a full fledged object model. For a perfect and fully formal model we would have to use or define a much more capable modelling language, and that becomes too much here, IMHO. Also don't forget that in the grander scheme of things, VSS is complemented by things like Franca IDL (which covers functions/methods, interface hierarchies, and arbitrarily complex structs / data types as well), Franca Plus (which covers components and architectures, in principle replacing parts of UML even), and therefore I think VSS should aim first to be simple but capable data/signal model that is directly usable with minimum problems. |
Sorry for the wall of text above, just spinning off of this into principles and thinking :)
This is the key I think. We need to consider how protocols and data model are related, it is fairly unaviodable in my opinion. I want us to envision that there is no new and different way to address instances, because it would then not have the equivalence I mentioned above. If we agreed that instances are quite simply children, then for every protocol, the way to address them is already defined (the same way as you address a child node). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(as above)
@UlfBj comments suggest that this...
...is fortunately not what the tool is actually generating. Maybe I missed some of the details on the discussion in between. Anyway, my comments was on Daniel's most recent entry so they should be valid.
Here we get to the crux of some different way of thinking I think. All in all, I care about what is written in the YAML. That is the formal spec and everything else is kind of just extra. As a parallel to explain how I think about this, in the internal representation of the data model inside of an actual program, it would be seem useful if instantiated nodes simply "have" all of their meta data (so you do not have to traverse a tree to find it) but that is of course completely an implementation detail. Both ways of doing it is possible and it does not affect external behavior. So that part has nothing to do with the spec definition. For any other generated format, such as CSV, as I mentioned I think they are just superfluous formats and have very little to do with a definition of the data model. Useful for discussion, documentation, etc. but not the formal files. So if we decide to duplicate the metadata in such a generated file, I don't see a problem with it. Actually in my view of instantiation that would be the "correct" definition of a generated/instantiated node. But that gets us to your actual opposition which is branch vs. leaf. I think details depend on which proposal we go for (in #81) but in general I see no problem with changing that metadata only, as needed (i.e. let a leaf become a branch if it gets instances below it), but I know this is where we differed in our thinking and proposals before. In any case, the key in my opinion is that some particular metadata (in this case node type only) is required to change to be consistent, just let it change. It does not mean to me that all other meta data (datatype, unit, whatever) cannot or should not be copied. (Again I'm not 100% comfortable with the word "copied"... because in some implementations this might be a non-issue even - it might make use of the YAML directly + semantic rules of how to interpret that YAML). So it all depends on what output format we are actually talking about as I see it. But I could imagine tooling that reads the YAML and outputs a CSV file to show the instances with all of their relevant data that they "inherited" from their parent being copied therein, and branch/leaf just defined to whatever it should be according to the model we agree upon. If you don't agree with this, I think I need further understanding of what you meant by "not present". |
I am talking about the Tool output. |
Yes, that's it! Further, the dot notation shows how it should be addressed
The idea with the instances came from the fact that on a protocol level you should be able to decide to just address all left tires, or in opposite all of row1, but I think you are right with the over-engineering.
This looks good and logical to me. But I second to have the meta-data there. @UlfBj, can you agree to this. I only would like to have a column in the csv, where it's mentioned that it's an instance, please :) |
I am soon at a level where I agree to anything and everything:) |
We could list the differences between our views, and then vote on them separately.
|
Yeah I thought that might be the thinking behind. But in my opinion the query syntax, for example simple wildcards (part of protocol definition, not VSS definition) already solve this problem. |
It's a good start for a list, Ulf. But this one... I wonder just to understand what you envision on this one:
I'm still not sure how you mean the alternative can work. I believe an instance must copy all characteristics of the defining node, otherwise they are not really instances? The scary thing here is that we get into complexity because this suggests in my mind the following question: Shall overriding be allowed? E.g. by defining an explicit node in YAML that overlaps an instance and redefines some property. If you cannot add or change the non-mandatory properties somehow, then they would simply be missing if they are not copied. In that case I imagine getting a full copy of the "parent" is really the right way (maybe exception is branch/leaf as indicated before) Otherwise I feel like the net effect is that instantiation is only useful for nodes that do not have any additional/proprietary/non-mandatory properties? Sorry if I missed it but is there anyone (you) proposing the copying of mandatory properties only? Or was it just a possibility you thought of now. |
@danielwilms wrote
Yes, after some rambling this is also what I found to be OK. I think the type should be equal to the "parent" (wherever the instances are defined), but indicating that the node is created as a result of an instance definition is just good additional information. |
These words are the key. In my world view there is nothing redundant because the YAML is king. Honestly in any auto-generated information (whether it is CSV, source code, or the binary data structure that a particular implementation chooses to store the tree) I don't think we should care whether there is redundancy. There might be... There will be... And there's really no controlling that if you look at the entire chain all the way to a compiled program. What are the main reasons to avoid redundancy in the definition of computer systems? In my opinion the following: A) Avoid any risk of inconsistent or incorrect results Then there are two points to highlight with regard to generated results, IMHO:
Both 1) and 2) independently eliminate or make non-applicable the concerns of A) and B), as I see it. |
I agree, only the resulting dot notation helps to interpret the specification correctly by visualising it, especially after instantiation.
I do focus on it more, because that's what developers see and look at, because this is what they have to use at the end. If I talk about it, people ask immediately: show me the tree. So, the serialization is used for communication, the yaml to formally describe it. |
Yes, I also do. Besides the reasons given by @danielwilms above, another argument for keeping redundancy low in the Tool output tree is that it is transmitted to the client as the response to a getmetadata request, so it reduces traffic load.
Whether you copy the metadata into the instance or not, logically the instance inherits it from its (YAML) parent node. So if it is not copied, the server parsing the instance node needs to go to the parent node for the metadata. Logically no difference, implementation wise a small difference. Shall we try to list the differences, or how can we come to a final resolution? |
165dfaf
to
e94c019
Compare
After some thinking and discussions I hope I can close the two PRs with this proposal. I tried to solve especially Ulfs concerns about a leaf not being the one carrying the data, but rather acting as a branch. As base I take the spec from #130. I this PR I introduced a new field "complex":
At the end you create with the instances a kind of typed list of items. |
+1. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense to me as well
@danielwilms Asked me to have a look at this PR (and #130). It seems like you have it covered. My only question is why instances are implemented in That said, since this has been a long run, we could approve this PR and then do a second run at migrating the code over to |
@magnusfeuer: you're right. Will open an issue after it's merged and move it. |
- introduced extensions - removed duplication from the vspec file - tooling separated for discussion, on how it should look like
* Previously a branch input was introduced to avoid adding instances to leafs. Didn't make sense and is now corrected * 'extensions' didn't fit. Renamed it to 'instances'
First attempt on how the tooling could look like with the positions as instances. Dot notation might not be the best form. But as discussion item for now to conclude COVESA#81.
* not all instances had been named before * some nodes didn't appear * some nodes too much (e.g. row[x] listed as separate instances)
(cherry picked from commit ed0d3edc305187073b6a51bbf878d68d8a0ae0c3)
In order to map the instances to the csv structure, complex types are introduced. This means, that a leaf with attached instances is a typed list and that instances have to be anticipated. (cherry picked from commit 7bdf3f1789abd0edba9e72acdf04015e4b69a63f)
Looks OK to me. +1 on putting this in the common module of course. Your branch is now very outdated and rebasing would help. |
e94c019
to
57e7135
Compare
Oh, I just pushed the rebased results. I forgot I can do that on the GENIVI/ repositories |
@gunnarx, Thanks for the rebase. @magnusfeuer: I created #138 With approvals from @klotzbenjamin, @UlfBj, @tguild, @magnusfeuer and @gunnarx I'll merge. |
First attempt on how the tooling could look like
with the positions as instances. Dot notation might
not be the best form. But as discussion item for now
to conclude #81.