Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serialization: Creation Information #80

Closed
davaya opened this issue Feb 10, 2023 · 5 comments
Closed

Serialization: Creation Information #80

davaya opened this issue Feb 10, 2023 · 5 comments
Labels
serialization Something about the representation of data in bytes
Milestone

Comments

@davaya
Copy link
Contributor

davaya commented Feb 10, 2023

Some time ago we moved creation-related properties from Element to the CreationInformation class because 1) they are related in purpose and 2) it makes Element easier on the eyes. But that raises the question of default values - if Element has a creationInfo property of type CreationInformation, then its value is treated as a single unit, not five or six separate property values. That won't work because the requirement is for each property to have an individual default value that doesn't need to be sent even if another property is not defaulted.

Two potential solutions:

  1. Move the individual creation-related properties back into Element
  2. Define a "macro" modeling notation that substitutes one value for another, e.g.:
        Element
+ SPDXID: IRI
+ name: String [0..1]
+ summary: String [0..1]
+ description: String [0..1]
+ comment: String [0..1]
#include CreationInformation
+ verifiedUsing: IntegrityMethod [0..1]
 ...

where the #include macro substitutes the group of properties called CreationInformation into Element, and there is no actual CreationInformation class.

If we did that, I'd shorten Element even more by defining a DescriptiveInformation macro with the summary, description, and comment properties.

Pros of #1: doesn't need any new modeling conventions or tooling support
Cons of #1: Element is bloated
Pros of #2: it's cool
Cons of #2: needs work to invent, macro properties must be de-duplicated (e.g. creationComment)

@goneall
Copy link
Member

goneall commented Feb 10, 2023

That won't work because the requirement is for each property to have an individual default value that doesn't need to be sent even if another property is not defaulted.

I'm of the opinion that individual default values do not make sense. One of the main purposes of a default value is to not repeat the same information for Elements created at the same time from the same creator etc - in which case you would have all the same CreationInfo defaults.

This opens up another solution where you have a default CreationInfo and if you would want to have a different value for any of the CreationInfo fields, you'd just create a separate CreationInfo just for that element.

@iamwillbar
Copy link
Contributor

Capturing what I said during today's tech meeting:

At a logical level, every element has a creation information, with all the properties containing their final value.

At a serialization level, a serializer can do any optimization it desires (including none), if once deserialized every element has a creation information, with all the properties containing their final value (i.e. it complies with the logical model).

For example, a serializer may optimize for space by storing duplicative creation information once and using a pointer in the element. On deserialization it would resolve the pointer and attach the creation information to the element (it could retain the pointer if it wanted to minimize differences when serializing that content back out).

Another example, a serializer that used hierarchies may optimize for space by supporting per-property inheritance by storing common values in the highest level of the hierarchy. On deserialization (or even when accessed) it would walk the hierarchy to retrieve the value of any property in the creation information.

@davaya
Copy link
Contributor Author

davaya commented Feb 19, 2023

At the logical level, every element has one creationInfo property of type CreationInformation.

At a serialization level, a serializer can do any optimization it desires (including none), if once deserialized every element has a creation information, with all the properties containing their final value (i.e. it complies with the logical model).

True, but we are going to define serialization in a way that supports interworking. "A serializer" doesn't mean whatever a developer chooses, it means whatever SPDX documents as the standard. Whatever we document must support serialization of a single element, and the optimization algorithm for serializing multiple elements must be defined.

As I noted in https://github.com/davaya/spdx-3-model/tree/serialization/serialization, the conceptually simplest serialization would just concatenate the individual elements and then compress them (with gzip or whatever). When we define a serialized format it shouldn't be conceptually harder to understand than that.

@davaya
Copy link
Contributor Author

davaya commented Feb 24, 2023

As William said, the logical model defines element properties, one of which is creationInfo. Every serialization must preserve the logical value of every element. The approach to satisfying that requirement in the most effective manner will be discussed by the Serialization (Thursday) and Canonicalization (Friday) teams

@davaya davaya changed the title Creation Information: struct or macro? Serialization: Creation Information Feb 24, 2023
@maxhbr maxhbr added the serialization Something about the representation of data in bytes label Mar 24, 2023
@kestewart kestewart added this to the 3.0 milestone May 5, 2023
@goneall
Copy link
Member

goneall commented Jul 28, 2023

I believe this has now been agreed to and resolved for JSON-LD.

Closing this issue.

@goneall goneall closed this as completed Jul 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
serialization Something about the representation of data in bytes
Projects
None yet
Development

No branches or pull requests

5 participants