Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize JSON (De)serialization (Final Types et al) #482

Open
dselman opened this issue Aug 9, 2022 · 5 comments · May be fixed by #861
Open

Optimize JSON (De)serialization (Final Types et al) #482

dselman opened this issue Aug 9, 2022 · 5 comments · May be fixed by #861
Labels
Difficulty: Medium Type: Enhancement ✨ Improvement to process or efficiency

Comments

@dselman
Copy link
Sponsor Contributor

dselman commented Aug 9, 2022

Feature Request 🛍️

Use Case

Allow modellers to declare that types are "final" and cannot be extended. This allows for optimisation of the serialised JSON, as an array of final types can no longer be polymorphic.

The $class is required in serialised instances because concepts can be extended. We could support the addition of a final keyword which would indicate that a concept could not be extended, meaning that the type of an object can be inferred by introspecting the model.

Possible Solution

namespace acme

final concept Person {
   o String name
}

final concept Company {
   o String name
   o Person[] employees
}

const company = 
{
   "name" : "Clause",
   "employees" : [
      {"name" : "Dan"}, {"name", "Simon"}
   ]
}

// deserialize, passing in the name of the type explicitly
// during deserialization the model is introspected to know that company.employee is an Employee type (and cannot be a subclass).
serialiser.fromJSON( company, ''acme.Company' );

Context

Detailed Description

@dselman dselman added Difficulty: Medium Type: Enhancement ✨ Improvement to process or efficiency labels Aug 9, 2022
@dselman
Copy link
Sponsor Contributor Author

dselman commented Aug 26, 2022

Note that if $class is missing we could assume that we need to do a type lookup - if the type found is abstract then we fail with an error.

@dselman dselman changed the title Final Types Optimize JSON Serialization (Final Types et al) Nov 1, 2022
@dselman dselman changed the title Optimize JSON Serialization (Final Types et al) Optimize JSON (De)serialization (Final Types et al) Nov 1, 2022
@DS-AdamMilazzo
Copy link

DS-AdamMilazzo commented Nov 2, 2022

I would suggest that the "final" feature is not needed for this, and you could get the desired savings in more situations via a more general mechanism. Since objects are naturally processed and validated recursively, you should already know the parent's type when you encounter a field value. From the parent's type definition, you already know the declared type of the field. $class could simply be omitted wherever it matches the declared field type. Also, when only the type name within the $class differs from the declared field type, the namespace and version could be omitted, leaving an abbreviated $class field containing only the type name. For example:

abstract concept Owner { ... }
concept HumanOwner extends Owner { ... }
concept CorporateOwner extends Owner { ... }

concept Thing {
  o Owner owner
}

The $class in the owner could simply be "HumanOwner" or "CorporateOwner". The namespace and version would be assumed to match that of the Owner type in that case.

I opened #542 to represent this broader mechanism, as this issue was still specific to the "final" feature last I looked.

@dselman
Copy link
Sponsor Contributor Author

dselman commented Feb 22, 2023

declared

Note that type determination based on the type of the property only works if you assume that the model doesn't change between when the instance was serialised and when it was deserialised. For example, if version 1 of the model is:

concept Owner { ... }

concept Thing {
  o Owner owner
}

It is assumed that no $class is required, because the owner property is of an (non-abstract) type Owner.

Version 2 of the model is:

concept Owner { ... }
concept HumanOwner extends Owner { ... }
concept CorporateOwner extends Owner { ... }

concept Thing {
  o Owner owner
}

A $class is now required as a type discriminator to deserialise the Thing.owner property due to the potential usage of HumanOwner and CorporateOwner

The advantage of the final keyword (or sometimes called sealed) is that it makes it much clearer to the modeller than making a final type non-final is a breaking change - and that change occurs in the namespace of the type that is being declared "unfinal". Without final keyword any introduction of a derived type, in any namespace, is potentially a breaking change for deserialisation.

Reading #542 I note your suggestion that if $class is not present we assume that the instance if of the type of the property of the model used during deserialisation (not of the model used during serialiazation). That sounds like an interesting optimisation but does have interesting model evolution implications!

@DS-AdamMilazzo
Copy link

DS-AdamMilazzo commented Feb 22, 2023

Note that type determination based on the type of the property only works if you assume that the model doesn't change between when the instance was serialised and when it was deserialised.

I don't think that's true. It should work regardless. The $class of the top-level object - which is always required - declares the exact version of the top-level object that you're dealing with. If it's v1 then you know you're in the first case. If it's v2 then you know you're in the second case. There's no ambiguity or assumptions about the model not changing.

(Furthermore, I don't see why a $class would be required even in your v2 model, at least according to my proposed rule. Given that Owner is not abstract, a $class-less owner would still be legal and unambiguous. It would only become required if Owner became abstract.)

As for your final comment, isn't the whole system based on the assumption that specific model versions are immutable? A specific model version M@1.0.0 should be identical during serialization and deserialization. If people are mutating models without changing the version then all kinds of things will break.

I assume there is some fundamental misunderstanding here. Can you post an example model and data that is ambiguous or interesting according to the rules in #542 ?

@DS-AdamMilazzo
Copy link

DS-AdamMilazzo commented Sep 1, 2023

It seems that your understanding of my proposal is that we allow $class to be omitted when and only when there's exactly one concrete type that a field could possibly be, so that adding more possible types suddenly makes $class required where it wasn't before. Something like this:

abstract concept A { }
concept C {
  o A a // $class can be omitted as long as there's only one concrete subtype of A in existence anywhere
}

I agree that would be a bad idea, but it's not my proposal. My proposal is much simpler and more generally applicable. It allows $class to be omitted whenever a field type is not abstract, and when omitted, it is understood to be the exact concrete type the field is declared as. To revisit your v2 example above:

concept Owner { ... }
concept HumanOwner extends Owner { ... }
concept CorporateOwner extends Owner { ... }

concept Thing {
  o Owner owner
}
  • Can $class be omitted for the "owner" field? Yes, because the field type, Owner, is not abstract.
  • What type is inferred when $class is omitted? It is only the exact field type, M@2.0.0.Owner, and none other! If you want to use HumanOwner or CorporateOwner, or even to use a different version of Owner (if one exists), then you need a $class.

I don't believe there is any form of model evolution for which this rule causes problems, assuming you're not mutating existing models without changing their versions, which can break all kinds of things. But if you have an example, I'd be glad to see it.

As for making such changes (to whether $class is required) obvious to the modeler, in my example the modeler would have to add or remove the abstract keyword - a breaking change - just like in your proposal they'd have to add or remove the final keyword (or, in either case, change the field type - another breaking change). There are no invisible/external or nonbreaking changes that can change the interpretation of the data. (In fact, since the models are versioned and immutable, there is no way to change the interpretation of the data, period.)

@dselman dselman linked a pull request Jun 15, 2024 that will close this issue
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Difficulty: Medium Type: Enhancement ✨ Improvement to process or efficiency
Projects
Status: Next 12 Months
Development

Successfully merging a pull request may close this issue.

2 participants