Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(serializer): add inferClass option #861

Draft
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

dselman
Copy link
Sponsor Contributor

@dselman dselman commented Jun 15, 2024

Closes #482 #542

Adds the option inferClass to the Serializer (false by default). When this option is true $class will only by included in the JSON created for Resource objects, in the following circumstances:

  1. If this is a root object, root objects always have a $class attribute so that they are self-describing
  2. If a nested object is not of the same type as the type of its field in the model

In addition, when a $class attribute is included it will be shortened (the namespace is removed) when the namespace of the Resource is the same as the namespace of the type of the property (see examples).

Changes

  • JSONGenerator updated to remove $class from JSON when it can be inferred
  • JSONPopulator updated to infer $class FQN for objects when it is short or missing
  • Unit tests

Flags

  • Do we want to only support inferClass when model manager is in strict mode?

Examples

Using these two model files:

            namespace org.acme.zoo@1.0.0

            abstract concept Animal {
               o String name
            }

            concept Address {
               o String line1
               o String line2 optional
               o String city
               o String state
               o String country
            }

            // a type that extends Animal, in the same ns as Animal
            concept Dog extends Animal{}

            abstract concept Person {
               o Address address optional // can be inferred from model
               o String name
            }

            concept Owner extends Person {
               o Integer age
            }

            concept Zoo {
               o Person person // $class cannot be inferred from model, as `Person` is abstract
               o Animal[] animals // $class cannot be inferred from model
            }

And:

            namespace org.acme.cat@1.0.0
            import org.acme.zoo@1.0.0.{Animal}
            // a type that extends Animal in a different namespace
            concept Cat extends Animal{}

Example 1

This example will deserialize with inferClass=true.

  1. The person property has the $class value Owner because it is not of the type of the field (Person) but is of type org.acme.zoo@1.0.0.Owner which is in the same namespace as the field type, so is shortened to just Owner.
  2. The first entry in the animals array has a $class of Dog. A $class is required because the type of the property is Animal and there are two types that specialise Animal: Cat and Dog, so a discriminator is required.
  3. The $class is Dog not org.acme.zoo@1.0.0.Dog because the type Dog and the type Animal are defined in the same namespace. During deserialisation the serialised assumes that $class short names are defined in the same namespace as the type of their property in the model.
  4. No $class is required for person.address as the type of the field is org.acme.zoo@1.0.0.Address is the same as the resource.
{
                $class: 'org.acme.zoo@1.0.0.Zoo',
                person: {
                    $class: 'Owner',
                    name: 'Dan',
                    age: 42,
                    address: {
                        line1: '1 Main Street',
                        city: 'Boston',
                        state: 'MA',
                        country: 'USA'
                    }
                },
                animals: [
                    { $class: 'Dog', name: 'fido' }
                ]
            }

Example 2

This example will deserialize with inferClass=true.

  1. The $class is org.acme.cat@1.0.0.Cat because the type Cat and the type Animal are not defined in the same namespace.
{
                $class: 'org.acme.zoo@1.0.0.Zoo',
                person: {
                    $class: 'Owner',
                    name: 'Dan',
                    age: 42
                },
                animals: [
                    {
                        $class: 'org.acme.cat@1.0.0.Cat',
                        name: 'tiddles'
                    }
                ]
            }

Example 3

This example will deserialize with inferClass=true.

  1. This is for backwards compatibility and shows that $class can still be explicitly provided for owner and the first element in the animals array.
{
                $class: 'org.acme.zoo@1.0.0.Zoo',
                person: {
                    $class: 'org.acme.zoo@1.0.0.Owner',
                    name: 'Dan',
                    age: 42
                },
                animals: [
                    {
                        $class: 'org.acme.zoo@1.0.0.Dog',
                        name: 'fido'
                    }
                ]
            }

Example 4

Here is a meta model instance with inferClass=true:

{
  "$class": "concerto.metamodel@1.0.0.Model",
  "namespace": "Lorem nisi enim enim.",
  "sourceUri": "Cupidatat officia laborum sunt incididunt.",
  "concertoVersion": "Veniam.",
  "imports": [
    {
      "$class": "ImportAll",
      "namespace": "Duis sint.",
      "uri": "Ea sunt reprehenderit."
    }
  ],
  "declarations": [
    {
      "$class": "MapDeclaration",
      "key": {
        "$class": "StringMapKeyType",
        "decorators": [
          {
            "name": "Incididunt excepteur nostrud enim.",
            "arguments": [
              {
                "$class": "DecoratorString",
                "value": "Cillum occaecat aute aute.",
                "location": {
                  "start": {
                    "line": 38511,
                    "column": 43930,
                    "offset": 23033
                  },
                  "end": {
                    "line": 36502,
                    "column": 39816,
                    "offset": 19486
                  },
                  "source": "Labore tempor et aliquip mollit."
                }
              }
            ],
            "location": {
              "start": {
                "line": 42391,
                "column": 3627,
                "offset": 33958
              },
              "end": {
                "line": 983,
                "column": 40090,
                "offset": 25069
              },
              "source": "Excepteur tempor veniam pariatur."
            }
          }
        ],
        "location": {
          "start": {
            "line": 61309,
            "column": 47612,
            "offset": 47848
          },
          "end": {
            "line": 4788,
            "column": 48475,
            "offset": 41651
          },
          "source": "Non anim nisi ipsum occaecat."
        }
      },
      "value": {
        "$class": "BooleanMapValueType",
        "decorators": [
          {
            "name": "Commodo ut tempor eiusmod.",
            "arguments": [
              {
                "$class": "DecoratorString",
                "value": "Nulla ea sit pariatur incididunt.",
                "location": {
                  "start": {
                    "line": 32231,
                    "column": 39463,
                    "offset": 11115
                  },
                  "end": {
                    "line": 5812,
                    "column": 46231,
                    "offset": 50449
                  },
                  "source": "Ut cupidatat nisi duis elit."
                }
              }
            ],
            "location": {
              "start": {
                "line": 44063,
                "column": 49727,
                "offset": 33761
              },
              "end": {
                "line": 51897,
                "column": 32889,
                "offset": 39320
              },
              "source": "Pariatur laboris adipisicing."
            }
          }
        ],
        "location": {
          "start": {
            "line": 37476,
            "column": 16394,
            "offset": 12273
          },
          "end": {
            "line": 50689,
            "column": 63232,
            "offset": 29302
          },
          "source": "In veniam anim ut."
        }
      },
      "name": "Foo",
      "decorators": [
        {
          "name": "Labore cillum.",
          "arguments": [
            {
              "$class": "DecoratorString",
              "value": "Qui.",
              "location": {
                "start": {
                  "line": 8643,
                  "column": 22128,
                  "offset": 37375
                },
                "end": {
                  "line": 35202,
                  "column": 14646,
                  "offset": 36723
                },
                "source": "Sint eu tempor sint."
              }
            }
          ],
          "location": {
            "start": {
              "line": 63155,
              "column": 29327,
              "offset": 52746
            },
            "end": {
              "line": 58986,
              "column": 61943,
              "offset": 29968
            },
            "source": "Sint excepteur sunt elit."
          }
        }
      ],
      "location": {
        "start": {
          "line": 16561,
          "column": 54945,
          "offset": 30256
        },
        "end": {
          "line": 8976,
          "column": 9505,
          "offset": 43034
        },
        "source": "Labore."
      }
    }
  ],
  "decorators": [
    {
      "name": "Pariatur aute voluptate id.",
      "arguments": [
        {
          "$class": "DecoratorString",
          "value": "Sit.",
          "location": {
            "start": {
              "line": 11878,
              "column": 38094,
              "offset": 16933
            },
            "end": {
              "line": 12942,
              "column": 61905,
              "offset": 17270
            },
            "source": "Anim occaecat proident."
          }
        }
      ],
      "location": {
        "start": {
          "line": 26284,
          "column": 63143,
          "offset": 13162
        },
        "end": {
          "line": 28658,
          "column": 38157,
          "offset": 63188
        },
        "source": "Id nisi."
      }
    }
  ]
}

Related Issues

  • Issue #
  • Pull Request #

Author Checklist

  • Ensure you provide a DCO sign-off for your commits using the --signoff option of git commit.
  • Vital features and changes captured in unit and/or integration tests
  • Commits messages follow AP format
  • Extend the documentation, if necessary
  • Merging to main from fork:branchname

Signed-off-by: Dan Selman <danscode@selman.org>
@dselman dselman self-assigned this Jun 15, 2024
@dselman dselman marked this pull request as draft June 15, 2024 12:16
@dselman dselman added Type: Feature Request 🛍️ New feature or request Difficulty: Medium Type: Enhancement ✨ Improvement to process or efficiency labels Jun 15, 2024
Signed-off-by: Dan Selman <danscode@selman.org>
@mttrbrts
Copy link
Member

Do we want to only support inferClass when model manager is in strict mode?

Yes, I say so

Do we want to be stricter in fromJSON with inferClass=false to ensure that $class is always present and is FQN, or is it ok to attempt type inference in all cases?

We need to be backwards compatible. So I say, no for v3, but this sounds reasonable for v4

@mttrbrts
Copy link
Member

This change has the side effect of making some types implicitly final, if they weren't in scope of the model manager when the serialisation occurred.

For example, it's possible that a namespace is unambiguous to one client, and ambiguous to another.

@dselman
Copy link
Sponsor Contributor Author

dselman commented Jun 15, 2024

This change has the side effect of making some types implicitly final, if they weren't in scope of the model manager when the serialisation occurred.

For example, it's possible that a namespace is unambiguous to one client, and ambiguous to another.

Yes, this one is worrying. I don't see a good solution to this, in as far as adding a new model file to the model manager could have a side-effect on serialisation of types, other than adding an explicit final but that would seriously limit the scope of the optimisation. So the assumption is that the model manager at T0 when JSON was generated by toJSON contains the same model files as at T1 when fromJSON is called.

One solution could be to store the unambiguous "final types" on the root node when we call toJSON. During the call to fromJSON we would use those stored types, assuming that any now ambiguous types were instances of those... Not pretty.

I added a test case for this...

Signed-off-by: Dan Selman <danscode@selman.org>
Signed-off-by: Dan Selman <danscode@selman.org>
@mttrbrts
Copy link
Member

mttrbrts commented Jun 16, 2024

In the real world, I see this optimization as hugely beneficial for fixed scope scenarios, such as the following:

  1. Serialisation of metamodel instances
  2. Serialisation of decorator command sets
  3. Serialisation of models in a known closed domain (say statically compiled apps),

In these cases, it's a fair assumption that the namespace domain is stable (e.g. it only contains the metamodel definitions), so the ambiguous cases should not arise.

Use of an explicit final keyword would also likely have most benefit in these same scenarios.

Serialisation of userland models is more dangerous unless the system can also persist the namespace domain and make an assumption like you suggest.

Making everything implicitly final (and introducing an explicit extensible modifier) would be safer and give us more scope for optimisation, but that's a big breaking change.

@mttrbrts
Copy link
Member

Adhoc testing shows about 1/3 file size reduction for a 29Mb model definition, and 73Mb Decorator Command Set file.

Copy link
Member

@mttrbrts mttrbrts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll likely need similar changes to the metamodel utils and decorator manager to allow loading and printing of modelfile definitions and dcss with inferred classes.

packages/concerto-core/lib/serializer/jsongenerator.js Outdated Show resolved Hide resolved
Signed-off-by: Dan Selman <danscode@selman.org>
@dselman dselman marked this pull request as ready for review June 19, 2024 16:49
@dselman
Copy link
Sponsor Contributor Author

dselman commented Jun 19, 2024

Simplified inference logic to assume the type of a field, if $class is missing.

Signed-off-by: Dan Selman <danscode@selman.org>
@dselman dselman marked this pull request as draft June 19, 2024 17:06
@dselman
Copy link
Sponsor Contributor Author

dselman commented Jun 19, 2024

Before we can load AST created with inferClass=true we will have to update code like this:

Which does not use Resource.

@mttrbrts
Copy link
Member

Nicely done. The size reduction is still around 1/3 for metamodel and DCS, which is good news!

Are you going to patch up the modelfile definition (and similar) in this PR too?

@dselman
Copy link
Sponsor Contributor Author

dselman commented Jun 21, 2024

Nicely done. The size reduction is still around 1/3 for metamodel and DCS, which is good news!

Are you going to patch up the modelfile definition (and similar) in this PR too?

Yes. I will add a test that does the full roundtrip with inferClass true/false on a meta model...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Difficulty: Medium Type: Enhancement ✨ Improvement to process or efficiency Type: Feature Request 🛍️ New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Optimize JSON (De)serialization (Final Types et al)
3 participants