Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Record type to express key/value pairs #447

Open
KyleBastien opened this issue Jun 21, 2022 · 19 comments
Open

Record type to express key/value pairs #447

KyleBastien opened this issue Jun 21, 2022 · 19 comments

Comments

@KyleBastien
Copy link
Contributor

KyleBastien commented Jun 21, 2022

Feature Request 🛍️

This proposes the addition of a Record type into the Concerto syntax to be able to express key/value pairs in an object. For example a Record<String, Foo> would express a key/value pair where the keys of the objects are String's and the values of this object are Foo objects.

You can look more at TypeScript's Record type documentation here: https://www.typescriptlang.org/docs/handbook/utility-types.html#recordkeys-type, which I think is good inspiration for what should be possible here.

EDIT: The Design Document for this feature is in the Wiki

Use Case

Here are a variety of different use cases that I think we should support:

Example 1:

concept Foo {
  o String propName
}

concept Example {
  o Record<String, Foo> fooDictionary
}

// An Example object could look like:
// {
//   fooDictionary: {
//     key: {
//       propName: 'Hello World'
//     }
//   }
// }

Example 2:

enum Keys {
  Some,
  Enum,
  Values
}

concept Foo {
  o String propName
}

concept Example {
  o Record<Keys, Foo> fooDictionary
}

// An Example object could look like:
// {
//   fooDictionary: {
//     Some: {
//       propName: 'Hello World'
//     },
//     Enum: {
//       propName: 'Other Hello World'
//     }
//   }
// }

Example 3:

concept Example {
  o Record<String, Boolean> fooDictionary
}

// An Example object could look like:
// {
//   fooDictionary: {
//     key: true
//   }
// }

Possible Solution

Not sure about implementation here, I would assume that we would at least need to be able to support the concept of generics within the language as a pre-requisite to this.

I also think it would be good to constrain the Record type to only support String or String derivates (e.g., Enum) as keys for a Record type. This is similar to what TypeScript does, and translates well to JSON. This does limit Record's functionality in terms of what is possible in other languages like Java or C# which could use classes as key types, but I think this constraint is desirable.

Context

Currently describing key/values pairs where keys are user generated content is currently not possible in Concerto. A Record type would help solve that problem. We see this when trying to describe variables for example, where variable names are defined by the user.

In these scenarios it's still very useful to developers at development time to have a representation of these Key/Value pairs, even if they aren't fully type safe, because a developer needs to understand what is possibly on this property of a concept.

@mttrbrts
Copy link
Sponsor Member

For the record, Kyle and I discussed this requirement and discussed the following motivations for not adding this feature:

  • Increased complexity for business users by introducing Generics and a second way to capture an object-style type. There is some value in giving users fewer options to make the language more accessible, at the cost of reduced expressiveness.
  • Making the language less well-typed. Not requiring the full structure of the type to be known in advance makes it more difficult for clients to implement tools that traverse the type (e.g. to build a UI). Clients need to be more dynamic in order to interpret the metamodel to generate or validate data.
  • Some target languages for CodeGen may not support Record Types, e.g. GraphQL

Up until this point, users have satisfied similar requirements with either:

  1. Explicit modelling, often dynamically via the concerto API.
  2. Approximating records with an array of pairs, e.g.
concept Pair {
  o String key
  o Foo value
}

concept MyConcept {
  o Pair[] variables
}

However, this changes the performance characteristics (i.e. lookup is now O(n), rather than O(1)).

Similarly, we've also considered supporting JSON objects as a Scalar type in the past, however, this was abandoned because it encouraged untyped properties which hinders introspection by clients.

In TypeScript, note that Record<A, B> is equivalent to { [key: A]: B; }.

However, Kyle's requirement strikes a compromise between the extremes of untypedness and strict typing.

  • ✅ It allows code-generation (e.g. to TypeScript and C#) that can be statically bundled with applications, with enough hints for runtime clients to understand the meaning of the declaration.
  • ✅ It avoids the messy dynamic model creation and introspection & special cases in client code to handle Dictionary use-cases.
  • 🙈 The more advanced nature of this typing construct goes some way to dissuading less-technical users from adopting it.

@mttrbrts
Copy link
Sponsor Member

mttrbrts commented Jun 22, 2022

constrain the Record type to only support String or String derivates (e.g., Enum) as keys

I support this.

We should also discuss the naming and syntax. Do we prefer Record, Map, Dictionary, Dict etc.?

Is it possible to adopt a different syntax which is less technical for the Generics requirement? Particularly given the constraint to only allow String types for keys. Perhaps ...

o Dictionary<Address> address

@dselman
Copy link
Sponsor Contributor

dselman commented Jun 22, 2022

Thinking aloud...

namespace concerto

concept Dictionary {
   o Concept[] values
}

namespace test

concept Person {
   @ValueType("Address")
   o Dictionary address
}

@KyleBastien
Copy link
Contributor Author

KyleBastien commented Jun 22, 2022

I don't think we should just limit to having the user only pass in the ValueType, I think that's very limiting for possible future scenarios where a user can pass in other String derivative types (e.g. String literals comes to mind), as keys. I also don't think it allows for passing in an Enum as a key type which I think is a requirement.

@KyleBastien
Copy link
Contributor Author

Also I am not sure that Dictionary would be the right choice as the type name here, I think given the limitations of Strings or String derivates only as the Key type, it might be confusing to those who have used Dictionary's from C# before?

@DS-AdamMilazzo
Copy link

I agree with Matt's proposed simplification to only annotate the value type and not the key type. I could be wrong but I personally don't see the enum key type as a very compelling case. The necessity of the Record type comes from the need to express unknown keys, which is currently impossible in Concerto. Expressing a set of known keys (as from an enum) can already be done efficiently by declaring them as optional fields of a concept. For example, the following seem equivalent to me:

enum E {
  o one
  o two
  o three
}

concept C {
  o Record<E, Integer> r
}

and

concept R {
  o Integer one optional
  o Integer two optional
  o Integer three optional
}

concept C {
  o R r
}

@KyleBastien
Copy link
Contributor Author

But what (if in the future) you had other String derivate types other than Enums? For example, String Literals come to mind. How then would you express the String Literal being the Key of a Record type?

That's why I proposed we should express the Key, Value pair for future flexibility. I think this also aligns it with practically every typed programming language that exists which expresses the type of a Record using Key/Value pair.

I also don't consider the two examples you showed to be strictly equivalent. You shouldn't allow for optionality generally with a Record type. You can see the different in this TypeScript playground here. I think what you're trying to express is a Partial type of the Record.

@DS-AdamMilazzo
Copy link

DS-AdamMilazzo commented Dec 11, 2022

I also don't consider the two examples you showed to be strictly equivalent. You shouldn't allow for optionality generally with a Record type.

In that case, isn't it equivalent to this?

concept R {
  o Integer one
  o Integer two
  o Integer three
}

concept C {
  o R r
}

To me it seems rather strange to me to imply that Record<K, V> should "generally" require all possible values of K to be specified. DateTime is a string derivate, but obviously for Record<DateTime, T> they shouldn't have to specify every possible DateTime key. For Record<String, T>, the key space is infinite. Surely the general case is that the user isn't required to specify every possible key value. Requiring all possible values when the key is an enum, then, would be inconsistent with the general case. (I do observe that typescript treats enums differently than strings and other types when used as a record key.)

But what (if in the future) you had other String derivate types other than Enums? For example, String Literals come to mind. How then would you express the String Literal being the Key of a Record type? That's why I proposed we should express the Key, Value pair for future flexibility. I think this also aligns it with practically every typed programming language that exists which expresses the type of a Record using Key/Value pair.

I agree that it's more useful to be able to express both the key and value. The question is more about whether the Concerto team wants to do the additional work. (I just don't think the enum example is all that compelling, given that it seems like it can already be expressed in other ways.)

To the previous point, I think "practically every typed programming language that exists which expresses the type of a Record using Key/Value pair" doesn't require every key to be specified, though, even when the key is an enum type. Either that, or the number of such languages must be very small.

(To my mind "string literal" means a string specified literally as opposed to being the value of some other expression, but I guess you have a different meaning in mind. I don't know quite what you mean, but I'm guessing it sounds like an enum.)

@KyleBastien
Copy link
Contributor Author

In that case, isn't it equivalent to this?

I'm not sure how you would access a key in your example based on the enum key. For example, how would you do R.one? I think this is kind of like saying, why have enums at all, if we can just express it as a concept? I think there is clear value in an Enum, and there is clear value in using one as a Record key, which is why it's done constantly in other typed languages to express a constrained set of keys that you want to access without using some magic string everywhere that then needs to be updated whenever you change it.

Surely the general case is that the user isn't required to specify every possible key value. Requiring all possible values when the key is an enum, then, would be inconsistent with the general case. (I do observe that typescript treats enums differently than strings and other types when used as a record key.)

Agreed! Which is why I think passing a Key for a Record is a necessity, because types need to be context aware based on what they are configured for.

...doesn't require every key to be specified, though, even when the key is an enum type. Either that, or the number of such languages must be very small.

I think this is because very few languages express a true Record type, while many express a Map type or Dictionary type. This is because of the historical tie between JSON serialization and how JavaScript/TypeScript express Records/objects. For example, even in TypeScript a Map doesn't require every Key to specified when using an Enum. You can see that in this example. Since Concerto is trying to express models that will be serialized, a Record type is clearly the correct choice, you can't serialize a Map.

(To my mind "string literal" means a string specified literally as opposed to being the value of some other expression, but I guess you have a different meaning in mind. I don't know quite what you mean, but I'm guessing it sounds like an enum.)

You can see the documentation on this here and here. A literal type is Enum like when combined with union types but can be expanded with template literals to be very powerful. You can see an example of this here. Just to be clear I'm not proposing we add string literals into Concerto here, we can discuss that in another issue. But I do think we want the flexibility to take advantage of these types of features with Records in the future so we can express truly type safe models.

@KyleBastien
Copy link
Contributor Author

Just to add to this, since the release of Scalars in the Concerto language I think this increases the need for a Record type to be able to specify both a Key and a Value type.

I can think of lots of examples where wanting to do something like this could make sense:

scalar GUID extends String default="00000000-0000-0000-0000-000000000000" regex=/^[{]?[0-9a-fA-F]{8}-([0-9a-fA-F]{4}-){3}[0-9a-fA-F]{12}[}]?$/

concept Person identified by id {
  o GUID id
  o String givenName
}

concept Example {
  o Record<GUID, Person> persons
}

@DS-AdamMilazzo
Copy link

I agree with that.

@mttrbrts
Copy link
Sponsor Member

mttrbrts commented Feb 6, 2023

Syntax Alternatives

We should choose one of the following variants.

// Variant 1: Type Variable syntax
// Inspired by TypeScript, C# etc.
concept MyConcept {
  o Record<GUID, Person> people
}
-----------------------------------
// Variant 2: Modifiers
// "Concerto-like"
concept MyConcept {
  o Person[] people index GUID
}
-----------------------------------
// Variant 3: map keyword
// Inspired by Smithy
// https://smithy.io/2.0/spec/aggregate-types.html#map
map AddressBook {
  o GUID key
  o Person
}

set MyFruit {
  o Fruit
}

// Individual or Corporation or Charity
// Individual | Corporation | Charity
union Party {
  o Individual
  o Corporation
  o Charity
}

concept MyConcept {
  o AddressBook people
}
-----------------------------------
// Variant 4: map keyword
map AddressBook of Person // Implicit String type variable for keys
map AddressBook from PhoneNumber to Person

concept MyConcept {
  o AddressBook people
}
-----------------------------------
// Variant 5: dictionary & record keywords
dictionary AddressBook for Person // Implicit String type variable for keys
record AddressBook of Person keyed by GUID

concept MyConcept {
  o AddressBook people
}

All options represent the same conceptual structure, e.g.

{
  "123-34-12-1234": { "$class": "Person", ... },
  "234-45-56-1234": { "$class": "Person", ... }
}

Definitions

concept A {}

concept B identified {}

enum E {
  o ONE
  o TWO
}

scalar StringScalar extends String
scalar BooleanScalar extends Boolean

Spec for Record Types

Examples

These examples use the type variable syntax (e.g. with implicit generic type declaration of Record<S, T>. Although they all can be represented in all other syntax alternatives too.

concept Concept {
  o Record<String, String> record1
  o Record<String, Boolean> record2
  o Record<DateTime, Boolean> record2b
  o Record<String, A> record3
  o Record<StringScalar, String> record4
  o Record<E, String> record5
  o Record<E, String>[] record6
  o Record<String, String> record7 optional

  // FOR DISCUSSION

  // New `Relationship` type
  o Record<String, Relationship<B>> recordOfRelationships
  --> Record<String, B> recordOfRelationships2

  // New `Array` type to make aggregate operations consistent
  o A[] listOfAs
  o Array<A> listOfAs2
    
  o Record<B, String> recordWithIdentifiedConceptKeys
  
  o Record<String, Record<String, String>> nestedRecord

  o Record<Integer, String> integerKeyedRecord

  // NOT SUPPORTED
  // o Record<A, String> record
  // o Record<BooleanScalar, String> record
  // --> Record<String, String> record // Relationship target is not identified 
  // --> Record<String, A> record // Relationship target is not identified 
  // o Record<String, String> record1 default={"a":"A","b":"B"}
}

// FOR DISCUSSION

// Suppose that there's an existing concept in an old model called "Record". Then should the following be valid
concept Record identified{} // A legacy concept called "Record"

concept MyHeadHurts {
  o Record foo
  o Record<String, Record> bar
}

// Using scalars to declare type aliases
scalar Dictionary extends Record<String, String>

concept Library {
  o Dictionary[] dictionaries
}

Future Extensions

To check that the design is extensible to other potential aggregate types in future

Examples

concept SetConcept {
  o Set<String> stringSet
  o Set<Boolean> booleanSet // Although why?
  o Set<Integer> intSet
  o Set<Double> doubleSet
  o Set<Long> longSet
  o Set<E> enumSet
  o Set<B> conceptSet
  o Array<Set<String>> stringSetArray
  o Set<StringScalar> stringScalarSet
  o Set<BooleanScalar> booleanScalarSet

  // NOT SUPPORTED
  // o Set<A> record // Concept is not identified
 
}

concept UnionConcept {
  o Union<A,B> ab
  o Union<A,B, C> abc
}

@KyleBastien
Copy link
Contributor Author

We should choose one of the following variants.

At least to me I think the "Modifers" syntax looks a little confusing just because I would expect that to be a Person array and not a Record.

Between the other two options I don't have a really strong opinion, but have a preference towards the first option, only because that aligns well with other languages. The map keyword is also just "more" code to represent the same thing, but that's not necessarily a bad thing per say.

scalar BooleanScalar extends String

I think you mean extends Boolean here yes?

// Suppose that there's an existing concept in an old model called "Record". Then should the following be valid

I'd personally say no, adding keywords and needing to modifying legacy code to not utilize those keywords I don't think is a foreign concept and I think would make for the cleanest break here.

@dselman
Copy link
Sponsor Contributor

dselman commented Feb 7, 2023

Some quick notes, as I won't be able to attend the meeting tomorrow. I'd be in favour of:

  1. Consistency in syntax across arrays, maps and sets
  2. Clear semantics around ordering of entries and examples of serialisation. E.g. would we serialise a set as an object?
  3. Try to keep the CTO syntax business accessible, as far as possible. I.e. this scares me Record<String, Relationship<B>> recordOfRelationships :-)
  4. Ability and clear syntax to distinguish between sets of Things and set of relationships to Things etc. Both are useful and we shouldn't let relationships fall off the wagon or become second class citizens
  5. We need to be very careful with type identity for these new constructs. Today all types are identified by namespace@version.TypeName — changing/breaking that will have far reaching consequences, I suspect. This also includes how relationships are encoded as URIs, when we serialise instances of relationships. This also includes import statements where a type in an external namespace must be unambiguously identified.
  6. Let's not forget that we will need to update all code gen targets to support these new metamodel constructs. It would be good to see in the design what code we propose to generate for each target, or do we "cross the rubicon" and fail code generation if you use some of these constructs with certain targets?

@mttrbrts
Copy link
Sponsor Member

mttrbrts commented Feb 8, 2023

Based on the feedback above and the discussion today on the Working Group call. I've started drafting a specification for this feature.

https://github.com/accordproject/concerto/wiki/Aggregate-Types-Design

@KyleBastien
Copy link
Contributor Author

@mttrbrts thanks for putting that together.

I think what you have in that spec generally looks good to me. A few pieces of feedback:

The other property indicates the type of the value in the map.

I feel like we might want to be explicit about what properties specifically is the value of the map something like:

map Dictionary {
  o String key
  o String value
}

I think this better aligns with how other concepts are written in Concerto in the sense that they always have a type and a name, in this case the names are just key words. It just would read odd to me to not have a name for a property.

Maps should be implemented using the JavaScript Map object. We inherit the semantics of that implementation here.

When you say implemented here, do you mean that internally to Concerto it might represent this as a Map object, or that if this is expressed in a concrete language like TypeScript it would actually be a Map object in that language? If it's the later, then I think this is a big departure from being able to express Records and wouldn't fit with the original requirements of this issue, because we need to be able to express a Record object that explicitly doesn't have things like .set, .delete, etc. If this is just how Concerto internally wants to represent this, I think this is fine, as long as in the concrete language (i.e., TypeScript) it is not a Map object but a Record object.

If it is also a requirement to be able to express a true Map object in the concrete language, I think then we ought to have two different types here, record and map to differentiate.

@mttrbrts
Copy link
Sponsor Member

mttrbrts commented Feb 9, 2023

Thanks @KyleBastien

I feel like we might want to be explicit about what properties specifically is the value of the map...

I don't have a strong view on this. However, it's worth comparing this to our syntax for Enums, and what is proposed for Union Types

Maps should be implemented using the JavaScript Map object. We inherit the semantics of that implementation here.

I'm only referring to the internal representation, I'll add the requirements for TypeScript to the Code Gen section

If it is also a requirement to be able to express a true Map object in the concrete language, I think then we ought to have two different types here, record and map to differentiate.

We don't have this requirement, AFAIK.

@KyleBastien
Copy link
Contributor Author

I don't have a strong view on this. However, it's worth comparing this to our syntax for Enums, and what is proposed for Union Types

Agreed I don't have a really strong feeling about this, it just read a bit odd to me to mix the two in the same concept.

I'm only referring to the internal representation, I'll add the requirements for TypeScript to the Code Gen section

Sounds good, adding the code gen requirements would be appreciated. From what I see that you updated it looks good so far.

We don't have this requirement, AFAIK.

Sounds good, we might want to consider if we will have this requirement in the future but given that it's quite hard to serialize true Map, Set, etc. objects from JavaScript/TypeScript to JSON, we might just want to make the call that we won't support that.

@mttrbrts
Copy link
Sponsor Member

it just read a bit odd to me to mix the two in the same concept.

Good point, I've updated the spec to remove the modifiers entirely and to rely on the order instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: In Progress
Development

Successfully merging a pull request may close this issue.

5 participants