Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ids and schema #42

Closed
ahacking opened this issue Jul 30, 2014 · 5 comments
Closed

Ids and schema #42

ahacking opened this issue Jul 30, 2014 · 5 comments

Comments

@ahacking
Copy link
Contributor

I've noticed that schema only allows one to specify the id handling globally for all models.

I have models that use UUID's however there are others that have natural keys and I can't say that globally for all instances of all models that the id attribute can be handled uniformly.

If I have API's from a number of sources, there isn't any particular global rule that can be used, for some models a UUID may be fine and in other cases the api might use type/small-integer-id or slug, in some models the remote id attribute name may vary.

I can see that the IdMap segregates model types, to keep their key spaces distinct, and handles the attribute type names too, however the schema model definition limits what can be expressed.

I believe the schema definition needs to be enhanced to:

  • allow configuring the remote id attribute(s) individually per model
  • allow configuration of a model specific id generator (since different API's might use different schemes for client assigned ids)
  • also indicate if the generated id can be used both locally and remotely.

In relation to my other issue #40 regarding embedded models/links, I am exploring if schema would be better described treating everything (ids, attrs, links) as just model property definitions.

Example:

 planet: {
       properties: {
         id: {type: 'id', generator: V4UUIDGenerator },
         name: {type: 'string'},
         classification: {type: 'string'},
         moons: {type: 'hasMany', model: 'moon', inverse: 'planet'},
         features: {type: 'hasMany', model: 'feature', embedded: true}
       }
     },
     moon: {
       properties: {
         id: {type: 'id', local: '_id', generator: OrbitDefaultGenerator },
         name: {type: 'string'},
         planet: {type: 'hasOne', model: 'planet', inverse: 'moons'},
         features: {type: 'hasMany', model: 'feature', embedded: true}
       }
     }
     feature: {
       properties: {
         // note absence of id field
         kind: {type: 'string'},
         description: {type: 'string'}
       }
     }

I think its important to keep things under a 'properties' key in-case there are future model/schema concerns that are not property specific.

Let me know your thoughts on your preferred way forward. I am trying things out with Orbit so I am willing to work up a PR with my changes if you're open to taking this approach or an alternative that provides similar capability. I am also aware that such a schema definition change also requires changes to Ember-Orbit.

@dgeb
Copy link
Member

dgeb commented Jul 30, 2014

@ahacking I can get behind all of these goals:

  • allow configuring the remote id attribute(s) individually per model
  • allow configuration of a model specific id generator (since different API's might use different schemes for client assigned ids)
  • also indicate if the generated id can be used both locally and remotely.

As I see it, here are the constraints on ids / remoteIds:

  • Every model REQUIRES an idField so that Orbit can track unique records by type and id
  • Every model MAY have a single remoteIdField that is used by remote servers to track unique records by type and id
  • A model's idField and remoteIdField MAY be the same
  • A model SHOULD be able to be assigned a custom generateId method to generate ids. This will only generate remote ids when idField === remoteIdField (and therefore should be a UUID generator).
  • If idField and remoteIdField differ, Orbit MUST maintain a mapping between the two. This is necessary so that ids can always be matched with remoteIds.
  • A schema can define default values for idField, remoteIdField and generateId that will be used when not overridden in a particular model.

Let me know your thoughts on your preferred way forward.

I am still puzzling over this and will follow up soon.

@ahacking
Copy link
Contributor Author

Thanks for that, your rules above confirm my understanding of the code and are exactly what I have been basing my modification's on.

Some questions:

  • Is there really a need to configure the local id to something other than _id or perhaps _cid?
  • Should there always be an _id/_cid attribute, and in the case where client and remote ids are the same (eg uuids) it just uses the value of the remote id. Thinking this might simplify logic in clients as _id/_cid is always present and can be depended on.
  • Presently the semantics for generateId seems to be intended for client id generation than for server ids as the default generator uses time and not uuids. Maybe the default generator should be changed and the semantics of what the generator is used for documented as dual purpose. There's a good fast uuid generation function I've been using which could be made the default unless you have other reasons for the time based one.

I've been playing with schema definition putting all attributes links and ids into a properties object. I like defining schema that way as when I think of embedded objects I don't think of them as links, it also aligns with how you define models in Ember. But registering a model requires building attribites and links as thats what the rest of the code expects.

It's late here (almost 2am) so I'll pick this up again in the morning. Thanks for your speedy response!

@dgeb
Copy link
Member

dgeb commented Jul 31, 2014

Is there really a need to configure the local id to something other than _id or perhaps _cid?
Should there always be an _id/_cid attribute, and in the case where client and remote ids are the same (eg uuids) it just uses the value of the remote id. Thinking this might simplify logic in clients as _id/_cid is always present and can be depended on.

Some DBs (MongoDB perhaps?) use _id as a default ID field, so that's not a good option. This is the reason I went with __id by default. I might be open to using __cid as well, but I think I prefer __id.

However, in the case of UUIDs, we would need to duplicate the value of remoteIdField (say, just id) into __id. This is a legitimate option, but it's extra work and takes up slightly more memory. On the other hand, it guarantees that __id will be present on every record and means fewer lookups. I've been on the fence re: this tradeoff since day one.

There's a good fast uuid generation function I've been using which could be made the default unless you have other reasons for the time based one.

I'm open to just using a decent v4 UUID generator instead of the hokey timestamp + counter that's a default. I want to be sure of licensing issues (Orbit is MIT) and performance before including this.

@ahacking
Copy link
Contributor Author

If you don't need to copy the value and merely reference it then no additional memory is required aside from the reference itself which is required on both sides of the fence.

What are your thoughts on composite keys? Being able to support an id based on some combination of attributes might be required for those with APIs that don't use surrogate keys.

@dgeb
Copy link
Member

dgeb commented Aug 15, 2014

Closed through the merge of #45

@dgeb dgeb closed this as completed Aug 15, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants