Skip to content
This repository has been archived by the owner on Oct 29, 2021. It is now read-only.

UUIDs #14

Closed
chasm opened this issue Sep 1, 2010 · 18 comments
Closed

UUIDs #14

chasm opened this issue Sep 1, 2010 · 18 comments

Comments

@chasm
Copy link

chasm commented Sep 1, 2010

Is there any interest (besides me) in allowing UUIDs to be used as primary keys in Circumflex ORM? Ideally, I'd love to have it set up so I could pass a UUID, or, if I didn't, one would be randomly generated. Ability to swap out UUID libraries would be ecstatically wonderful. Don't really care how it's stored -- string is fine. I haven't seen much advantage to e.g., PostgreSQL's UUID datatype.

Currently, I'm working around this with a "code" attribute:

class Expectation extends Record[Expectation] {
  def this(code: String) = {
    this()
    this.code := code
  }

  val code = "code".TEXT.UNIQUE.DEFAULT(UUID.randomUUID.toString)
}

Then I just ignore the automatically generated id.

@chasm
Copy link
Author

chasm commented Sep 1, 2010

BTW, yes, I did see this:

"We only support auto-generated BIGINT columns as primary keys for a couple of reasons. Sorry."

Just curious as to the reasons and whether this was a permanent policy, plus any hints on how to make the above code better. I'd also like to extend Record to something like "BaseEntity," modify a few things, and then have all my record classes extend BaseEntity, but I'm so type-stupid that I can figure out how to make this work. Is it possible and, if so, can you give me a hint?

Thanks!

@actsasflinn
Copy link
Contributor

I'm interested in alternate primary key types. Specifically for use with Rails applications so I don't need to modify my existing schema. I've looked into making the modifications but I suspect some of the internals are tied to the BIGINT type. Also there are no tests/specs to verify I haven't broken anything so that holds me back.

@inca
Copy link
Owner

inca commented Sep 4, 2010

That's right, some of our internals (cache, projections, eager fetching and joins) are tied to Long values. As we are reorganizing almost all these concepts, we'll take a look at how to enable support for custom id types.

P.S. Supporting composite primary keys still seems challenging -- I can't assure you that we will include this support into next release.

@inca
Copy link
Owner

inca commented Sep 6, 2010

BTW, this decision did not come up easily. We've spent quite a time to find out that BIGINT type is the most versatile one for primary keys since the indices on BIGINT columns are faster, so if you have lots of joins in your business logic (I'm sure you do), than the BIGINT column is the best option. If you need UUIDs, then simply introduce another column, just like you did in your example. Of course, this does not cover interop issues with already existing database schema.

@chasm
Copy link
Author

chasm commented Sep 6, 2010

Actually, the code example above is working beautifully. I don't really care what the DB uses internally. As long as I can do CRUD using just the UUID, I'm happy.

Any hints on how to extend Record so I don't have to copy that "code" snippet (and the creation/update timestamp code) into every class?

@inca
Copy link
Owner

inca commented Sep 7, 2010

Oh that should be fairly easy:

abstract class GenericRecord[R <: GenericRecord[R]]
    extends Record[R] { this: R =>
  val code = "code".TEXT.UNIQUE
  val createdAt = "created_at".TIMESTAMP.DEFAULT("current_timestamp")
  val updatedAt = "updated_at".TIMESTAMP.NULLABLE

  code := UUID.randomUUID.toString
}

You can also create GenericRelation class where you can add find-by-uuid methods and add events to update the updatedAt field before updates (though you might consider to automatically generate triggers for such tables to make the behavior more consistent).

@inca
Copy link
Owner

inca commented Sep 7, 2010

Ooops, sorry, my mistake. Edited the code above, check it out.

@chasm
Copy link
Author

chasm commented Sep 7, 2010

Ah! Nice. That works beautifully. Now I see what I was doing wrongly. Thanks.

@inca
Copy link
Owner

inca commented Sep 7, 2010

You are welcome, Chas!

In a meantime, supporting custom-type primary keys implies introducing additional type parameter into Record, so that client code would look like this:

class Country extends Record[Country, Long] {
  override def primaryKey = id
  val id = "id".BIGINT 
}

This, in turn, will imply rewriting all the code that references records (which is 80% of ORM) and introduce a little more verbosity into client code. Yeap, associations and inverses will also receive this additional parameter -- this will help us make sure that data types are consistent between foreign key and other side's primary key.

There is, however, another way: primary key will expect any field (Field[_]); in this case the verbosity is slightly reduced, but type-safety in this case is reduced too.

So, which solution do you prefer?

@chasm
Copy link
Author

chasm commented Sep 7, 2010

Well, I'm only one person, but in my mind type safety is usually paramount. It may be a little more verbose to write, but I suspect a lot less trouble to debug. I'm a big fan of the compiler. It does half my work for me. But if I'm reading this correctly, the added verbosity comes only if you want to use a custom-type primary key.

Then again, you have to prioritize your time and if it's a lot of extra work...

To explain my interest in UUIDs, I build a lot of REST web services. (Most of what I do comes down, at least in part, to CRUD. Story of my cruddy life.) A common method for doing CRUD with REST is to use POST for new item creation and PUT for item update.

The problem for me is that POST is not idempotent, and this creates regular problems. On the other end of my REST interface is a web form, and a user. If the user does the same POST twice, he gets two copies. To avoid this requires some trickery.

A better solution IMO is to use PUT for both creation and update. The back end doesn't care whether the item exists or not. It PUTs the new item. If the item already exists, it is simply overwritten (assuming the new item passes validation, of course). So the user can resubmit the same data 100 times and he gets the same result.

To make this work, of course, I have to send the ID of the item in the PUT. If I didn't and let the database assign the ID, then it would just assign a new one on each PUT and we're back to square one. So I either have to poll the database to get the next ID in the sequence (many of which will be thrown away when the form is never submitted), or I need a UUID.

That brings up one issue with your wonderful abstract class above. It assumes that the UUID will be randomly generated by the object itself. This might be true part of the time (e.g., for automated object creation behind the scenes), but most of the time I need to pass this UUID to the object in the constructor. So I need two constructors -- one with and one without the UUID. I tried the same trick I used in my first message above with the GenericRecord you provided, but it didn't work. Any idea how to make it work? No sweat if you're too busy.

@inca
Copy link
Owner

inca commented Sep 8, 2010

I would rather stick with one more method in abstract class:

abstract class GenericRecord[R <: GenericRecord[R]]
    extends Record[R] { this: R =>
  ...
  def withCode(c: String): this.type = {
    this.code := c
    return this
  }
}

so that you could easily create a record and set it's uuid in just one statement:

val c = new Country().withCode(myUUID)

Also note that this approach also eliminates the need to declare additional constructors in subclasses.

@chasm
Copy link
Author

chasm commented Sep 9, 2010

That works beautifully. As for triggers to set the updatedAt value, I'd prefer not to rely on db-specific stuff to the extent possible. Here is the code I'm using (I've updated this to reflect the information below).

abstract class BaseRecord[R <: BaseRecord[R]] extends Record[R] {
    this: R =>
  val code = "code".TEXT.UNIQUE
  val createdAt = "created_at".TIMESTAMP.DEFAULT("current_timestamp")
  val updatedAt = "updated_at".TIMESTAMP.NULLABLE

  code := UUID.randomUUID.toString

  def withCode(c: String): this.type = {
    this.code := c
    this
  }
}

abstract class BaseTable[R <: BaseRecord[R]] extends Table[R] {
  afterUpdate(_.updatedAt := new Date())
}

@inca
Copy link
Owner

inca commented Sep 9, 2010

Validators aren't meant for this; use afterUpdate instead:

abstract class BaseTable[R <: BaseRecord[R]] extends Table[R] {
  afterUpdate(_.updatedAt := new Date())
}

@chasm
Copy link
Author

chasm commented Sep 9, 2010

Ah! I was looking for that but didn't see it in the documentation or the source code. Validation was a work around. Glad I asked.

@inca
Copy link
Owner

inca commented Sep 14, 2010

Okay, since I'm in a middle of serious reorganization, there's a few things I should know to introduce support for custom primary key types in Circumflex ORM: when inserting a row, we rely on database to generate primary key and then re-select this row (in case it was modified by a trigger) using last generated id; as far as I concerned, this functionality only works with sequential numeric data types (and either AUTOINCREMENT expression or sequence-generated value) -- how should I do the same with custom data types?

@chasm
Copy link
Author

chasm commented Sep 14, 2010

Hmm. That's a good question. I presume that you do this in a transaction so that another row isn't inserted in the meantime.

I would say off the top of my head that you allow two methods. In one, the database selects the ID as now, returns it, and then you requery for the row. In the other, the ID is provided in the original insert and stored in the caller, then the caller makes the select using that ID. But I'm not sure whether these two calls -- the insert and select -- are made in the same caller or if they are somehow separate. But either way I am assuming that the ID is created outside the database (possibly by polling the database) and that it must be stored temporarily or passed to the select method as a parameter.

Does this make sense? I will try to look at the source code later and will see if I have a clearer idea. But the above seems the only way at first glance.

@inca
Copy link
Owner

inca commented Sep 15, 2010

That's right. The main concern is that most identifier generation strategies (except application-assigned ids, UUIDs for instance) expect the primary key to be of integer type. I'm currently thinking of implementing four different identifier generation strategies:

  • application-assigned identifiers will be the default, application should assign an identifier manually before inserting, an exception is thrown if insert is performed with null in primary key; the only way to implement the save method is to perform DELETE with specified identifier and then INSERT;
  • identity strategy will use the ability of most database vendors to automatically generate an identifier and to access last generated identifier (LAST_INSERT_ID or something like that); this will only work with BIGINT data type;
  • sequence strategy will poll database for next sequence value and then use this value to perform an insert.
  • uuid strategy is almost identical to the first one, except that application will generate UUIDs automatically.

@chasm
Copy link
Author

chasm commented Sep 15, 2010

This should be flexible enough for pretty much everyone, I would think. I'd like to be able to implement strategy 1 where I assign a UUID, but with strategy 4 as a fallback. Sort of the way I have it now with the Item() or Item().withCode(code) where the former creates a random UUID and the latter uses the UUID I assigned.

This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants