newtype interferes with unique constraints in MySQL #100

Open
ezyang opened this Issue Oct 29, 2012 · 11 comments

Projects

None yet

4 participants

@ezyang
ezyang commented Oct 29, 2012

Text fields can have unique constraints over them.

However, when a Text field is newtype'd, for extra type safety, this facility appears to break:

In Model.hs with GeneralizedNewtypeDeriving:

newtype Hash = Hash { unHash :: Text }
    deriving (Show, Read, Eq, PathPiece, PersistField)

In config/models:

Profile
  hash Hash
  UniqueHash hash

On initial database migration with mysql, we see:

Migrating: ALTER TABLE `profile` ADD CONSTRAINT `unique_hash` UNIQUE(`hash`)
hpd3js: ConnectionError {errFunction = "query", errNumber = 1170, errMessage = "BLOB/TEXT column 'hash' used in key specification without a key length"}

Uncaught instance of overlapping instances floating around, perhaps? (Am I missing a deriving statement?)

@meteficha
Member

@ezyang, this is an unfortunate, known-but-undocumented gotcha with persistent-mysql due to yours truly. MySQL can't create indices for fields of types BLOB or TEXT, it can only create indices for fields of types VARBINARY(x) or VARCHAR(x), where x is the maximum length of the field. OTOH, Text and ByteString data types have no fixed, maximum length, so any default value we may choose for x would be incorrect. The current workaround is to always use maxlen for fields that internally are Text or ByteString, e.g.:

Profile
  hash Hash maxlen=20
  UniqueHash hash

I don't know if there's a better way of designing the library around this gotcha. But I'll happily accept documentation pull requests instead as well =).

@ezyang
ezyang commented Oct 29, 2012

meteficha: Here is the really weird thing: TEXT is arbitrary length; however, when I don't use a newtype wrapper, the unique constraint gets created successfully! This is why I find this so confusing (and why I've added all the bits about newtype...)

@meteficha
Member

https://github.com/yesodweb/persistent/blob/master/persistent-mysql/Database/Persist/MySQL.hs#L611

Text, String and ByteString are all special cased =P. Like I said, it is hacky. Ideas?

@meteficha
Member

BTW, the code I've linked above puts the length constraint on the index instead of putting it on the data type. However, there currently isn't any support for that at all besides the hack. OTOH, it's a good idea performance-wise to use maxlen anyway since TEXT/BLOB are not stored on the row itself. So messy...

@ezyang
ezyang commented Oct 29, 2012

Seriously?! shakes head There goes parametricity...

Ignoring BC for a moment, I would recommend getting rid of the hack, and documenting how to specify maxlen and requiring it for the case of any key fields. You could even newtype a VARCHAR type which is Text but will fail to typecheck if it doesn't have a length restriction associated with it (and update validators to check this length parameter. With type literals, we might even be able to reflect the length in the type system.)

For another approach, the implicit length constraint should be reflected as a typeclass (the classic way for Haskell functions to violate parametricity), not an ad-hoc pattern match, so that when I define a new data type I can specify a default length parameter for it. You might need overlapping instances to make this work in general, which I'd not be too happy about.

@meteficha
Member

It's actually even worse! Try this using type Hash = Text instead of a newtype ;-). Or even importing Data.Text qualified.

Hmmm, AFAICS the type class approach would need (many?) changes to both persistent and persistent-mysql. As you can see, persistent-mysql only gets a textual references to the types (check this).

In order to improve consistency, perhaps we should add SqlType to AddUniqueConstraint. Instead of adding the implicit (200) index length based on the type name, we'd add it based on whether the SqlType was SqlString/SqlBlob or something else. This change would not break any current code, but would make code like yours work out-of-the-box. What do you think?

@ezyang
ezyang commented Oct 29, 2012

I'd feel better about it if the user was forced to specify it when they add a unique constraint, but that seems basically OK.

@meteficha
Member

I'll be honest and say that I won't be able to implement this in the next few weeks at least. I'll gladly review and merge pull requests if you're interested, though =). It should be easy, I guess, just wiring things up.

@ezyang
ezyang commented Oct 29, 2012

That's fine; you've told me how to work around it and I'm not blocking on the bug. Just hope it gets fixed eventually.

@gregwebs gregwebs added the MySQL label Aug 4, 2014
@snoyberg
Member

Closing out old issues, please reopen if still relevant

@snoyberg snoyberg closed this Jul 19, 2015
@gregwebs
Member

this shortcoming should be documented

@gregwebs gregwebs reopened this Jul 20, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment