Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MySQL on duplicate key update #674

Merged
merged 10 commits into from
Jun 19, 2017

Conversation

parsonsmatt
Copy link
Collaborator

MySQL has a native upsert in INSERT ... ON DUPLICATE KEY UPDATE .... This PR introduces that functionality to persistent-mysql.

It also introduces a bulkInsertOnDuplicateKeyUpdate function, which has an API I'd like feedback on.

=> [record] -- ^ A list of the records you want to insert, or update
-> [SomeField record] -- ^ A list of the fields you want to copy over.
-> [Update record] -- ^ A list of the updates to apply that aren't dependent on the record being inserted.
-> SqlPersistT m ()
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function would be used like:

bulkInsertOnDuplicateKeyUpdate
  [ {- a big ol' list of records you want to insert/update -} ]
  [ SomeField UserName, SomeField UserEmail ] -- this copies the values that are being inserted to existing records
  [ UserModified =. now, UserEncounted +=. 1 ] -- this update is performed for any field that matches the inserted records

@parsonsmatt parsonsmatt changed the title [WIP] MySQL on duplicate key update MySQL on duplicate key update May 31, 2017
@gregwebs
Copy link
Member

gregwebs commented Jun 1, 2017

Travis CI is pointint to some import issues on older versions

@parsonsmatt parsonsmatt changed the title MySQL on duplicate key update [WIP] MySQL on duplicate key update Jun 1, 2017
@parsonsmatt parsonsmatt changed the title [WIP] MySQL on duplicate key update MySQL on duplicate key update Jun 1, 2017
@parsonsmatt
Copy link
Collaborator Author

@gregwebs Alright, CI is building now 😄 Any comments on the API or implementation?

@psibi
Copy link
Member

psibi commented Jun 2, 2017

Minor comment: I would prefer the name inserManyOnDuplicateKeyUpdate instead of the current one as that is roughly equivalent to the current naming convention if I'm not wrong.

@parsonsmatt
Copy link
Collaborator Author

👍 Good call @psibi. I've implemented that change.

--
-- The third parameter is a list of updates to perform that are independent of
-- the value that is provided. You can use this to increment a counter value.
-- These updates only occur if the original record is present in the database.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation for the third parameter is kind of confusing to me

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a little difficult to explain. If you pass something like

insertManyOnDuplicateKeyUpdate _ [SomeField UserName] [UserAge +=. 1]

then it generates the following clause in the Update:

INSERT INTO ... VALUES ...
ON DUPLICATE KEY UPDATE ... 
  `user`.`name` = VALUES(`user`.`name`)
  `user`.`age` = `user`.`age` + 1

So the name gets copied from whatever record was being inserted and had a key collision, and the age gets incremented by 1.

Since the Update field requires a specific value, that means that we can't make it dependent on the record that we're updating. MySQL would support something like

ON DUPLICATE KEY UPDATE
  user.age = VALUES(user.age) + 2

which would add 2 to the record that hit the duplicate key. There's not a good way to make the update have more complicated values, unfortunately, so this feature would be difficult to incorporate as-is.

Multiply -> T.concat [n, "=", n, "*?"]
Divide -> T.concat [n, "=", n, "/?"]
BackendSpecificUpdate up ->
error . T.unpack $ "BackendSpecificUpdate" <> up <> "not supported"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be a space after BackendSpecificUpdate and before not supported?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably! It's vendored from part of the update function in persistent. I could push it further upstream and refactor that code to use it as well.

commaSeparated :: [Text] -> Text
commaSeparated = T.intercalate ", "

parenWrapped :: Text -> Text
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These functions make the code much more readable, nice idea

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I can use them in the other instances of these if you'd like.

commaSeparated = T.intercalate ", "

parenWrapped :: Text -> Text
parenWrapped = ("(" <>) . (<> ")")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that it's a huge deal since this function is so small, but is point free really better here? It's longer than just parenWrapped t = "(" <> t <> ")"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whenever I get the chance, I try to make smiley faces in code. Tbh T.concat ["(", t, ")"] is the tiniest bit cleaner and probably more efficient.

@@ -720,6 +724,7 @@ showColumn (Column n nu t def _defConstraintName maxLen ref) = concat
Just s -> -- Avoid DEFAULT NULL, since it is always unnecessary, and is an error for text/blob fields
if T.toUpper s == "NULL" then ""
else " DEFAULT " ++ T.unpack s
{-# LANGUAGE GADTs #-}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this intentionally here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, no, that is definitely a mistake

@MaxGabriel
Copy link
Member

LGTM

@gregwebs
Copy link
Member

gregwebs commented Jun 4, 2017

awesome! Can this be used to fill out upsert in the typeclass?

@parsonsmatt
Copy link
Collaborator Author

Unfortunately not, as this doesn't return the updated records. We'd need something like upsert_ :: record -. [Update record] -> SqlPersistT m () for this to make it into the class.

@gregwebs
Copy link
Member

gregwebs commented Jun 4, 2017

That could make sense to add upsert_ then. Could you also add a SELECT in the same transaction to get the upsert behavior though?

@parsonsmatt
Copy link
Collaborator Author

Ooh, I think that would work. I'm not sure it'd be any more efficient than the current default implementation which does a SELECT and then an UPDATE if present and INSERT if missing -- still two queries.

@psibi
Copy link
Member

psibi commented Jun 5, 2017

@parsonsmatt I just looked on the code. Is there any reason why native upsert for mysql has been done like this instead of going via the typeclass route: https://www.stackage.org/haddock/lts-8.17/persistent-2.6.1/Database-Persist-Sql.html#t:SqlBackend

The function connUpsertSql there has been specifically created for implementing native upsert feature for backends. In fact, Postgres right now uses that. Sorry for commenting about this so late.

@parsonsmatt
Copy link
Collaborator Author

@psibi The type class function upsert returns the updated or inserted record. MySQL's ON DUPLICATE KEY UPDATE doesn't have any way of returning a value, so we can't do that. If the type class had upsert_ :: rec -> [Update rec] -> SqlPersistT m (), then this would fit that signature.

@gregwebs
Copy link
Member

gregwebs commented Jun 5, 2017

@psibi I am in favor of adding upsert_.

@psibi
Copy link
Member

psibi commented Jun 7, 2017

Yeah, upsert_ would be good. It can be probably added as a typeclass method and a custom function be passed as a part of SqlBackend type which will generate the appropriate query for the specific backend. For those backends, for which the function hasn't been passed, upsert_ can simply be defined as upsert >> return ().

@gregwebs
Copy link
Member

@psibi can you merge and release this as is?

@psibi psibi merged commit 3d7ff00 into yesodweb:master Jun 19, 2017
@psibi
Copy link
Member

psibi commented Jun 19, 2017

persistent-mysql-2.6.1 has been released with the relevant CHANGELOG in Hackage.

naushadh added a commit to naushadh/persistent that referenced this pull request Aug 7, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants