Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[persistent-mysql] Use an "INSERT IGNORE" query if the updates are empty. #702

Merged
merged 6 commits into from
Sep 21, 2017

Conversation

parsonsmatt
Copy link
Collaborator

The current behavior of insertManyOnDuplicateKeyUpdate will perform an insertMany_ if there are no updates to perform. insertMany_ will then throw an error if the key already exists in the database.

This is surprising -- typically, when you're doing a bulk upsert like this, the desired behavior is "Try to insert these rows into the database. If there's a key collision, then perform the set of updates." Given an empty set of updates, I would expect nothing to be done to the preexisting record, not throwing an exception and rolling back the entire insert.

MySQL's INSERT IGNORE is a little unfortunate: docs

If you use the IGNORE modifier, errors that occur while executing the INSERT statement are ignored. For example, without IGNORE, a row that duplicates an existing UNIQUE index or PRIMARY KEY value in the table causes a duplicate-key error and the statement is aborted. With IGNORE, the row is discarded and no error occurs. Ignored errors may generate warnings instead, although duplicate-key errors do not.
IGNORE has a similar effect on inserts into partitioned tables where no partition matching a given value is found. Without IGNORE, such INSERT statements are aborted with an error. When INSERT IGNORE is used, the insert operation fails silently for rows containing the unmatched value, but inserts rows that are matched. For an example, see Section 19.2.2, “LIST Partitioning”.
Data conversions that would trigger errors abort the statement if IGNORE is not specified. With IGNORE, invalid values are adjusted to the closest values and inserted; warnings are produced but the statement does not abort. You can determine with the mysql_info() C API function how many rows were actually inserted into the table.

I believe the unfortunate bits of this API are hidden behind the guarantees that Persistent can make about the query we're generating.

@parsonsmatt
Copy link
Collaborator Author

This PR is a subset of #701 and can be closed without merging if that PR is merged. PR #701 needed to incorporate these changes in order to unentangle the PersistEntityBackend record ~ BaseBackend backend constraint arising from insertMany_, which would block the use in persistent-typed-db.

If you choose not to merge #701, then merging this PR and Esqueleto merging the BackendCompatible class will work to get persistent-typed-db fully compatible.

@parsonsmatt
Copy link
Collaborator Author

I'm getting a weird test failure on the MySQL tests in an unrelated module.

 src/DataTypeTest.hs:66: 
  1) data type specs handles all types
       expected: ("time",14:19:40)
        but got: ("time",14:19:41)

The test handles MySQL specially:

roundTime :: TimeOfDay -> TimeOfDay
#ifdef WITH_MYSQL
roundTime (TimeOfDay h m s) = TimeOfDay h m (fromIntegral (truncate s :: Integer))
#else
roundTime = id
#endif

Seems like updating the Travis image might have changed an expectation on how MySQL does times.

When inserting values into the database, we use truncateTimeOfDay, but then we use roundTime when we compare. I suspect that this might be causing some of the difference.

@paul-rouse
Copy link
Contributor

paul-rouse commented Sep 16, 2017

The mysql test seems to have been broken by Travis recently upgrading their default Ubuntu images to "trusty". As a result, MySQL will have been upgraded from 5.5.X to 5.7.X, assuming travis uses the standard versions for the distro.

Before MySQL 5.6.4, fractional seconds were not supported, and, if I understand correctly, any fractional part in a literal was simply ignored, thereby truncating to the whole second below - documented here.

Since 5.6.4 the default is to use whole-second precision, but, crucially, excess precision in literals is now rounded, as described in the 5.7 documentation here.

So, MySQL itself has made a totally incompatible change. The persistent test did the right thing for the old beviour, but now fails in the 50% of cases when the rounding goes upward. Great:-(

I'll see if I can come up with a way of fixing the test...

@parsonsmatt
Copy link
Collaborator Author

MySQL, you bring me so much joy 😡

Thanks for posting the rationale there. I figured it was a Travis change.

I replaced truncate with round in my other branch, which appears to have fixed the test failure for MySQL.

@paul-rouse
Copy link
Contributor

I realise #701 includes this change, but I feel more competent to comment on the MySQL part of it for now.

INSERT IGNORE is indeed a can of worms, and one thing about it does bother me: it suppresses errors you would expect in strict mode when invalid values are inserted. Persistent does not protect against all of these. A nasty example is trying to insert data which is too long for the default text or blob fields (which are only 64k) - I was bitten by that particular one in the days before the Yesod scaffolding added the code to set the MySQL session mode to STRICT_ALL_TABLES.

It is a little worrying that a behind-the-scenes use of INSERT IGNORE could nullify an explicit strict-mode setting!

Is it possible to use INSERT ... ON DUPLICATE KEY UPDATE instead? You have to supply an update, but it can be a no-op. For example, if the table foo has a column bar you can say INSERT INTO foo ... ON DUPLICATE KEY UPDATE bar = bar.

BTW I don't understand the statement in the docs about InnoDB tables and auto-increment columns - I have just tried it and I don't see that behaviour!

@parsonsmatt
Copy link
Collaborator Author

That's a fantastic point that I hadn't considered. Thanks! I'll modify the PR to use ON DUPLICATE KEY UPDATE with a dummy field set.

@parsonsmatt
Copy link
Collaborator Author

Hooray for spurious CI failures 😂

This is passing all tests now, using the dummy update.

@paul-rouse
Copy link
Contributor

Thanks for including the changelog note as well! I suggest I merge this now, and then if you could rebase #701 on this, we can look at the new type class separately, and ask one of the others for comments (I can't see any objections myself).

@paul-rouse paul-rouse merged commit bd65fcb into yesodweb:master Sep 21, 2017
@parsonsmatt
Copy link
Collaborator Author

Thanks! Will do :)

naushadh added a commit to naushadh/persistent that referenced this pull request Oct 8, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants