Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[persistent-mysql] Add support for conditional copying #693

Merged
merged 17 commits into from
Sep 20, 2017

Conversation

parsonsmatt
Copy link
Collaborator

This PR extends the insertManyOnDuplicateKeyUpdate function to allow conditional copying of data from the input record.

Use case: Suppose you're importing a report that sometimes has partial data, and you want to copy the report into the database. The result record may have many Nothing fields. If we copy this using the current SomeField, then we will overwrite the existing data with NULLs. Oops.

The new constructor and functions permit us to avoid copying bad data, including NULL and empty strings, into the database, potentially overriding already present data.

@parsonsmatt parsonsmatt changed the title [persistent-mysql] Add support for conditional copying [persistent-mysql] [WIP] Add support for conditional copying Sep 1, 2017
@parsonsmatt parsonsmatt changed the title [persistent-mysql] [WIP] Add support for conditional copying [persistent-mysql] Add support for conditional copying Sep 1, 2017
@psibi
Copy link
Member

psibi commented Sep 1, 2017

Sorry, but can you actually expand the use case you have explained with an example ? I'm having a difficult time following the problem this solves.

Also in the code, you seem to have deleted the existing haddock comment for the type SomeField. Has it become outdated or changed with this feature you have added ?

@parsonsmatt
Copy link
Collaborator Author

Sorry, but can you actually expand the use case you have explained with an example ? I'm having a difficult time following the problem this solves.

Sure thing!

Let's say you've got a CSV row that maps to a database column. The CSV only ever contains changes to the value set, instead of the full values.

{- 
in a data definition file:

Row
  productName Text
  productPrice Double Maybe
  productDescription Text Maybe
  productQuantity Int Maybe

  Primary productName
-}
csvRowHeader = "productName,productPrice,productDescription,productQuantity"
csvRowExample = "bestStuff,3.99,,3"

csvRowParsed :: Row
csvRowParsed = Row "bestStuff" (Just 3.99) Nothing (Just 3)

If we did a bulk insert on these sorts of rows, then we'd be copying NULL values into the database, overwriting whatever useful values were there before. This isn't great!

So instead, we can do insertManyOnDuplicateKeyUpdate rows [copyUnlessNull RowProductPrice, copyUnlessNull RowProductDescription, copyUnlessNull RowProductQuantity]. This will:

  1. Check if the row is present. If it is not, just do an insert. Otherwise:
  2. Check if the value in the VALUES to be inserted is equal to NULL. If it is, then it doesn't alter it.
  3. If the new value to be inserted is non-null, then we copy the value into the existing database table.

Also in the code, you seem to have deleted the existing haddock comment for the type SomeField. Has it become outdated or changed with this feature you have added ?

I feel like it better reflects what the type is actually for. I'd be happy to make it clearer or add examples.

CopyUnlessEq :: PersistField typ => EntityField record typ -> typ -> SomeField record
-- ^ Only copy the field if it is not equal to the provided value.

-- | Copy the field into the database only if the value in the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you bump the cabal version, update changelog and add the @since haddock syntax for all the functions you have added here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure thing! Is adding a constructor like this a minor version bump?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to bump it to 2.6.2. More info here: http://www.snoyman.com/blog/2017/06/how-to-send-me-a-pull-request

-- corresponding record is non-empty, where "empty" means the Monoid
-- definition for 'mempty'. Useful for 'Text', 'String', 'ByteString', etc.
copyUnlessEmpty :: (Monoid typ, PersistField typ) => EntityField record typ -> SomeField record
copyUnlessEmpty field = CopyUnlessEq field mempty

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both of the introduced functions are only seem to be applicable for insertManyOnDuplicateKeyUpdate. Can you mention that in the haddock comments? Also can you add the example you mentioned here: #693 (comment) and explain it's usecase in a similar way we have done it here: persistent doc

@psibi
Copy link
Member

psibi commented Sep 1, 2017

Thanks for the explanation, that was helpful.

  1. If the new value to be inserted is non-null, then we copy the value into the existing database table.

You mean update the existing row ?

I have left my other comments in the code review.

@parsonsmatt
Copy link
Collaborator Author

You mean update the existing row ?

Yes, that's correct.

I've added the documentation requested along with version bump. Thanks!

@psibi
Copy link
Member

psibi commented Sep 3, 2017

LGTM. I will merge this once it is reviewed by one more person. CC: @gregwebs @snoyberg and others.

@parsonsmatt
Copy link
Collaborator Author

The travis build is failing with

persistent-test: ConnectionError {errFunction = "connect", errNumber = 1044, errMessage = "Access denied for user 'travis'@'%' to database 'persistent'"}
Process exited with ExitFailure 1: /home/travis/build/yesodweb/persistent/.stack-work/install/x86_64-linux/lts-2.22/7.8.4/bin/persistent-test

Any idea what the fix for that might be?

@paul-rouse
Copy link
Contributor

About travis: the last successful run seems to have been on precise, but they have recently finished changing the default to trusty: https://blog.travis-ci.com/2017-08-31-trusty-as-default-status.

I'm guessing we are suffering a trysty/mysql problem along the lines of travis-ci/travis-ci#8331. The images are being updated on Wednesday - https://docs.travis-ci.com/user/build-environment-updates/2017-09-06/ - so perhaps it will work then!

Copy link
Member

@snoyberg snoyberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reviewed this, and don't see any big problems, but I'm also not that familiar with either the code or the functionality in MySQL. I'd strongly recommend including some integration tests to ensure this is working as expected.

-- ^ Copy the field directly from the record.
CopyUnlessEq :: PersistField typ => EntityField record typ -> typ -> SomeField record
-- ^ Only copy the field if it is not equal to the provided value.
-- /Since 2.6.2/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd recommend using the @since 2.6.2 syntax

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL, thanks!

@parsonsmatt
Copy link
Collaborator Author

I'll implement the documentation example as a test.

@parsonsmatt
Copy link
Collaborator Author

I made a change to the implementation. I removed the CopyUnlessEq data constructor from the export list and am exporting a function instead.

I'm pretty sure I'm going to want to expand this to copyUnlessIn :: PersistField rec typ -> [typ] -> SomeField rec in the near future, and then CopyUnlessEq becomes a special case of that.

Copy link
Contributor

@paul-rouse paul-rouse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I think this test change is wrong. The point about truncateTimeOfDay is that most databases handle microseconds, so we just truncate any picoseconds. roundTime truncates to whole seconds for a completely different purpose, namely as a hack for MySQL, and this has now become a load more complicated.

@parsonsmatt
Copy link
Collaborator Author

Good call, I didn't understand exactly what was going on there. I've reverted that change and used round instead of truncate, which should give us the right answers, as MySQL appears to be doing a round instead of truncate.

I can also revert the unrelated test changes and leave that for another PR, if you're OK with an unrelated red test.

@parsonsmatt
Copy link
Collaborator Author

et tu, sqlite? 😛

Well, MySQL passes, and SQLite is just dying for no reason now.

@paul-rouse
Copy link
Contributor

I should leave the tests as they were, and we'll live with them failing for now. I'd like a little while to think about persistent-mysql itself, now that fractional seconds could be supported. If we (optionally?) changed the mapping to SQL types so as to include the microseconds, the problem would go away. It might cause too much breakage, though, and that's what I would like to think through - any comments are welcome, of course!

@parsonsmatt
Copy link
Collaborator Author

I've reverted the test changes.

@parsonsmatt
Copy link
Collaborator Author

Is there anything I can do to help get this merged? :)

@paul-rouse
Copy link
Contributor

OK by me - are you happy too @psibi ? We know why the test fails - for unrelated reasons - and I am addressing that separately.

@psibi
Copy link
Member

psibi commented Sep 20, 2017

@paul-rouse LGTM. Please go ahead and merge.

@paul-rouse paul-rouse merged commit b192b73 into yesodweb:master Sep 20, 2017
@paul-rouse
Copy link
Contributor

Thanks both of you!

naushadh added a commit to naushadh/persistent that referenced this pull request Oct 8, 2017
Added minor README enchancement.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants