Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(postgres): enabling the updateOnDuplicate BulkCreate option #11163

Merged
merged 7 commits into from
Jul 27, 2019

Conversation

thaddeusmt
Copy link
Contributor

@thaddeusmt thaddeusmt commented Jul 8, 2019

Pull Request check-list

Please make sure to review and check all of these items:

  • Does npm run test or npm run test-DIALECT pass with this change (including linting)?
  • Does the description below contain a link to an existing issue (Closes #[issue]) or a description of the issue you are solving?
  • Have you added new tests to prevent regressions?
  • Is a documentation update included (if this change modifies existing APIs, or introduces new ones)?
  • Did you follow the commit message conventions explained in CONTRIBUTING.md?

Description of change

I have attempted to enable the BulkCreate() updateOnDuplicate option for the Postgres dialect using the INSERT ... ON CONFLICT DO NOTHING/UPDATE syntax added in Postgres 9.5. This essentially enables "upsert" functionality on "BulkCreate" operations (as is currently provided for the MySql and Mariadb dialects with the ON DUPLICATE KEY UPDATE functionality).

NOTE: The Postgres upsert() functionality still uses the INSERT EXCEPTION WHEN method, I did not refactor that to use ON CONFLICT.

The Postgres ON CONFLICT command requires either listing out the columns to key conflicts on, or using an actual unique constraint by name. This easiest way I could see to support both primary key and "unique" constraints in the model definitions was to add some logic in the bulkInsertQuery() method that looks up those columns, and then add to the ON CONFLICT command directly. By default it uses the primary key columns, but if there is a separate unique constraint in the model definition it will use that instead.

I added an integration test called should support the updateOnDuplicate option with primary keys to test the primary key case, since the existing should support the updateOnDuplicate option test only checked unique constraints. The unit and integration tests are passing for Postgres.

I am only set up to run the tests with the Postgres dialect right now, so hopefully I didn't break any other dialects...

I think this might fix issues #4132 and #6325 and #3354 and #4324.

It appears this outstanding PR #6325 was attempting to do this as well.

This is my first attempt at a PR on a project like this, please advise on how I can improve my submission, thank you.

@codecov
Copy link

codecov bot commented Jul 8, 2019

Codecov Report

Merging #11163 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #11163      +/-   ##
==========================================
+ Coverage   96.34%   96.34%   +<.01%     
==========================================
  Files          94       94              
  Lines        9023     9029       +6     
==========================================
+ Hits         8693     8699       +6     
  Misses        330      330
Impacted Files Coverage Δ
lib/dialects/mysql/index.js 100% <ø> (ø) ⬆️
lib/dialects/postgres/index.js 100% <ø> (ø) ⬆️
lib/dialects/mariadb/index.js 100% <ø> (ø) ⬆️
lib/model.js 96.69% <100%> (ø) ⬆️
lib/dialects/abstract/query-generator.js 97.6% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1d6fa05...066b5f0. Read the comment docs.

if (this._dialect.supports.inserts.updateOnDuplicate == ' ON CONFLICT DO UPDATE SET') { // postgres
// If no conflict target columns were specified, use the primary key names from options.upsertKeys
onDuplicateKeyUpdate = ` ON CONFLICT (${options.upsertKeys.map(attr => {
return `"${attr}"`;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one can be inlined eg. => `...`

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also use this.quoteIdentifier(<attr>) instead of "<attr>"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In commit 8381105 I updated with quoteIdentifier() (oops).

I am not sure how best to inline this though? Especially since I added the quoteIdentifier() call? My JS-fu is weak, there is probably some classic syntax I am not familiar with. I was following the existing code example which looks like this:

onDuplicateKeyUpdate = `${this._dialect.supports.inserts.updateOnDuplicate} ${options.updateOnDuplicate.map(attr => {
  const key = this.quoteIdentifier(attr);
  return `${key}=VALUES(${key})`;
}).join(',')}`;

@@ -28,7 +28,8 @@ MariadbDialect.prototype.supports = _.merge(
schemas: true,
inserts: {
ignoreDuplicates: ' IGNORE',
updateOnDuplicate: true
updateOnDuplicate: ' ON DUPLICATE KEY UPDATE'
//updateOnDuplicate: true // now using a string instead of boolean, like ignoreDuplicates
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be removed and written in the changelog.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed these in commit 8381105. Any guidance on how to update the changelog? Is there a manually generated changelog in the docs somewhere? Thanks

@thaddeusmt
Copy link
Contributor Author

What's the next step? Do I just need to update the Documentation? Or is there anything else I should improve? Thanks

onConflictDoNothing: ' ON CONFLICT DO NOTHING'
onConflictDoNothing: ' ON CONFLICT DO NOTHING',
updateOnDuplicate: ' ON CONFLICT DO UPDATE SET'
// updateOnDuplicate: true // this is why mysql does
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the comment

@papb
Copy link
Member

papb commented Jul 19, 2019

@thaddeusmt Hello! Thanks for the PR! Please update the docs: here and here.

@bmeriaux
Copy link
Contributor

the same refactor on the upsert method and it will be a major improvement for postgres insert performance in sequelize, nice work!

@papb papb added status: awaiting response For issues and PRs. OP must respond (or change something, if it is a PR). Maintainers have no action type: feature For issues and PRs. For new features. Never breaking changes. labels Jul 24, 2019
@papb papb mentioned this pull request Jul 26, 2019
7 tasks
@papb papb added the dialect: postgres For issues and PRs. Things that involve PostgreSQL (and do not involve all dialects). label Jul 26, 2019
@thaddeusmt
Copy link
Contributor Author

thaddeusmt commented Jul 26, 2019

@thaddeusmt Hello! Thanks for the PR! Please update the docs: here and here.

I missed those ESDoc comments, thanks. I updated them to add that Postgres 9.5 is supported.

@thaddeusmt
Copy link
Contributor Author

the same refactor on the upsert method and it will be a major improvement for postgres insert performance in sequelize, nice work!

I have thought about replacing the Postgres upsert implementation to use ON CONFLICT instead of INSERT EXCEPTION WHEN - if I do that I think I'll open a new PR, so if it takes me a long time to motivate this one isn't blocked.

@papb
Copy link
Member

papb commented Jul 26, 2019

@thaddeusmt Hi, thanks for the changes. I have one more request. I am bothered by the specific, explicit mention of Postgres 9.5. This makes it seem that no other version whatsoever is supported. Is this true? Is there a version of Postgres that is not supported?

@sushantdhiman
Copy link
Contributor

ON CONFLICT is not supported by versions below 9.5

@papb
Copy link
Member

papb commented Jul 27, 2019

@sushantdhiman Ah, so by 9.5 it is meant "9.5 and above"? I think that deserves clarification

@sushantdhiman
Copy link
Contributor

Ah, so by 9.5 it is meant "9.5 and above"? I think that deserves clarification

Docs updated

@papb papb self-requested a review July 27, 2019 05:44
@sushantdhiman
Copy link
Contributor

🎉 This PR is included in version 5.11.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

@prachikhadke
Copy link

Been trying to use this but keep getting an error Unhandled rejection SequelizeDatabaseError: syntax error at or near ";" DO UPDATE SET ;. Any idea what could be the issue?

@papb
Copy link
Member

papb commented Aug 16, 2019

@prachikhadke You should open a new issue with a SSCCE/MCVE/reprex so we can help you 👍

@thaddeusmt
Copy link
Contributor Author

@prachikhadke Link me to (or tag me in) an Issue with some details for reproducing and I'll take a look. I'm curious:

  • What does the Sequelize model init/creation look like (how are you defining Primary and Unique keys)
  • What does your BulkCreate() call look like?

Thanks

@prachikhadke
Copy link

prachikhadke commented Aug 19, 2019

ObjectModel.bulkCreate(objectsToInsertOrUpdate, { transaction, returning: false })

Where ObjectModel defines the object model. There is only one primary key field - an id field. No other unique keys.

@thaddeusmt
Copy link
Contributor Author

@prachikhadke The first thing that jumps out is that you are not setting updateOnDuplicate: true on the bulkCreate() options? That enables the feature in this pull request. But perhaps this bug is that I broke the default non-update functionality?

It might still be useful to see how the model is created. Is the primary key id automatically defined, or is it defined with primaryKey: true in the column definition? There could be some case I missed where primary key is defined in a way that it doesn't appear when I query model.primaryKeys (and model.uniqueKeys) to build the ON CONFLICT SQL clause.

@prachikhadke
Copy link

@thaddeusmt I did try setting the updateOnDuplicate: true option after upgrading to the latest sequelize version. That's when I got the error. The id field does have the primaryKey: true. The primary key is not being set using a sequence but is set as part of the record.

@papb
Copy link
Member

papb commented Aug 20, 2019

@prachikhadke Please open a new bug report for this, instead of discussing here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dialect: postgres For issues and PRs. Things that involve PostgreSQL (and do not involve all dialects). released status: awaiting response For issues and PRs. OP must respond (or change something, if it is a PR). Maintainers have no action type: feature For issues and PRs. For new features. Never breaking changes.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants