Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add insert_many to ActiveRecord models #35077

Merged
merged 1 commit into from Mar 5, 2019
Merged

Add insert_many to ActiveRecord models #35077

merged 1 commit into from Mar 5, 2019

Conversation

@boblail
Copy link
Contributor

@boblail boblail commented Jan 29, 2019

Options

  • [:returning]
    (Postgres-only) An array of attributes that should be returned for all successfully inserted records. For databases that support INSERT ... RETURNING, this will default to returning the primary keys of the successfully inserted records. Pass returning: %w[ id name ] to return the id and name of every successfully inserted record or pass returning: false to omit the clause.

  • [:unique_by]
    (Postgres and SQLite only) In a table with more than one unique constaint or index, new records may considered duplicates according to different criteria. For MySQL, an upsert will take place if a new record violates any unique constraint. For Postgres and SQLite, new rows will replace existing rows when the new row has the same primary key as the existing row. By defining :unique_by, you can supply a different key for matching new records to existing ones than the primary key.

    (For example, if you have a unique index on the ISBN column and use that as the :unique_by, a new record with the same ISBN as an existing record will replace the existing record but a new record with the same primary key as an existing record will raise ActiveRecord::RecordNotUnique.)

    Indexes can be identified by an array of columns:

    unique_by: { columns: %w[ isbn ] }

    Partial indexes can be identified by an array of columns and a :where condition:

    unique_by: { columns: %w[ isbn ], where: "published_on IS NOT NULL" }

Examples

# Insert multiple records, performing an upsert when records have duplicate ISBNs
# ('Eloquent Ruby' will overwrite 'Rework' because its ISBN is duplicate)
Book.upsert_all([
  { title: 'Rework', author: 'David', isbn: '1' },
  { title: 'Eloquent Ruby', author: 'Russ', isbn: '1' }
],
   unique_by: { columns: %w[ isbn ] })
@boblail boblail force-pushed the boblail:insert_many branch 2 times, most recently from acb4250 to 261dde9 Jan 29, 2019
Copy link
Member

@eileencodes eileencodes left a comment

I've always wanted bulk inserts - not sure if there's a reason we haven't written this feature in the past.

I left some comments for changes I think this needs. Can you also run a benchmark for inserting multiple records with many insert calls vs insert_many?

activerecord/lib/active_record/persistence.rb Outdated Show resolved Hide resolved
activerecord/lib/active_record/persistence.rb Outdated Show resolved Hide resolved
activerecord/lib/active_record/persistence.rb Outdated Show resolved Hide resolved
activerecord/test/cases/persistence_test.rb Outdated Show resolved Hide resolved
@boblail boblail force-pushed the boblail:insert_many branch from 261dde9 to 5b9a2ea Feb 1, 2019
@dhh
Copy link
Member

@dhh dhh commented Feb 1, 2019

A few notes on the API:

  1. I don't think #insert_all should take anything but an array. If we want a raw insert method for a single record, it should have it's on method. Could just be #insert.
  2. I'd rather break out the on_conflict options into a few different things. First, we can do skip_duplicates: true for that part. Second, we can have an explicit method called #upsert/upsert_all to handle upserts. Still need an api for how to declare the index logic, but it'll be less nested.
@boblail boblail force-pushed the boblail:insert_many branch from 5b9a2ea to 12ca9e9 Feb 2, 2019
@boblail
Copy link
Contributor Author

@boblail boblail commented Feb 2, 2019

David, thanks so much for the code review! Great feedback!


@dhh and @eileencodes, I reworked this and pushed up a new commit. Here’s a summary of my changes:

  • renamed the method insert_all
  • clarified docs and behavior of the returning option
  • added more tests
  • split :on_conflict into :on_duplicate and :conflict_target (more below)

:on_duplicate and :conflict_target

New usage example
Book.insert_all books, on_duplicate: :raise  # default behavior — plain ol' INSERT
Book.insert_all books                        # same as above
Book.insert_all books, on_duplicate: :skip   # skips duplicates
Book.insert_all books, on_duplicate: :update # upsert

# conflict_target specifies an index to handle conflicts for. This is like saying:
#   - raise if we're inserting a duplicate primary key
#   - skip if we're inserting a duplicate ISBN
Book.insert_all books,
  on_duplicate: :skip,
  conflict_target: { columns: %w{ isbn }, where: "published_on IS NOT NULL" }
Notes
  • I really like splitting :on_conflict into :on_duplicate and :conflict_target for two reasons:
    1. It's easier to express the default strategy with a value, :raise
    2. It's easier to express what a database adapter supports (MySQL supports the :skip and :update strategies using different syntax; but it doesn't support specifying a :conflict_target)
  • I picked the name :conflict_target because it's the name Postgres's docs use.
  • :conflict_target is still a nested hash, but it feels more similar to the options you can pass to :index or :foreign_key in TableDefinition#column

Command Object

I'm totally open to extracting DatabaseStatements#insert_all to a Command Object. I tried two other refactors first (consolidating input validation and extracting prepare_keys_and_values_for_insert). Now all but the last 3 lines of the method have one responsibility, validating/normalizing input. I'm curious what your thoughts are about that!

Separate methods

There are three strategies supported here: vanilla-insert, insert-skip-duplicates, and upsert. I found, when trying to tease them apart, that they share 90% of the same concerns. Most of the heavy lifting is just constructing a bulk INSERT statement; but even :conflict_target is a concern shared by both the insert-skip-duplicates and upsert strategies. I think it makes sense to have a common insert_all method (at least on the connection). Would you like me to add three shorthand methods to Persistence? I'm thinking:

# See <tt>ActiveRecord::Persistence#insert_many</tt> for documentation.
def upsert(attributes, options = {})
  insert(attributes, options.merge(on_duplicate: :update))
end

# See <tt>ActiveRecord::Persistence#insert_many</tt> for documentation.
def upsert_all(attributes, options = {})
  insert_all(attributes, options.merge(on_duplicate: :update))
end

# See <tt>ActiveRecord::Persistence#insert_many</tt> for documentation.
def insert(attributes, options = {})
  insert_all([ attributes ], options)
end

👆 In particular, is that an OK way of documenting them?

@boblail
Copy link
Contributor Author

@boblail boblail commented Feb 2, 2019

Performance

Can you also run a benchmark for inserting multiple records with many insert calls vs insert_many?

I ran this test code:

def test_insert_performance
  books = 1_000.times.map { { name: "Rework" } }
  Benchmark.bmbm do |x|
    x.report("create") { books.each { |book| Book.create!(book) } }
    x.report("insert") { books.each { |book| Book.insert_all([book]) } }
    x.report("insert_all") { Book.insert_all(books) }
  end
end

and got these results:

mysql2
--------------------------------------------------------
                 user     system      total        real
create       0.591878   0.074504   0.666382 (  0.775443)
insert       0.157255   0.025772   0.183027 (  0.251659)
insert_all   0.014531   0.000114   0.014645 (  0.030236)

sqlite3
--------------------------------------------------------
                 user     system      total        real
create       0.619300   0.036571   0.655871 (  0.656279)
insert       0.174633   0.012864   0.187497 (  0.187530)
insert_all   0.018640   0.000192   0.018832 (  0.018902)

postgresql
--------------------------------------------------------
                 user     system      total        real
create       0.642238   0.076162   0.718400 (  0.941469)
insert       0.189128   0.026410   0.215538 (  0.376370)
insert_all   0.016598   0.000107   0.016705 (  0.033562)

I also ran it with 100 books and the ratios were about the same.

@dhh
Copy link
Member

@dhh dhh commented Feb 3, 2019

Bob, I think if you split out a command object, then you won’t have to route all the insert/upsert methods through the public insert_all method, and therefore don’t need to have the insert_all method carry all these options that we’re delegating to specific methods.

I’d also like to skip having the on_duplicates option entirely, actually. We can get there with upsert and bang methods. So insert!/insert_all! will raise on dupes, the non-bang methods won’t.

The conflict_target API isn’t quite my cup’o’tea either. Have you actually used this in anger in a real app? I’m even less enthused about it since it’s a pgsql specific set of operations. I’d prefer to move forward with the rest of this without a pgsql specific route. Then treat that as a concern to deal with afterwards.

@boblail
Copy link
Contributor Author

@boblail boblail commented Feb 4, 2019

David, I really appreciate your feedback!

👍 I'll refactor to extract a command object.

Conflict Target

I think I haven't quite gotten the API to express when :conflict_target is required...

In Postgres and SQLite (SQLite supports the same set of ops as Postgres), conflict target is optional in the expression ON CONFLICT DO NOTHING but required in the expression ON CONFLICT (*something*) DO UPDATE ....

In my PR, I default :conflict_target to the primary key (which maybe isn't the right default for ON CONFLICT DO NOTHING); but you'd need to set :conflict_target any time you've added a unique index other than the primary key. —and that was an important use case for me (I extracted this PR from a product). I used :conflict_target heavily in import logic

My scenario
Customers import frequently, not just once; and imports have to be fast and idempotent.

Bang Methods

🤔 I like the simplicity of the bang v. non-bang methods — but I have a couple of questions about what it would express:

  1. Would it be misleading if the bang in insert_all/insert_all! controls a different exception than the bang in save/save!, create/create!, and update_attributes/update_attributes!?
    • In the existing methods, the bang methods raise ActiveRecord::RecordInvalid, but neither insert_all nor insert_all! would raise ActiveRecord::RecordInvalid.
    • In the existing methods, even the non-bang methods raise ActiveRecord::RecordNotUnique, but insert_all wouldn't.
  2. Should the difference been vanilla-insert and insert-skip-duplicates be more obvious? save does the same thing as save! (and fails under the same circumstances), it just returns false instead of raising; but the proposed insert_all and insert_all! would actually do different things to the database. Should that be made more explicit?
@boblail
Copy link
Contributor Author

@boblail boblail commented Feb 4, 2019

Thinking about it just a little more ... would :index or :unique_index be a better name than :conflict_target?

And if we split insert_all from upsert_all, they would have simpler options apiece since insert_all wouldn't care about "conflict target" and upsert_all wouldn't need a strategy argument.

If you're still good with it, I think your original suggestion of skip_duplicates: true plays a little better than bang/no-bang versions of insert_all.

👆 I'll push something tomorrow.

@dhh
Copy link
Member

@dhh dhh commented Feb 4, 2019

Bang vs non-bang is just about setting expectations of what's going to happen. With AR, we've set the expectation that bang-methods will raise an exception. There's no guarantee on what kind will be raised. So fine raising a different kind of exception. So I'd like to stick with that.

That means insert/insert_all will just skip dupes, no errors. insert!/insert_all! will raise an appropriate exception on dupes.

So conflict target is only necessary for upsert, yeah? Trying to understand the feature in fully. You specify a conflict_target when you want upsert to raise an exception given a unique key violation?

@boblail
Copy link
Contributor Author

@boblail boblail commented Feb 4, 2019

So conflict target is only necessary for upsert, yeah? Trying to understand the feature in fully. You specify a conflict_target when you want upsert to raise an exception given a unique key violation?

👍

It's necessary for upsert, optional for skip-duplicates.

You specify conflict target to say when to do an upsert — i.e. CT answers "do an upsert when a new record is in conflict with which unique index?"


Example:

Given a table with two unique indexes (one on id, one on [author, title])

create_table :books, id: :integer, force: true do |t|
  t.column :title, :string
  t.column :author, :string
  t.index [:author, :title], unique: true
end

Without specifying a conflict target, INSERT...DO NOTHING will skip a record that violates any unique index.

# Given
Book.create! id: 1, title: "Rework", author: "David"

# violation of index on id, skipped
Book.insert_all [{ id: 1, title: "Refactoring", author: "Martin" }],
  on_duplicate: :skip

# violation of index on author+title, skipped
Book.insert_all [{ id: 2, title: "Rework", author: "David" }],
  on_duplicate: :skip

Comment: Skipping inserts when they violate any unique index seems like a sensible default for INSERT ... DO NOTHING

If you specify a conflict target, INSERT will skip records that violate only the specified unique index (and raise if your record violates a different index):

# Given
Book.create! id: 1, title: "Rework", author: "David"

# violation of index on author+title, skipped
Book.insert_all [{ id: 2, title: "Rework", author: "David" }],
  on_duplicate: :skip,
  conflict_target: %w{ author title }

# violation of index on id, raises ActiveRecord::RecordNotUnique
Book.insert_all [{ id: 1, title: "Refactoring", author: "Martin" }],
  on_duplicate: :skip,
  conflict_target: %w{ author title }

Comment: I have wanted to specify "conflict target" in a INSERT ... DO NOTHING when I was working on an app with UUID ids, which could be generated on the client, and I'd want want to raise-not-skip in the unlikely event the client passed a given UUID twice.

For upsert you must specify a conflict target, so you can only UPSERT on violations of one unique index (and it'll raise if your record violates a different index)

# Given
Book.create! id: 1, title: "Rework", author: "David"

# violation of index on id, Refactoring overwrites Rework
Book.insert_all [{ id: 1, title: "Refactoring", author: "Martin" }],
  on_duplicate: :update,
  conflict_target: { columns: %w{ id } }

# violation of index on author+title, raises ActiveRecord::RecordNotUnique
Book.insert_all [{ id: 2, title: "Refactoring", author: "Martin" }],
  on_duplicate: :update,
  conflict_target: { columns: %w{ id } }

Comment: I find that I almost always want to specify a conflict target when I do INSERT ... DO UPDATE (because I seldom include id in the list of attributes I'm inserting; and upsert only makes sense if I we've got some other column(s) acting as a unique identifier for a record).

@boblail boblail force-pushed the boblail:insert_many branch 2 times, most recently from c688fef to 58359e4 Feb 6, 2019
@boblail
Copy link
Contributor Author

@boblail boblail commented Feb 6, 2019

I just pushed a commit that:

  • Extracts a command object
  • Splits the API into six methods
    • insert! / insert_all!
    • insert / insert_all (skip duplicates)
    • upsert / upsert_all
@boblail boblail force-pushed the boblail:insert_many branch 3 times, most recently from 80eeca8 to 99d56f4 Feb 6, 2019
activerecord/lib/active_record/insert_all.rb Outdated Show resolved Hide resolved
activerecord/lib/active_record/insert_all.rb Outdated Show resolved Hide resolved
activerecord/lib/active_record/insert_all.rb Outdated Show resolved Hide resolved
activerecord/lib/active_record/insert_all.rb Outdated Show resolved Hide resolved
activerecord/lib/active_record/persistence.rb Outdated Show resolved Hide resolved
activerecord/lib/active_record/persistence.rb Outdated Show resolved Hide resolved
activerecord/lib/active_record/persistence.rb Outdated Show resolved Hide resolved
activerecord/lib/active_record/persistence.rb Outdated Show resolved Hide resolved
activerecord/lib/active_record/persistence.rb Outdated Show resolved Hide resolved
@boblail
Copy link
Contributor Author

@boblail boblail commented Feb 11, 2019

@dhh,

I pushed these changes (good call on all of them 👍) as separate commits (if that's easier to review):

  • stack assignments on one line
  • prefer %w[ over %w{
  • replace guard clause
  • inline delegates (except :connection — it's used 12 times)
  • move prepare_keys_and_values_for_insert vs. throwing away _keys
  • extract conflict_columns
  • remove unnecessary memoization
  • gather argument-checking to InsertAll
  • improve documentation
  • use keyword args for options
  • sort methods in Table-of-Contents order

"conflict target"

I'm on board with departing from the Postgres/SQLite docs on that awkward name 😄.

So this option is doing something a bit like the block you pass to #uniq_by or #index_by:

books.index_by(&:isbn)
books.uniq_by { |book| [ book.author_id, book.title ] }

How about taking inspiration from those method names? (unique_by or distinct_by?)

Book.insert_all([
  { id: 1, title: 'Rework', author: 'David' },
  { id: 1, title: 'Eloquent Ruby', author: 'Russ' }
], unique_by: %i[ author_id title ])

Alternately, with this key, we're telling the database how to identify existing records. Maybe the word identity is the key:

Book.insert_all([
  { id: 1, title: 'Rework', author: 'David' },
  { id: 1, title: 'Eloquent Ruby', author: 'Russ' }
], identity: %i[ author_id title ])

Or, in the docs — and in our discussion — we say this is useful if you have more than one unique index on a table, so maybe we double down on unique index:

Book.insert_all([
  { id: 1, title: 'Rework', author: 'David' },
  { id: 1, title: 'Eloquent Ruby', author: 'Russ' }
], unique_index: %i[ author_id title ])

What do you think of these options?

Query Builders

Do you mind sharing just a little more about what you're picturing here?

  • Two new objects? (A generic builder and a MySQL builder?)
  • Would #connection still respond to build_insert_all_sql (its implementation extracted) or would it have a factory method for returning the right builder (insert_all_builder?)
@boblail boblail force-pushed the boblail:insert_many branch from 99d56f4 to 31dfe0c Feb 11, 2019
@dhh
Copy link
Member

@dhh dhh commented Feb 11, 2019

Excellent, Bob.

I'm thinking something like ActiveRecord::InsertAll::SqlBuilder and ActiveRecord::InsertAll::SqlBuilder::MySQL < ActiveRecord::InsertAll::SqlBuilder, and a way to lookup which builder you need based on the connection.

I like unique_by for the parameter 👍

@boblail boblail force-pushed the boblail:insert_many branch from 31dfe0c to ec34815 Feb 12, 2019
@boblail
Copy link
Contributor Author

@boblail boblail commented Feb 12, 2019

Made both changes, @dhh!

I like how the SqlBuilders turned out — good call on that. Pulling out objects revealed opportunities I wasn't expecting to extract little methods which, in turn, gave names to more of what was going on.

Thanks for all the feedback!

@brandoncc
Copy link
Contributor

@brandoncc brandoncc commented Mar 17, 2019

I'm really excited to see this be built into Rails, nice job @boblail! I currently use https://github.com/zdennis/activerecord-import in a couple of projects and have needed to provide raw sql for the update logic. I second @palkan's idea for that ability.

@simi
Copy link
Contributor

@simi simi commented Mar 17, 2019

ryohashimoto added a commit to ryohashimoto/rails that referenced this pull request Apr 8, 2019
…#upsert etc. methods

In rails#35077, `#insert_all` / `#upsert_all` / `#insert` / `#upsert` etc. methods are added. But Active Record logs only “Bulk Insert” log messages when they are invoked.

This commit request improve the log messages to use collect words for how invoked them.
ryohashimoto added a commit to ryohashimoto/rails that referenced this pull request Apr 8, 2019
…#upsert etc. methods

In rails#35077, `#insert_all` / `#upsert_all` / `#insert` / `#upsert` etc. methods are added. But Active Record logs only “Bulk Insert” log messages when they are invoked.

This commit improves the log messages to use collect words for how invoked them.
ryohashimoto added a commit to ryohashimoto/rails that referenced this pull request Apr 8, 2019
…#upsert etc. methods

In rails#35077, `#insert_all` / `#upsert_all` / `#insert` / `#upsert` etc. methods are added. But Active Record logs only “Bulk Insert” log messages when they are invoked.

This commit improves the log messages to use collect words for how invoked them.
ryohashimoto added a commit to ryohashimoto/rails that referenced this pull request Apr 8, 2019
…#upsert etc. methods

In rails#35077, `#insert_all` / `#upsert_all` / `#insert` / `#upsert` etc. methods are added. But Active Record logs only “Bulk Insert” log messages when they are invoked.

This commit improves the log messages to use collect words for how invoked them.
@palkan
Copy link
Contributor

@palkan palkan commented Apr 25, 2019

Just start porting the existing code to #insert_all)

Found one interesting case: the order of the columns in unique_by does matter; which confused me a bit, since PostgreSQL (not sure about others) does not care about the order of the columns in confilict_target (from the docs):

All table_name unique indexes that, without regard to order, contain exactly the conflict_target-specified columns/expressions are inferred (chosen) as arbiter indexes.

Should we also be order-independent then?

# { title: 'Eloquent Ruby', author: 'Russ' }
# ])
#
# # raises ActiveRecord::RecordNotUnique beacuse 'Eloquent Ruby'

This comment has been minimized.

#
# See <tt>ActiveRecord::Persistence#insert_all</tt> for documentation.
def insert(attributes, returning: nil, unique_by: nil)
insert_all([ attributes ], returning: returning, unique_by: unique_by)

This comment has been minimized.

@bf4

bf4 Jun 25, 2019
Contributor

would be super-cool if create/save could also take a returning argument so that I could tell Rails the db is going to calculate some additional values it should set on the instance. Does that seems like a reasons feature request? I could work on it. I wouldn't normally bring it up this way, but it seems like this PR paves the way for something like this.

suketa added a commit to suketa/rails_sandbox that referenced this pull request Sep 8, 2019
Add insert_many to ActiveRecord models
rails/rails#35077
@schneems
Copy link
Member

@schneems schneems commented Oct 29, 2019

I'm curious how you all get around autogenerated fields that are populated from Ruby and not inside the database. If I try to run this:

Book.upsert_all([
  { title: 'Rework', author: 'David', isbn: '1' },
  { title: 'Eloquent Ruby', author: 'Russ', isbn: '1' }
],
   unique_by: { columns: %w[ isbn ] })

And I have nothing in the database then I get 2 books with null created_at columns, or since my created_at were specified as not-nullable I get an error.

@simi
Copy link
Contributor

@simi simi commented Oct 29, 2019

@schneems if I remember well we have considered this more advanced API, similar to update_all which doesn't handle update_at as well.

You can add updated_at and created_at to your hashes to get them stored.

@schneems
Copy link
Member

@schneems schneems commented Oct 29, 2019

You can add updated_at and created_at to your hashes to get them stored.

Thanks. I was already sending over updated_at, but as I can correctly calculate that value, but sending over created_at as well means any existing values in the DB will have their records over-written, which effectively makes created_at the same as updated_at for all records. For my case it's not that bad, but its effectively one set of data that i'm losing while at the same time having to pay to store the duplicate info.

Thanks for the reply, i'll move forward with the duplicate timestamps

@schneems
Copy link
Member

@schneems schneems commented Oct 29, 2019

I had the thought that I could add a default value to created_at column, while this runs, it seems these two options are a no-op

      change_column_default :issues, :created_at, from: nil, to: 'NOW()'

and

    change_column :issues, :created_at, :datetime, :default => "NOW()"
schneems added a commit to codetriage/CodeTriage that referenced this pull request Oct 29, 2019
@schneems
Copy link
Member

@schneems schneems commented Oct 30, 2019

Here's the query one of my upsert_all calls generates:

INSERT INTO "issues"("repo_id","title","url","state","html_url","number","pr_attached","last_touched_at","updated_at","created_at") VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10), ($11, $12, $13, $14, $15, $16, $17, $18, $19, $20), ($21, $22, $23, $24, $25, $26, $27, $28, $29, $30), ($31, $32, $33, $34, $35, $36, $37, $38, $39, $40), ($41, $42, $43, $44, $45, $46, $47, $48, $49, $50), ($51, $52, $53, $54, $55, $56, $57, $58, $59, $60), ($61, $62, $63, $64, $65, $66, $67, $68, $69, $70), ($71, $72, $73, $74, $75, $76, $77, $78, $79, $80), ($81, $82, $83, $84, $85, $86, $87, $88, $89, $90), ($91, $92, $93, $94, $95, $96, $97, $98, $99, $100), ($101, $102, $103, $104, $105, $106, $107, $108, $109, $110), ($111, $112, $113, $114, $115, $116, $117, $118, $119, $120), ($121, $122, $123, $124, $125, $126, $127, $128, $129, $130), ($131, $132, $133, $134, $135, $136, $137, $138, $139, $140), ($141, $142, $143, $144, $145, $146, $147, $148, $149, $150), ($151, $152, $153, $154, $155, $156, $157, $158, $159, $160), ($161, $162, $163, $164, $165, $166, $167, $168, $169, $170), ($171, $172, $173, $174, $175, $176, $177, $178, $179, $180), ($181, $182, $183, $184, $185, $186, $187, $188, $189, $190), ($191, $192, $193, $194, $195, $196, $197, $198, $199, $200), ($201, $202, $203, $204, $205, $206, $207, $208, $209, $210), ($211, $212, $213, $214, $215, $216, $217, $218, $219, $220), ($221, $222, $223, $224, $225, $226, $227, $228, $229, $230), ($231, $232, $233, $234, $235, $236, $237, $238, $239, $240), ($241, $242, $243, $244, $245, $246, $247, $248, $249, $250), ($251, $252, $253, $254, $255, $256, $257, $258, $259, $260), ($261, $262, $263, $264, $265, $266, $267, $268, $269, $270), ($271, $272, $273, $274, $275, $276, $277, $278, $279, $280), ($281, $282, $283, $284, $285, $286, $287, $288, $289, $290), ($291, $292, $293, $294, $295, $296, $297, $298, $299, $300) ON CONFLICT ("number","repo_id") DO UPDATE SET "title"=excluded."title","url"=excluded."url","state"=excluded."state","html_url"=excluded."html_url","pr_attached"=excluded."pr_attached","last_touched_at"=excluded."last_touched_at","updated_at"=excluded."updated_at","created_at"=excluded."created_at" RETURNING "id"

I think that what I would ideally like is to be able to say something like:

Issue.upsert_all(upsert_mega_array, unique_by: [:number, :repo_id], on_conflict_skip: [:created_at])

So that way I could specify that my new records should get the created_at value, but existing records should keep their values.

@simi
Copy link
Contributor

@simi simi commented Oct 30, 2019

@schneems I was trying to build something to cover this kind of common problems, but I wasn't successful.

see #35635 and #35631

I will be more than happy to work more on this feature, but I think we were missing a decision where to move this. I can try to revive my work on relations and also create_with support.

@boblail
Copy link
Contributor Author

@boblail boblail commented Oct 31, 2019

@schneems, what if you try wrapping the default value in a lambda?

change_column_default :issues, :created_at, from: nil, to: ->() { 'NOW()' }

(that's always worked for me)

@md5
Copy link
Contributor

@md5 md5 commented Nov 19, 2019

It's great that RETURNING is now supported in the bulk insert case! Any chance of revisiting #34237 to get it wired more naturally into models?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet