Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bulk import #168

Merged
merged 24 commits into from
Apr 26, 2018
Merged
Show file tree
Hide file tree
Changes from 9 commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
814de94
Start of bulk-import method
Blacksmoke16 Mar 30, 2018
d1b2134
Use parameterized statement
Blacksmoke16 Mar 30, 2018
93b6e67
Handle duplicate key IGNORE/UPDATE
Blacksmoke16 Mar 31, 2018
6435d63
Add commas between update values
Blacksmoke16 Mar 31, 2018
85c36a0
merge in the fields unity code
Blacksmoke16 Mar 31, 2018
73e7de5
Remove now undeeded primary_auto variable
Blacksmoke16 Mar 31, 2018
1a03145
Add import for pg adapter
Blacksmoke16 Mar 31, 2018
54dd5b7
Also quote other column key
Blacksmoke16 Mar 31, 2018
c74d126
Add adapter for sqlite
Blacksmoke16 Mar 31, 2018
e0e1be4
Merge branch 'master' into bulk-import
faustinoaq Apr 4, 2018
84364d9
Fix spec
Blacksmoke16 Apr 7, 2018
2d15769
Add more tests, fix sqlite adapter
Blacksmoke16 Apr 7, 2018
bbd2f7f
Update readme
Blacksmoke16 Apr 7, 2018
2425ad2
Fix spacing issue in sqlite adapater
Blacksmoke16 Apr 7, 2018
87009d5
Fix indentation
Blacksmoke16 Apr 7, 2018
acc4a6e
Fix typos in readme examples
Blacksmoke16 Apr 7, 2018
4b10e27
Merge branch 'master' into bulk-import
faustinoaq Apr 7, 2018
87268a0
Fix indentations
Blacksmoke16 Apr 7, 2018
f6dcf2a
Add overload methods for update/ignore_on_duplicate, update readme
Blacksmoke16 Apr 8, 2018
e3e525a
Add = sign to model arrays in readme
Blacksmoke16 Apr 12, 2018
371d0b4
Merge branch 'master' into bulk-import
drujensen Apr 22, 2018
9fe4bcb
Specify imports do not trigger callbacks yet
Blacksmoke16 Apr 25, 2018
5b31b80
proper grammar
Blacksmoke16 Apr 25, 2018
027be38
Merge branch 'master' into bulk-import
drujensen Apr 26, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion spec/granite_orm/fields/timestamps_spec.cr
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ require "../../spec_helper"
module {{adapter.capitalize.id}}
{%
avoid_macro_bug = 1 # https://github.com/crystal-lang/crystal/issues/5724

# TODO mysql timestamp support should work better
if adapter == "pg"
time_kind_on_read = "Time::Kind::Utc".id
Expand Down
2 changes: 1 addition & 1 deletion spec/granite_orm/querying/find_spec.cr
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ module {{adapter.capitalize.id}}
it "returns nil or raises if no result" do
found = Parent.find? 0
found.should be_nil

expect_raises(Granite::ORM::Querying::NotFound, /Couldn't find .*Parent.* with id=0/) do
Parent.find 0
end
Expand Down
17 changes: 17 additions & 0 deletions spec/granite_orm/transactions/import_spec.cr
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
require "../../spec_helper"

{% for adapter in GraniteExample::ADAPTERS %}
module {{adapter.capitalize.id}}
describe "{{ adapter.id }} .import" do
it "should import 3 new objects" do
to_import = [
Parent.new(name: "ImportParent1"),
Parent.new(name: "ImportParent2"),
Parent.new(name: "ImportParent3"),
]
Parent.import(to_import)
Parent.all("WHERE name LIKE ?", ["Import%"]).size.should eq 3
end
end
end
{% end %}
2 changes: 1 addition & 1 deletion spec/spec_helper.cr
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
require "spec"

module GraniteExample
ADAPTERS = ["pg","mysql","sqlite"]
ADAPTERS = ["pg", "mysql", "sqlite"]
end

require "../src/granite_orm"
Expand Down
3 changes: 3 additions & 0 deletions src/adapter/base.cr
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,9 @@ abstract class Granite::Adapter::Base
# This will insert a row in the database and return the id generated.
abstract def insert(table_name, fields, params, lastval) : Int64

# This will insert an array of models as one insert statement
abstract def import(table_name : String, primary_name : String, fields, model_array, **options)

# This will update a row in the database.
abstract def update(table_name, primary_name, fields, params)

Expand Down
38 changes: 38 additions & 0 deletions src/adapter/mysql.cr
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,44 @@ class Granite::Adapter::Mysql < Granite::Adapter::Base
end
end

def import(table_name : String, primary_name : String, fields, model_array, **options)
params = [] of DB::Any
now = Time.now.to_utc
fields.reject! { |field| field === "id" } if primary_name === "id"

statement = String.build do |stmt|
stmt << "INSERT"
stmt << " IGNORE" if options["on_duplicate_key_ignore"]?
stmt << " INTO #{quote(table_name)} ("
stmt << fields.map { |field| quote(field) }.join(", ")
stmt << ") VALUES "

model_array.each do |model|
model.updated_at = now if model.responds_to? :updated_at
model.created_at = now if model.responds_to? :created_at
next unless model.valid?
stmt << '('
stmt << Array.new(fields.size, '?').join(',')
Copy link
Contributor Author

@Blacksmoke16 Blacksmoke16 Mar 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The unless primary_auto references above/below here can be removed once #169 gets merged, since fields will cover it by default.

Probably also be able to remove the primary_auto param since that is only reason i needed it.

params.concat fields.map { |field| model.to_h[field] }
stmt << "),"
end
end.chomp(',')

if update_keys = options["on_duplicate_key_update"]?
statement += " ON DUPLICATE KEY UPDATE "
update_keys.each do |key|
statement += "#{quote(key)}=VALUES(#{quote(key)}), "
end
statement = statement.chomp(", ")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does mysql not have an ON DUPLICATE IGNORE option?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

end

log statement, params

open do |db|
db.exec statement, params
end
end

private def last_val
return "SELECT LAST_INSERT_ID()"
end
Expand Down
41 changes: 41 additions & 0 deletions src/adapter/pg.cr
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,47 @@ class Granite::Adapter::Pg < Granite::Adapter::Base
end
end

def import(table_name : String, primary_name : String, fields, model_array, **options)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

your **options can be a little cleaner and have a similar effect:

def import(table_name : String, primary_name : String, fields, model_array : Array(self), *, update_on_duplicate_key = false)

Copy link
Contributor Author

@Blacksmoke16 Blacksmoke16 Apr 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would mean i would need to have a defined param for each options right?

As of now there is on_duplicate_key_ignore: Bool, on_duplicate_key_update: Array(Strings) to be updated, and possibly in future also validate: true, as giving option to ignore validation might speed up large imports. I don't have strong feelings one way or another.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can probably have the best of both worlds using a crystal overloaded method signature. If you need, here are the docs on overloading

Copy link
Member

@robacarp robacarp Apr 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t have strong opinions on this either. I just noticed a rubyism that might be able to be cleaned up with some crystal.

params = [] of DB::Any
now = Time.now.to_utc
fields.reject! { |field| field === "id" } if primary_name === "id"
index = 0

statement = String.build do |stmt|
stmt << "INSERT"
stmt << " INTO #{quote(table_name)} ("
stmt << fields.map { |field| quote(field) }.join(", ")
stmt << ") VALUES "

model_array.each do |model|
model.updated_at = now if model.responds_to? :updated_at
model.created_at = now if model.responds_to? :created_at
next unless model.valid?
stmt << '('
stmt << fields.map_with_index { |_f, idx| "$#{index + idx + 1}" }.join(',')
params.concat fields.map { |field| model.to_h[field] }
stmt << "),"
index += fields.size
end
end.chomp(',')

if update_keys = options["on_duplicate_key_update"]?
statement += " ON CONFLICT (#{quote(primary_name)}) DO UPDATE SET "
Copy link
Member

@robacarp robacarp Apr 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems right, but would you mind explaining the behavior in a terse comment for future maintainers? (And me...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I plan on updating the READ me with info on how to use everything. But basic this was part of Postgres' 9.5 version:

https://www.postgresql.org/docs/9.5/static/sql-insert.html

It's basic the PostGres version of MySQL ON DUPLICATE KEY UPDATE

From the docs:

INSERT INTO distributors (did, dname)
    VALUES (5, 'Gizmo Transglobal'), (6, 'Associated Computing, Inc')
    ON CONFLICT (did) DO UPDATE SET dname = EXCLUDED.dname;

Is basic saying if there is a conflict on the did value, then dname should be updated to the new value that would have otherwise been excluded.

update_keys.each do |key|
statement += "#{quote(key)}=EXCLUDED.#{quote(key)}, "
end
statement = statement.chomp(", ")
elsif options["on_duplicate_key_ignore"]?
statement += " ON CONFLICT DO NOTHING"
end

log statement, params

open do |db|
db.exec statement, params
end
end

private def last_val
return "SELECT LASTVAL()"
end
Expand Down
38 changes: 38 additions & 0 deletions src/adapter/sqlite.cr
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,44 @@ class Granite::Adapter::Sqlite < Granite::Adapter::Base
end
end

def import(table_name : String, primary_name : String, fields, model_array, **options)
params = [] of DB::Any
now = Time.now.to_utc
fields.reject! { |field| field === "id" } if primary_name === "id"

statement = String.build do |stmt|
stmt << "INSERT"
stmt << " IGNORE" if options["on_duplicate_key_ignore"]?
stmt << " INTO #{quote(table_name)} ("
stmt << fields.map { |field| quote(field) }.join(", ")
stmt << ") VALUES "

model_array.each do |model|
next unless model.valid?
model.updated_at = now if model.responds_to? :updated_at
model.created_at = now if model.responds_to? :created_at
stmt << '('
stmt << Array.new(fields.size, '?').join(',')
params.concat fields.map { |field| model.to_h[field] }
stmt << "),"
end
end.chomp(',')

if update_keys = options["on_duplicate_key_update"]?
statement += " ON DUPLICATE KEY UPDATE "
update_keys.each do |key|
statement += "#{quote(key)}=VALUES(#{quote(key)}), "
end
statement = statement.chomp(", ")
end

log statement, params

open do |db|
db.exec statement, params
end
end

private def last_val
return "SELECT LAST_INSERT_ROWID()"
end
Expand Down
2 changes: 1 addition & 1 deletion src/granite_orm/associations.cr
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ module Granite::ORM::Associations
{{children_class}}.all(query, id)
end
end

# define getter for related children
macro has_many(children_table, through)
def {{children_table.id}}
Expand Down
2 changes: 1 addition & 1 deletion src/granite_orm/querying.cr
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
module Granite::ORM::Querying
class NotFound < Exception
end
end

macro extended
macro __process_querying
Expand Down
15 changes: 13 additions & 2 deletions src/granite_orm/transactions.cr
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,20 @@ module Granite::ORM::Transactions
@updated_at : Time?
@created_at : Time?

# The import class method will run a batch INSERT statement for each model in the array
# the array must contain only one model class
# invalid model records will be skipped
def self.import(model_array : Array(self), **options)
begin
@@adapter.import(table_name, primary_name, fields, model_array, **options)
rescue err
raise DB::Error.new(err.message)
end
end

# The save method will check to see if the primary exists yet. If it does it
# will call the update method, otherwise it will call the create method.
# This will update the timestamps apropriately.
# This will update the timestamps appropriately.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥇

def save
return false unless valid?

Expand Down Expand Up @@ -91,7 +102,7 @@ module Granite::ORM::Transactions
end
end

module ClassMethods
module ClassMethods
def create(**args)
create(args.to_h)
end
Expand Down
2 changes: 1 addition & 1 deletion src/granite_orm/validators.cr
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ require "./error"
# !user.name.to_s.blank?
# end
#
# validate :name, "can't be blank", -> (user : User) do
# validate :name, "can't be blank", ->(user : User) do
# !user.name.to_s.blank?
# end
#
Expand Down