Skip to content

Add placeholder option to insert_all#290

Merged
josevalim merged 19 commits intoelixir-ecto:masterfrom
Un3qual:add_placeholders_initial
Jan 19, 2021
Merged

Add placeholder option to insert_all#290
josevalim merged 19 commits intoelixir-ecto:masterfrom
Un3qual:add_placeholders_initial

Conversation

@Un3qual
Copy link
Copy Markdown
Contributor

@Un3qual Un3qual commented Dec 17, 2020

A few days ago, I created this post on Elixir Forum. This PR adds support for "placeholders" in the insert_all operation. It can save bandwidth by allowing you to send less data over the wire. It may also increase performance by decreasing how much you have to typecheck before you insert.

Example

Before

now = Datetime.utc_now
posts = [%{title: "Foo", inserted_at: now}, %{title: "bar", inserted_at: now}, %{title: "Baz", inserted_at: now}]
Repo.insert_all(Post, posts)

results in the SQL:

INSERT INTO "posts" ("title", "inserted_at") VALUES ($1, $2), ($3, $4), ($5, $6)

with these args passed in:

["Foo", ~U[2020-12-17 22:21:10.932918Z], "Bar", ~U[2020-12-17 22:21:10.932918Z], "Baz", ~U[2020-12-17 22:21:10.932918Z]]

After

posts = [%{title: "Foo", inserted_at: {:placeholder, :timestamp}, %{title: "bar", inserted_at:  {:placeholder, :timestamp}}, %{title: "Baz", inserted_at:  {:placeholder, :timestamp}}]
Repo.insert_all(Post, posts, placeholders: %{timestamp: Datetime.utc_now})

results in the SQL:

INSERT INTO "posts" ("title", "inserted_at") VALUES ($2, $1), ($3, $1), ($4, $1)

with these args passed in:

[~U[2020-12-17 22:21:10.932918Z], "Foo", "Bar", "Baz"]

Implementation

So far, I have only implemented this for postgres, but doing so for tds and myxql shouldn't be too difficult. I just want to get the implementation details hashed out first. This could also be extended to be used for update_all in addition to insert_all, and could be useful elsewhere as well. TDD is not my strong point, so I've just written three simple tests so far.
When fetching from the map of placeholders, I use Map.fetch!/2 right now. I'm not sure what function is preferred for stuff like this.

👍s

  • It's very simple for the end user
  • It doesn't interfere with normal use (just an additional option)
  • I don't think there is much of a performance penalty

👎s

  • Ecto.Adapters.SQL.unzip_inserts/2 is considerably more complicated than before

Other possibilities

This could be implemented in other ways. There could just be a dedup option that can be provided to insert_all and update_all, where the list of fields is deduped automatically, but that seems computationally expensive especially in the cases where this feature is most useful (large bulk inserts).

Changes to Ecto

This feature requires changes to elixir-ecto/ecto as well for tests to pass. The PR for that is elixir-ecto/ecto#3515. I'm not sure how to link two seperate PRs though without creating a circular dependency issue (this needs the ecto changes for its tests to pass). If someone has some input on that, I would appreciate it.

@josevalim
Copy link
Copy Markdown
Member

Hi @Un3qual! I have added some comments on the ecto PR which I think will simplify this PR. Some other notes:

  1. You should add the tests to the ecto PR and not ecto_sql PR. There is a bunch of integration tests in the ecto project that you run from ecto_sql, see the ecto README for instructions.

  2. Feel free to update this branch to point to your git branch and we will change it to point to master once we change the ecto PR.

  3. Note that at least MySQL won't support placeholders as it uses ? instead of $1.

{%Ecto.Query{} = query, params_counter}, counter ->
{[?(, all(query), ?)], counter + params_counter}

{:placeholder, placeholder_key}, counter ->
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On unzip_inserts, make it so this is passed as {:placeholder, placeholder_index}, this way you don't need to pass the placeholder_index_map to this function.

@Un3qual
Copy link
Copy Markdown
Contributor Author

Un3qual commented Dec 20, 2020

@josevalim I've made the suggested improvements and a few others, namely ensuring when a placeholder value is dumped as one type, its key can't be used with a column of another type because it can cause unexpected behavior. I'm just stuck on this last MyXQL test: https://github.com/elixir-ecto/ecto_sql/pull/290/checks?check_run_id=1583183874. The first failure looks like it might be an issue with the decimal library (4.00000000000 should be equal to 4.0 right?). The second and third I have no idea. I don't think my changes should be changing the values of anything that isn't a placeholder. Any ideas?

end

def insert(_prefix, _table, _header, _rows, _on_conflict, _returning, _counter_start) do
error!(nil, "The :placeholders is not supported by MySQL")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
error!(nil, "The :placeholders is not supported by MySQL")
error!(nil, ":placeholders is not supported by MySQL")


@impl true
def insert(prefix, table, header, rows, on_conflict, returning) do
def insert(prefix, table, header, rows, on_conflict, returning, counter_start \\ 1) do
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't need the \\. We should remove it and make sure everything passes it as necessary. :)

|> Enum.map(fn {ph_key, ph_idx} ->
{ph_idx, Map.fetch!(placeholder_map, ph_key)}
end)
|> List.keysort(0)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Enum.sort should be more efficient and enough since the first tuple element will be the first to sort on:

Suggested change
|> List.keysort(0)
|> Enum.sort()

|> List.keysort(0)
|> Enum.map(&elem(&1, 1))

all_params = placeholder_values ++ Enum.reverse(params) ++ conflict_params
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small optimization we didn't apply before:

Suggested change
all_params = placeholder_values ++ Enum.reverse(params) ++ conflict_params
all_params = placeholder_values ++ Enum.reverse(params, conflict_params)

{{query, length(query_params)}, Enum.reverse(query_params) ++ acc}
{
{query, length(query_params)},
{Enum.reverse(query_params) ++ values_acc, placeholder_idx_acc, counter}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
{Enum.reverse(query_params) ++ values_acc, placeholder_idx_acc, counter}
{Enum.reverse(query_params, values_acc), placeholder_idx_acc, counter}

end
end


Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

@Un3qual
Copy link
Copy Markdown
Contributor Author

Un3qual commented Dec 21, 2020

@josevalim Got all of the tests (except for mysql 8.0) passing and all of your suggestions implemented.


defp insert(prefx, table, header, rows, on_conflict, returning) do
IO.iodata_to_binary SQL.insert(prefx, table, header, rows, on_conflict, returning)
IO.iodata_to_binary SQL.insert(prefx, table, header, rows, on_conflict, returning, [])
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also add that checks the proper references are used and counted correctly for placeholders!

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both in postgres and tds suites!

@josevalim
Copy link
Copy Markdown
Member

Thank you, this looks great and we are almost there, just some tests. Also please update mix.exs to point to ecto master git repo as the other PR has been merged. :) happy new year!

@josevalim josevalim merged commit c38acca into elixir-ecto:master Jan 19, 2021
@josevalim
Copy link
Copy Markdown
Member

💚 💙 💜 💛 ❤️

@josevalim
Copy link
Copy Markdown
Member

Merged and I will push the pending tests, thank you.

@Un3qual
Copy link
Copy Markdown
Contributor Author

Un3qual commented Jan 29, 2021

@josevalim thanks for getting this merged. Sorry i've been MIA. I did some benchmarks with bulk inserts and placeholders actually ended up being slightly slower, but they were very unscientific. The slight slowdown might be worth the saved bandwidth though

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants