Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Map access in PostgreSQL do not allow index usage #330

Closed
hauleth opened this issue Jun 11, 2021 · 11 comments · Fixed by elixir-ecto/ecto#3698
Closed

Map access in PostgreSQL do not allow index usage #330

hauleth opened this issue Jun 11, 2021 · 11 comments · Fixed by elixir-ecto/ecto#3698

Comments

@hauleth
Copy link

hauleth commented Jun 11, 2021

Environment

  • Elixir version (elixir -v): doesn't matter
  • Database and version (PostgreSQL 9.4, MongoDB 3.2, etc.): PostgreSQL 12, but applies to any PostgreSQL version
  • Ecto version (mix deps): 3.6.2
  • Database adapter and version (mix deps): postgrex 0.15.9
  • Operating system: macOS 11

Current behavior

Assuming that we have table like:

create table(:foos) do
  add :map, :jsonb
end

create index(:foos, [:map], using: :gin)

And query like:

from f in "foos",
  where: f.map["val"] == 1,
  select: f.id

Then the foos_map_index will not be used due to fact that Ecto will use #> operator which is not part of GIN supported operator classes. This mean that this query will be less performant than less idiomatic:

from f in "foos",
  where: fragment("? @> ?", f.map, %{val: 1}),
  select: f.id

Expected behavior

This gotcha should be either documented or we should detect situations like above and translate them to the @> operator if possible (but that still can be surprising sometimes that it doesn't work as expected).

@josevalim
Copy link
Member

We will document it but perhaps, as per your recent comment in the mailing list, this should be reported to PGSQL too, because if we could do the rewrite, they certainly could too. :)

@hauleth
Copy link
Author

hauleth commented Jun 11, 2021

I do not think that it would be so easy to do rewrite in this case. It is much easier to do in $1 = ANY(q.column) than it is to reason about the (q.column #> $1) = $2, especially as you do not know what is in $1. This mean that planner cannot be ran without knowing what is in $1, which mean that "prepared statement" is no longer "prepared".

@josevalim
Copy link
Member

Couldn't

(q.column #> $1) = $2

be rewritten to:

(q.column @> json_build_object($1, $2))

?

In any case, I think our biggest issue is that we support both regular json and jsonb fields, and @> is DB specific only. And we don't know the underlying DB field type when emitting the query.

@hauleth
Copy link
Author

hauleth commented Jun 11, 2021

@josevalim that is why I think it could be done in Ecto.Adapters.Postgres to do the rewrite to @>. However I do not know if such "magic" would be welcome in the adapters.

@> is supported by both json and jsonb, the difference is that json field cannot be indexed at all, so there is no problem there. And both #> and @> are DB specific operators, so there is not much difference there.

And, no, it couldn't as it wouldn't work for nested objects. There is no way to transform $1 = '{a,b,c}' and $2 = 1 into '{"a":{"b":{"c":1}}}'::jsonb without doing some complex parsing of $1 and $2.

@josevalim
Copy link
Member

I don't think it is an issue to rewrite, after all, it is the job of the adapter to match on Elixir expressions and convert them to SQL. The important is to keep the semantics.

And, no, it couldn't as it wouldn't work for nested objects. There is no way to transform $1 = '{a,b,c}' and $2 = 1 into '{"a":{"b":{"c":1}}}'::jsonb without doing some complex parsing of $1 and $2.

I don't think it is a concern, we will never have to parse $1. In our case, we always have a static list of keys, so we do have a good idea of the shape. The issue is that keys can be dynamic. So my previous example:

(q.column #> $1) = $2

The complication is rather $1 needing to work as an integer or a string.

@wojtekmach
Copy link
Member

I don't know if we have a precedent for different behaviour per server version, but fwiw pg 14 adds a subscript syntax:

SELECT ('{ "this": { "now": { "works": "in postgres 14!" }}}'::jsonb)['this']['now']['works'];

(https://www.postgresql.org/about/news/postgresql-14-beta-1-released-2213/)

and I'd expect it to be handled by indexes too which may or may not help here.

@hauleth
Copy link
Author

hauleth commented Jun 11, 2021

we will never have to parse $1

Yes, I was telling if we would like to push that to the PostgreSQL planner. Planner/query builder would need to parse $1 to decide how to work with that.

Of course, the arrays in JSON would be pretty hard, as q.column[1] == 1 cannot be translated to q.column @> '[1]'::jsonb as it has different behaviour. The same goes for q.column["foo"] == [1, 2]. So as long as left hand is an array or we use integer index from array then this rewrite cannot be done.

@wojtekmach I am afraid it will not work either as per documentation it will work only if the operator is applied directly on column:

However, the index could not be used for queries like the following, because though the operator ? is indexable, it is not applied directly to the indexed column jdoc:

-- Find documents in which the key "tags" contains key or array element "qui"
SELECT jdoc->'guid', jdoc->'name' FROM api WHERE jdoc -> 'tags' ? 'qui';

@hauleth
Copy link
Author

hauleth commented Jun 11, 2021

It also could be mitigated with better description how to declare indices on JSONB fields, as IIRC:

create index(:foo, ["map #> '{val}'"])

Would allow usage of index in the above case. However I do not know if the DB would be smart enough to see that map->'val' would be the same as map #> '{val}'.

EDIT: Tested it, and index(:foos, ["(map->'val')"]) do not work with map["val"], one need to use index(:foos, ["(map#>'{val}')"]).

@josevalim
Copy link
Member

@hauleth PR for the docs are always welcome!

@josevalim
Copy link
Member

@hauleth Ping. :)

@hauleth
Copy link
Author

hauleth commented Jul 25, 2021

@josevalim only one question - where it should be documented? In Ecto.Query, in Ecto.Migration, Ecto.Adapter.Postgres?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants