Map access in PostgreSQL do not allow index usage #330

hauleth · 2021-06-11T11:24:20Z

Environment

Elixir version (elixir -v): doesn't matter
Database and version (PostgreSQL 9.4, MongoDB 3.2, etc.): PostgreSQL 12, but applies to any PostgreSQL version
Ecto version (mix deps): 3.6.2
Database adapter and version (mix deps): postgrex 0.15.9
Operating system: macOS 11

Current behavior

Assuming that we have table like:

create table(:foos) do
  add :map, :jsonb
end

create index(:foos, [:map], using: :gin)

And query like:

from f in "foos",
  where: f.map["val"] == 1,
  select: f.id

Then the foos_map_index will not be used due to fact that Ecto will use #> operator which is not part of GIN supported operator classes. This mean that this query will be less performant than less idiomatic:

from f in "foos",
  where: fragment("? @> ?", f.map, %{val: 1}),
  select: f.id

Expected behavior

This gotcha should be either documented or we should detect situations like above and translate them to the @> operator if possible (but that still can be surprising sometimes that it doesn't work as expected).

The text was updated successfully, but these errors were encountered:

josevalim · 2021-06-11T11:55:18Z

We will document it but perhaps, as per your recent comment in the mailing list, this should be reported to PGSQL too, because if we could do the rewrite, they certainly could too. :)

hauleth · 2021-06-11T11:58:33Z

I do not think that it would be so easy to do rewrite in this case. It is much easier to do in $1 = ANY(q.column) than it is to reason about the (q.column #> $1) = $2, especially as you do not know what is in $1. This mean that planner cannot be ran without knowing what is in $1, which mean that "prepared statement" is no longer "prepared".

josevalim · 2021-06-11T12:03:40Z

Couldn't

(q.column #> $1) = $2

be rewritten to:

(q.column @> json_build_object($1, $2))

?

In any case, I think our biggest issue is that we support both regular json and jsonb fields, and @> is DB specific only. And we don't know the underlying DB field type when emitting the query.

hauleth · 2021-06-11T12:13:59Z

@josevalim that is why I think it could be done in Ecto.Adapters.Postgres to do the rewrite to @>. However I do not know if such "magic" would be welcome in the adapters.

@> is supported by both json and jsonb, the difference is that json field cannot be indexed at all, so there is no problem there. And both #> and @> are DB specific operators, so there is not much difference there.

And, no, it couldn't as it wouldn't work for nested objects. There is no way to transform $1 = '{a,b,c}' and $2 = 1 into '{"a":{"b":{"c":1}}}'::jsonb without doing some complex parsing of $1 and $2.

josevalim · 2021-06-11T12:46:53Z

I don't think it is an issue to rewrite, after all, it is the job of the adapter to match on Elixir expressions and convert them to SQL. The important is to keep the semantics.

And, no, it couldn't as it wouldn't work for nested objects. There is no way to transform $1 = '{a,b,c}' and $2 = 1 into '{"a":{"b":{"c":1}}}'::jsonb without doing some complex parsing of $1 and $2.

I don't think it is a concern, we will never have to parse $1. In our case, we always have a static list of keys, so we do have a good idea of the shape. The issue is that keys can be dynamic. So my previous example:

(q.column #> $1) = $2

The complication is rather $1 needing to work as an integer or a string.

wojtekmach · 2021-06-11T12:49:06Z

I don't know if we have a precedent for different behaviour per server version, but fwiw pg 14 adds a subscript syntax:

SELECT ('{ "this": { "now": { "works": "in postgres 14!" }}}'::jsonb)['this']['now']['works'];

(https://www.postgresql.org/about/news/postgresql-14-beta-1-released-2213/)

and I'd expect it to be handled by indexes too which may or may not help here.

hauleth · 2021-06-11T13:05:07Z

we will never have to parse $1

Yes, I was telling if we would like to push that to the PostgreSQL planner. Planner/query builder would need to parse $1 to decide how to work with that.

Of course, the arrays in JSON would be pretty hard, as q.column[1] == 1 cannot be translated to q.column @> '[1]'::jsonb as it has different behaviour. The same goes for q.column["foo"] == [1, 2]. So as long as left hand is an array or we use integer index from array then this rewrite cannot be done.

@wojtekmach I am afraid it will not work either as per documentation it will work only if the operator is applied directly on column:

However, the index could not be used for queries like the following, because though the operator ? is indexable, it is not applied directly to the indexed column jdoc:
-- Find documents in which the key "tags" contains key or array element "qui"
SELECT jdoc->'guid', jdoc->'name' FROM api WHERE jdoc -> 'tags' ? 'qui';

hauleth · 2021-06-11T13:08:32Z

It also could be mitigated with better description how to declare indices on JSONB fields, as IIRC:

create index(:foo, ["map #> '{val}'"])

Would allow usage of index in the above case. However I do not know if the DB would be smart enough to see that map->'val' would be the same as map #> '{val}'.

EDIT: Tested it, and index(:foos, ["(map->'val')"]) do not work with map["val"], one need to use index(:foos, ["(map#>'{val}')"]).

josevalim · 2021-06-24T10:03:06Z

@hauleth PR for the docs are always welcome!

josevalim · 2021-07-25T08:17:16Z

@hauleth Ping. :)

hauleth · 2021-07-25T20:35:37Z

@josevalim only one question - where it should be documented? In Ecto.Query, in Ecto.Migration, Ecto.Adapter.Postgres?

princemaple referenced this issue in elixir-ecto/ecto Jun 11, 2021

Add warning on GIN/jsonb, closes #330

9cab2d8

hauleth mentioned this issue Jul 25, 2021

Add note about index usage when using JSON access API elixir-ecto/ecto#3698

Merged

josevalim closed this as completed in elixir-ecto/ecto#3698 Jul 25, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Map access in PostgreSQL do not allow index usage #330

Map access in PostgreSQL do not allow index usage #330

hauleth commented Jun 11, 2021 •

edited

josevalim commented Jun 11, 2021

hauleth commented Jun 11, 2021

josevalim commented Jun 11, 2021

hauleth commented Jun 11, 2021

josevalim commented Jun 11, 2021

wojtekmach commented Jun 11, 2021

hauleth commented Jun 11, 2021

hauleth commented Jun 11, 2021 •

edited

josevalim commented Jun 24, 2021

josevalim commented Jul 25, 2021

hauleth commented Jul 25, 2021

Map access in PostgreSQL do not allow index usage #330

Map access in PostgreSQL do not allow index usage #330

Comments

hauleth commented Jun 11, 2021 • edited

Environment

Current behavior

Expected behavior

josevalim commented Jun 11, 2021

hauleth commented Jun 11, 2021

josevalim commented Jun 11, 2021

hauleth commented Jun 11, 2021

josevalim commented Jun 11, 2021

wojtekmach commented Jun 11, 2021

hauleth commented Jun 11, 2021

hauleth commented Jun 11, 2021 • edited

josevalim commented Jun 24, 2021

josevalim commented Jul 25, 2021

hauleth commented Jul 25, 2021

hauleth commented Jun 11, 2021 •

edited

hauleth commented Jun 11, 2021 •

edited