Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Containment where clauses #1345

Closed
KevinRecuerda opened this issue Aug 27, 2019 · 4 comments
Closed

Containment where clauses #1345

KevinRecuerda opened this issue Aug 27, 2019 · 4 comments

Comments

@KevinRecuerda
Copy link

Hello,

It seems that generated containment where clauses are different and don't use same index.


Given

public class Issue
{
    public string Id { get; set; }
    public string[] Tags { get; set; }
}

// request 1
session.Query<Issue>().Where(x => x.Tags.Contains("requested tag")).Explain();

// request 2
session.Query<Issue>().Where(x => x.Tags.Any(tag => new [] {"tag1", "tag2"}.Contains(tag))).Explain();

Generated queries

-- request 1
explain ( format json) select d.data, d.id, d.mt_version from docs.mt_doc_issue as d where d.data @> :arg0
with arg0 = {"Tags":["requested tag"]}

where d.data @> {"Tags":["requested tag"]}
--> Index used: GIN (data)

-- request 2
explain ( format json) select d.data, d.id, d.mt_version from docs.mt_doc_issue as d where CAST(d.data ->> 'Tags' as jsonb) ?| :arg0
with arg0 = array["tag1", "tag2"]

where CAST(d.data ->> 'Tags' as jsonb) ?| array["tag1", "tag2"]
--> Index used: GIN (data->'Tags')

Request 1 uses jsonb path op (and so index on data).
Request 2 uses jsonb array op (and so index on data->Tags).


Is there any reason to not generate queries as following?
(using jsonb array op (and so index on data->Tags))

-- request 1
where d.data->'Tags' ? 'requested tag'
--> Index used: GIN (data->'Tags')

-- request 2
where d.data->'Tags' ?| array["tag1", "tag2"]
--> Index used: GIN (data->'Tags')

Thanks & Regards

@ericgreenmix
Copy link
Contributor

According to the Postgres docs, using jsonb_path_ops is more performant than using jsonb_ops. So it makes sense to me that it is using the @> operator.

Although the jsonb_path_ops operator class supports only queries with the @> operator, 
it has notable performance advantages over the default operator class jsonb_ops. 
A jsonb_path_ops index is usually much smaller than a jsonb_ops index over the same data, 
and the specificity of searches is better, 
particularly when queries contain keys that appear frequently in the data. 
Therefore search operations typically perform better than with the default operator class.

from https://www.postgresql.org/docs/11/datatype-json.html

@KevinRecuerda
Copy link
Author

Indeed, jsonb_path_ops is more performant than using jsonb_ops, assuming using same index path.

But this is not the case here: GIN (data) vs GIN (data->'Tags')

From same doc:

A simple GIN index on the jdoc column can support this query. But note that such an index will store copies of every key and value in the jdoc column, whereas the expression index of the previous example stores only data found under the tags key. While the simple-index approach is far more flexible (since it supports queries about any key), targeted expression indexes are likely to be smaller and faster to search than a simple index.

According to the doc, GIN (data->'Tags') seems to be more efficient than GIN (data).

@ericgreenmix
Copy link
Contributor

What you are referencing is targeted jsonb_ops (GIN (data->'Tags')) vs non-targeted jsonb_ops (GIN(data)).

When using the @> as I am talking about it uses the jsonb_path_ops, which is more efficient for searching. GIN ("data" jsonb_path_ops)

@KevinRecuerda
Copy link
Author

Indeed, my mistake.

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants