Performance hit from `in` vs `=` on indexed properties #6668

jollygreenlaser · 2024-01-05T04:37:06Z

EdgeDB Version: 4.3
EdgeDB CLI Version: EdgeDB CLI 4.0.2+500be79
OS Version: OSX Sonoma 14.1.1

Steps to Reproduce:

Add index to .thing
Query using .thing in { option }
Query using .thing = option

Using edgedb analyze --expand it shows the index is never used for in but can be used for = when the filter creates an appropriately small group.

It seems that in is forcing the query planner to not account for indices on the property.

The text was updated successfully, but these errors were encountered:

jollygreenlaser · 2024-01-07T02:17:04Z

I submitted a database dump and exact queries to Devon via Discord.

I tested on postgres and confirmed that postgres will use the index with in.

I can use and / or and string build my queries for now. However, this is both more annoying to code and leaves a lot of room for mistakes with injection. Would be nice to go back to array unpack.

themajashurka · 2024-03-10T14:54:49Z

+1 on this. Actually (on a bit different scenario), using an array literal, and an array_unpack around it uses the index:

SELECT A
FILTER .id in array_unpack(
  [
    <uuid>"042029c6-dee5-11ee-9d1f-377ec3aea48d",
    <uuid>"04202a3e-dee5-11ee-9d1f-0722bfa9b90a",
    <uuid>"04202e08-dee5-11ee-9d1f-87413e89e885", 
    <uuid>"04206e40-dee5-11ee-9d1f-af4f29f49a8f",
    <uuid>"0420cb6a-dee5-11ee-9d1f-33a575eb1f6f"
  ]
);
#0.053ms

────────────────────────────────── Fine-grained Query Plan ──────────────────────────────────
 │ Time  Cost Loops Rows Width │ Plan Info
 │  0.0 44.44   1.0  5.0    32 │ ➊ IndexOnlyScan relation_name=A, scan_direction=Forward,
 │                             │ index_name=constraint 'std::exclusive' of property 'id' of o
 │                             │ bject type 'default::A'

Whereas:

SELECT A
FILTER .id in {
   <uuid>"042029c6-dee5-11ee-9d1f-377ec3aea48d",
   <uuid>"04202a3e-dee5-11ee-9d1f-0722bfa9b90a",
   <uuid>"04202e08-dee5-11ee-9d1f-87413e89e885", 
   <uuid>"04206e40-dee5-11ee-9d1f-af4f29f49a8f",
   <uuid>"0420cb6a-dee5-11ee-9d1f-33a575eb1f6f"
};
#120ms

─────────────────── Fine-grained Query Plan ───────────────────
   │  Time     Cost Loops Rows Width │ Plan Info
┬─ │ 118.2 21104.78   1.0  5.0    32 │ ➊ SeqScan relation_name=A
╰┬ │   0.0     0.08   1.0  5.0    16 │ Append
 ├ │   0.0     0.01   1.0  1.0    16 │ Result
 ├ │   0.0     0.01   1.0  1.0    16 │ Result
 ├ │   0.0     0.01   1.0  1.0    16 │ Result
 ├ │   0.0     0.01   1.0  1.0    16 │ Result
 ╰ │   0.0     0.01   1.0  1.0    16 │ Result

themajashurka · 2024-03-10T15:17:21Z

I made a separate issue on this with a few numbers: #7027

jollygreenlaser · 2024-10-03T14:43:47Z

Another similar example:

type A {
  name: str {
     constraint exclusive;
  }
}

with
name := <str>(
  with s := json_array_unpack(to_json(<str>$0))
  select s
  filter <int16>s['rank'] = <int16>$1
)['team'],
select A
filter .name = name

This will not hit that index. But if I slap assert_single on name it will.

raddevon added the performance label Jan 5, 2024

jollygreenlaser mentioned this issue Mar 21, 2024

Performance issue with in / = on indexed property #7027

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance hit from `in` vs `=` on indexed properties #6668

Performance hit from `in` vs `=` on indexed properties #6668

jollygreenlaser commented Jan 5, 2024

jollygreenlaser commented Jan 7, 2024

themajashurka commented Mar 10, 2024

themajashurka commented Mar 10, 2024

jollygreenlaser commented Oct 3, 2024

Performance hit from in vs = on indexed properties #6668

Performance hit from in vs = on indexed properties #6668

Comments

jollygreenlaser commented Jan 5, 2024

jollygreenlaser commented Jan 7, 2024

themajashurka commented Mar 10, 2024

themajashurka commented Mar 10, 2024

jollygreenlaser commented Oct 3, 2024

Performance hit from `in` vs `=` on indexed properties #6668

Performance hit from `in` vs `=` on indexed properties #6668