Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PRIMARY KEY lookup with additional filters #10298

Closed
chaudum opened this issue Jul 31, 2020 · 4 comments · Fixed by #10708
Closed

PRIMARY KEY lookup with additional filters #10298

chaudum opened this issue Jul 31, 2020 · 4 comments · Fixed by #10708

Comments

@chaudum
Copy link
Contributor

chaudum commented Jul 31, 2020

Use case:

The SQLAlchemy ORM loads expired session objects by their primary key identity:

session = Session()
o = Model(id=uuid.uuid4().hex, deleted=False)
session.add(o)
session.commit()  # will expire the object `o`

print(f"{o.id=}")  # will perform do a PK lookup of the object `o`

Since documents in CrateDB are immediately available through PK lookup, this works great. So far so good.

However, we have the case that a bunch of SQLAlchemy models have an deleted field, which defines whether the object is "visible". In order to transparently hide the deleted = FALSE filter from queries, we make use of the pattern described here: https://docs.sqlalchemy.org/en/13/orm/events.html#sqlalchemy.orm.events.QueryEvents.before_compile
Compiling the query for loading on primary key identity however, is also going through this event hook, causing the alleged primary key lookups to be converted to regular search queries. These search queries then of course do not find the object with the given primary key.

Example:

$ export PK=$(cat /proc/sys/kernel/random/uuid); cr8 run-crate 4.2.x -- @crash <<EOF <<<
CREATE TABLE t1 (id TEXT PRIMARY KEY, deleted BOOLEAN);
INSERT INTO t1 (id, deleted) values ('$PK', FALSE);
SELECT * FROM t1 WHERE id = '$PK';
SELECT * FROM t1 WHERE id = '$PK' AND deleted = FALSE;
EOF
# run-crate
===========

Skipping download, tarball alrady extracted at /home/christian/.cache/cr8/crates/crate-4.2.2
Starting Crate process
CrateDB launching:
    PID: 1503697
    Logs: /home/christian/.cache/cr8/crates/crate-4.2.2/logs/cr8-crate-run670707194.log
    Data: /tmp/tmp4d1k9jbh (removed on stop)

Psql      : 127.0.0.1:5432
Http      : 127.0.0.1:4200
Transport : 127.0.0.1:4300
Cluster ready to process requests


# @crash
========

CONNECT OK
CREATE OK, 1 row affected  (0.413 sec)
INSERT OK, 1 row affected  (0.097 sec)
+--------------------------------------+---------+
| id                                   | deleted |
+--------------------------------------+---------+
| 1c02ffaa-8f08-4106-9ea7-3f609ac99518 | FALSE   |
+--------------------------------------+---------+
SELECT 1 row in set (0.105 sec)
+----+---------+
| id | deleted |
+----+---------+
+----+---------+
SELECT 0 rows in set (0.044 sec)

Feature description:

Would it somehow be possible to make use of primary key lookups if a query contains the full primary key even if there are additional filters? CrateDB should be able to find the document immediately, right?

@chaudum chaudum added the triage An issue that needs to be triaged by a maintainer label Jul 31, 2020
@mfussenegger
Copy link
Member

Would it somehow be possible to make use of primary key lookups if a query contains the full primary key even if there are additional filters?

Yes, if there are no OR clauses but only AND it should be possible to do.

@mfussenegger mfussenegger added complexity: no estimate feature feature: performance and removed triage An issue that needs to be triaged by a maintainer labels Jul 31, 2020
@mfussenegger mfussenegger added this to To Do in 4.3 via automation Jul 31, 2020
@mfussenegger mfussenegger added feature: integration Better integration with other systems client: crate sqlalchemy labels Jul 31, 2020
@chaudum
Copy link
Contributor Author

chaudum commented Jul 31, 2020

Would it somehow be possible to make use of primary key lookups if a query contains the full primary key even if there are additional filters?

Yes, if there are no OR clauses but only AND it should be possible to do.

So WHERE id = ? AND (deleted = FALSE OR deleted IS NULL) would not work?

I guess that is ok, since we can rely on deleted being NOT NULL.

@mfussenegger
Copy link
Member

So WHERE id = ? AND (deleted = FALSE OR deleted IS NULL) would not work?

I think that should also work, problematic is a OR on the same level as the primary key column. Something like id = ? OR x = 10

@mfussenegger mfussenegger moved this from Candidate to Could in 4.3 Sep 2, 2020
@mfussenegger mfussenegger added this to Candidate in 4.4 via automation Sep 11, 2020
@mfussenegger mfussenegger removed this from Could in 4.3 Sep 11, 2020
@mfussenegger mfussenegger moved this from Candidate to Must in 4.4 Sep 11, 2020
@seut seut self-assigned this Oct 23, 2020
seut added a commit that referenced this issue Oct 27, 2020
If filters are combined by AND operators in addition to filters
on primary key columns, the optimzed PK lookup plan is used now.

Closes #10298.
seut added a commit that referenced this issue Oct 27, 2020
If filters are combined by AND operators in addition to filters
on primary key columns, the optimized PK lookup plan is used now.

Closes #10298.
seut added a commit that referenced this issue Oct 29, 2020
If filters are combined by AND operators in addition to filters
on primary key columns, the optimized PK lookup plan is used now.

Closes #10298.
seut added a commit that referenced this issue Oct 29, 2020
If filters are combined by AND operators in addition to filters
on primary key columns, the optimized PK lookup plan is used now.

Closes #10298.
seut added a commit that referenced this issue Nov 2, 2020
If filters are combined by AND operators in addition to filters
on primary key columns, the optimized PK lookup plan is used now.

Closes #10298.
@mergify mergify bot closed this as completed in #10708 Nov 2, 2020
4.4 automation moved this from Must to Done Nov 2, 2020
mergify bot pushed a commit that referenced this issue Nov 2, 2020
If filters are combined by AND operators in addition to filters
on primary key columns, the optimized PK lookup plan is used now.

Closes #10298.
@seut
Copy link
Member

seut commented Nov 2, 2020

We've implemented support for additional filters on primary-key SELECTqueries when combined with an AND operator on the same level by #10708.
It will be part of our next feature 4.4 feature release.
Thanks @chaudum for bringing this up!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
4.4
  
Done
Development

Successfully merging a pull request may close this issue.

3 participants