Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deletion vector with partitioned table returns incorrect results in Delta Lake connector #21737

Closed
ebyhr opened this issue Apr 29, 2024 · 0 comments · Fixed by #21738
Closed
Assignees
Labels
bug Something isn't working correctness delta-lake Delta Lake connector

Comments

@ebyhr
Copy link
Member

ebyhr commented Apr 29, 2024

Steps to create a table on Spark:

CREATE TABLE test (id int, part int) using delta partitioned by (part) TBLPROPERTIES ('delta.enableDeletionVectors' = true);
INSERT INTO test VALUES (1,1), (2,1);
DELETE FROM test WHERE id = 1;
SELECT * FROM test;
2	1

Read the table on Trino:

TABLE delta.default.test;
 id | part
----+------
(0 rows)

Also, it may throw an exception in some situation:

io.trino.spi.TrinoException: Index 1 out of bounds for length 1
at io.trino.plugin.deltalake.DeltaLakePageSource.getNextPage(DeltaLakePageSource.java:206)
at io.trino.operator.TableScanOperator.getOutput(TableScanOperator.java:291)
at io.trino.operator.Driver.processInternal(Driver.java:395)
at io.trino.operator.Driver.lambda$process$8(Driver.java:298)
at io.trino.operator.Driver.tryWithLock(Driver.java:701)
at io.trino.operator.Driver.process(Driver.java:290)
at io.trino.operator.Driver.processForDuration(Driver.java:261)
at io.trino.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:887)
at io.trino.execution.executor.timesharing.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:187)
at io.trino.execution.executor.timesharing.TimeSharingTaskExecutor$TaskRunner.run(TimeSharingTaskExecutor.java:565)
at io.trino.$gen.Trino_xxx____20240417_154334_2.run(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
@ebyhr ebyhr added bug Something isn't working correctness delta-lake Delta Lake connector labels Apr 29, 2024
@ebyhr ebyhr self-assigned this Apr 29, 2024
@ebyhr ebyhr changed the title Deletion vector with partitioned table returns incorrect results Deletion vector with partitioned table returns incorrect results in Delta Lake connector Apr 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working correctness delta-lake Delta Lake connector
Development

Successfully merging a pull request may close this issue.

1 participant