Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Iceberg optimize fails when sorted_by UUID columns #18136

Closed
joshuarobinson opened this issue Jul 5, 2023 · 1 comment · Fixed by #18154
Closed

Iceberg optimize fails when sorted_by UUID columns #18136

joshuarobinson opened this issue Jul 5, 2023 · 1 comment · Fixed by #18154
Labels
bug Something isn't working

Comments

@joshuarobinson
Copy link

I have an Iceberg table and I'm failing to be able to "optimize"

When I do:
alter table locations execute optimize;

I get the following error:
Query 20230705_100203_00039_3b66q failed: Unsupported Hive type: uuid

The table in question (locations) is configured with sorted_by and the column in question is of type UUID.

I've only recently started hitting this error as the table grew, I think because it only happens when the sort operation needs to spill to temporary files.

stacktrace:
io.trino.spi.TrinoException: Unsupported Hive type: uuid at io.trino.orc.metadata.OrcType.toOrcType(OrcType.java:259) at io.trino.orc.metadata.OrcType.createOrcRowType(OrcType.java:307) at io.trino.orc.metadata.OrcType.createRootOrcType(OrcType.java:297) at io.trino.orc.metadata.OrcType.createRootOrcType(OrcType.java:291) at io.trino.plugin.hive.util.TempFileWriter.createOrcFileWriter(TempFileWriter.java:79) at io.trino.plugin.hive.util.TempFileWriter.<init>(TempFileWriter.java:44) at io.trino.plugin.hive.SortingFileWriter.writeTempFile(SortingFileWriter.java:258) at io.trino.plugin.hive.SortingFileWriter.flushToTempFile(SortingFileWriter.java:201) at io.trino.plugin.hive.SortingFileWriter.appendRows(SortingFileWriter.java:125) at io.trino.plugin.iceberg.IcebergSortingFileWriter.appendRows(IcebergSortingFileWriter.java:88) at io.trino.plugin.iceberg.IcebergPageSink.writePage(IcebergPageSink.java:308) at io.trino.plugin.iceberg.IcebergPageSink.doAppend(IcebergPageSink.java:260) at io.trino.plugin.iceberg.IcebergPageSink.appendPage(IcebergPageSink.java:213) at io.trino.plugin.base.classloader.ClassLoaderSafeConnectorPageSink.appendPage(ClassLoaderSafeConnectorPageSink.java:68) at io.trino.operator.TableWriterOperator.addInput(TableWriterOperator.java:255) at io.trino.operator.Driver.processInternal(Driver.java:401) at io.trino.operator.Driver.lambda$process$8(Driver.java:299) at io.trino.operator.Driver.tryWithLock(Driver.java:695) at io.trino.operator.Driver.process(Driver.java:291) at io.trino.operator.Driver.processForDuration(Driver.java:262) at io.trino.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:888) at io.trino.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:187) at io.trino.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:556) at io.trino.$gen.Trino_420____20230702_183201_2.run(Unknown Source)

@rchukh
Copy link

rchukh commented Jul 5, 2023

I hit the same issue some time ago and was able to reproduce it consistently using 'more than the trivial amount of data'.
In my case, it was related to CTAS rather than optimize.
It does indeed relate to temp files and actually affects not just UUID but some of the timestamp types as well.

Have this fixed locally for a while now, but need to figure out a better way to generate more data in existing tests (the non-tests part is done and seems to work fine now).

I'll try to find some time this week to finish the changes and submit the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Development

Successfully merging a pull request may close this issue.

3 participants