Some INSERT statements are ignored with Iceberg #20092

ricardopereira33 · 2023-12-13T09:20:50Z

We noticed a weird issue with Trino version 434 (latest). Some of our users do a lot of INSERT INTO ... in an iceberg table with Trino. However, when we changed the version from 428 to 434, seems like some inserts are ignored. It actually writes the file on the object storage, and commits a new snapshot, mentioning that it added X records (we get the same result from the stats in the Trino UI). However, when we query the table, it does not return any records. We tried to query the iceberg table with Spark as well, and it does not return anything.

We analysed the metadata and manifest files, and we found out the following:

Until the sequence-number 15934, everything was fine. Then we have 2 inserts almost at the same time, that committed 2 new snapshots:

29819-ccca636b-3279-4fe2-a733-bc3eed379312.metadata.json  03:56:15
29819-982db761-e158-42ee-8b5e-2c2ac92681ac.metadata.json  03:56:12

We can see a pattern on the metadata files where the prefix increments on every new snapshot, but in this case it is the same (not sure if it is expected). Anyway, if we open both metadata files, we can see the reference to the manifests (Avro files), and the latest metadata file includes the previous one, so I guess it could handle the concurrency.

Last manifest in 29819-982db761-e158-42ee-8b5e-2c2ac92681ac.metadata.json:

 {
    "sequence-number" : 15935,
    "snapshot-id" : 2349488274696194790,
    "parent-snapshot-id" : 4184787990992439919,
    "timestamp-ms" : 1702007762898,
    "summary" : {
      "operation" : "append",
      "trino_query_id" : "20231208_035545_04129_jywup",
      "added-data-files" : "1",
      "added-records" : "14",
      "added-files-size" : "14047",
      "changed-partition-count" : "1",
      "total-records" : "2670050571",
      "total-files-size" : "211155960909",
      "total-data-files" : "36273",
      "total-delete-files" : "1616",
      "total-position-deletes" : "5429420",
      "total-equality-deletes" : "0"
    },
    "manifest-list" : "gs://...b2a6894c12fb.avro",
    "schema-id" : 3
  }

Last manifest in 29819-ccca636b-3279-4fe2-a733-bc3eed379312.metadata.json:

{
    "sequence-number" : 15935,
    "snapshot-id" : 2349488274696194790,
    "parent-snapshot-id" : 4184787990992439919,
    "timestamp-ms" : 1702007762898,
    "summary" : {
      "operation" : "append",
      "trino_query_id" : "20231208_035545_04129_jywup",
      "added-data-files" : "1",
      "added-records" : "14",
      "added-files-size" : "14047",
      "changed-partition-count" : "1",
      "total-records" : "2670050571",
      "total-files-size" : "211155960909",
      "total-data-files" : "36273",
      "total-delete-files" : "1616",
      "total-position-deletes" : "5429420",
      "total-equality-deletes" : "0"
    },
    "manifest-list" : "gs://...b2a6894c12fb.avro",
    "schema-id" : 3
},
{
    "sequence-number" : 15936,
    "snapshot-id" : 6897514136146417222,
    "parent-snapshot-id" : 2349488274696194790,
    "timestamp-ms" : 1702007773563,
    "summary" : {
      "operation" : "append",
      "trino_query_id" : "20231208_035556_04136_jywup",
      "added-data-files" : "1",
      "added-records" : "6",
      "added-files-size" : "14039",
      "changed-partition-count" : "1",
      "total-records" : "2670050577",
      "total-files-size" : "211155974948",
      "total-data-files" : "36274",
      "total-delete-files" : "1616",
      "total-position-deletes" : "5429420",
      "total-equality-deletes" : "0"
    },
    "manifest-list" : "gs://...5cfc3f50c6c9.avro",
    "schema-id" : 3
  } ],

When we tried to check the Avro files, we could see the problem. For the commit before these 2 inserts, it has the following information about the sequence number 15934:

{'avro.schema': ..., 
'avro.codec': b'deflate', 
'snapshot-id': b'4184787990992439919', 
'format-version': b'2', 
'sequence-number': b'15934', 
'iceberg.schema': ...
'parent-snapshot-id': b'7860648970688177621'}
{'manifest_path': 'gs://...954ea5f5a0a0-m0.avro', 'manifest_length': 13879, 'sequence_number': 15934, 'min_sequence_number': 15934, 'added_snapshot_id': 4184787990992439919, ...}

However, for the 2 commits we have this on each avro file:

gs://...b2a6894c12fb.avro:

{'avro.schema': ..., 
'avro.codec': b'deflate', 
'snapshot-id': b'2349488274696194790', 
'format-version': b'2', 
'sequence-number': b'15935', 
'iceberg.schema': ..., 
'parent-snapshot-id': b'4184787990992439919'}
{'manifest_path': 'gs://...954ea5f5a0a0-m0.avro', 'manifest_length': 13879, 'sequence_number': 15934, 'min_sequence_number': 15934, 'added_snapshot_id': 4184787990992439919, ...}

gs://...5cfc3f50c6c9.avro:

{'avro.schema': ..., 
    'avro.codec': b'deflate', 
    'snapshot-id': b'6897514136146417222', 
    'format-version': b'2', 
    'sequence-number': b'15936', 
    'iceberg.schema': ..., 
    'parent-snapshot-id': b'2349488274696194790'
}
{'manifest_path': 'gs://...954ea5f5a0a0-m0.avro', 'manifest_length': 13879, 'sequence_number': 15934, 'min_sequence_number': 15934, 'added_snapshot_id': 4184787990992439919, ...}

For some reason, it doesn’t include a -mX.avro file for these inserts…

The text was updated successfully, but these errors were encountered:

findinpath · 2023-12-13T09:27:27Z

Slack discussion:

https://trinodb.slack.com/archives/CJ6UC075E/p1702376393926019

findinpath · 2023-12-13T09:30:51Z

Snapshot IDs sequence for the table:
7860648970688177621 -> 4184787990992439919 -> 2349488274696194790 -> 6897514136146417222

The snapshots 2349488274696194790 and 6897514136146417222 have:

an own sequence number (monotonically ascending) ✔️
an own manifest list file ✔️
almost identical manifest files ✖️

it just changed the header, with the sequence number and snapshots ids:

{'avro.schema': ..., 
'avro.codec': b'deflate', 
'snapshot-id': b'2349488274696194790', 
'format-version': b'2', 
'sequence-number': b'15935', 
'iceberg.schema': ..., 
'parent-snapshot-id': b'4184787990992439919'}

Relevant Trino code:

trino/plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java

Lines 1088 to 1113 in 3cbf498

    
           AppendFiles appendFiles = isMergeManifestsOnWrite(session) ? transaction.newAppend() : transaction.newFastAppend(); 
        
           ImmutableSet.Builder<String> writtenFiles = ImmutableSet.builder(); 
        
           for (CommitTaskData task : commitTasks) { 
        
               DataFiles.Builder builder = DataFiles.builder(icebergTable.spec()) 
        
                       .withPath(task.getPath()) 
        
                       .withFileSizeInBytes(task.getFileSizeInBytes()) 
        
                       .withFormat(table.getFileFormat().toIceberg()) 
        
                       .withMetrics(task.getMetrics().metrics()); 
        
               if (!icebergTable.spec().fields().isEmpty()) { 
        
                   String partitionDataJson = task.getPartitionDataJson() 
        
                           .orElseThrow(() -> new VerifyException("No partition data for partitioned table")); 
        
                   builder.withPartition(PartitionData.fromJson(partitionDataJson, partitionColumnTypes)); 
        
               } 
        
               appendFiles.appendFile(builder.build()); 
        
               writtenFiles.add(task.getPath()); 
        
           } 
        
           // try to leave as little garbage as possible behind 
        
           if (table.getRetryMode() != NO_RETRIES) { 
        
               cleanExtraOutputFiles(session, writtenFiles.build()); 
        
           } 
        
           commit(appendFiles, session); 
        
           transaction.commitTransaction();

Relevant Iceberg code:
https://github.com/apache/iceberg/blob/3112ec91617ef080604f47ac92c255b517458522/core/src/main/java/org/apache/iceberg/SnapshotProducer.java#L226-L244

Relevant piece of information, for the INSERT which does not have the data file reflected in the table, we see:

      "added-data-files" : "1",
      "added-records" : "6",

which means that the were actually newDataFiles in the snapshot producer.

https://github.com/apache/iceberg/blob/7240752e18278b3454be141c2e4b121079004d8b/core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java#L933-L935

However, the manifest file corresponding the new data file is missing 🙄

yoniiny · 2023-12-14T21:24:20Z

@findinpath I think it's probably a regression introduced in Iceberg 1.4. If it's the same issue, here's the relevant PR that should fix it: apache/iceberg#9230

findepi · 2023-12-18T21:54:36Z

@alexjo2144 posted a temporary fix for Trino -- #20159
thank you Alex!

findepi · 2023-12-19T14:17:25Z

It's disturbing that io.trino.testing.BaseConnectorTest#testInsertRowConcurrently didn't catch the problem.
i would expect it to catch it, assuming no special setup is needed for repro.
@yoniiny any idea what were we missing in this test?

findepi · 2023-12-20T21:42:47Z

#20159 would fix the problem , but the current plan is to have a better fix (#20159 (comment))

findinpath · 2023-12-29T02:34:06Z

#20207 should address the reported issue.

findinpath added bug Something isn't working correctness iceberg Iceberg connector labels Dec 13, 2023

findepi added the RELEASE-BLOCKER label Dec 18, 2023

findepi mentioned this issue Dec 18, 2023

Revert "Update Iceberg from 1.3.1 to 1.4.1" #20159

Closed

findepi mentioned this issue Dec 29, 2023

Update to Iceberg 1.4.3 #20207

Merged

findepi closed this as completed in #20207 Jan 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some INSERT statements are ignored with Iceberg #20092

Some INSERT statements are ignored with Iceberg #20092

ricardopereira33 commented Dec 13, 2023 •

edited

findinpath commented Dec 13, 2023

findinpath commented Dec 13, 2023 •

edited

yoniiny commented Dec 14, 2023

findepi commented Dec 18, 2023

findepi commented Dec 19, 2023

findepi commented Dec 20, 2023

findinpath commented Dec 29, 2023

Some INSERT statements are ignored with Iceberg #20092

Some INSERT statements are ignored with Iceberg #20092

Comments

ricardopereira33 commented Dec 13, 2023 • edited

findinpath commented Dec 13, 2023

findinpath commented Dec 13, 2023 • edited

yoniiny commented Dec 14, 2023

findepi commented Dec 18, 2023

findepi commented Dec 19, 2023

findepi commented Dec 20, 2023

findinpath commented Dec 29, 2023

ricardopereira33 commented Dec 13, 2023 •

edited

findinpath commented Dec 13, 2023 •

edited