Skip to content

When iceberg stream reads table data, the data of update and delete operations will not be read out #7835

@luckyQing

Description

@luckyQing

Apache Iceberg version

1.2.1 (latest release)

Query engine

None

Please describe the bug 🐞

Use the TABLE_SCAN_THEN_INCREMENTAL strategy to read the iceberg table data in real time. When the delete or update sql operation is executed in trino, the following code will not read the changed data, but can only monitor the data of the insert sql operation.

`IcebergSource icebergSource = IcebergSource.forRowData()
.table(sourceTable)
.tableLoader(sourceTableLoader)
.assignerFactory(new SimpleSplitAssignerFactory())
.streaming(true)
.streamingStartingStrategy(StreamingStartingStrategy.TABLE_SCAN_THEN_INCREMENTAL)
.monitorInterval(Duration.ofSeconds(10))
.build();

    StreamExecutionEnvironment env = StreamExecutionEnvironment.createLocalEnvironmentWithWebUI(buildConfiguration());
    DataStream<RowData> stream =
            env.fromSource(icebergSource, WatermarkStrategy.noWatermarks(), "icebergSource", TypeInformation.of(RowData.class));
    try (CloseableIterator<RowData> iterator = stream.executeAndCollect()) {
        while (iterator.hasNext()) {
            System.out.println(iterator.next());
        }
    }`

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions