v3 initial-default rows are silently dropped when a query filters on the defaulted column

### Apache Iceberg version

1.11.0 (latest release)

### Query engine

Spark

### Please describe the bug 🐞

A column added by schema evolution with an `initial-default` is correctly backfilled on a **full
scan**, but is **silently dropped from results whenever a query filters on that column**. No exception
is raised — it returns silent wrong results (missing rows).

Reproduced on releases `1.11.0` and `1.10.1`, and on `main` @ `c00669fde`, across **Spark 3.5, 4.0,
and 4.1** (identical result on all three), Parquet, format-version 3. The root cause is in shared `core`/`parquet` read
code, so it is not Spark-version-specific; the same per-file record filter is used by the Flink and
generic readers as well.

The defaulted column has to be *added to an existing table* to be absent from older files, which today
is only reachable through the schema-evolution API — Spark SQL `ALTER TABLE ... ADD COLUMN ... DEFAULT`
is currently rejected (`UnsupportedOperationException: setting default values in Spark is currently
unsupported`), which is part of why this filtered-read case is untested.

#### Repro

`id=1` is written before column `c` exists (so `c` is physically absent from that file); `c` is then
added with `initial-default 'US'`; `id=2` is written with `c='US'`.

```java
// spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/sql/TestDefaultFilteredRead.java
public class TestDefaultFilteredRead extends CatalogTestBase {

  @Parameters(name = "catalogName = {0}, implementation = {1}, config = {2}")
  protected static Object[][] parameters() {
    return new Object[][] {
      {SparkCatalogConfig.HADOOP.catalogName(),
       SparkCatalogConfig.HADOOP.implementation(),
       SparkCatalogConfig.HADOOP.properties()}
    };
  }

  @AfterEach
  public void dropTable() {
    sql("DROP TABLE IF EXISTS %s", tableName);
  }

  @TestTemplate
  public void filteredReadOverAbsentDefaultColumn() {
    sql("CREATE TABLE %s (id bigint, name string) USING iceberg "
        + "TBLPROPERTIES ('format-version'='3','write.format.default'='parquet')", tableName);
    sql("INSERT INTO %s VALUES (1, 'Alice')", tableName);          // F_old: column c absent

    Table table = validationCatalog.loadTable(tableIdent);
    table.updateSchema().addColumn("c", Types.StringType.get(), Expressions.lit("US")).commit();
    sql("REFRESH TABLE %s", tableName);
    sql("INSERT INTO %s VALUES (2, 'Bob', 'US')", tableName);      // F_new: c = 'US'

    // full scan is correct -> the default is materialized for id=1
    assertThat(sql("SELECT id FROM %s ORDER BY id", tableName))
        .containsExactly(row(1L), row(2L));
    // BUG: filtering on c drops id=1 (the backfilled row). Expected [1, 2], actual [2].
    assertThat(sql("SELECT id FROM %s WHERE c = 'US' ORDER BY id", tableName))
        .containsExactly(row(1L), row(2L));        // fails today: only [2]
    assertThat(sql("SELECT id FROM %s WHERE c IS NOT NULL ORDER BY id", tableName))
        .containsExactly(row(1L), row(2L));        // fails today: only [2]
  }
}
```

#### Expected behavior

| Query | Expected |
|---|---|
| `SELECT id, c` | `(1,US) (2,US)` |
| `WHERE c = 'US'` | `[1, 2]` |
| `WHERE upper(c) = 'US'` | `[1, 2]` |
| `WHERE c IS NOT NULL` | `[1, 2]` |
| `WHERE c IS NULL` | `[]` |

#### Actual behavior

| Query | Actual |
|---|---|
| `SELECT id, c` | `(1,US) (2,US)` ✅ |
| `WHERE c = 'US'` | `[2]` ❌ |
| `WHERE upper(c) = 'US'` | `[2]` ❌ |
| `WHERE c IS NOT NULL` | `[2]` ❌ |
| `WHERE c IS NULL` | `[]` |

The full scan proves the default is materialized for `id=1`; the filtered queries drop it whenever the
predicate references `c`.

#### Root cause

The default is injected in the per-format reader (`BaseParquetReaders`, via
`NestedField.initialDefault()`) **after** record-level filtering. The read applies the residual as a
record filter (`Parquet.ReadBuilder.filterRecords=true` → `.useRecordFilter(...)`, `.filter(residual)`
in `BaseRowReader`). For a file physically missing `c`, the record filter reads `c` as null, so
`c = 'US'` — and the `IsNotNull(c)` Spark infers for any null-intolerant predicate (which is why even
the un-pushable `upper(c) = 'US'` drops rows) — matches nothing, and every record is dropped *before*
the default is applied. This mirrors why partition columns are immune: those are folded out of the
per-file residual by `ResidualEvaluator`; `initial-default` columns are not.

Manifest pruning is not the cause (a column with no file metrics returns `ROWS_MIGHT_MATCH`, so the
file is kept). The drop is the reader-side record filter.

#### Environment

- Iceberg: `1.11.0` and `1.10.1` releases, and `main` @ `c00669fde`
- Spark: 3.5, 4.0, 4.1 (all reproduce), Parquet, Hadoop catalog, format-version 3

#### Possible fix

Fold an absent-with-default column out of the per-file residual the way partition constants already
are — substitute the `initialDefault` literal for any field absent from the file's physical schema and
constant-fold (`c='US'`→true, `c='CA'`→false, `IsNotNull(c)`→true, `IsNull(c)`→false). Done in the
format read builders (where the file schema is known) this is engine-agnostic and preserves manifest
pruning for files that contain the column. Happy to put up the PR.

### Willingness to contribute

- [x] I can contribute a fix for this bug independently
- [ ] I would be willing to contribute a fix for this bug with guidance from the Iceberg community
- [ ] I cannot contribute a fix for this bug at this time

Query	Expected
`SELECT id, c`	`(1,US) (2,US)`
`WHERE c = 'US'`	`[1, 2]`
`WHERE upper(c) = 'US'`	`[1, 2]`
`WHERE c IS NOT NULL`	`[1, 2]`
`WHERE c IS NULL`	`[]`

Query	Actual
`SELECT id, c`	`(1,US) (2,US)` ✅
`WHERE c = 'US'`	`[2]` ❌
`WHERE upper(c) = 'US'`	`[2]` ❌
`WHERE c IS NOT NULL`	`[2]` ❌
`WHERE c IS NULL`	`[]`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v3 initial-default rows are silently dropped when a query filters on the defaulted column #16690

Apache Iceberg version

Query engine

Please describe the bug 🐞

Repro

Expected behavior

Actual behavior

Root cause

Environment

Possible fix

Willingness to contribute

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

v3 initial-default rows are silently dropped when a query filters on the defaulted column #16690

Description

Apache Iceberg version

Query engine

Please describe the bug 🐞

Repro

Expected behavior

Actual behavior

Root cause

Environment

Possible fix

Willingness to contribute

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions