translators: add Apache Iceberg schema translator

## Problem

Iceberg is the dominant open table format adjacent to every existing Spark/Databricks target in daco, but there is no translator for it. Iceberg uses its own JSON schema serialization with explicit, mandatory field IDs (monotonic, deterministic) and v3-only types (`variant`, `geometry`, `geography`, `timestamp_ns`, `timestamptz_ns`, `unknown`) — none of which are emitted by the existing `databrickssql` / `sparksql` / `databrickspyspark` translators.

A user authoring an OpenDPI port today cannot generate an Iceberg schema; they have to translate to Spark SQL DDL and lose Iceberg-specific information (field IDs, v3 types).

## Proposed change

New package `internal/translate/iceberg/` following the `avro` pattern (resolver + JSON marshal in `Translate`, no `text/template` — Iceberg schemas are structured JSON):

- `translator.go` — implements `translate.Translator`. `FileExtension` returns `.json`. `Translate` calls `translate.Prepare(...)` then marshals to the Iceberg schema JSON shape, assigning field IDs sequentially in property order (`Prepare` already preserves that order, so output is deterministic across runs).
- `resolver.go` — implements `translate.TypeResolver`:
  - `PrimitiveType`: `string`→`string`, `integer`→`long` (narrowed in `EnrichField` via `Constraints.Minimum`/`Maximum` to `int` where it fits), `number`→`double` (or `decimal(P,S)` when `Constraints.MultipleOf` is a decimal fraction), `boolean`→`boolean`, `format:date`→`date`, `format:date-time`→`timestamptz`, `format:time`→`time`, `format:uuid`→`uuid`.
  - `ArrayType(elem)` → `list<elem>` (marker form, materialized in `Translate`).
  - `MapType(k,v)` → `map<k,v>` (marker form).
  - `RefType`/`FormatDefName` → `PascalCase(defName)`, must agree (per `.claude/rules/translators.md`).
- `EnrichField`: integer narrowing (lift `inferIntegerType` from `databrickspyspark/resolver.go` into shared `internal/translate` so it isn't duplicated); decimal precision/scale from `MultipleOf` (lift `computeDecimalScale`/`computeDecimalPrecision` similarly).
- Field IDs assigned in `Translate` via a counter threaded through marshal — IDs go in `data.Extra` if needed but are simplest computed inline at marshal time. `Prepare`/`SchemaData` shape unchanged.
- Register in `cmd/daco/internal/app.go` `registerTranslators` as `iceberg`.

V3-only types (`variant`, `geometry`, `geography`, `timestamp_ns`) are out of scope for the initial PR — JSON Schema doesn't natively express them, so they need a daco-side hint mechanism that should be designed separately. The translator should emit v2-compatible output by default.

## References

- [Apache Iceberg Spec (top-level)](https://iceberg.apache.org/spec/)
- [iceberg/format/spec.md — primitive types and version table](https://github.com/apache/iceberg/blob/main/format/spec.md)
- [Dremio — What's New in Apache Iceberg V3](https://www.dremio.com/blog/apache-iceberg-v3/)
- [Snowflake Engineering — Apache Iceberg V3 Variant Type](https://www.snowflake.com/en/engineering-blog/apache-iceberg-v3-variant-type/)
- `.claude/rules/translators.md` in this repo — package layout, `EnrichField` mutation order, `RefType`/`FormatDefName` symmetry rule

## Test cases

Following the shape in `internal/translate/pyspark/translator_test.go`:

1. **Simple object — sequential field IDs and root naming**

   Input: `{type:object, properties:{name:{type:string}, age:{type:integer}}}`

   Expected (substring asserts):
   ```json
   {
     "type": "struct",
     "schema-id": 0,
     "fields": [
       {"id": 1, "name": "name", "required": false, "type": "string"},
       {"id": 2, "name": "age", "required": false, "type": "long"}
     ]
   }
   ```

2. **Required vs optional** — `required: ["name"]` → `"required": true` for `name`, `false` for `age`.

3. **Decimal from `multipleOf`** — `{type:number, multipleOf:0.01, minimum:0, maximum:99999.99}` → `"type": "decimal(7, 2)"`.

4. **Integer narrowing** — `{type:integer, minimum:-128, maximum:127}` → `"type": "int"` (Iceberg has no smaller int; narrows from `long`).

5. **Date/time/uuid formats** — `format:date` → `"type": "date"`; `format:date-time` → `"type": "timestamptz"`; `format:uuid` → `"type": "uuid"`.

6. **Arrays** — `{type:array, items:{type:string}}` → `"type": {"type": "list", "element-id": N, "element": "string", "element-required": ...}`.

7. **`$ref` + `$defs`** — verifies `RefType`/`FormatDefName` agreement: a referenced def is emitted as a nested `struct` with its own field IDs continuing the global counter.

8. **Inline nested object** — auto-extracted by `Prepare` to a synthetic def named after the field in PascalCase; emitted as a nested struct with continued IDs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

translators: add Apache Iceberg schema translator #100

Problem

Proposed change

References

Test cases

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

translators: add Apache Iceberg schema translator #100

Description

Problem

Proposed change

References

Test cases

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions