Skip to content

Conversation

@andygrove
Copy link
Member

@andygrove andygrove commented Jan 23, 2026

Documentation Updates

User Guide

  • Iceberg Guide (docs/source/user-guide/latest/iceberg.md):

    • Added "Supported features" section documenting what the native Iceberg reader supports:
      • Table spec versions (v1, v2)
      • Schema and data types (primitives, UUID, complex types, schema evolution)
      • Time travel (VERSION AS OF, branch reads)
      • MOR delete handling (positional, equality, mixed)
      • Filter pushdown operations
      • Partitioning with various transforms
      • Storage backends (local, HDFS, S3)
    • Added REST catalog configuration example
    • Improved "Current limitations" section with clearer explanations
  • Data Sources (docs/source/user-guide/latest/datasources.md):

    • Updated CSV section to document experimental native CSV scan support (spark.comet.scan.csv.v2.enabled)
  • Expressions (docs/source/user-guide/latest/expressions.md):

    • Added missing expressions to the supported expressions list

Contributor Guide

  • Parquet Scans (docs/source/contributor-guide/parquet_scans.md):

    • Updated fallback behavior description (auto mode falls back to Spark, not native_comet)
    • Fixed incomplete sentence in S3 Support section
  • Roadmap (docs/source/contributor-guide/roadmap.md):

    • Updated Iceberg Integration section to reflect that auto mode now uses native_iceberg_compat
    • Updated native_comet removal section to indicate the milestone has been achieved
    • Fixed typo ("originally" → "original")

andygrove and others added 2 commits January 23, 2026 10:12
Add documentation for expressions that were implemented but not
documented in the supported expressions list:

- Left (string function)
- DateDiff, DateFormat, LastDay, UnixDate, UnixTimestamp (date/time)
- Sha1 (hashing)
- JsonToStructs (struct)

Also fixes TruncTimestamp SQL from `trunc_date` to `date_trunc`.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@andygrove andygrove marked this pull request as draft January 23, 2026 18:06
@andygrove andygrove changed the title docs: add missing expressions to user guide docs: various documentation updates in preparation for next release Jan 23, 2026
andygrove and others added 6 commits January 23, 2026 11:06
…to native_comet

Updates contributor guide documentation following the change in c9af2c6:
- parquet_scans.md: clarify that auto mode falls back to Spark's native scan, not native_comet
- roadmap.md: reflect that the switch to native_iceberg_compat for auto mode is complete

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Adds documentation showing how to configure Comet's native Iceberg scan
with a REST catalog, including example Spark configuration and sample
queries demonstrating namespace creation.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Adds comprehensive documentation of native Iceberg reader capabilities:
- Table spec versions (v1, v2 supported; v3 falls back)
- Schema and data types including complex types and schema evolution
- Time travel and branch reads
- MOR table delete handling (positional and equality deletes)
- Filter pushdown operations
- Partitioning with various transform types
- Storage backends (local, HDFS, S3)

Also improves the limitations section with clearer explanations.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The datasources.md previously stated that Comet does not provide native
CSV scan, but experimental native CSV support was added in commit f538424.
Updated to reflect the new spark.comet.scan.csv.v2.enabled option.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- parquet_scans.md: complete the truncated sentence in S3 Support section
- roadmap.md: update native_comet removal section to reflect that auto
  mode now uses native_iceberg_compat (milestone achieved)
- roadmap.md: fix typo "originally" -> "original"

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@andygrove andygrove marked this pull request as ready for review January 23, 2026 18:31
@mbutrovich mbutrovich self-requested a review January 23, 2026 19:20
Copy link
Contributor

@mbutrovich mbutrovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @andygrove!

@mbutrovich mbutrovich merged commit 64ebfcc into apache:main Jan 23, 2026
3 checks passed
@andygrove andygrove deleted the docs-update-expressions-list branch January 23, 2026 19:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants