Skip to content

Support StringSplitSQL to enable split_part #4561

@andygrove

Description

@andygrove

Background

split_part(str, delimiter, partNum) currently falls back to Spark. It is RuntimeReplaceable and lowers to element_at(StringSplitSQL(str, delimiter), partNum). Comet already supports element_at, but it does not support the inner StringSplitSQL expression, so the whole function falls back with stringsplitsql is not supported.

StringSplitSQL differs from StringSplit (the split function) in that it splits on a literal string rather than a regex.

Proposal

Add a serde for StringSplitSQL (a native string split on a literal delimiter). This would enable split_part to run natively, since element_at over the resulting array is already supported.

Acceptance criteria

  • StringSplitSQL executes natively and matches Spark.
  • split_part no longer falls back; add SQL file test coverage.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions