[Feature] Support Spark expression: parse_to_timestamp

## What is the problem the feature request solves?

> **Note:** This issue was generated with AI assistance. The specification details have been extracted from Spark documentation and may need verification.

Comet does not currently support the Spark `parse_to_timestamp` function, causing queries using this function to fall back to Spark's JVM execution instead of running natively on DataFusion.

ParseToTimestamp is a Spark Catalyst expression that converts string, date, timestamp, or numeric values to a timestamp data type. It supports optional format specifications for parsing string inputs and provides timezone-aware conversion capabilities with configurable error handling behavior.

Supporting this expression would allow more Spark workloads to benefit from Comet's native acceleration.

## Describe the potential solution

### Spark Specification

**Syntax:**
```sql
to_timestamp(timestamp_str[, format])
```

```scala
// DataFrame API usage
df.select(to_timestamp($"timestamp_column"))
df.select(to_timestamp($"timestamp_column", "yyyy-MM-dd HH:mm:ss"))
```

**Arguments:**
| Argument | Type | Description |
|----------|------|-------------|
| left | Expression | The input expression to convert to timestamp |
| format | Option[Expression] | Optional format string for parsing input |
| dataType | DataType | Target timestamp data type |
| timeZoneId | Option[String] | Optional timezone identifier for conversion |
| failOnError | Boolean | Whether to fail on conversion errors (defaults to ANSI mode setting) |

**Return Type:** Returns a timestamp data type as specified by the `dataType` parameter, typically `TimestampType` or `TimestampNTZType`.

**Supported Data Types:**
- StringType with collation support (including trim collation)
- DateType
- TimestampType  
- TimestampNTZType
- NumericType (only when target dataType is TimestampType)

**Edge Cases:**
- Null inputs are handled gracefully and typically return null outputs
- Invalid format strings will cause runtime errors when `failOnError` is true
- Unparseable timestamp strings behavior depends on ANSI mode settings
- Numeric inputs are interpreted as seconds since epoch when converting to TimestampType
- Timezone conversion edge cases (DST transitions) are handled according to Java timezone rules

**Examples:**
```sql
-- Basic timestamp parsing
SELECT to_timestamp('2016-12-31 00:00:00');

-- With custom format
SELECT to_timestamp('12/31/2016 00:00:00', 'MM/dd/yyyy HH:mm:ss');

-- Converting date to timestamp
SELECT to_timestamp(current_date());
```

```scala
// DataFrame API usage
import org.apache.spark.sql.functions._

// Basic conversion
df.select(to_timestamp($"timestamp_str"))

// With format specification
df.select(to_timestamp($"date_str", "yyyy-MM-dd"))

// Converting numeric epoch seconds
df.select(to_timestamp($"epoch_seconds"))
```

### Implementation Approach

See the [Comet guide on adding new expressions](https://datafusion.apache.org/comet/contributor-guide/adding_a_new_expression.html) for detailed instructions.

1. **Scala Serde**: Add expression handler in `spark/src/main/scala/org/apache/comet/serde/`
2. **Register**: Add to appropriate map in `QueryPlanSerde.scala`
3. **Protobuf**: Add message type in `native/proto/src/proto/expr.proto` if needed
4. **Rust**: Implement in `native/spark-expr/src/` (check if DataFusion has built-in support first)


## Additional context

**Difficulty:** Medium
**Spark Expression Class:** `org.apache.spark.sql.catalyst.expressions.ParseToTimestamp`

**Related:**
- `GetTimestamp` - Underlying expression for formatted parsing
- `Cast` - Underlying expression for unformatted conversion
- `ParseToDate` - Similar expression for date parsing
- `UnixTimestamp` - Converting to Unix timestamp format

---
*This issue was auto-generated from Spark reference documentation.*


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature] Support Spark expression: parse_to_timestamp #3109

What is the problem the feature request solves?

Describe the potential solution

Spark Specification

Implementation Approach

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Argument	Type	Description
left	Expression	The input expression to convert to timestamp
format	Option[Expression]	Optional format string for parsing input
dataType	DataType	Target timestamp data type
timeZoneId	Option[String]	Optional timezone identifier for conversion
failOnError	Boolean	Whether to fail on conversion errors (defaults to ANSI mode setting)

[Feature] Support Spark expression: parse_to_timestamp #3109

Description

What is the problem the feature request solves?

Describe the potential solution

Spark Specification

Implementation Approach

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions