Adding postgres json binary operators for nested data strcutures

### Is your feature request related to a problem or challenge?

Hello community,

I have been thinking about adding [Postgres style JSON operators](https://www.postgresql.org/docs/18/functions-json.html#FUNCTIONS-JSON) for nested data structures, mostly for `Struct` and `List`. These operators include:

- `->`, `->>`, `#>>`: Access data field by index/key
- `@>`, `<@`, `?`, `?|`, `?&`: Containment testing
- `||`, `-`, `#-`: Data structure manipulation
- `@?`, `@@`: Predicate testing

### Describe the solution you'd like

Just want to make sure I'm in the right direction.

1. I assume we won't have built-in `json` type in datafusion, so these operators will be implemented directly on `Struct`, `List` and other `json`-like primitives directly, following postgres' semantics of them. I noticed we have VARIANT coming to arrow/datafusion, will we have a new `DataType` for `VARIANT`? If so, it will be good option for input and return type of these operators.
2. At the moment, we don't have support for operators on nested data structure and primitives. If the left input is nested, we will assume the right array is nested too, and perform compare operators recursively: https://github.com/apache/datafusion/blob/531af8e43ae3563da2a3c5ef35b2241d3ce2d621/datafusion/physical-expr/src/expressions/binary.rs#L254-L259 I will need to change this behavior. 
3. Some of the operators may create dynamic results if the right input is Array. For example, if the right array of `->` is `["a", "b", "c"]`, it is expected to return 3 different data types in result set which breaks our type system. So for these `>` operators, I'm going to support scalar version only.
4. Also I expected less strict check in https://github.com/apache/datafusion/blob/531af8e43ae3563da2a3c5ef35b2241d3ce2d621/datafusion/physical-expr/src/expressions/binary.rs#L285 because these `>` will create dynamic return types.

Some of the kernels are going to be implemented in `arrow-rs` first, and integrate into datafusion.

Let me know if these changes will make sense, and align with our previous plan if any. And I will start to send pull requests on both repos.

### Describe alternatives you've considered

_No response_

### Additional context

_No response_

	if left_data_type.is_nested() {
	if !left_data_type.equals_datatype(&right_data_type) {
	return internal_err!("Cannot evaluate binary expression because of type mismatch: left {}, right {} ", left_data_type, right_data_type);
	}
	return apply_cmp_for_nested(self.op, &lhs, &rhs);
	}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Adding postgres json binary operators for nested data strcutures #18210

Is your feature request related to a problem or challenge?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Adding postgres json binary operators for nested data strcutures #18210

Description

Is your feature request related to a problem or challenge?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions