Skip to content

Implement native function of overlay(70) #1539

@slfan1989

Description

@slfan1989

Title

Implement Native Vectorized overlay Function for Replacing Substrings in Strings

Abstract

Introduce a native, vectorized overlay function to implement substring replacement functionality in strings. This function replaces a specified portion of a string with a given substring, aligning with Apache Spark SQL semantics. It supports common string types and optimizes performance on large datasets.

Background and Motivation

overlay is a commonly used string manipulation function in Spark SQL, designed to replace a portion of a string at a specified position. It is widely used in text processing and data cleaning scenarios. By leveraging DataFusion's native support, we aim to implement the vectorized overlay function for Spark, enhancing performance and reducing resource consumption.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions