KSQL has some basic scalar functions to handle string manipulation like LCASE, SUBSTR, SLICE, SPLIT etc.. but doesn’t currently have an implementation for capturing the power of regular expressions.
- Oracle: REGEXP_SUBSTR, REGEXP_INSTR, and REGEXP_REPLACE
- MySQL: has a number of regular expression functions
- Postgres: has a number of regular expression functions
These UDFs could support the following implementations (heavily borrowed from MySQL):
REGEXP
Return BOOLEAN if the pattern matches
REGEXP(
string - string to match,
pattern - regular expression pattern,
[position - position in the string to start the search,
occurrence - which occurrence to match, default 1st,
match_type - how to perform matching:
c: case sensitive,
i: case insensitive
]
)
REGEXP_SUBSTR
Return the substring matching the regular expression
REGEXP_SUBSTR(
string - string to match,
pattern - regular expression pattern,
[position - the position in the string to start the search (default 0),
occurrence - which occurrence to match (defualt 1st),
match_type - how to perform matching:
c: case sensitive,
i: case insensitive,
]
)
REGEXP_INSTR
Returns starting index of substring matching the index
REGEXP_REPLACE
Replace substrings matching regular expression