Skip to content

Regex SPARQL filter

skomlaebri edited this page Feb 6, 2014 · 13 revisions

Ontop supports the regex SPARQL filter (SPARQL follows the XPath, XQuery Regex syntax) by translating into similar filter expressions in sql. As regular expression filtering is not part of standard SQL, the use of the regex filter in SPARQL expressions therefore depends on the underlying database. (Note that the SQL LIKE filter only supports a limited form of pattern matching, where % and _ are wildcards.). Most notably, for DB2 and MS SQLServer we have not found a proper translation, and the regex filter will not work, except for matching single words. For the other databases, regular expression matching is supported, including (an approximation of) case insensitive matching.

  • Mysql: Case-insensitive matching is probably more efficient than the SPARQL default case-sensitive search. Multiline mode is not supported.

  • H2: Follows Java language regular expression rules. Putting (?i) at the start of the regex makes it case insensitive, (?m) is the equivalent of Pattern.MULTILINE, (?s) equals Pattern.DOTALL.

  • Postgres: Use POSIX regular expressions operators and Multiline mode is supported using ARE Embedded-Option Letters as in Tcl (w for multiline, p for dotall), see Postgres manual for more references .

  • Oracle: Multiline (flag m), dotall (flag n), and case-insensitive (flag c/i) search supported thorugh the REGEX_LIKE operator.

  • DB2 and SqlServer: No built-in support for regular expression filtering in the database engine. The regex SPARQL filter is translated to LIKE, which means the filter only works properly for single words. We use the wildcard % at the beginning and at the end of the pattern to search for a string that contains and not only match the pattern.

Clone this wiki locally