Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

String matching using wildcards #85

Open
afs opened this issue May 3, 2019 · 9 comments
Open

String matching using wildcards #85

afs opened this issue May 3, 2019 · 9 comments
Labels
function More custom function improvements or inbuilds

Comments

@afs
Copy link
Collaborator

afs commented May 3, 2019

Regular expressions can be complex. Strings with wildcards are simpler.

Proposed solution

Provide string matching using wildcards as an additional, alternative to regular
expressions by adding a new function. The string is anchored.

Examples:

MATCH(?string, "abc*")

MATCH(?string, "*abc*", "i") # Case insentive.

MATCH(?string, "a?c")

Previous work

Glob patterns
SQL LIKE
Lucene wildcard searches

Considerations for backward compatibility

None.

@afs afs added enhancement New feature or request function More custom function improvements or inbuilds query Extends the Query spec and removed enhancement New feature or request query Extends the Query spec labels May 3, 2019
@lisp
Copy link
Contributor

lisp commented May 5, 2019

why?

@ktk
Copy link

ktk commented May 8, 2019

@lisp that is a very common scenario in the real world and right now I have to look up every time how I can do it with regex. I teach SPARQL on a regular base as well, that would definitely facilitate simple string-matches for users.

@cygri
Copy link

cygri commented May 8, 2019

If this were added, it should be a different name. When explaining SPARQL, one constantly has to talk about matching—and it usually means matching graph patterns against triples. Having a function called “match” that uses the word in a different sense does not help.

Some possible other names:

FILTER wildcard(?title, "*sparql*", "i")
FILTER like(?title, "*sparql*", "i")

This could also be combined with #34:

?doc :title ~"*sparql*"i.

@dbooth-boston
Copy link
Collaborator

dbooth-boston commented May 8, 2019

It is called glob in several other languages.

@lisp
Copy link
Contributor

lisp commented May 9, 2019

if the goal is succinctness, it makes sense to go all the way to something like

?doc :title ~"*sparql*"i.

but

  • how would this sort of syntax permit bindings?
  • how would something similar apply to other structured value domains, such as those for temporal values?

@VladimirAlexiev
Copy link
Contributor

Shex has a similar construct called Stem.
It works for strings, IRIs (also prefixed) and lang tags

@afs
Copy link
Collaborator Author

afs commented Dec 10, 2022

Shex Stem is fn:starts-with / STRSTARTS.

@afs
Copy link
Collaborator Author

afs commented Dec 10, 2022

I agree "match" is already used for graph patterns,. It is also valuable a as a keyword.

LIKE is good depending on the SQL implications (SQL uses _ and % for what is commonly * and ? in shells and filename matching); SQL LIKE also has character classes and negated character classes.

Filename matching with glob matching, where * means any character except the component separator, and some systems (e.g. git) add ** to mean "filename, any depth".

Possibilities:

  • STRMATCH
  • LIKE
  • GLOB
  • WILDCARD
  • . . .

The other choice s what matching language.

SQL LIKE can be rewritten to a regex expression and there are code examples for that online.

In Java, there a few open source direct implementations with * and ?, but not [ ] character classes. (The JDK supports glob on filenames, not directly for strings).

@ktk
Copy link

ktk commented Dec 11, 2022

I think being close to SQL is not a bad idea so I like the LIKE idea. If I would not know about that I would go for STRMATCH as it resembles other functions in SPARQL but then again it might add to the confusion. LIKE is unique in that sense.

And while I like the GLOB idea I have to agree that I mainly know it for file matching.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
function More custom function improvements or inbuilds
Projects
None yet
Development

No branches or pull requests

6 participants