You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We need a way to optimize case insensitive regex filters like FILTER regex(?var, "astr", "i") matching of Astronaut.
This could be solved by always doing a case insensitive sort but I think that might be against the spec (e.g. this SO answer suggests it should be case sensitive). Then again looking at the spec and the linked xpath function it seems to me, that we are free to use any default collation strategy which would allow us to do case-insensitive sort.
The text was updated successfully, but these errors were encountered:
This is now supported through building a case insensitive index with #209. It remains an open issue to be able to use both case sensitive and case insensitive prefix search with the same index
I have been giving this some thought:
What you are missing is easy to implement, because the case-sensitive Prefix filter is a subset
of the case-insensitive prefix filter (If you sort the values that are the same when ignoring the case according to their casing. Every useful collation strategy should ensure this).
However I would also like to support diacritic-agnostic filtering and I in general don't like my current solution (It is really hacky with Uppercasing and lowercasing).
What I would really like to do is introduce boost_locale as a dependency which correctly supports all the features we would need to do all this collation stuff properly.
IMHO this dependency is justified since
a) Properly handling international (Including handling german Umlauts etc.) should be a priority.
b) Some of the things we need (e.g. find the range of all strings that have the same prefix) cannot be done by std::locale in a way that is correct and portable.
(Some background: Unicode exactly supports what we want, a collation that sorts first by the character 'value' (e) , then by added accents(ée) and then by the case (EeÉé). std::locale does not support only performing the first one or two of those comparison steps since it provides a generic interface that also has to support non-unicode locales.
We need a way to optimize case insensitive regex filters like
FILTER regex(?var, "astr", "i")
matching ofAstronaut
.This could be solved by always doing a case insensitive sort but I think that might be against the spec (e.g. this SO answer suggests it should be case sensitive). Then again looking at the spec and the linked xpath function it seems to me, that we are free to use any default collation strategy which would allow us to do case-insensitive sort.
The text was updated successfully, but these errors were encountered: