Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems with Regex when using SPARQL REPLACE function with certain characters #415

Open
qtips opened this Issue May 26, 2015 · 1 comment

Comments

Projects
None yet
2 participants
@qtips
Copy link

qtips commented May 26, 2015

Running the following SPARQL for replacing %c3%85 with the letter Å runs as expected

    select REPLACE("%c3%85-XYZ-%20%28-DEF-%29","%C3%85", "Å", 'i') where {}
    #Result: Å-XYZ-%20%28-DEF-%29

However, when using nested REPLACE statements with an outer replace having a regex with ., the replace function "jumps" back one character where the match is found :

select REPLACE(
         REPLACE("%c3%85-XYZ-%20%28-DEF-%29","%C3%85", "Å", 'i'), 
      "%..(%..)*", "?", 'i')
where {}
    # Result:      Å-XYZ?8-DEF?9
    # Expected: Å-XYZ-?-DEF-?

This only happens for some replace characters including all ofÆØÅæøå

Workaround for this is to run a CONCAT before the second REPLACE, which seems to "reset" the string before sending it to next REPLACE:

select REPLACE(
                   CONCAT(REPLACE("%c3%85-XYZ-%20%28-DEF-%29","%C3%85", "Å", 'i'),""), 
      "%..(%..)*", "?", 'i')
where {}
    # Result: Å-XYZ-?-DEF-?

This was tested using Virtuoso version 07.20.3212 on Linux (x86_64-unknown-linux-gnu), Single Server Edition with Virtuoso SPARQL Query Editor

@jindrichmynarz

This comment has been minimized.

Copy link

jindrichmynarz commented Nov 20, 2016

I stumbled upon a similar issue. When I use a non-ASCII character in the replacement, it ends up broken:

SELECT (REPLACE(".", "\\.", "á") AS ?s)
WHERE {}
# Result: á
# Expected: á 

However, when STR() is applied to the replacement, what should be in theory a no-op delivers the correct result:

SELECT (REPLACE(".", "\\.", STR("á")) AS ?s)
WHERE {}
# Result: á
# Expected: á 

Unfortunately, this work-around doesn't work for all non-ASCII characters:

SELECT (REPLACE(".", "\\.", STR("š")) AS ?s)
WHERE {}
# Result: ?
# Expected: š

This seems to be a general problem of STR(), which I've filed as #609.

Tested using Virtuoso version 07.20.3217.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.