Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] replace, tokenize and analyze-string must throw when pattern matches an empty string #3803

Closed
PieterLamers opened this issue Apr 6, 2021 · 8 comments · Fixed by #4864
Assignees
Labels
bug issue confirmed as bug xquery issue is related to xquery implementation
Milestone

Comments

@PieterLamers
Copy link

PieterLamers commented Apr 6, 2021

Describe the bug

fn:replace, fn:tokenize and fn:analyze-string allow to use a pattern that matches an empty string. That results in odd behaviour as in this example:

replace( '12.34' , '^\D*', '')

The provided pattern does always match because it matches the empty string.

Expected behavior

Error FORX0003 is thrown with location information

Actual

The first character is swallowed: 2.34

To Reproduce

replace( '12.34' , '^\D*','')  

or

replace( '12.34' , '^[^0-9]*','')

will return 2.34 instead of the desired 12.34

Reference

This used to have a different behaviour in earlier versions of existdb where a pattern that matched an empty string would just return the input unchanged (likely related to #3530).

fn:replace specification

XQTS 31 tests

Context (please always complete the following information):

  • OS: Windows10
  • eXist-db version: 5.3.0-snapshot 55e77cc
  • Java Version 1.8.0.281
@line-o
Copy link
Member

line-o commented Apr 6, 2021

This might be a change in behaviour again.

replace("$11.23", "^[^0-9]*(.*)$", "$1")

is the correct form to write the replacement.

@line-o
Copy link
Member

line-o commented Apr 6, 2021

Looking at it again, I am unsure. This looks like a bug. But

  • replace('12.34' , '^\D','')
  • replace('12.34' , '^[^0-9]','')
    both do not match and therefore do not replace a single character at the beginning.

@line-o
Copy link
Member

line-o commented Apr 6, 2021

So replace($may-start-with-currency-symbol, "^\D+", "") might be the easiest solution.

@line-o
Copy link
Member

line-o commented Apr 6, 2021

related #3530

@line-o
Copy link
Member

line-o commented Apr 6, 2021

BaseX 9.5 will raise error [FORX0003] Pattern matches empty string. for the original pattern "^[^0-9]*".

@line-o
Copy link
Member

line-o commented Apr 6, 2021

Saxon 10.0 (HE) also throws the same error.
So all in all this is definitely a bug, because no error is thrown!

@line-o line-o added bug issue confirmed as bug xquery issue is related to xquery implementation labels Apr 6, 2021
@line-o
Copy link
Member

line-o commented Apr 6, 2021

@PieterLamers would you or may I edit the bug description to reflect the new findings? Or should I open a separate one?

@PieterLamers
Copy link
Author

PieterLamers commented Apr 6, 2021

Hi @line-o , thanks for the explanations! You are welcome to edit the ticket. I think I will simply replace the * by a + to avoid the error.

@line-o line-o changed the title [BUG] regexp error in replace#3 [BUG] replace and analyze-string must throw when pattern matches an empty string Apr 7, 2021
@line-o line-o changed the title [BUG] replace and analyze-string must throw when pattern matches an empty string [BUG] replace, tokenize and analyze-string must throw when pattern matches an empty string Apr 7, 2021
@joewiz joewiz added this to the eXist-5.2.1 milestone Apr 11, 2021
@line-o line-o modified the milestones: eXist-5.2.1, eXist-6.0.0 Jun 28, 2021
@adamretter adamretter modified the milestones: eXist-6.0.0, eXist-7.0.0 Feb 14, 2022
@adamretter adamretter self-assigned this Apr 10, 2023
adamretter added a commit to evolvedbinary/exist that referenced this issue Apr 10, 2023
adamretter added a commit to evolvedbinary/exist that referenced this issue Apr 10, 2023
adamretter added a commit to evolvedbinary/exist that referenced this issue Apr 10, 2023
marmoure pushed a commit to evolvedbinary/exist that referenced this issue May 12, 2023
line-o pushed a commit to eXistSolutions/exist that referenced this issue Jun 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug issue confirmed as bug xquery issue is related to xquery implementation
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants