Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regex operations #334

Open
NicoLaval opened this issue May 20, 2024 · 4 comments
Open

Regex operations #334

NicoLaval opened this issue May 20, 2024 · 4 comments

Comments

@NicoLaval
Copy link
Collaborator

@noahboerger you reported:

  • Different pattern syntax
  • Trevas is using the Java pattern syntax

Could you precise please?

@NicoLaval NicoLaval added question Further information is requested Needs more information and removed question Further information is requested labels May 20, 2024
@noahboerger
Copy link
Collaborator

noahboerger commented May 24, 2024

The testcase this note is raised from is from the BdI testcases the one under "string/pattern_replacement_3".

There the replace function is called with the pattern [a-e-i-o-u] but wanting to only replace the letters a, e, i, o, u.
This pattern seems to be weird out of my point of view so i transformed it to the pattern [a|e|i|o|u] to get the expected result.

It was more a note on my side, that maybe the engine of BdI and Trevas may be using a different pattern syntax or something is wrong with this testcase itself. Nothing that should be adjusted in Trevas.

So i would propose to close this issue.

@hadrienk
Copy link
Collaborator

What does the spec says about the regexp syntax?

@noahboerger
Copy link
Collaborator

The reference manual of match_characters provides the following information (p. 116):

match_characters returns TRUE if op matches the regular expression regexp, FALSE otherwise. The string regexp is an Extended Regular Expression as described in the POSIX standard. Different implementations of VTL may implement different versions of the POSIX standard therefore it is possible that match_characters may behave in slightly different ways.

for replace no explicit reference to a pattern standard seems to be made and also the examples are only containing simple string values.

@NicoLaval
Copy link
Collaborator Author

It's a problem.

I opened an issue in the TF repo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants