Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regex for variable definition fails on string which includes semi colon or closing bracket #35

Closed
kdorsel opened this issue Jun 29, 2020 · 6 comments · Fixed by #38
Closed
Assignees
Labels
bug Something isn't working
Milestone

Comments

@kdorsel
Copy link
Contributor

kdorsel commented Jun 29, 2020

Losing the initialization portion if there's a semicolon inside the string
image

My initial idea was negative lookahead in the Tokenize method.

+ $@"{unit_pattern}{possible_space}{initialization}(?:;){comment}";

(?:;(?!.*;))
This negative lookahead would fail if there's a semi colon in the comment portion of the line also.

But that gives me even weirder results with the assignment operator now showing up.
image

@Roald87
Copy link
Owner

Roald87 commented Jun 30, 2020

Thanks for reporting! I earlier already noticed strange behavior when there is a closing bracket in the string ')'. You can see the failing ones and a few working ones here https://regex101.com/r/HN9oLJ/3

I tried the following to fix it, but it fails on different ones:

  1. Replace \w+\(.*\) with \w+\([^)]\) then WriteContents: STRING(1) := ')'; passes, but fbSample : FB_Sample(nId_Init := 11, fIn_Init := 33.44) ; fails https://regex101.com/r/HN9oLJ/6.
  2. Replace \w+\(.*\) with \w+\(.?\) then WriteContents: STRING(1) := ')'; passes, but fbSample : FB_Sample(nId_Init := 11, fIn_Init := 33.44) ; also fails https://regex101.com/r/HN9oLJ/7.

I also tried with your negative lookahead but that also doesn't lead to the desired result https://regex101.com/r/HN9oLJ/4

It's a though one.

@kdorsel
Copy link
Contributor Author

kdorsel commented Jun 30, 2020

Ok, this was an interesting one!

This seems to work. Add a negative look ahead/behind for the quotes
(?<!['"]) and (?!["'])
and also using the negative lookahead for the semi colon (?:;(?!.*;)). This will still fail is there's a semi colon in the comment.

I also removed the (?s) single line modifier. I'm not too sure the purpose of this one... But if needed the negative look ahead just needs to be modified to not match new lines with the dot.
https://regex101.com/r/HN9oLJ/8

This is my attempt to solve the semi colon in the comment, but still needs some work...
https://regex101.com/r/HN9oLJ/9

Any other edge cases will surely pop up as needed 😆

@Roald87
Copy link
Owner

Roald87 commented Jun 30, 2020

Yeah its a tricky one 😁 . That's why the tests are so convenient. It often happens that a small change of the regex pattern can have very large consequences for other variable declarations.

What I usually do is:

  1. Add a failing test and run it.
  2. Use regex101.com and/or https://www.debuggex.com/ to find the right pattern with a few examples
  3. Run the tests again. If it fails go to two. Else 🥳

@Roald87 Roald87 changed the title Weird string variable definition when string includes semi colon Regex for variable definition fails on string which includes semi colon or closing bracket Jun 30, 2020
@Roald87 Roald87 added the bug Something isn't working label Jun 30, 2020
@Roald87 Roald87 added this to the Version 0.2 milestone Jun 30, 2020
@kdorsel
Copy link
Contributor Author

kdorsel commented Jun 30, 2020

I think this might be easier to do in two steps. The first step would be to check for a string like initializing. Either ' or ". If found remove it and apply the current regex to part it out. The string check would have to make sure that any FB initializing with string values don't get caught.

Because the major issue I see right now is supporting those special characters inside a string. ) and ;.
https://regex101.com/r/EzvPSi/2

@Roald87
Copy link
Owner

Roald87 commented Jun 30, 2020

That seems like a good solution. Or else the regex will become very complex.

@kdorsel
Copy link
Contributor Author

kdorsel commented Jul 6, 2020

Ok, I'll look at the two step solution and open up a PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants