Fix Regex Character Class Escape Tests#4423
Conversation
|
Replying to a comment from @ptomato in #4364 here:
I can't see the changes you've made but I've tried replicating them in the most recent commit, and removed the file |
For each character class escape (\d, \D, \s, \S, \w, \W), check positive cases (the escape matches all characters it's supposed to match) and negative cases (the escape doesn't match any of the characters it should not match). Each of these checks is also done in Unicode mode and with the v flag. This uses regenerate.js from the unicode-property-escapes-tests repo to generate strings that contain exactly the characters that are supposed to be matched or not matched for each escape. Comparison is done with regex test instead of regex replace to optimize the tests. This is part of my work at the SYSTEMF lab at EPFL. Avoid modifying the regenerate library object prototype.
074b5fd to
2f8296e
Compare
Thanks, that's exactly what I had in mind. I've pushed an update with some coding style fixes and split the commits into one that modifies the test generator script, and one with the resulting generated tests. I do have some comments/questions remaining so I'll reply inline. |
ptomato
left a comment
There was a problem hiding this comment.
Actually never mind, I answered my own questions by reading the description of the original PR you opened (#4195). I think this is ready to merge. Thank you very much for sticking with this through the long process of fixing and moving over the test generator script.
|
Perfect, thanks for the review and edits! Happy to have helped! |
This PR is the follow-up of #4364 and #4195
For each character class escape (\d, \D, \s, \S, \w, \W), check positive cases (the escape matches all characters it's supposed to match) and negative cases (the escape doesn't match any of the characters it should not match). Each of these checks is also done in Unicode mode and with the v flag.
This uses regenerate.js from the unicode-property-escapes-tests repo to generate strings that contain exactly the characters that are supposed to be matched or not matched for each escape.
Comparison is done with regex test instead of regex replace to optimize the tests.
This is part of my work at the SYSTEMF lab at EPFL.