-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inverse regex #1
Comments
My current patch for py3 is --- regex_inverter.py 2020-02-13 17:25:40.000000000 +0700
+++ regex_inverter.py 2020-02-13 22:46:59.652203153 +0700
@@ -5,10 +5,11 @@
import sre_parse
import string
+# Note string.ascii_letters is not the same as \w which is unicode by default
category_chars = {
CATEGORY_DIGIT: string.digits,
CATEGORY_SPACE: string.whitespace,
- CATEGORY_WORD: string.digits + string.letters + '_'
+ CATEGORY_WORD: string.digits + string.ascii_letters + '_',
}
@@ -26,7 +27,8 @@
return string.printable
-def handle_branch((tok, val)):
+def handle_branch(val):
+ tok, val = val
all_opts = []
for toks in val:
opts = permute_toks(toks)
@@ -49,10 +51,11 @@
return [chr(val)]
-def handle_max_repeat((min, max, val)):
+def handle_max_repeat(val):
"""
Handle a repeat token such as {x,y} or ?.
"""
+ (min, max, val) = val
subtok, subval = val[0]
if max > 5000:
@@ -76,7 +79,7 @@
def handle_subpattern(val):
- return list(permute_toks(val[1]))
+ return list(permute_toks(val[3]))
def handle_tok(tok, val): |
I see |
Thank you. I'll take a look at this shortly |
Thank you for the suggestions! I've reimplemented the code and added it to my repository. I don't have time to maintain it as a Python package, but if you or anyone else needs, it feel free to copy the code. |
Thanks @bjourne |
It appears that your inverse regex from https://www.mail-archive.com/python-list@python.org/msg125198.html isnt in this repo, or anywhere, however it is mentioned in https://stackoverflow.com/questions/17518554/how-to-reverse-a-regex-in-python/49389042 and there are a few unattributed copies in GitHub.
IMO it is a very useful solution, and warrants being its own Python library on PyPI for easy adoption.
https://github.com/pyparsing/pyparsing/blob/master/examples/invRegex.py and https://pypi.org/project/er/ exist, and there are a few other data-fuzzing type code available, but few do full permutations of the regex efficiently. The old code is py2 only, but the changes needed to support py3 are quite small.
The text was updated successfully, but these errors were encountered: