Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option for verbose regular expressions #9596

Open
dlangBugzillaToGithub opened this issue Jan 24, 2013 · 1 comment
Open

Option for verbose regular expressions #9596

dlangBugzillaToGithub opened this issue Jan 24, 2013 · 1 comment

Comments

@dlangBugzillaToGithub
Copy link

bearophile_hugs reported this on 2013-01-24T18:14:01Z

Transfered from https://issues.dlang.org/show_bug.cgi?id=9390

CC List

Description

I'd really like an option to write "verbose" regular expressions in D, like in Python:

http://docs.python.org/2/library/re.html


> re.X
> re.VERBOSE
> 
>     This flag allows you to write regular expressions that look
>     nicer. Whitespace within the pattern is ignored, except when in a
>     character class or preceded by an unescaped backslash, and, when
>     a line contains a '#' neither in a character class or preceded by
>     an unescaped backslash, all characters from the leftmost such '#'
>     through the end of the line are ignored.
> 
>     That means that the two following regular expression objects that
>     match a decimal number are functionally equal:
> 
>     a = re.compile(r"""\d +  # the integral part
>                        \.    # the decimal point
>                        \d *  # some fractional digits""", re.X)
>     b = re.compile(r"\d+\.\d*")


RE code is code like every other, so it enjoys comments, a nicer indenting and formatting.

Making RE more readable helps their debug and understand. In my Python code all RE longer than half a line of chars are "verbose".
@dlangBugzillaToGithub
Copy link
Author

dmitry.olsh (@DmitryOlshansky) commented on 2013-01-25T12:13:45Z

How about adding the common extensions that is called comments inside regular expression.

I can't recall synatx off-hand but it's something like:
(?# some comment that is ignored)


Plus you can already use any of the follwoing:

auto pattern - r"the first piece" // comment
r"the second piece" //comment 2
...
r" the last piece"; //last comment


Or if implicit concatenation feels too dirty:

auto pattern - r"the first piece"  // comment
~ r"the second piece" //comment 2
...
~ r" the last piece"; //last comment

Either way free-form regex + top-level explanatory note is enough by my standards. The rationale is if you have to explan every piece in isolation then it's one of 2 cases: you are explaning machanics to people that don't know what regex is (and it's wrong) or the regex pattern is too darn complex for its own good.

Since this is enhancement request I hereby propose 2 ways to solve it: close as won't fix or add the aformentioned extension for comments (that at least is more or less common). I'm not going to add another option that messes with syntax rules.

@LightBender LightBender removed the P4 label Dec 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants