Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regex has no equivalent to re.Match.groups() for captures #474

Closed
Dylan-Brotherston opened this issue Jul 24, 2022 · 3 comments
Closed

regex has no equivalent to re.Match.groups() for captures #474

Dylan-Brotherston opened this issue Jul 24, 2022 · 3 comments

Comments

@Dylan-Brotherston
Copy link

Dylan-Brotherston commented Jul 24, 2022

regex adds he regex.Match.captures() method for returning a list of all captures that a capture-group makes.
regex.Match.captures(N) returns the list of captures for group N just as re.Match.group(N) return the last capture.
regex.Match.captures() returns the list of captures for group 0 just as re.Match.group()

re also has re.Match.groups() that returns a tuple of all last captures.
But regex has no way to easily return the same information. ie a tuple of lists of all captures.

Given the following example:

ip_regex = r"(([0-9]|[1-9][0-9]|1[0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])"
ip = "127.0.0.1"
m1 = re.fullmatch(ip_regex, ip)
print(f"{m1.groups()=}")
print(f"{m1.group()=}")
print(f"{m1.group(0)=}")
print(f"{m1.group(1)=}")
print(f"{m1.group(2)=}")
print(f"{m1.group(3)=}")
m1.groups()=('0.', '0', '1')
m1.group()='127.0.0.1'
m1.group(0)='127.0.0.1'
m1.group(1)='0.'
m1.group(2)='0'
m1.group(3)='1'
m2 = regex.fullmatch(ip_regex, ip)
# nothing can go here to imitate print(f"{m1.groups()=}")
print(f"{m2.captures()=}")
print(f"{m2.captures(0)=}")
print(f"{m2.captures(1)=}")
print(f"{m2.captures(2)=}")
print(f"{m2.captures(3)=}")
# can't print (['127.', '0.', '0.'], ['127', '0', '0'], ['1'])
m2.captures()=['127.0.0.1']
m2.captures(0)=['127.0.0.1']
m2.captures(1)=['127.', '0.', '0.']
m2.captures(2)=['127', '0', '0']
m2.captures(3)=['1']

One possibility would be to rename regex.Match.captures() to regex.Match.capture() to match the name of re.Match.group()
Then use regex.Match.captures() to match the name of re.Match.groups()

This would obviously be a breaking change
so alternatively adding a new method eg.
regex.Match.capturestuple() named similar to regex.Match.capturesdict()

@mrabarnett
Copy link
Owner

A shorter name would be allcaptures, maybe?

A workaround is:

>>> m2.captures(*list(range(len(m2))))
(['127.0.0.1'], ['127.', '0.', '0.'], ['127', '0', '0'], ['1'])

@Dylan-Brotherston
Copy link
Author

That's pretty much the workaround I'm currently using.
I'm certainly not fussed on the name, but definitely think is should be provided functionality.

@mrabarnett
Copy link
Owner

Added allcaptures and allspans in regex 2022.7.24.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants