### Grouping
Frequently you need to obtain more information than just whether the regex pattern matched or not.

By placing part of a regular expression inside round brackets or parentheses (, ), you can group that part of the regex pattern together.

### Applications of grouping:
##### 1. apply a quantifier to the entire group.
For example, (ab)+ will match one or more repetitions of ab.

In [2]:
import re

from colorama import Back, Style


def highlight_regex_matches(pattern, text, print_output=True):
	output = text
	len_inc = 0
	for match in pattern.finditer(text):
		start, end = match.start() + len_inc, match.end() + len_inc
		output = output[:start] + Back.YELLOW + Style.BRIGHT + output[start:end] + Style.RESET_ALL + output[end:]
		len_inc = len(output) - len(text)  

	if print_output:
		print(output)
	else:
		return output

In [3]:
txt = "abbbbbbabbbb"

In [7]:
pattern1 = re.compile("ab+")
pattern2 = re.compile("(abb)+")

In [8]:
highlight_regex_matches(pattern1,txt)

[43m[1mabbbbbb[0m[43m[1mabbbb[0m


In [9]:
highlight_regex_matches(pattern2,txt)

[43m[1mabb[0mbbbb[43m[1mabb[0mbb


### 2. restrict alternation to part of the regex.
For example, my name is ram|sam will match my name is ram and sam whereas my name is (ram|sam) will match my name is ram and my name is sam.

In [14]:
txt = """
my name is ram
my name is sam
"""

In [15]:
pattern1 = re.compile("my name is ran|sam")
pattern2= re.compile("my name is (ran|sam)")


In [16]:
highlight_regex_matches(pattern1,txt)


my name is ram
my name is [43m[1msam[0m



In [17]:
highlight_regex_matches(pattern1,txt)


my name is ram
my name is [43m[1msam[0m



### 3. capture the text matched by group.

* Groups indicated with (, ) also capture the starting and ending index of the text that they match.

* Groups can be retrieved by passing an argument to group(), start(), end(), and span() of the Match object.

* Groups are numbered starting with 0.

* Group 0 is always present; it captures the whole regex pattern, so all Match object methods have group 0 as their default argument.

Consider an example where we want to parse a date and determine day, month and year.

In [28]:
text = "12/02/2012"

In [29]:
pattern = re.compile("(\d{2})\/(\d{2})\/(\d{4})")

In [34]:
match = pattern.match(text)

In [35]:
highlight_regex_matches(pattern,text)

[43m[1m12/02/2012[0m


In [38]:
match.group(0)

'12/02/2012'

In [37]:
match.group(2)

'02'

In [39]:
match.group(3)

'2012'

In [41]:
day,month,year = match.groups()

In [42]:
day

'12'

In [44]:
day,month,year

('12', '02', '2012')

In [45]:
txt = """
Name: Nikhil
Age: 0
Roll No.: 15
Grade: S

Name: Ravi
Age: -1
Roll No.: 123
Grade: K

Name: Ram
Age: N/A
Roll No.: 1
Grade: G
"""

In [46]:
pattern = re.compile("Name: (.+)\n")


In [47]:
pattern.findall(txt)

['Nikhil', 'Ravi', 'Ram']