# Grouping

When using regular expressions, it is sometimes useful to divide a text into groups. For example, if we are searching a text for email addresses, we could be interested in more than just the addresses themselves. We could be interested in the different parts of the email addresses. To access them, we can divide an email address pattern into 2 groups of characters: username and domain, which are matched separately.

In [1]:
import re

## Basic grouping

For the purpose of having an example, let's say we are attempting to use regex to match a text consisting of 3 parts: 3 letters, 2 whitespace characters, and a 3 numbers.

Grouping allows us to access parts of the matched text separately. In this case, we can get the letters, whitespaces, and numbers independently. This isn't the most useful when the lengths of the groups are fixed, but as you will see, it is extremely useful when the lengths are variable.

In [2]:
pattern = re.compile("([a-zA-Z][a-zA-Z][a-zA-Z])(\s\s)(\d\d\d)")
mo = pattern.search("The text is: abc\n\t321")
print(mo)

<re.Match object; span=(13, 21), match='abc\n\t321'>


To access the character groups, we can use the `group()` and `groups()` methods on the `Match` object. Group 0 is the entirety of the matched expression, while groups 1 and on are the parts of the text we capture separately.

In [3]:
print(mo.groups())

('abc', '\n\t', '321')


In [4]:
print(mo.group())
print(mo.group(0))

abc
	321
abc
	321


In [5]:
print(mo.group(1))
print(mo.group(2))
print(mo.group(3))

abc

	
321


In [6]:
print(mo.group(3, 2, 1))

('321', '\n\t', 'abc')


Of course, we can use repetition modifiers with grouping when necessary.

In [7]:
pattern = re.compile("(\d+)-(\d+)")
mo = pattern.search("182891231230-128379AAAA")
print(mo.groups())

('182891231230', '128379')


## Named groups

We can also name groups to more easily access them.

The syntax is

```python
(?P<name>...)
```

where `name` is the identifier of the group and `...` represents the pattern.

In [8]:
pattern = re.compile("(?P<last>\w+), (?P<first>\w+)")
mo = pattern.search("Programmer, Future")
print(mo.groupdict())
print(mo.group("first"))
print(mo.group("last"))

{'last': 'Programmer', 'first': 'Future'}
Future
Programmer


## Summary

Now that you have learned about grouping in regular expressions, you have everything you need to know to match different parts of texts separately in groups.