# <center>RegEx in Python</center>

![](images/memes/meme11.jpeg)

# Quantifiers

**Quantifiers** are the mechanisms to define how a **character**, **metacharacter**, or **character set** can be **repeated**.

Here is the list of 4 basic quantifers:

<table style="border: 1px solid black; font-size:15px;">
<thead>
    <th>Symbol</th>
    <th>Name</th>
    <th>Quantification of previous character</th>
</thead>
    
<tbody>
<tr>
    <td>?</td>
    <td>Question Mark</td>
    <td>Optional (0 or 1 repetitions)</td>
</tr>
    
<tr>
    <td>*</td>
    <td>Asterisk</td>
    <td>Zero or more times</td>
</tr>

<tr>
    <td>+</td>
    <td>Plus Sign</td>
    <td>One or more times</td>
</tr>

<tr>
    <td>{n,m}</td>
    <td>Curly Braces</td>
    <td>Between n and m times</td>
</tr>
</tbody>
</table>


Let us go through different examples to understand them one by one.

### Example 1

Find all the matches for `dog` and `dogs` in the given text.

In [1]:
import re

In [2]:
txt = """
I have 2 dogs. One dog is 1 year old and other one is 2 years old. Both dogs are very cute! 
"""

In [3]:
pattern = re.compile("dogs?")

In [4]:
pattern.findall(txt)

['dogs', 'dog', 'dogs']

In [5]:
from utils import highlight_regex_matches
highlight_regex_matches(pattern, txt)


I have 2 [43m[1mdogs[0m. One [43m[1mdog[0m is 1 year old and other one is 2 years old. Both [43m[1mdogs[0m are very cute! 



### Example 2

Find all filenames starting with `file` and ending with `.txt` in the given text.

In [6]:
txt = """
file1.txt
file_one.txt
file.txt
fil.txt
file.xml
file-1.txt
"""

In [7]:
pattern = re.compile("file[\w-]*\.txt")

In [8]:
pattern.findall(txt)

['file1.txt', 'file_one.txt', 'file.txt', 'file-1.txt']

In [9]:
highlight_regex_matches(pattern, txt)


[43m[1mfile1.txt[0m
[43m[1mfile_one.txt[0m
[43m[1mfile.txt[0m
fil.txt
file.xml
[43m[1mfile-1.txt[0m



### Example 3

Find all filenames starting with `file` followed by 1 or more digits and ending with `.txt` in the given text.

In [10]:
txt = """
file1.txt
file_one.txt
file09.txt
fil.txt
file23.xml
file.txt
"""

In [11]:
pattern = re.compile("file\d+\.txt")

In [12]:
pattern.findall(txt)

['file1.txt', 'file09.txt']

In [13]:
highlight_regex_matches(pattern, txt)


[43m[1mfile1.txt[0m
file_one.txt
[43m[1mfile09.txt[0m
fil.txt
file23.xml
file.txt



We can use the curly brackets syntax here with these modifications:

<table style="border: 1px solid black; font-size:15px;">
<thead>
    <th>Syntax</th>
    <th>Description</th>
</thead>
    
<tbody>
<tr>
    <td>{n}</td>
    <td>The previous character is repeated exactly n times.</td>
</tr>
    
<tr>
    <td>{n,}</td>
    <td>The previous character is repeated at least n times.</td>
</tr>

<tr>
    <td>{,n}</td>
    <td>The previous character is repeated at most n times.</td>
</tr>

<tr>
    <td>{n,m}</td>
    <td>The previous character is repeated between n and m times (both inclusive).</td>
</tr>
</tbody>
</table>

### Example 4

Find years in the given text.


In [14]:
txt = """
The first season of Indian Premiere League (IPL) was played in 2008. 
The second season was played in 2009 in South Africa. 
Last season was played in 2018 and won by Chennai Super Kings (CSK).
CSK won the title in 2010 and 2011 as well.
Mumbai Indians (MI) has also won the title 3 times in 2013, 2015 and 2017.
"""

In [15]:
pattern = re.compile("\d{4}")

In [16]:
pattern.findall(txt)

['2008', '2009', '2018', '2010', '2011', '2013', '2015', '2017']

### Example 5

In the given text, filter out all 4 or more digit numbers.

In [17]:
txt = """
123143
432
5657
4435
54
65111
"""

In [18]:
pattern = re.compile("\d{4,}")

In [19]:
re.findall(pattern, txt)

['123143', '5657', '4435', '65111']

### Example 6

Write a pattern to validate telephone numbers.

Telephone numbers can be of the form: `555-555-5555`, `555 555 5555`, `5555555555`

In [20]:
txt = """
555-555-5555
555 555 5555
5555555555
"""

In [21]:
pattern = re.compile("\d{3}[-\s]?\d{3}[-\s]?\d{4}")

In [22]:
pattern.findall(txt)

['555-555-5555', '555 555 5555', '5555555555']

![](images/memes/meme12.jpg)