# Python RegEx

A RegEx, or Regular Expression, is a sequence of characters that forms a search pattern.

RegEx can be used to check if a string contains the specified search pattern.

---



RegEx Module

Python has a **built-in package called re**, which can be used to work with Regular Expressions.

Import the re module:

---



In [0]:
import re

**RegEx in Python**

When you have imported the **re** module, you can start using regular expressions:

---



In [0]:
import re

txt = "The rain in Spain"
x = re.search("^The.*Spain$", txt)

if (x):
  print("YES! We have a match!")
else:
  print("No match")

YES! We have a match!


**RegEx Functions**

The re module offers a set of functions that allows us to search a string for a match:

---

**Function	Description**

---

**findall **       -> 	Returns a list containing all matches

**search  **     -> 	Returns a Match object if there is a match anywhere in the string

**split	 **       ->    Returns a list where the string has been split at each match

**sub  **   ->    	Replaces one or many matches with a string

In [0]:
import re

#Return a list containing every occurrence of "ai":

str = "The rain in Spain"
x = re.findall("ai", str)
print(x)

['ai', 'ai']


The list contains the matches in the order they are found.

If no matches are found, an empty list is returned:

In [0]:
import re

str = "The rain in Spain"
x = re.findall("Portugal", str)
print(x)

if (x):
  print("Yes, there is at least one match!")
else:
  print("No match")


[]
No match


**The search() Function**

The search() function searches the string for a match, and returns a Match object if there is a match.

If there is more than one match, only the first occurrence of the match will be returned:

---

Search for the first white-space character in the string:

In [0]:
import re

str = "The rain in Spain"
x = re.search("\s", str)

print("The first white-space character is located in position:", x.start())

The first white-space character is located in position: 3


If no matches are found, the value** None** is returned:

---

Make a search that returns no match:

In [0]:
import re

str = "The rain in Spain"
x = re.search("Portugal", str)
print(x)

None


**The split() Function**

The split() function returns a list where the string has been split at each match:


---

Split at each white-space character:

In [0]:
import re

str = "The rain in Spain"
x = re.split("\s", str)
print(x)

['The', 'rain', 'in', 'Spain']


You can control the number of occurrences by specifying the maxsplit parameter:

---
Split the string only at the first occurrence:

In [0]:
import re

str = "The rain in Spain"
x = re.split("\s", str, 1)
print(x)

['The', 'rain in Spain']


**The sub() Function**

The sub() function replaces the matches with the text of your choice:

---

**Example:**
Replace every white-space character with the number 9:

In [0]:
import re

str = "The rain in Spain"
x = re.sub("\s", "9", str)
print(x)

The9rain9in9Spain


You can control the number of replacements by specifying the count parameter:

---

**Example:**
Replace the first 2 occurrences:

In [0]:
import re

str = "The rain in Spain"
x = re.sub("\s", "9", str, 2)
print(x)

The9rain9in Spain


**Match Object**

A Match Object is an object containing information about the search and the result.

---

**Note:** If there are no match, the value None will be returned, instead of the Match Object.

---
**Example: **Do a search that will return a Match Object:

In [0]:
import re

str = "The rain in Spain"
x = re.search("ai", str)
print(x) #this will print an object

<_sre.SRE_Match object; span=(5, 7), match='ai'>


**The Match object has properties and methods used to retrieve information about the search, and the result:**

**.span()** -> returns a tuple containing the start-, and end positions of the match.

**.string** -> returns the string passed into the function

**.group()** -> returns the part of the string where there was a match

---

**Example: **
Print the position (start- and end-position) of the first match occurrence.

The regular expression looks for any words that starts with an upper case "S":



In [0]:
import re

str = "The rain in Spain"
x = re.search(r"\bS\w+", str)
print(x.span())

(12, 17)


**Example:**
Print the string passed into the function:

In [0]:
import re

str = "The rain in Spain"
x = re.search(r"\bS\w+", str)
print(x.string)

The rain in Spain


**Example:**
Print the part of the string where there was a match.

The regular expression looks for any words that starts with an upper case "S":

In [0]:
import re

str = "The rain in Spain"
x = re.search(r"\bS\w+", str)
print(x.group())

Spain


**Note: **If there are no match, the value None will be returned, instead of the Match Object.