# Regex

Regex (short for Regular Expression) is a sequence of characters that defines a search pattern, primarily used for string matching, searching, and text manipulation.

📌 In Simple Terms:

Regex is like a smart search tool. It helps you find, extract, or replace parts of a string based on complex patterns.

✅ Common Uses of Regex:

1. Validate formats (e.g., emails, phone numbers, ZIP codes).

2. Search for specific patterns (e.g., dates, hashtags).

3. Extract data (e.g., getting all numbers from a text).

4. Replace text based on patterns.

![image.png](attachment:75724661-ce82-4480-9ebc-eedeb581c650.png)

![image.png](attachment:d9fc7ec5-b450-4331-a6d7-141ff4c50e11.png)

In [29]:
# Example: Regex in customer support

import re


# Retrieve Phone Number

chat1 = 'codebasics: you ask lot of questions 😠  1235678912, abc@xyz.com'
pattern1 = r'\d{10}'
P1_match = re.findall(pattern1,chat1)
print('Phone No of Chat 1 :',P1_match)


chat2 = 'codebasics: here it is: (123)-567-8912, abC@xyz!com'
pattern2 = r'\(\d{3}\)-\d{3}-\d{4}'
P2_match = re.findall(pattern2,chat2)
print('Phone No of Chat 2 :',P2_match)


chat3 = 'codebasics: yes, phone: 1235678912 email: abc@xyz.io'
pattern3 = r'\d{10} | \(\d{3}\)-\d{3}-\d{4}'
P3_match = re.findall(pattern3,chat3)
print('Phone No of Chat 3 :',P3_match)


# Retrieve E-mail

pattern_1 = r'[a-z0-9_]*@[a-z]*\.com'
E1_match = re.findall(pattern_1,chat1)
print('\nE-mail of Chat 1 :',E1_match)


pattern_2 = r'[a-zA-Z0-9_]*@[a-z]*.com'
E2_match = re.findall(pattern_2,chat2)
print('E-mail of Chat 1 :',E2_match)


pattern_3 = r'[a-zA-Z0-9_]*@[a-z]*.[a-z]*'
E3_match = re.findall(pattern_3,chat3)
print('E-mail of Chat 1 :',E3_match)


Phone No of Chat 1 : ['1235678912']
Phone No of Chat 2 : ['(123)-567-8912']
Phone No of Chat 3 : ['1235678912 ']

E-mail of Chat 1 : ['abc@xyz.com']
E-mail of Chat 1 : ['abC@xyz!com']
E-mail of Chat 1 : ['abc@xyz.io']


In [31]:
# Retrieve order number

chat1 ='codebasics: Hello, I am having an issue with my order # 412889912'
pattern1 = r'order[^\d]*(\d*)'
P1_match = re.findall(pattern1,chat1)
print('Order No of Chat 1 :',P1_match)


chat2 = 'codebasics: I have a problem with my order number 412889912'
pattern2 = r'order[^\d]*(\d*)'
P2_match = re.findall(pattern2,chat2)
print('Order No of Chat 2 :',P2_match)


chat3 = 'codebasics: My order 412889912 is having an issue, I was charged 300$ when online it says 280$'
pattern3 = r'order[^\d]*(\d*)'
P3_match = re.findall(pattern3,chat3)
print('Order No of Chat 3 :',P3_match)


Order No of Chat 1 : ['412889912']
Order No of Chat 2 : ['412889912']
Order No of Chat 3 : ['412889912']


In [47]:
# Regex for Information Extraction

Text ='''
Born	Elon Reeve Musk
June 28, 1971 (age 50)
Pretoria, Transvaal, South Africa
Citizenship	
South Africa (1971–present)
Canada (1971–present)
United States (2002–present)
Education	University of Pennsylvania (BS, BA)
Title	
Founder, CEO and Chief Engineer of SpaceX
CEO and product architect of Tesla, Inc.
Founder of The Boring Company and X.com (now part of PayPal)
Co-founder of Neuralink, OpenAI, and Zip2
Spouse(s)	
Justine Wilson
​
​(m. 2000; div. 2008)​
Talulah Riley
​
​(m. 2010; div. 2012)​
​
​(m. 2013; div. 2016)
'''


In [49]:
def get_pattern_match(pattern, Text):
    matches = re.findall(pattern, Text)
    if matches:
        return matches[0]

In [51]:
# Age

get_pattern_match(r'age (\d+)', Text)


'50'

In [59]:
# Name

get_pattern_match(r'Born(.*)\n', Text).strip()


'Elon Reeve Musk'

In [65]:
# DOB

get_pattern_match(r'Born.*\n(.*)\(age', Text)


'June 28, 1971 '

In [63]:
# Place

get_pattern_match(r'\(age.*\n(.*)', Text)


'Pretoria, Transvaal, South Africa'

In [71]:

def extract_personal_information(Text):
    age = get_pattern_match(r'age (\d+)', Text)
    full_name = get_pattern_match(r'Born(.*)\n', Text)
    birth_date = get_pattern_match(r'Born.*\n(.*)\(age', Text)
    birth_place = get_pattern_match(r'\(age.*\n(.*)', Text)
    return {
        'age': int(age),
        'name': full_name.strip(),
        'birth_date': birth_date.strip(),
        'birth_place': birth_place.strip()
    }


extract_personal_information(Text)


{'age': 50,
 'name': 'Elon Reeve Musk',
 'birth_date': 'June 28, 1971',
 'birth_place': 'Pretoria, Transvaal, South Africa'}

In [73]:

text = '''
Born	Mukesh Dhirubhai Ambani
19 April 1957 (age 64)
Aden, Colony of Aden
(present-day Yemen)[1][2]
Nationality	Indian
Alma mater	
St. Xavier's College, Mumbai
Institute of Chemical Technology (B.E.)
Stanford University (drop-out)
Occupation	Chairman and MD, Reliance Industries
Spouse(s)	Nita Ambani ​(m. 1985)​[3]
Children	3
Parent(s)	
Dhirubhai Ambani (father)
Kokilaben Ambani (mother)
Relatives	Anil Ambani (brother)
Tina Ambani (sister-in-law)
'''


In [75]:
extract_personal_information(text)

{'age': 64,
 'name': 'Mukesh Dhirubhai Ambani',
 'birth_date': '19 April 1957',
 'birth_place': 'Aden, Colony of Aden'}