# 🧩 Regex Pattern Tutorial
---
This notebook will teach you **regex patterns** step by step with examples.


| Pattern    | Meaning                                         | Example Text         | Matches            |
| ---------- | ----------------------------------------------- | -------------------- | ------------------ |
| `[a-z]`    | Any **small letter**                            | `"abc XYZ"`          | a, b, c            |
| `[A-Z]`    | Any **capital letter**                          | `"abc XYZ"`          | X, Y, Z            |
| `[a-zA-Z]` | Any letter (small or big)                       | `"Pizza 123"`        | P, i, z, z, a      |
| `\w`       | Any word character (letter, number, underscore) | `"Hi_123!"`          | H, i, \_, 1, 2, 3  |
| `\w+`      | Whole word (many word chars)                    | `"Pizza and Python"` | Pizza, and, Python |


| Pattern   | Meaning                | Example Text          | Matches |
| --------- | ---------------------- | --------------------- | ------- |
| `\d`      | A single digit         | `"I have 2 apples"`   | 2       |
| `\d+`     | One or more digits     | `"Room 123, floor 7"` | 123, 7  |
| `\d{3}`   | Exactly 3 digits       | `"My pin is 4567"`    | 456     |
| `\d{2,4}` | Between 2 and 4 digits | `"Born in 2005"`      | 2005    |
| `\d{2,4}` | Between 2 and 4 digits | `"Born in 05"`        |   05    |


| Pattern    | Meaning                | Example Text                 | Matches          |
| ---------- | ---------------------- | ---------------------------- | ---------------- |
| `^Hello`   | Word at **start**      | `"Hello world"`              | Hello            |
| `world$`   | Word at **end**        | `"Say world"`                | world            |
| `\bP\w+`   | Word starting with P   | `"Pizza and Python"`         | Pizza, Python    |
| `\b\w+ing` | Word ending with “ing” | `"I am running and singing"` | running, singing |


| Pattern                         | Meaning                    | Example Text                      | Matches                                         |
| ------------------------------- | -------------------------- | --------------------------------- | ----------------------------------------------- |
| `\d{10}`                        | 10-digit phone number      | `"Call 9876543210 now"`           | 9876543210                                      |
| `[a-zA-Z0-9._]+@[a-z]+\.[a-z]+` | Email                      | `"Mail me at hello123@gmail.com"` | [hello123@gmail.com](mailto:hello123@gmail.com) |
| `https?://\S+`                  | Website links (http/https) | `"Visit https://python.org"`      | [https://python.org](https://python.org)        |


## 1. Letters & Words

In [6]:
import re

text = "Pizza 123 Hi_ABC"

print("[a-z] ->", re.findall(r"[a-z]", text))
print("[A-Z] ->", re.findall(r"[A-Z]", text))
print("[a-zA-Z] ->", re.findall(r"[a-zA-Z]", text))
print("\\w ->", re.findall(r"\w", text))
print("\\w+ ->", re.findall(r"\w+", text))

[a-z] -> ['i', 'z', 'z', 'a', 'i']
[A-Z] -> ['P', 'H', 'A', 'B', 'C']
[a-zA-Z] -> ['P', 'i', 'z', 'z', 'a', 'H', 'i', 'A', 'B', 'C']
\w -> ['P', 'i', 'z', 'z', 'a', '1', '2', '3', 'H', 'i', '_', 'A', 'B', 'C']
\w+ -> ['Pizza', '123', 'Hi_ABC']


## 2. Numbers

In [2]:
text = "My pin is 4567, born in 2005, room 123"

print("\\d ->", re.findall(r"\d", text))
print("\\d+ ->", re.findall(r"\d+", text))
print("\\d{3} ->", re.findall(r"\d{3}", text))
print("\\d{2,4} ->", re.findall(r"\d{2,4}", text))

\d -> ['4', '5', '6', '7', '2', '0', '0', '5', '1', '2', '3']
\d+ -> ['4567', '2005', '123']
\d{3} -> ['456', '200', '123']
\d{2,4} -> ['4567', '2005', '123']


## 3. Spaces & Special Characters

In [6]:
text = "Hi there. www.google.com"

print("\\s ->", re.findall(r"\s", text))
print(". ->", re.findall(r".", text))
print("\. ->", re.findall(r"\.", text))

\s -> [' ', ' ']
. -> ['H', 'i', ' ', 't', 'h', 'e', 'r', 'e', '.', ' ', 'w', 'w', 'w', '.', 'g', 'o', 'o', 'g', 'l', 'e', '.', 'c', 'o', 'm']
\. -> ['.', '.', '.']


## 4. Start, End, Whole Word

In [7]:
text = "Hello world. I am running and singing with Python Pizza"

print("^Hello ->", re.findall(r"\w*ing", text))
print("world$ ->", re.findall(r"world$", text))
print("\\bP\\w+ ->", re.findall(r"\bP\w+", text))
print("\\b\\w+ing ->", re.findall(r"\b\w+ing", text))

^Hello -> ['running', 'singing']
world$ -> []
\bP\w+ -> ['Python', 'Pizza']
\b\w+ing -> ['running', 'singing']


## 5. Real-Life Examples

In [16]:
text = "Call 9876543210 or mail me at hello123@gmail.com. Visit https://python.org"

print("Phone numbers ->", re.findall(r"\d{10}", text))
print("Emails ->", re.findall(r"[a-zA-Z0-9._]+@[a-z]+\.[a-z]+", text))
print("Links ->", re.findall(r"https[a-z:/.]+", text))


print("trial ->", re.findall(r"[a-zA-Z0-9._]+@[a-z]+\.[a-z]+", text))

Phone numbers -> ['9876543210']
Emails -> ['hello123@gmail.com']
Links -> ['https://python.org']
trial -> ['hello123@gmail.com']


## ✅ Recap
- `[a-z]` → small letter  
- `[A-Z]` → capital letter  
- `\d` → digit  
- `\w` → word character  
- `+` → many  
- `{n}` → exact count
- `{n, m}` → min and max count
- `^` / `$` → beginning / end  

Regex = **Lego blocks 🧩 for text matching!**

In [7]:
import re

text = """On 99/99/9999, Mr. Rahul Sharma called from 9876543210 to confirm the meeting. 
Later the same day (September 5, 2024) Mrs Anita Gupta emailed the agenda to rahul.sharma@example.com and to contact@school.edu. 
Please note the backup contact 9123456789 belongs to Mr. A. Verma who said he will arrive on 2024-09-06. 
Mrs. Sunita Kapoor sent a follow-up on 06/09/2024 with attachments and also cc'd anita_gupta123@mail.co.in. 
A reminder SMS was also sent from 9988776655 on Sep 7, 2024. 
For registration, write to admissions@college.ac.in or call 9012345678. 
Finally, Mr Suresh.P reported an issue on 07/09/2024 and requested support from support-team@example.org.
"""

In [8]:
patterns = {
    "phones": r"\b\d{10}\b",
    "date_dd_mm_yyyy": r"\b\d{2}/\d{2}/\d{4}\b",
    "date_yyyy_mm_dd": r"\b\d{4}-\d{2}-\d{2}\b",
    "date_month_d_comma_yyyy": r"\b(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Sept|Oct|Nov|Dec|January|February|March|April|May|June|July|August|September|October|November|December)\s+\d{1,2},\s*\d{4}\b",
    "emails": r"[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}",
    "mr_mrs_names": r"\b(?:Mr|Mrs)\.?\s+[A-Z][a-zA-Z.]*?(?:\s+[A-Z][a-zA-Z.]*)?\b"
}

for name, pat in patterns.items():
    found = re.findall(pat, text)
    print(f"{name} -> {found}")

phones -> ['9876543210', '9123456789', '9988776655', '9012345678']
date_dd_mm_yyyy -> ['99/99/9999', '06/09/2024', '07/09/2024']
date_yyyy_mm_dd -> ['2024-09-06']
date_month_d_comma_yyyy -> ['September 5, 2024', 'Sep 7, 2024']
emails -> ['rahul.sharma@example.com', 'contact@school.edu', 'anita_gupta123@mail.co.in', 'admissions@college.ac.in', 'support-team@example.org']
mr_mrs_names -> ['Mr. Rahul Sharma', 'Mrs Anita Gupta', 'Mr. A', 'Mrs. Sunita Kapoor', 'Mr Suresh']


# 📝 Regex Practice Questions

## Level 1 – Baby Steps
1. Write a pattern to **find all digits** in:  
   `"I have 2 apples and 15 bananas."`


In [11]:

text = "I have 2 apples and 15 bananas."
re.findall(r"\d+",text)

['2', '15']


2. Write a pattern to **find all words starting with capital letters** in:  
   `"My friend John likes Pizza and Python."`

In [None]:
text ="My friend John likes Pizza and Python"
print([i  for i in  re.findall(r"\w+",text) if re.findall(r"[A-Z]",i)])
print(re.findall(r"\b[A-Z]\w+",text))

# best
print(re.findall(r"\b[A-Z][a-z]*\b",text))


['My', 'John', 'Pizza', 'Python']
['My', 'John', 'Pizza', 'Python']
['My', 'John', 'Pizza', 'Python']



3. Write a pattern to **find all words ending with "ing"** in:  
   `"I am singing while running and eating."`

---


In [29]:
text = "I am singing while running and eating"

re.findall(r"\b\w+ing",text)

['singing', 'running', 'eating']


## Level 2 – Numbers & Words
4. Find all **3-digit numbers** in:  
   `"Pin codes: 123, 4567, and 890."`


In [36]:
text="Pin codes: 123, 4567, and 890"

print(re.findall(r"\b\d{3}\b",text))

['123', '890']



5. Find all **words that start with P** in:  
   `"Python, Pizza, Panda, Apple, Banana"`


In [43]:
text = "Pyhton, Pizza, Panda, Apple, Banana"
print(re.findall(r"\bP\w+",text))

['Pyhton', 'Pizza', 'Panda']



6. Find all **2-letter words** in:  
   `"It is an on or up to me."`

---


In [53]:
text = "It is an on or up to me"

print(re.findall(r"\b[a-z A-Z]{2}\b",text))

['It', 'is', 'an', 'on', 'or', 'up', 'to', 'me']



## Level 3 – Real Life Patterns
7. Find all **10-digit phone numbers** in:  
   `"Call 9876543210 or 9123456789 today."`


In [54]:
text = "Call 9876543210 or 9123456789 today"
print(re.findall(r"\d{10}",text))

['9876543210', '9123456789']



8. Find all **email addresses** in:  
   `"Contact me at hello@gmail.com or admin@school.org"`
   


In [58]:
text = "Contact me at hello@gmail.com or admin@school.org"

print(re.findall(r"[a-zA-Z0-9]+@+[a-z]+\.+[a-z]\w+",text))

['hello@gmail.com', 'admin@school.org']



9. Find all **dates in DD/MM/YYYY format** in:  
   `"My birthday is 05/09/2024 and my friend’s is 15/08/2001."`
   


In [59]:
text ="My birthday is 05/09/2024 and my friend’s is 15/08/2001."

print(re.findall(r"\b\d{2}/\d{2}/\d{4}\b", text))

['05/09/2024', '15/08/2001']



10. Find all **names starting with Mr. or Mrs.** in:  
   `"Mr. Ramesh met Mrs Anita at the park."`

---


In [72]:
text = "Mr. Ramesh met Mrs. Anita at the park"

print(re.findall(r"\b(?:Mr|Mrs)\.\s+[A-Z][a-zA-Z]*\b",text))

['Mr. Ramesh', 'Mrs. Anita']



## 🎯 Hints
- Use `\d` for digits.  
- Use `[A-Z]` for uppercase letters.  
- Use `\w+` for whole words.  
- Use `{n}` for exact count (e.g., 10 digits = `\d{10}`).  
- Use `^` and `$` for start/end of line.  
- Use `\b` for word boundaries.  


<details>
<summary>🔑 Show Solution</summary>

# 📝 Regex Practice Questions with Solutions

## Level 1 – Baby Steps
**Q1.** Write a pattern to **find all digits** in:  
`"I have 2 apples and 15 bananas."`  

**Solution:**  
`\d`  
Matches: `2`, `1`, `5`

---

**Q2.** Write a pattern to **find all words starting with capital letters** in:  
`"My friend John likes Pizza and Python."`  

**Solution:**  
`\b[A-Z]\w+`  
Matches: `My`, `John`, `Pizza`, `Python`

---

**Q3.** Write a pattern to **find all words ending with "ing"** in:  
`"I am singing while running and eating."`  

**Solution:**  
`\b\w+ing`  
Matches: `singing`, `running`, `eating`

---

## Level 2 – Numbers & Words
**Q4.** Find all **3-digit numbers** in:  
`"Pin codes: 123, 4567, and 890."`  

**Solution:**  
`\b\d{3}\b`  
Matches: `123`, `890`

---

**Q5.** Find all **words that start with P** in:  
`"Python, Pizza, Panda, Apple, Banana"`  

**Solution:**  
`\bP\w+`  
Matches: `Python`, `Pizza`, `Panda`

---

**Q6.** Find all **2-letter words** in:  
`"It is an on or up to me."`  

**Solution:**  
`\b\w{2}\b`  
Matches: `It`, `is`, `an`, `on`, `or`, `up`, `to`, `me`

---

## Level 3 – Real Life Patterns
**Q7.** Find all **10-digit phone numbers** in:  
`"Call 9876543210 or 9123456789 today."`  

**Solution:**  
`\b\d{10}\b`  
Matches: `9876543210`, `9123456789`

---

**Q8.** Find all **email addresses** in:  
`"Contact me at hello@gmail.com or admin@school.org"`  

**Solution:**  
`[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}`  
Matches: `hello@gmail.com`, `admin@school.org`

---

**Q9.** Find all **dates in DD/MM/YYYY format** in:  
`"My birthday is 05/09/2024 and my friend’s is 15/08/2001."`  

**Solution:**  
`\b\d{2}/\d{2}/\d{4}\b`  
Matches: `05/09/2024`, `15/08/2001`

---

**Q10.** Find all **names starting with Mr. or Mrs.** in:  
`"Mr. Ramesh met Mrs Anita at the park."`  

**Solution:**  
`\b(?:Mr|Mrs)\.?\s+[A-Z][a-zA-Z.]*\b`  
Matches: `Mr. Ramesh`, `Mrs Anita`

---

# 🎯 Quick Recap
- `\d` → digit  
- `\w` → word character  
- `{n}` → exact count  
- `+` → one or more  
- `\b` → word boundary  
- `^` / `$` → start / end of line  

</details>

In [33]:
import re

# Pattern to match valid dates in DD/MM/YYYY or DD-MM-YYYY format, excluding 99/99/9999 and 99-99-9999
date_pattern = r"\b(?!99[/-]99[/-]9999)(?:0[1-9]|[12][0-9]|3[01])[/-](?:0[1-9]|1[0-2])[/-](?:19|20)\d{2}\b"

valid_dates = re.findall(date_pattern, text)
print("Valid dates (excluding 99/99/9999 and 99-99-9999):", valid_dates)

Valid dates (excluding 99/99/9999 and 99-99-9999): ['06/09/2024', '07/09/2024']


# Pandas

In [73]:
!pip install pandas

Collecting pandas
  Downloading pandas-2.3.2-cp313-cp313-win_amd64.whl.metadata (19 kB)
Collecting numpy>=1.26.0 (from pandas)
  Downloading numpy-2.3.3-cp313-cp313-win_amd64.whl.metadata (60 kB)
Collecting pytz>=2020.1 (from pandas)
  Downloading pytz-2025.2-py2.py3-none-any.whl.metadata (22 kB)
Collecting tzdata>=2022.7 (from pandas)
  Downloading tzdata-2025.2-py2.py3-none-any.whl.metadata (1.4 kB)
Downloading pandas-2.3.2-cp313-cp313-win_amd64.whl (11.0 MB)
   ---------------------------------------- 0.0/11.0 MB ? eta -:--:--
   ---------------------------------------- 0.0/11.0 MB ? eta -:--:--
   ---------------------------------------- 0.0/11.0 MB ? eta -:--:--
   ---------------------------------------- 0.0/11.0 MB ? eta -:--:--
    --------------------------------------- 0.3/11.0 MB ? eta -:--:--
    --------------------------------------- 0.3/11.0 MB ? eta -:--:--
   - -------------------------------------- 0.5/11.0 MB 557.9 kB/s eta 0:00:19
   - ------------------------------

In [84]:
import pandas as pd 
import numpy as np

In [75]:
### Series in pandas
#  series is one dimentional array like object in pandas having only row lables

In [93]:
# to create a series we can pass a list, tuple, array, dictionary

lst = ['pooja','Mansi','Sanket','Ritika','Rahul','Shruti','Raj']
ser1 = pd.Series(lst)
ser1

0     pooja
1     Mansi
2    Sanket
3    Ritika
4     Rahul
5    Shruti
6       Raj
dtype: object

In [94]:
print(ser1.index)
ser1.index=list("abcdefg")
print(ser1)

RangeIndex(start=0, stop=7, step=1)
a     pooja
b     Mansi
c    Sanket
d    Ritika
e     Rahul
f    Shruti
g       Raj
dtype: object


In [90]:
arr1= np.array(range(100,200,10))
seer1= pd.Series(arr1,index=list("ABCDEFGHIJ"))
seer1

A    100
B    110
C    120
D    130
E    140
F    150
G    160
H    170
I    180
J    190
dtype: int64

In [95]:
ser1

a     pooja
b     Mansi
c    Sanket
d    Ritika
e     Rahul
f    Shruti
g       Raj
dtype: object

In [98]:
dict1 = {"Pooja":"Pune","Manasi": "Mumbai","Tejal":"Delhi"}
ser1 = pd.Series(dict1)
ser1

Pooja       Pune
Manasi    Mumbai
Tejal      Delhi
dtype: object

In [99]:
ser1.rename({'Pooja':'Puja','Manasi':"a"},inplace=True)
ser1

Puja       Pune
a        Mumbai
Tejal     Delhi
dtype: object

In [100]:
print(ser1.values)
print(list(ser1.values))

['Pune' 'Mumbai' 'Delhi']
['Pune', 'Mumbai', 'Delhi']


# DataFrame

In [101]:
dict1={'Name': ['Tom', 'Jerry', 'Lee', 'Bob', 'Stock', 'Ben', 'Candy'],\
'Age': [23,24, 12, 34, 33, 21,45],\
'Marks': [90,97,85,67,89,78,70],\
'Location': ['Pune', 'Mumbai', 'Delhi', 'Banglore', 'Goa', 'Chennai', 'Nasik'],\
'Gender': ['M', 'M', 'F', 'M', 'M', 'M', 'F']}
df=pd.DataFrame(dict1)
df

Unnamed: 0,Name,Age,Marks,Location,Gender
0,Tom,23,90,Pune,M
1,Jerry,24,97,Mumbai,M
2,Lee,12,85,Delhi,F
3,Bob,34,67,Banglore,M
4,Stock,33,89,Goa,M
5,Ben,21,78,Chennai,M
6,Candy,45,70,Nasik,F


In [102]:
df.shape

(7, 5)