
---

## Advanced Regular Expression Assignments

### Assignment 1: Extracting Phone Numbers

**Raw Text:** 
Extract all valid Pakistani phone numbers from a given text.

**Example:**
```
Text: Please contact me at 0301-1234567 or 042-35678901 for further details.
```



In [3]:
import re
Text = """Please contact me at 0301-1234567 or 042-35678901 for further details."""

pattern = r"[0-9]{3,4}-[0-9]{7,8}"
phone = re.findall(pattern, Text)
phone


['0301-1234567', '042-35678901']

### Assignment 2: Validating Email Addresses

**Raw Text:** 
Validate email addresses according to Pakistani domain extensions (.pk).

**Example:**
```
Text: Contact us at info@example.com or support@domain.pk for assistance.
```



In [18]:
import re
Text = """Contact us at info@example.com or support@domain.pk for assistance."""

pattern = (r"[a-z0-9\.\-+_]+@[a-z0-9\.\-+_]+\.pk")
email= re.findall(pattern,Text,re.M)
email

['support@domain.pk']

### Assignment 3: Extracting CNIC Numbers

**Raw Text:** 
Extract all Pakistani CNIC (Computerized National Identity Card) numbers from a given text.

**Example:**
```
Text: My CNIC is 12345-6789012-3 and another one is 34567-8901234-5.
```


In [24]:
import re
Text = """My CNIC is 12345-6789012-3 and another one is 34567-8901234-5."""

pattern = r"[0-9]{5}-[0-9]{7}-[0-9]{1}"
Cnic = re.findall(pattern, Text)
Cnic


['12345-6789012-3', '34567-8901234-5']


### Assignment 4: Identifying Urdu Words

**Raw Text:** 
Identify and extract Urdu words from a mixed English-Urdu text.

**Example:**
```
Text: یہ sentence میں کچھ English words بھی ہیں۔
```



In [28]:
Text= " یہ sentence میں کچھ English words بھی ہیں۔"
urdu_word_pattern = r'[\u0600-\u06FF]+'
urdu_words = re.findall(urdu_word_pattern,Text)
urdu_words

['یہ', 'میں', 'کچھ', 'بھی', 'ہیں']

### Assignment 5: Finding Dates

**Raw Text:** 
Find and extract dates in the format DD-MM-YYYY from a given text.

**Example:**
```
Text: The event will take place on 15-08-2023 and 23-09-2023.
```



In [39]:
Text= "The event will take place on 15-08-2023 and 23-09-2023."

pattern = r"\d{1,2}[-/]\d{1,2}[-/]\d{1,4}"
date = re.findall(pattern,Text)
date


['15-08-2023', '23-09-2023']

### Assignment 6: Extracting URLs

**Raw Text:** 
Extract all URLs from a text that belong to Pakistani domains.

**Example:**
```
Text: Visit http://www.example.pk or https://website.com.pk for more information.
```



In [66]:
Text= "Visit http://www.example.pk or https://website.com.pk for more information."

pattern = r'https?://[a-zA-Z0-9.-]+\.pk'
website = re.findall(pattern,Text,)
website


['http://www.example.pk', 'https://website.com.pk']

### Assignment 7: Analyzing Currency

**Raw Text:** 
Extract and analyze currency amounts in Pakistani Rupees (PKR) from a given text.

**Example:**
```
Text: The product costs PKR 1500, while the deluxe version is priced at Rs. 2500.
```



In [67]:
Text = "The product costs PKR 1500, while the deluxe version is priced at Rs. 2500."

pattern = r'PKR\s*(\d+(?:,\d{3})*(?:\.\d{1,2})?)'
Price = re.findall(pattern,Text,)
Price



['1500']

### Assignment 8: Removing Punctuation

**Raw Text:** 
Remove all punctuation marks from a text while preserving Urdu characters.

**Example:**
```
Text: کیا! آپ, یہاں؟
```



In [93]:
Text= "کیا! آپ, یہاں؟"

pattern = r'[,؛‘’“”‘‘!؟،ء۔]+'

urdu = re.sub(pattern, '', Text)
urdu


'کیا آپ یہاں'

### Assignment 9: Extracting City Names

**Raw Text:** 
Extract names of Pakistani cities from a given text.

**Example:**
```
Text: Lahore, Karachi, Islamabad, and Peshawar are major cities of Pakistan.
```


In [94]:
Text = "Lahore, Karachi, Islamabad, and Peshawar are major cities of Pakistan."
city_names = ["Lahore", "Karachi", "Islamabad", "Peshawar", "Quetta", "Faisalabad", "Rawalpindi", "Multan", "Gujranwala", "Sialkot", "Sargodha", "Bahawalpur", "Sukkur", "Larkana", "Sheikhupura", "Jhang", "Rahim Yar Khan"]
pattern = r'\b(?:' + '|'.join(re.escape(city) for city in city_names) + r')\b'
Names = re.findall(pattern,Text,)
Names


['Lahore', 'Karachi', 'Islamabad', 'Peshawar']


### Assignment 10: Analyzing Vehicle Numbers

**Raw Text:** 
Identify and extract Pakistani vehicle registration numbers (e.g., ABC-123) from a text.

**Example:**
```
Text: I saw a car with the number plate LEA-567 near the market.
```



In [99]:
Text = "I saw a car with the number plate LEA-567 near the market."

pattern = r"[a-z]{3}-[0-9]{3}"
Cnic = re.findall(pattern, Text,re.IGNORECASE)
Cnic


['LEA-567']