---

## Advanced Regular Expression Assignments

### Assignment 1: Extracting Phone Numbers

**Raw Text:** 
Extract all valid Pakistani phone numbers from a given text.

**Example:**
```
Text: Please contact me at 0301-1234567 or 042-35678901 for further details.
```



In [7]:
import re

txt = "please contact me at 0301-4240934 or 0301-2466093 for further details"

pattern = r'\d{3,4}-\d{7,8}\b'
phone_numbers = re.findall(pattern, txt)
phone_numbers



['0301-4240934', '0301-2466093']

### Assignment 2: Validating Email Addresses

**Raw Text:** 
Validate email addresses according to Pakistani domain extensions (.pk).

**Example:**
```
Text: Contact us at info@example.com or support@domain.pk for assistance.
```



In [15]:
import re

txt = "contact us at info@example.com or support@domain.com for assistance"

pattern = r'\b[\w.-]+@[\w.-]+\.pk'
email_address = re.findall(pattern,txt)
email_address


['support@domain.com']

### Assignment 3: Extracting CNIC Numbers

**Raw Text:** 
Extract all Pakistani CNIC (Computerized National Identity Card) numbers from a given text.

**Example:**
```
Text: My CNIC is 12345-6789012-3 and another one is 34567-8901234-5.
```


In [17]:
import re

txt = "My CNIC is 12345-6789012-3 and another one is 34567-8901234-5"
pattern = r'\b\d{5}-\d{7}-\d{1}\b'
CNIC = re.findall(pattern,txt)
CNIC


['12345-6789012-3', '34567-8901234-5']


### Assignment 4: Identifying Urdu Words

**Raw Text:** 
Identify and extract Urdu words from a mixed English-Urdu text.

**Example:**
```
Text: یہ sentence میں کچھ English words بھی ہیں۔
```



In [33]:
import re

txt = "یہ sentence میں کچھ English words بھی ہیں۔"

pattern = r'[\u0600-\u06FF]+'
urdu_txt = re.findall(pattern,txt)
urdu_txt


['یہ', 'میں', 'کچھ', 'بھی', 'ہیں۔']

### Assignment 5: Finding Dates

**Raw Text:** 
Find and extract dates in the format DD-MM-YYYY from a given text.

**Example:**
```
Text: The event will take place on 15-08-2023 and 23-09-2023.
```



In [27]:
import re
txt = "the event will take place on 15-08-2023 and 23-09-2023."
pattern = r'\d{1,2}-\d{2}-\d{3,4}'
find_dates = re.findall(pattern,txt)
find_dates



['15-08-2023', '23-09-2023']

### Assignment 6: Extracting URLs

**Raw Text:** 
Extract all URLs from a text that belong to Pakistani domains.

**Example:**
```
Text: Visit http://www.example.pk or https://website.com.pk for more information.
```



In [32]:
import re
txt =  "Visit http://www.example.pk or https://website.com.pk for more information."
pattern = r'[http://]+[\w.?]+\.pk'
Url = re.findall(pattern,txt)
Url

['http://www.example.pk', '://website.com.pk']

### Assignment 7: Analyzing Currency

**Raw Text:** 
Extract and analyze currency amounts in Pakistani Rupees (PKR) from a given text.

**Example:**
```
Text: The product costs PKR 1500, while the deluxe version is priced at Rs. 2500.
```



In [37]:
import re
Txt = "The product costs PKR 1500, while the deluxe version is priced at Rs. 2500."
pattern = r'[\w]-?\d{3,4}'
currency = re.findall(pattern,Txt)
currency

['1500', '2500']

### Assignment 8: Removing Punctuation

**Raw Text:** 
Remove all punctuation marks from a text while preserving Urdu characters.

**Example:**
```
Text: کیا! آپ, یہاں؟
```



In [17]:
import re
txt =  "کیا! آپ, یہاں؟" 

pattern = r'[\u0621-\u06FF]'   

    
urdu_punc= re.findall(pattern,txt)
urdu_punc

['ک', 'ی', 'ا', 'آ', 'پ', 'ی', 'ہ', 'ا', 'ں']

### Assignment 9: Extracting City Names

**Raw Text:** 
Extract names of Pakistani cities from a given text.

**Example:**
```
Text: Lahore, Karachi, Islamabad, and Peshawar are major cities of Pakistan.
```


In [27]:
import re
txt =  "Lahore, karachi, Islamabad, and Peshawar are major cities of Pakistan."
pattern = r'\b(\w+),?\s(\w+),?\s(\w+),?\sand\s(\w+)\b'
cities_names = re.findall(pattern,txt)
cities_names

[('Lahore', 'karachi', 'Islamabad', 'Peshawar')]


### Assignment 10: Analyzing Vehicle Numbers

**Raw Text:** 
Identify and extract Pakistani vehicle registration numbers (e.g., ABC-123) from a text.

**Example:**
```
Text: I saw a car with the number plate LEA-567 near the market.
```



In [66]:
import re
Txt = " I saw a car with the number plate LEA-567 near the market."
pattern = r'[A-Z]-?\d{2,3}'
# pattern = r"\b [A-Z]{3} - \d{3}\b"
reg_num = re.findall(pattern,Txt)
reg_num

['A-567']