---

## Advanced Regular Expression Assignments

### Assignment 1: Extracting Phone Numbers

**Raw Text:** 
Extract all valid Pakistani phone numbers from a given text.

**Example:**
```
Text: Please contact me at 0301-1234567 or 042-35678901 for further details.
```



In [6]:
text="Please contact me at 0301-1234567 or 042-35678901 for further details."
import re
pattern="[0-9]{3,4}-[0-9]{7,8}"
result=re.findall(pattern, text)
result


['0301-1234567', '042-35678901']

### Assignment 2: Validating Email Addresses

**Raw Text:** 
Validate email addresses according to Pakistani domain extensions (.pk).

**Example:**
```
Text: Contact us at info@example.com or support@domain.pk for assistance.
```



In [1]:
text= "Contact us at info@example.com or support@domain.pk for assistance."
import re
pattern="[a-z]+@[a-z]+.pk"
result=re.findall(pattern,text)
result

['support@domain.pk']

### Assignment 3: Extracting CNIC Numbers

**Raw Text:** 
Extract all Pakistani CNIC (Computerized National Identity Card) numbers from a given text.

**Example:**
```
Text: My CNIC is 12345-6789012-3 and another one is 34567-8901234-5.
```


In [4]:
text="My CNIC is 12345-6789012-3 and another one is 34567-8901234-5."

import re
pattern="[0-9]+-[0-9]+-[0-9]"
result=re.findall(pattern, text)
result

['12345-6789012-3', '34567-8901234-5']


### Assignment 4: Identifying Urdu Words

**Raw Text:** 
Identify and extract Urdu words from a mixed English-Urdu text.

**Example:**
```
Text: یہ sentence میں کچھ English words بھی ہیں۔
```



In [2]:
text=" یہ sentence میں کچھ English words بھی ہیں۔"
import re
pattern=r"[\u0600-\u06ff]+"
result=re.findall(pattern,text)
result

['یہ', 'میں', 'کچھ', 'بھی', 'ہیں۔']

### Assignment 5: Finding Dates

**Raw Text:** 
Find and extract dates in the format DD-MM-YYYY from a given text.

**Example:**
```
Text: The event will take place on 15-08-2023 and 23-09-2023.
```



In [5]:
text="The event will take place on 15-08-2023 and 23-09-2023."
import re
pattern="[0-9]+-[0-9]+-[0-9]+"
result=re.findall(pattern,text)
result


['15-08-2023', '23-09-2023']

### Assignment 6: Extracting URLs

**Raw Text:** 
Extract all URLs from a text that belong to Pakistani domains.

**Example:**
```
Text: Visit http://www.example.pk or https://website.com.pk for more information.
```



In [5]:
text="Visit http://www.example.pk or https://website.com.pk for more information."
import re
pattern="http+s?:+//+w?w?w?.?[a-z]+.c?o?m?.pk"
result=re.findall(pattern,text)
result

['http://www.example.pk', 'https://website.com.pk']

### Assignment 7: Analyzing Currency

**Raw Text:** 
Extract and analyze currency amounts in Pakistani Rupees (PKR) from a given text.

**Example:**
```
Text: The product costs PKR 1500, while the deluxe version is priced at Rs. 2500.
```



In [7]:
text="The product costs PKR 1500, while the deluxe version is priced at Rs. 2500."
import re
pattern="PKR\s[0-9]+|Rs.\s[0-9]+"
result=re.findall(pattern,text)
result

['PKR 1500', 'Rs. 2500']

### Assignment 8: Removing Punctuation

**Raw Text:** 
Remove all punctuation marks from a text while preserving Urdu characters.

**Example:**
```
Text: کیا! آپ, یہاں؟
```



In [22]:
text=" کیا! آپ, یہاں؟"
import re
pattern=r"[!,؟]"
result=re.sub(pattern,"",text)
result

' کیا آپ یہاں'

### Assignment 9: Extracting City Names

**Raw Text:** 
Extract names of Pakistani cities from a given text.

**Example:**
```
Text: Lahore, Karachi, Islamabad, and Peshawar are major cities of Pakistan.
```


In [14]:
text="Lahore, Karachi, Islamabad, and Peshawar are major cities of Pakistan."
import re
pattern=r"(?!Pakistan)[A-Z][a-z]+"
result=re.findall(pattern,text)
result

['Lahore', 'Karachi', 'Islamabad', 'Peshawar']


### Assignment 10: Analyzing Vehicle Numbers

**Raw Text:** 
Identify and extract Pakistani vehicle registration numbers (e.g., ABC-123) from a text.

**Example:**
```
Text: I saw a car with the number plate LEA-567 near the market.
```



In [17]:
text="I saw a car with the number plate LEA-567 near the market."
import re
pattern="[A-Z]+-[0-9]+"
result=re.findall(pattern,text)
result

['LEA-567']