## RegEx Part3
### üìû Extracting Phone Numbers with Regular Expressions (RegEx)

In real-world text data, phone numbers often appear in many formats ‚Äî with or without country codes, separated by dashes, or mixed with other text.  
Python‚Äôs `re` module allows us to **extract all such patterns** using a few simple expressions.

---

#### üß© Example Code

In [1]:
import re

# Sample multi-line text data
data = """nasrin +9122034 1234-5678-9872-2341
bita +9123039 @bitabita
jadi +9031415 @jadijadi 6221-0610-1111-2222
sina 9876-9383-1234-4321"""

# ------------------------------------------
# Find all international-style numbers (+countrycode + digits)
# ------------------------------------------
phones = re.findall(r"\+\d+", data)
print(phones)
# Output: ['+9122034', '+9123039', '+9031415']

# ------------------------------------------
# Find all formatted phone numbers like 1234-5678-9872-2341
# ------------------------------------------
formatted = re.findall(r"\d{4}-\d{4}-\d{4}-\d{4}", data)
print(formatted)
# Output: ['1234-5678-9872-2341', '6221-0610-1111-2222', '9876-9383-1234-4321']

# ------------------------------------------
# Match both +country codes and dashed phone numbers
# ------------------------------------------
all_numbers = re.findall(r"\+\d+|\d{4}-\d{4}-\d{4}-\d{4}", data)
print(all_numbers)
# Output: ['+9122034', '+9123039', '+9031415', '1234-5678-9872-2341', '6221-0610-1111-2222', '9876-9383-1234-4321']

['+9122034', '+9123039', '+9031415']
['1234-5678-9872-2341', '6221-0610-1111-2222', '9876-9383-1234-4321']
['+9122034', '1234-5678-9872-2341', '+9123039', '+9031415', '6221-0610-1111-2222', '9876-9383-1234-4321']


## üîç Using `re.finditer()` to Access Matches and Groups

The **`re.finditer()`** function returns an **iterator** that yields detailed match objects.  
Each match object contains:
- The **exact text matched**
- The **position (span)** in the input
- The ability to extract **specific parts** using `.group()`

---

### üß© Example  ‚Äî Basic `finditer()` with `match` objects

In [2]:
import re

data = """nasrin +9122034 1234-5678-9872-2341
bita +9123039 @bitabita
jadi +9031415 @jadijadi 6221-0610-1111-2222
sina 9876-9383-1234-4321"""

# Find all formatted phone numbers
matches = re.finditer(r"(\d{4})-\d{4}-\d{4}-\d{4}", data)

for match in matches:
    print(match)

<re.Match object; span=(16, 35), match='1234-5678-9872-2341'>
<re.Match object; span=(84, 103), match='6221-0610-1111-2222'>
<re.Match object; span=(109, 128), match='9876-9383-1234-4321'>


### üß© Example 2 ‚Äî Getting the Matched Text Only

Instead of printing the entire Match object,
you can print just the text part using '.group()':

In [3]:
matches = re.finditer(r"(\d{4})-\d{4}-\d{4}-\d{4}", data)

for match in matches:
    print(match.group())

1234-5678-9872-2341
6221-0610-1111-2222
9876-9383-1234-4321


### üß© Example 3 ‚Äî Using Capture Groups '()' to Get a Specific Part

By adding parentheses '()' inside your regex,
you can capture only the first block of digits (e.g., the first 4 digits of each number).

In [4]:
matches = re.finditer(r"(\d{4})-\d{4}-\d{4}-\d{4}", data)

for match in matches:
    print(match.group(1))


1234
6221
9876
