# <font color="#418FDE" size="6.5" uppercase>**Groups and Alternation**</font>

>Last update: 20251224.
    
By the end of this Lecture, you will be able to:
- Construct regex patterns that use capturing and non-capturing groups to structure complex matches. 
- Use alternation with the `|` operator to match one of several possible substrings within a single pattern. 
- Reference captured groups in Python code to extract and reorganize matched text. 


## **1. Regex capturing groups**

### **1.1. Grouping with Parentheses**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Python Regex A-Z/Module_02/Lecture_A/image_01_01.jpg?v=1766632696" width="250">



>* Parentheses group multiple regex tokens as one
>* They control scope for repetition, options, alternation

>* Parentheses control which parts quantifiers and anchors affect
>* They make repeated sequences precise, readable, maintainable

>* Use groups to mirror real-world text structure
>* Grouping clarifies organization and produces stronger patterns



In [None]:
#@title Python Code - Grouping with Parentheses

# Demonstrate regex grouping with parentheses using simple date patterns.
# Show difference between grouped and ungrouped quantifiers in patterns.
# Print matches so grouping behavior becomes visually clear and understandable.

import re  # Import regular expression module for pattern matching.

text = "Sale dates: 01-02-2024 01-02-24 01-02-2025."  # Example text with dates.

pattern_ungrouped = r"\d\d-\d\d-\d{2}"  # Quantifier applies only to final two digits.
pattern_grouped = r"(\d\d-\d\d-\d{2})"  # Parentheses group entire date pattern together.

matches_ungrouped = re.findall(pattern_ungrouped, text)  # Find matches without explicit grouping.
matches_grouped = re.findall(pattern_grouped, text)  # Find matches where entire date is grouped.

print("Text being searched:", text)  # Show the original text for reference.
print("Ungrouped pattern matches:", matches_ungrouped)  # Display ungrouped pattern matches.
print("Grouped pattern matches:", matches_grouped)  # Display grouped pattern matches clearly.




### **1.2. Group Numbering Basics**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Python Regex A-Z/Module_02/Lecture_A/image_01_02.jpg?v=1766632719" width="250">



>* Groups numbered by opening parenthesis, left-to-right
>* Numbers stay fixed, even when groups match nothing

>* Only capturing groups receive numbers and captures
>* Non-capturing groups skip numbering, preserving group indexes

>* Plan groups left to right for predictability
>* Capture only key data, use non-capturing elsewhere



In [None]:
#@title Python Code - Group Numbering Basics

# Demonstrate how regex group numbering works with capturing groups.
# Show difference between capturing and non capturing group numbering.
# Print matched groups to connect numbers with captured text.

import re  # Import regular expression module for pattern matching.

text_example = "Item: TV-42inch Color: Black"  # Example product description text string.

pattern_capturing = r"(Item): (\w+-\w+inch) Color: (\w+)"  # Three capturing groups pattern.

match_capturing = re.search(pattern_capturing, text_example)  # Search text using capturing pattern.

print("Capturing groups pattern groups:", match_capturing.groups())  # Print all captured groups tuple.

pattern_mixed = r"(?:Item): (\w+-\w+inch) Color: (\w+)"  # First group non capturing pattern.

match_mixed = re.search(pattern_mixed, text_example)  # Search text using mixed capturing pattern.

print("Mixed groups pattern groups:", match_mixed.groups())  # Print captured groups without label group.

print("First capturing group text:", match_mixed.group(1))  # Show first capturing group content.




### **1.3. Accessing Match Groups**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Python Regex A-Z/Module_02/Lecture_A/image_01_03.jpg?v=1766632741" width="250">



>* Capturing groups store specific matched text pieces
>* Use groups to extract and reorganize structured data

>* Groups split semi-structured text into useful fields
>* Whole match confirms success; groups expose details

>* Plan group structure to control later access
>* Use non-capturing groups to keep numbering stable



In [None]:
#@title Python Code - Accessing Match Groups

# Demonstrate accessing regex match groups in simple Python examples.
# Show how full matches differ from individual captured groups.
# Highlight how group positions map to specific text parts.

import re  # Import regular expression module for pattern matching.

text = "Ticket #4821: Printer not working"  # Example subject line text.
pattern = r"Ticket #(\d+): (.+)"  # Pattern with two capturing groups.

match = re.search(pattern, text)  # Search text using the defined pattern.

if match:  # Check whether the pattern successfully matched the text.
    print("Full match:", match.group(0))  # Print entire matched substring.
    print("Ticket number:", match.group(1))  # Print first captured group value.
    print("Description:", match.group(2))  # Print second captured group value.

log_line = "2025-12-25 ERROR Disk almost full"  # Example log line string.
log_pattern = r"(\d{4}-\d{2}-\d{2}) (INFO|ERROR|WARN) (.+)"  # Log pattern.

log_match = re.search(log_pattern, log_line)  # Match pattern against log line.

if log_match:  # Confirm that the log line matched the pattern.
    date, level, message = log_match.groups()  # Unpack all captured groups.
    print("Date:", date, "Level:", level)  # Print date and level information.
    print("Message:", message)  # Print remaining message body from the log line.



## **2. Noncapturing Groups Essentials**

### **2.1. Noncapturing Group Syntax**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Python Regex A-Z/Module_02/Lecture_A/image_02_01.jpg?v=1766632767" width="250">



>* Noncapturing groups organize pattern pieces without capturing
>* Use them with alternation when capture isnâ€™t needed

>* Noncapturing groups organize alternation without storing matches
>* They keep captured group numbering stable and predictable

>* Use noncapturing groups to organize layered alternation
>* Keep captures only for data you truly need



In [None]:
#@title Python Code - Noncapturing Group Syntax

# Demonstrate noncapturing group syntax with simple honorific matching.
# Compare capturing and noncapturing groups using the same sample names.
# Show how group numbering changes when groups are noncapturing.

import re

names_list = ["Mr. John Smith", "Ms. Anna Clark", "Dr. Mike Brown"]

pattern_capturing = re.compile(r"(Mr\.|Ms\.|Dr\.) (\w+) (\w+)")

pattern_noncapturing = re.compile(r"(?:Mr\.|Ms\.|Dr\.) (\w+) (\w+)")

print("Using capturing group for honorific and names:")

for name in names_list:
    match = pattern_capturing.search(name)
    if match:
        print("Full:", match.group(0), "Honorific:", match.group(1), "First:", match.group(2))

print("\nUsing noncapturing group for honorific only:")

for name in names_list:
    match = pattern_noncapturing.search(name)
    if match:
        print("Full:", match.group(0), "First:", match.group(1), "Last:", match.group(2))



### **2.2. Avoid Unnecessary Captures**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Python Regex A-Z/Module_02/Lecture_A/image_02_02.jpg?v=1766632794" width="250">



>* Only capture text you actually need
>* Use noncapturing groups for structural alternation

>* Too many captures clutter complex alternation patterns
>* Use noncapturing groups to reduce confusion, mistakes

>* Unneeded captures break when patterns change over time
>* Use captures only for data you truly need



In [None]:
#@title Python Code - Avoid Unnecessary Captures

# Demonstrate unnecessary capturing groups with alternation in simple patterns.
# Show how noncapturing groups reduce confusing unused captured groups.
# Compare match group outputs for capturing and noncapturing alternation patterns.

import re  # Import regular expression module for pattern matching.

text = "Senior Dev, Junior Dev, Lead Dev"  # Example job titles text string.

pattern_capturing = r"(Senior|Junior|Lead) Dev"  # Capturing alternation group pattern.

pattern_noncapturing = r"(?:Senior|Junior|Lead) Dev"  # Noncapturing alternation group pattern.

match_cap = re.search(pattern_capturing, text)  # Search using capturing pattern example.

match_nocap = re.search(pattern_noncapturing, text)  # Search using noncapturing pattern example.

print("Capturing pattern groups:", match_cap.groups())  # Show captured groups list.

print("Noncapturing pattern groups:", match_nocap.groups())  # Show noncaptured groups list.

print("Matched title text:", match_nocap.group(0))  # Show full matched title text only.



### **2.3. Clarity and Speed Gains**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Python Regex A-Z/Module_02/Lecture_A/image_02_03.jpg?v=1766632812" width="250">



>* Noncapturing groups make alternation patterns more readable
>* They highlight shared text and reduce mental effort

>* Noncapturing groups make alternation boundaries clear, safe
>* They simplify extending, debugging, and reviewing complex patterns

>* Noncapturing alternation avoids storing unnecessary matches
>* This reduces memory use and speeds repeated matching



In [None]:
#@title Python Code - Clarity and Speed Gains

# Demonstrate clarity and speed using noncapturing groups with alternation in Python regex patterns.
# Compare capturing versus noncapturing groups when matching several related shipping status phrases.
# Show that noncapturing groups avoid unnecessary captures while keeping patterns readable and efficient.

import re

statuses_text = "Package shipped, package shipping, package shipment, package delayed, package delivered."

pattern_capturing = re.compile(r"package (shipped|shipping|shipment|delayed|delivered)")

pattern_noncapturing = re.compile(r"package (?:shipped|shipping|shipment|delayed|delivered)")

matches_capturing = list(pattern_capturing.finditer(statuses_text))

matches_noncapturing = list(pattern_noncapturing.finditer(statuses_text))

print("Capturing group pattern matches and captured words:")

for match in matches_capturing:
    print("Full:", match.group(0), "Captured:", match.group(1))

print("\nNoncapturing group pattern matches, no extra captured words:")

for match in matches_noncapturing:
    print("Full:", match.group(0), "Groups tuple:", match.groups())



## **3. Regex Alternation Basics**

### **3.1. Alternation With Pipe**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Python Regex A-Z/Module_02/Lecture_A/image_03_01.jpg?v=1766632834" width="250">



>* Pipe alternation lets one pattern match options
>* Captured group shows which option matched in Python

>* Alternation inside groups selects which pattern matched
>* Use captured choice in Python to drive logic

>* Use alternation and groups to parse flexible dates
>* Normalize captured pieces into consistent, structured data



In [None]:
#@title Python Code - Alternation With Pipe

# Demonstrate regex alternation with pipe and capturing groups in Python.
# Show how one pattern matches several words using the or style operator.
# Use captured group text to decide which simple response message to print.

import re  # Import regular expression module for pattern matching operations.

messages = ["I have a billing question.", "This is a shipping delay.", "I need technical help."]

pattern = re.compile(r"(billing|shipping|technical)")  # Alternation group selects one category.

for message in messages:  # Loop through each example message string.

    match = pattern.search(message)  # Search message using alternation based pattern.

    if match:  # Check whether the pattern successfully matched any category word.
        category = match.group(1)  # Captured group text shows which alternative matched.
        if category == "billing":  # Decide response based on captured billing category.
            reply = "Routing to billing support desk for invoice questions."  # Billing response text.
        elif category == "shipping":  # Decide response based on captured shipping category.
            reply = "Routing to shipping support desk for delivery issues."  # Shipping response text.
        else:  # Remaining alternative must be the technical support category.
            reply = "Routing to technical support desk for troubleshooting help."  # Technical response.
    else:  # Handle messages that do not match any alternation category.
        reply = "No category detected, sending to general support queue."  # Fallback response.

    print(f"Message: {message} -> {reply}")  # Show message and chosen response together.



### **3.2. Grouping and precedence**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Python Regex A-Z/Module_02/Lecture_A/image_03_02.jpg?v=1766632853" width="250">



>* Grouping controls alternation precedence and captured text
>* Careful groups simplify reorganizing matches in Python

>* Use parentheses to define interchangeable text pieces
>* Match groups to meaningful units for easy reordering

>* Good grouping keeps patterns flexible and extendable
>* Stable group positions simplify Python data extraction



In [None]:
#@title Python Code - Grouping and precedence

# Demonstrate regex grouping precedence with alternation in Python code.
# Compare patterns with and without grouping around department codes.
# Show how captured groups change and affect reorganized output.

import re  # Import regular expression module for pattern matching.

text_line = "Report: HR-2048 and IT-4096 for Q4."  # Example report line.

pattern_no_group = r"(HR|IT)-([0-9]+)"  # Alternation only around department codes.

match_no_group = re.search(pattern_no_group, text_line)  # Search using first pattern.

print("No grouping change, dept group, id group:", match_no_group.groups())

pattern_with_group = r"(HR|IT)-(20|40)([0-9]{2})"  # Group alternation inside identifier.

match_with_group = re.search(pattern_with_group, text_line)  # Search using second pattern.

print("With grouping change, dept group, part groups:", match_with_group.groups())

reorganized_one = f"Department {match_no_group.group(1)} with id {match_no_group.group(2)}"  # Rebuild.

print("Reorganized using simple grouping:", reorganized_one)

reorganized_two = f"Department {match_with_group.group(1)} with id {match_with_group.group(2)}{match_with_group.group(3)}"  # Rebuild.

print("Reorganized using detailed grouping:", reorganized_two)



### **3.3. Word Variant Alternation**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Python Regex A-Z/Module_02/Lecture_A/image_03_03.jpg?v=1766632874" width="250">



>* Use alternation groups to match word variants
>* Captured variant lets Python normalize or track preferences

>* Factor shared word parts; alternate changing pieces
>* Use captured variant segment to control Python logic

>* Grouped alternations capture nuanced word variant meanings
>* Python maps captured variants into structured data fields



In [None]:
#@title Python Code - Word Variant Alternation

# Demonstrate word variant alternation with regex groups in Python.
# Capture different word variants and inspect which alternative actually matched.
# Normalize matched variants into a single canonical label for simple downstream processing.

import re  # Import regular expression module for pattern matching operations.

text = "The color, colour, and multi-color options were all mentioned by the customer."  

pattern = re.compile(r"\b(colo(?:r|ur|rful)|multi-color)\b", re.IGNORECASE)  

matches = list(pattern.finditer(text))  

print("Original feedback sentence:")  
print(text)  
print("\nFound word variants and normalized labels:")  

for match in matches:  
    variant = match.group(1)  
    normalized = "color"  
    print(f"Matched variant: {variant:<12} -> Normalized label: {normalized}")  




# <font color="#418FDE" size="6.5" uppercase>**Groups and Alternation**</font>


In this lecture, you learned to:
- Construct regex patterns that use capturing and non-capturing groups to structure complex matches. 
- Use alternation with the `|` operator to match one of several possible substrings within a single pattern. 
- Reference captured groups in Python code to extract and reorganize matched text. 

In the next Lecture (Lecture B), we will go over 'Anchors and Boundaries'