# <font color="#418FDE" size="6.5" uppercase>**Anchors and Boundaries**</font>

>Last update: 20251224.
    
By the end of this Lecture, you will be able to:
- Use start and end anchors to ensure patterns match entire lines or strings when needed. 
- Apply word boundary tokens to match whole words without capturing partial substrings. 
- Design intermediate-level validation patterns that combine anchors, groups, and character classes. 


## **1. Line Anchors Basics**

### **1.1. Start Anchor Usage**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Python Regex A-Z/Module_02/Lecture_B/image_01_01.jpg?v=1766634054" width="250">



>* Start anchor forces matches from string beginning
>* Ensures clean, exact starts for structured data

>* Start anchors match tags only at beginnings
>* They prevent later occurrences from matching incorrectly

>* Start anchors prevent hidden prefixes in input
>* They improve security and reliable data cleaning



In [None]:
#@title Python Code - Start Anchor Usage

# Demonstrate start anchor usage with simple subject lines example.
# Compare anchored and unanchored patterns using Python re module.
# Show which subjects truly start with desired project tag.

import re  # Import regular expression module for pattern matching.

subjects = [
    "[PROJ] Server reboot tonight",
    "Re: [PROJ] Server reboot tonight",
    "FWD: urgent [PROJ] notice",
    "[ALERT] Backup failed yesterday",
]

pattern_unanchored = re.compile(r"\[PROJ\]")  # Match tag anywhere within subject.
pattern_anchored = re.compile(r"^\[PROJ\]")  # Match tag only at very start.

print("Unanchored matches, tag appears anywhere within subject:")
for subject in subjects:
    if pattern_unanchored.search(subject):
        print("MATCH ANYWHERE ->", subject)

print("\nAnchored matches, tag must appear at line start:")
for subject in subjects:
    if pattern_anchored.search(subject):
        print("MATCH AT START ->", subject)



### **1.2. End Anchor Usage**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Python Regex A-Z/Module_02/Lecture_B/image_01_02.jpg?v=1766634088" width="250">



>* End anchor forces matches to reach text end
>* Prevents extra trailing characters, ensuring strict validation

>* End anchors enforce exact formats without leftovers
>* They prevent partial matches and subtle validation errors

>* End anchors mark where each text unit ends
>* They prevent matching keywords appearing only in middle



In [None]:
#@title Python Code - End Anchor Usage

# Demonstrate end anchor usage with simple yes or no answers.
# Show difference between anchored and unanchored regular expression patterns.
# Print which user inputs are accepted by each regular expression pattern.

import re

# Define a pattern without an end anchor, allowing extra trailing characters.
pattern_loose = re.compile(r"^(yes|no)")

# Define a pattern with an end anchor, blocking extra trailing characters.
pattern_strict = re.compile(r"^(yes|no)$")

# Prepare several example inputs that look similar but behave differently.
user_inputs = ["yes", "no", "yes!", "no later", "yesterday"]

# Check each input against both patterns and collect readable result strings.
results = []
for text in user_inputs:
    loose_match = bool(pattern_loose.search(text))
    strict_match = bool(pattern_strict.search(text))
    result_line = f"Input: {text!r}  loose: {loose_match}  strict: {strict_match}"
    results.append(result_line)

# Print a short header explaining the meaning of the printed comparison lines.
print("Comparing loose pattern and strict end anchored pattern results.")

# Print each comparison line, showing how the end anchor changes accepted inputs.
for line in results:
    print(line)



### **1.3. Multiline Anchor Flags**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Python Regex A-Z/Module_02/Lecture_B/image_01_03.jpg?v=1766634107" width="250">



>* Multiline mode changes how start and end anchors behave
>* Anchors treat each line like a separate string

>* Multiline mode checks each line’s start separately
>* Helps find lines ending with specific characters

>* Use absolute anchors for whole-text validation
>* Use multiline anchors for per-line matching



In [None]:
#@title Python Code - Multiline Anchor Flags

# Demonstrate multiline anchor flags using simple log style text.
# Compare behavior with and without multiline regex flag usage.
# Show how caret and dollar anchors change meaning per line.

import re

log_text = "ERROR Disk full on drive C:\nINFO Backup completed successfully\nWARNING Low memory detected"

pattern_single = re.compile(r"^ERROR.*$")
pattern_multi = re.compile(r"^ERROR.*$", re.MULTILINE)

single_match = pattern_single.findall(log_text)
multi_match = pattern_multi.findall(log_text)

print("Single line mode matches list:", single_match)
print("Multiline mode matches list:", multi_match)




## **2. Regex word boundaries**

### **2.1. Word Boundary Basics**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Python Regex A-Z/Module_02/Lecture_B/image_02_01.jpg?v=1766634126" width="250">



>* Word boundaries help match complete words in text
>* They mark borders between word and non-word characters

>* Word boundaries mark edges between word characters
>* They ensure standalone words, not inner substrings

>* Word boundaries find precise, standalone word matches
>* They prevent noisy partial matches and enable advanced patterns



In [None]:
#@title Python Code - Word Boundary Basics

# Demonstrate basic word boundary behavior using simple Python regex examples.
# Compare matches with and without word boundary tokens in short sentences.
# Show how boundaries prevent matching unwanted partial word fragments.

import re

text_example = "The cat sat on the cathedral carpet near another cat."
pattern_plain = "cat"
pattern_boundary = r"\bcat\b"

plain_matches = re.findall(pattern_plain, text_example)
boundary_matches = re.findall(pattern_boundary, text_example)

print("Text being searched for matches with word boundaries.")
print(text_example)
print("Plain pattern matches including partial word fragments:", plain_matches)
print("Boundary pattern matches only whole word occurrences:", boundary_matches)



### **2.2. Inside word boundaries**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Python Regex A-Z/Module_02/Lecture_B/image_02_02.jpg?v=1766634147" width="250">



>* Word boundaries are conditions, not consumed characters
>* Actual match sits between boundaries as whole word

>* Word boundaries can enclose complex, flexible patterns
>* They keep matches isolated from surrounding characters

>* Use boundaries to isolate words in messy text
>* Treat inside as flexible pattern guarded by boundaries



In [None]:
#@title Python Code - Inside word boundaries

# Demonstrate how word boundaries frame flexible internal regex patterns.
# Show that inside boundaries we can allow optional characters and variations.
# Print matches to compare boundary based patterns with similar non boundary patterns.

import re  # Import regular expression module for pattern matching.

text = "Our product codes: X100, X-100, X_100, and AX100B in this catalog."  # Example text string.

pattern_with_boundaries = re.compile(r"\bX[-_]?100\b")  # Pattern uses word boundaries around core.
pattern_without_boundaries = re.compile(r"X[-_]?100")  # Pattern omits boundaries for comparison.

matches_with = pattern_with_boundaries.findall(text)  # Find matches that respect word boundaries.
matches_without = pattern_without_boundaries.findall(text)  # Find matches that may include partials.

print("Text being searched:")  # Show the original text for context.
print(text)  # Print the example text string.

print("\nMatches with word boundaries:")  # Show matches that behave like whole word tokens.
print(matches_with)  # Print list of matches found using boundary based pattern.

print("\nMatches without word boundaries:")  # Show matches that may include unwanted partial matches.
print(matches_without)  # Print list of matches found using non boundary pattern.



### **2.3. Preventing Partial Matches**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Python Regex A-Z/Module_02/Lecture_B/image_02_03.jpg?v=1766634171" width="250">



>* Word boundaries stop regex from matching inside words
>* They ensure the pattern matches only standalone words

>* Word boundaries stop confusing, high‑stakes partial matches
>* They isolate exact terms like lease, Gap, A

>* Use word boundaries to avoid subtle false positives
>* They keep matches to complete, meaningful whole words



In [None]:
#@title Python Code - Preventing Partial Matches

# Demonstrate preventing partial matches using word boundaries in Python regex patterns.
# Compare matches with and without word boundaries around specific target words.
# Show how boundaries avoid matching fragments inside longer unrelated words.

import re  # Import regular expression module for pattern matching operations.

text = "The art class met near the heart clinic and artery research center."  # Example sentence.

pattern_loose = re.compile(r"art")  # Pattern without boundaries matches partial word fragments.
pattern_strict = re.compile(r"\bart\b")  # Pattern with boundaries matches only standalone word.

loose_matches = pattern_loose.findall(text)  # Find all loose matches inside entire sentence.
strict_matches = pattern_strict.findall(text)  # Find all strict matches using word boundaries.

print("Text being searched:")  # Print label describing following sentence content.
print(text)  # Print example sentence showing several possible matching locations.

print("\nMatches without boundaries:", loose_matches)  # Show partial and full matches together.
print("Matches with word boundaries:", strict_matches)  # Show only complete standalone word matches.




## **3. Regex Validation Basics**

### **3.1. Username pattern rules**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Python Regex A-Z/Module_02/Lecture_B/image_03_01.jpg?v=1766634198" width="250">



>* Username regexes encode clear identity rules precisely
>* Anchors, groups, classes define structure and options

>* Separate edge rules from interior username characters
>* Use anchors, groups, lengths for realistic validation

>* Support multiple username types with one pattern
>* Use groups, alternation, anchors to enforce rules



In [None]:
#@title Python Code - Username pattern rules

# Demonstrate username validation using anchors and character classes.
# Show different rules for first, middle, and last username characters.
# Print which sample usernames pass or fail the validation pattern.

import re

pattern = re.compile(r"^[A-Za-z][A-Za-z0-9_.-]{1,14}[A-Za-z0-9]$")

usernames = ["ab", "a_b2", "_start", "end_", "toolong_username_here", "ok.name1"]

print("Username policy: start letter, end alphanumeric, length two to sixteen.")

for name in usernames:
    match = bool(pattern.fullmatch(name))
    print(f"Username: {name:20} Valid: {match}")




### **3.2. Simple Code Validation**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Python Regex A-Z/Module_02/Lecture_B/image_03_02.jpg?v=1766634232" width="250">



>* Short, structured codes are perfect regex practice
>* Combine anchors, groups, classes to fully validate

>* Different permit codes use structured, predictable formats
>* Anchors, classes, groups enforce strict, distinct matches

>* Use strict yet flexible patterns across many domains
>* Combine anchors, groups, classes for robust validation



In [None]:
#@title Python Code - Simple Code Validation

# Demonstrate simple code validation using anchors and character classes.
# Show how different permit formats are checked with one combined pattern.
# Print which sample codes are valid or invalid according to the regex rules.

import re  # Import regular expression module for pattern matching.

pattern = re.compile(r"^(?:[A-Z]-\d{4}|[A-Z]{2}-\d{3,5})$")

sample_codes = ["A-1234", "ST-987", "ST-98765", "B-12", "ST9876"]

print("Validation results for sample parking permit codes:")

for code in sample_codes:

    match = pattern.fullmatch(code)

    if match:

        print(f"{code:8} -> VALID code format matched.")

    else:

        print(f"{code:8} -> INVALID code format rejected.")



### **3.3. Anchors Groups Together**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Python Regex A-Z/Module_02/Lecture_B/image_03_03.jpg?v=1766634341" width="250">



>* Anchors frame the pattern to entire string
>* Groups define ordered segments like prefix, core, suffix

>* Character classes control what each group allows
>* Anchors block extra characters and catch subtle errors

>* Anchors control optional and alternative grouped sections
>* They prevent partial, malformed codes from matching



In [None]:
#@title Python Code - Anchors Groups Together

# Demonstrate anchors with grouped segments for simple code validation.
# Show how groups define ordered parts inside anchored regex patterns.
# Print which sample codes match the strict anchored pattern template.

import re  # Import regular expression module for pattern matching.

pattern = re.compile(r"^(ENG|SCI)[0-9]{2}-[0-9]{3}[A-Z]?$")

samples = [
    "ENG21-123A",  # Valid code with optional suffix letter present.
    "SCI19-777",   # Valid code without optional suffix letter.
    "XENG21-123A", # Invalid because extra prefix appears before anchors.
    "ENG21-123A!", # Invalid because extra symbol appears after suffix.
    "ENG2-123A"    # Invalid because year segment has wrong length.
]

print("Pattern uses anchors, groups, and character classes together:")

for code in samples:
    match = bool(pattern.fullmatch(code))
    print(f"{code:11} -> matches template: {match}")




# <font color="#418FDE" size="6.5" uppercase>**Anchors and Boundaries**</font>


In this lecture, you learned to:
- Use start and end anchors to ensure patterns match entire lines or strings when needed. 
- Apply word boundary tokens to match whole words without capturing partial substrings. 
- Design intermediate-level validation patterns that combine anchors, groups, and character classes. 

In the next Module (Module 3), we will go over 'Advanced Regex Use'