# Regex Quest ðŸ§©ðŸ”¥
Welcome to the gamified journey through Regular Expressions. Advance through levels, earn points, build streaks, unlock boss fights, and master powerful pattern skills.

## How To Play
- Run each Level cell in order.
- Write your regex or code in the Attempt cell.
- Check the inline Solution cell (after you try first!).
- Call `complete_level(level_id, used_hint=False, success=True)` to bank points.
- Use `get_hint(level_id)` if stuck (penalty applied).
- Streak grows with consecutive successes (no hints, no failures) â†’ higher multipliers.

## Scoring System
- Base points per level: increases as difficulty rises.
- Streak multiplier: `mult = 1 + (streak * 0.10)`.
- Hint penalty: reduces awarded points by tier (early 5%, mid 15%, late/boss 25%).
- Failure resets streak to 0.

## Files
- Persistent score stored in `regex_score.json`.

## Goal
Finish all levels + Boss + Tournament with maximum score and zero hints.

---
Run the next cell to load scoring utilities.

In [None]:
# Scoring & Persistence Utilities
import json, time, math, os, re
from typing import Dict, Any
SCORE_PATH = 'regex_score.json'

BASE_POINTS = {
    # Levels 1-12
    1: 50, 2: 60, 3: 70, 4: 80,
    5: 90, 6: 100, 7: 110, 8: 120,
    9: 130, 10: 140, 11: 160, 12: 180,
    # Special IDs
    100: 250,   # Mini Boss
    200: 350,   # Boss
    300: 500    # Tournament
}

HINT_PENALTY_TIERS = {
    'early': 0.05,  # Levels 1-4
    'mid': 0.15,    # Levels 5-8
    'late': 0.25    # Levels 9-12 + bosses
}

def _hint_tier(level_id:int)->str:
    if level_id <= 4: return 'early'
    if level_id <= 8: return 'mid'
    return 'late'

def load_score()->Dict[str,Any]:
    if not os.path.exists(SCORE_PATH):
        return {'user':'default','score':0,'streak':0,'history':[]}
    with open(SCORE_PATH,'r',encoding='utf-8') as f:
        return json.load(f)

def save_score(data:Dict[str,Any])->None:
    with open(SCORE_PATH,'w',encoding='utf-8') as f:
        json.dump(data,f,indent=2)

def current_multiplier(streak:int)->float:
    return 1 + (streak * 0.10)

def complete_level(level_id:int, used_hint:bool=False, success:bool=True):
    data = load_score()
    if success:
        base = BASE_POINTS.get(level_id, 50)
        mult = current_multiplier(data['streak'])
        points = base * mult
        if used_hint:
            tier = _hint_tier(level_id)
            penalty = HINT_PENALTY_TIERS[tier]
            points *= (1 - penalty)
        awarded = math.floor(points)
        data['score'] += awarded
        data['streak'] = data['streak'] + 1 if not used_hint else 0  # hint breaks streak growth
        data['history'].append({
            'ts': time.time(),
            'level': level_id,
            'base': base,
            'multiplier': round(mult,2),
            'used_hint': used_hint,
            'awarded': awarded
        })
        save_score(data)
        print(f"Level {level_id} COMPLETE â†’ +{awarded} points (streak={data['streak']})")
    else:
        data['streak'] = 0
        data['history'].append({'ts': time.time(),'level':level_id,'failure':True})
        save_score(data)
        print(f"Level {level_id} FAILED â†’ streak reset.")

def show_status():
    d = load_score()
    print(f"User: {d['user']} | Score: {d['score']} | Streak: {d['streak']}")
    print(f"Completed: {[h['level'] for h in d['history'] if 'awarded' in h]}")

_HINTS = {}

def register_hint(level_id:int, text:str):
    _HINTS[level_id] = text.strip()

def get_hint(level_id:int):
    if level_id not in _HINTS:
        print('No hint registered for this level yet.')
        return None
    print(f"HINT (penalty applies on completion): {_HINTS[level_id]}")
    return _HINTS[level_id]

show_status()

# Lookarounds Explained (Zero-Width Assertions)
Lookarounds let you assert **context** around a match *without consuming it*.
They match a **position**, not characters.

| Type | Syntax | Meaning | Example |
|------|--------|---------|---------|
| Positive Lookahead | `X(?=Y)` | Match `X` only if followed by `Y` | `\d+(?= USD)` finds numbers before " USD" |
| Negative Lookahead | `X(?!Y)` | Match `X` only if *not* followed by `Y` | `foo(?!bar)` matches `foo` not followed by `bar` |
| Positive Lookbehind | `(?<=Y)X` | Match `X` only if preceded by `Y` | `(?<=ID:)\d+` grabs digits after `ID:` |
| Negative Lookbehind | `(?<!Y)X` | Match `X` only if *not* preceded by `Y` | `(?<!-)\b\d+\b` numbers not after dash |

Use cases:
- Select digits only after a label.
- Exclude words in certain contexts without capturing extra text.
- Prevent over-greedy matches.

Run the next cells for levels. Attempt first, then view solution.


In [None]:
# Level 1: Literal & Raw Strings (ID=1)
import re
level_id = 1
sample_text = "Contact: Mukesh, Python learner."
# Task: Match the exact word Mukesh.
# Write your pattern in variable `pat` then run re.search(pat, sample_text)
pat = r"Mukesh"  # TODO: modify if experimenting
match = re.search(pat, sample_text)
print(match.group() if match else 'No match')
register_hint(level_id, 'Use a simple literal; raw string not required here.')

In [None]:
# Level 1 Solution
# Explanation: Direct literal match.
complete_level(1, used_hint=False, success=True)

In [None]:
# Level 2: Character Classes & Negation (ID=2)
import re
level_id = 2
sample_text = "User42 scored 99 points; ID:2025; Flag:X"
# Task: 1) Find all digit sequences. 2) Find all uppercase letters excluding 'X'.
# Fill patterns pat_digits, pat_upper_no_x.
pat_digits = r"\d+"  # TODO: refine if needed
pat_upper_no_x = r"[A-WY-Z]"  # excludes X via range split
print('Digits:', re.findall(pat_digits, sample_text))
print('Upper (no X):', re.findall(pat_upper_no_x, sample_text))
register_hint(level_id, 'Use [^X] within a class or split ranges (A-W)(Y-Z).')

In [None]:
# Level 2 Solution
# Explanation: \d+ grabs consecutive digits; ranges avoid X.
complete_level(2, used_hint=False, success=True)

In [None]:
# Level 3: Shorthand Classes (ID=3)
import re
level_id = 3
sample_text = "Name:Mukesh\tEmail:mukesh@example.com\nLangs: Python, Java, C++"
# Task: Extract all 'words' (alphanumeric + underscore) and all whitespace segments.
pat_words = r"\b\w+\b"  # TODO adjust
pat_spaces = r"\s+"       # captures spaces/tabs/newlines
words = re.findall(pat_words, sample_text)
spaces = re.findall(pat_spaces, sample_text)
print('Words count:', len(words), '\nFirst 8:', words[:8])
print('Whitespace segments:', spaces)
register_hint(level_id, 'Use \w for word chars, \s for whitespace; anchor with \b.')

In [None]:
# Level 3 Solution
# Explanation: \b\w+\b ensures word boundaries; \s+ groups whitespace.
complete_level(3, used_hint=False, success=True)

In [None]:
# Level 4: Quantifiers (ID=4)
import re
level_id = 4
sample_text = "Phones: 903-366-9661, 800-123-0000 alt 44-22-11"
# Task: Match US-style phone numbers ###-###-#### only.
pat_phone = r"\b\d{3}-\d{3}-\d{4}\b"  # TODO adjust or extend
phones = re.findall(pat_phone, sample_text)
print('Phones found:', phones)
register_hint(level_id, 'Use {3} and {4} quantifiers with \d; anchor with \b.')

In [None]:
# Level 4 Solution
# Explanation: {n} specifies exact counts; boundaries prevent partial matches.
complete_level(4, used_hint=False, success=True)

In [None]:
# Level 5: Anchors (ID=5)
import re
level_id = 5
sample_text = "START task A\nMiddle line\nEND42"
# Task: 1) Match 'START' only at line start. 2) Capture trailing digits at end of last line.
pat_start = r"^START"  # with MULTILINE
pat_end_digits = r"\d+$"  # end digits
start_match = re.findall(pat_start, sample_text, flags=re.MULTILINE)
end_digits = re.findall(pat_end_digits, sample_text, flags=re.MULTILINE)
print('Line-start:', start_match)
print('End digits:', end_digits)
register_hint(level_id, 'Use ^ and $ with re.MULTILINE to anchor per line.')

In [None]:
# Level 5 Solution
# Explanation: ^ and $ match start/end of line with MULTILINE flag.
complete_level(5, used_hint=False, success=True)

In [None]:
# Level 6: Groups & Alternation (ID=6)
import re
level_id = 6
sample_text = "Skills: Python, Java, C++, C, Rust, Python, Java"
# Task: Capture only (Python|Java|C\+\+|C) occurrences and count frequency.
pat_langs = r"\b(Python|Java|C\+\+|C)\b"  # TODO extend
langs = re.findall(pat_langs, sample_text)
from collections import Counter
print('Matches:', langs)
print('Frequency:', Counter(langs))
register_hint(level_id, 'Use grouping ( ... ) with alternation | ; escape + in C++.')

In [None]:
# Level 6 Solution
# Explanation: Group with alternation; \b ensures whole-word matches.
complete_level(6, used_hint=False, success=True)