# GROUP 11 {style="text-align: center;"}

### **Group Members:**
- **Alliance IRIGENERA** *(Group Leader)*
- **Maryam Yahya MOHAMED**
- **Olusola Timothy OGUNDEPO**
- **Jean Baptiste HABINEZA**

## Project 5. License Plate Detection

Write a program in pure Python (no external libraries) that detects the presence of Senegalese vehicle license plate numbers inside any text string. 

We consider the following formats (letters are A–Z; digits are 0–9):

XY-abcd-T or XY-abcd-ZT

where X, Y, Z, T are letters and a, b, c, d are digits.

**Requirements**

1. The input is an arbitrary string; the output is a Boolean. If True, the program must also print all detected license plates.
2. Matching is case-insensitive; detected plates must be normalized to uppercase and printed in canonical hyphenated form XY-ABCD-T/XY-ABCD-ZT.
3. Allow either a hyphen or a single space as a separator (- or ). For example, XY 1234 T and xy-1234-zt must be recognized and normalized.
4. Do not match substrings embedded within longer alphanumeric tokens (e.g., do not match across letters/digits without a boundary).
5. If multiple plates appear, print them in the order they appear, without duplicates.
6. If no plate is found, print a clear message stating so.

**Additional Complexity**

- Tolerate trailing punctuation around a plate (e.g., commas or periods).
- Count and display the total number of unique plates found at the end.
- Provide a minimal text menu to test multiple inputs until the user quits.

In [1]:
def senegalese_plate_number_detector(text):
    """
        senegalese_plate_number_detector(text)

    This function helps detect Senegalese vehicle license plates in arbitrary text.
    
    Sample format of Senegalese vehicle license plates: XY-1234-T, XY-1234-ZT or XY 1234 T, XY 1234 ZT

    The detector is case-insensitive and allows either hyphens or spaces as separators. Also, it ensures that the detected plate numbers are not part of larger alphanumeric sequences.

    Parameters:
        text (str): Input text to search for Senegalese vehicle license plates.
    Returns:
        bool: True if at least one plate number is found, False otherwise.
    """
    
    text = text.upper()
    plates_found = [] # to store unique plates
    length = len(text)

    index = 0
    while index < length - 8:  # minimal length check
        # Check first two letters
        if not text[index:index+2].isalpha():
            index += 1
            continue

        # Check first separator
        if text[index+2] not in ('-', ' '):
            index += 1
            continue

        # Check next 4 digits
        if not text[index+3:index+7].isdigit():
            index += 1
            continue

        # Check next separator
        if text[index+7] not in ('-', ' '):
            index += 1
            continue

        # The last part of the plate number can be either 1 or 2 letters
        end = index + 8
        last_part = ""
        if end < length and text[end].isalpha():
            last_part += text[end]
            end += 1
            if end < length and text[end].isalpha():
                last_part += text[end]
                end += 1

        if len(last_part) == 0:
            index += 1
            continue

        # Check boundaries (before and after)
        boundary_before = (index == 0) or (not text[index-1].isalnum())
        boundary_after = (end == length) or (not text[end].isalnum())

        if boundary_before and boundary_after:
            # Normalise to canonical form with hyphens
            plate = f"{text[index:index+2]}-{text[index+3:index+7]}-{last_part}"
            if plate not in plates_found:
                plates_found.append(plate)
            index = end  # skip ahead since we found a valid plate
        else:
            index += 1
    
    if plates_found:
        print("Detected Senegalese plates:")
        for p in plates_found:
            print("---->", p)
        print(f"Total unique plates found: {len(plates_found)}")
        return True
    else:
        print("No Senegalese plate number found in the text.")
        return False

## Senegalese License Plate Detector Test

In [2]:
testing_paragraph = (
    "A man at the African Institute of Mathematical Sciences drives a vehicle with the plate number AB-1234-DC. "
    "Later, he moved to Nigeria and changed his plate to BC-6782-E. After an accident, he switched to xy 1234 ZT. "
    "Another person was seen with MN-4321-ZT, while someone else reported PQ-5678-TY. "
    "uv 9876 T was involved in a minor incident. Someone mistakenly reported BAD-233-X, but the correct plate was bA-2333-X. "
    "A claim about CD-3456-Z turned out to be fake. Finally, EF 4567 YT was spotted in the area."
)

In [3]:
senegalese_plate_number_detector(testing_paragraph)

Detected Senegalese plates:
----> AB-1234-DC
----> BC-6782-E
----> XY-1234-ZT
----> MN-4321-ZT
----> PQ-5678-TY
----> UV-9876-T
----> BA-2333-X
----> CD-3456-Z
----> EF-4567-YT
Total unique plates found: 9


True

Clearly, the function created works perfectly as it is able to detect all the senegalese license plate numbers in the given paragraph. And also, it is able to normalize them to uppercase and print them in canonical hyphenated form XY-ABCD-T/XY-ABCD-ZT.

## Minimal Text Menu to Test Multiple Inputs

In [4]:
def test_menu():
    print("Senegalese Vehicle Plate Detector")

    while True:
        print("Enter 'q' or 'quit' to stop the program.\n")
        text = input("Enter text input: ").strip()
        if text.lower() == 'q' or text.lower() == 'quit':
            print("Goodbye!")
            break
        senegalese_plate_number_detector(text)
        print("-" * 40)

In [5]:
test_menu()

Senegalese Vehicle Plate Detector
Enter 'q' or 'quit' to stop the program.

Detected Senegalese plates:
----> AG-1234-H
Total unique plates found: 1
----------------------------------------
Enter 'q' or 'quit' to stop the program.

Detected Senegalese plates:
----> ER-6785-UI
Total unique plates found: 1
----------------------------------------
Enter 'q' or 'quit' to stop the program.

Detected Senegalese plates:
----> ER-3353-MN
Total unique plates found: 1
----------------------------------------
Enter 'q' or 'quit' to stop the program.

No Senegalese plate number found in the text.
----------------------------------------
Enter 'q' or 'quit' to stop the program.

Detected Senegalese plates:
----> ED-3214-TY
Total unique plates found: 1
----------------------------------------
Enter 'q' or 'quit' to stop the program.

Goodbye!


## Senegalese Vehicle Plate Detector Using Regular Expressions

In [6]:
import re

def detect_senegalese_plates_regex(text):
    """Detect Senegalese license plates in text using a regular expression.

    This version does the same job as senegalese_plate_number_detector but
    with a compact regex. It returns True if at least one plate is found,
    else False. It prints the plates (unique, in order) in normalized form.

    Rules handled:
    - Two letters, separator (- or space), four digits, separator, 1-2 letters
    - Not part of a bigger alphanumeric word (uses lookbehind/lookahead)
    - Case-insensitive match; output always uppercase with hyphens
    - Accepts hyphen or single space in both separator positions
    """

    pattern = re.compile(r"(?i)(?<![A-Z0-9])([A-Z]{2})[- ](\d{4})[- ]([A-Z]{1,2})(?![A-Z0-9])")

    unique = []
    for m in pattern.finditer(text):
        left, digits, end = m.group(1).upper(), m.group(2), m.group(3).upper()
        plate = f"{left}-{digits}-{end}"
        if plate not in unique:
            unique.append(plate)

    if unique:
        print("Detected Senegalese plates (regex):")
        for p in unique:
            print("---->", p)
        print(f"Total unique plates found: {len(unique)}")
        return True
    else:
        print("No Senegalese plate number found (regex version).")
        return False

In [7]:
# Quick comparison helper: run both detectors on the same text

def compare_detectors(sample_text):
    print("Manual parser result:")
    senegalese_plate_number_detector(sample_text)
    print("\nRegex parser result:")
    detect_senegalese_plates_regex(sample_text)

# Example (can be commented out if using interactively)
compare_detectors(testing_paragraph)

Manual parser result:
Detected Senegalese plates:
----> AB-1234-DC
----> BC-6782-E
----> XY-1234-ZT
----> MN-4321-ZT
----> PQ-5678-TY
----> UV-9876-T
----> BA-2333-X
----> CD-3456-Z
----> EF-4567-YT
Total unique plates found: 9

Regex parser result:
Detected Senegalese plates (regex):
----> AB-1234-DC
----> BC-6782-E
----> XY-1234-ZT
----> MN-4321-ZT
----> PQ-5678-TY
----> UV-9876-T
----> BA-2333-X
----> CD-3456-Z
----> EF-4567-YT
Total unique plates found: 9


# Project Report: Senegalese License Plate Detection System

## Abstract
This project builds a simple Python program that finds Senegalese vehicle license plate numbers that appear inside any piece of text. It works without using external libraries. It supports two valid formats, accepts both hyphens and single spaces as separators, ignores letter case, and prints all plates it finds in a clean, standard form. We also add a second version that uses a regular expression (regex) to do the same job in fewer lines. The goal is clarity, correctness, and ease of understanding.

## 1. Introduction
We often need to pull structured items (like license plates) out of unstructured text (paragraphs, notes, reports). This project shows how to do that for Senegalese license plates using two approaches:
1. A manual step‑by‑step parser (no regex)
2. A regex-based version (shorter, but less explicit)

## 2. What Counts as a Valid Plate
Accepted shapes (letters = A–Z, digits = 0–9):
- XY-1234-T
- XY-1234-ZT  
You may also write them with spaces instead of hyphens: `XY 1234 T` or `XY 1234 ZT`.

Rules:
- First two characters: letters
- Separator: hyphen (-) or single space
- Four digits
- Separator: hyphen (-) or single space
- Last part: 1 or 2 letters
- No extra letters/digits glued directly to the left or right
- Case-insensitive input; output always uppercase with hyphens

Examples that SHOULD match:
- ab-1234-e
- ab 1234 e
- mn-4321-zt
- uv 9876 t

Examples that SHOULD NOT match:
- AAB-1234-T (three starting letters)
- AB-12345-T (five digits)
- AB_1234_T (underscores are not allowed separators)
- AB-1234-TXQ (last part too long)

## 3. Main Goals
- Detect any valid plate in a text
- Show each unique plate once, in order of appearance
- Print a helpful message if none are found
- Make the code easy to read and extend
- Provide an alternative regex version for comparison

## 4. Approach 1: Manual Parsing (No Regex)
### Idea in Plain Words
We walk through the text one character at a time. At each position we check if the next characters could form a valid plate. If not, we move one step forward. If yes, we store it (in a standard format) and then jump ahead past it.

### Steps
1. Convert the whole text to uppercase (so we only compare one case).
2. Check two letters.
3. Check a separator (hyphen or space).
4. Check four digits.
5. Check a separator.
6. Collect one or two ending letters.
7. Make sure there is a boundary (start/end of text OR a non-letter/digit) before and after.
8. Normalize to the form: `XY-1234-T` or `XY-1234-ZT`.
9. Avoid duplicates.

### Strengths / Weaknesses
Strengths: Transparent, easy to modify, no hidden behavior.  
Weaknesses: More lines of code than a regex.

## 5. Approach 2: Regex Version
The regex puts the same rules into one pattern. It is shorter and fast. We still normalize output the same way (always with hyphens and uppercase). Lookbehind and lookahead prevent partial matches inside larger words.

### Explanation of the Regex pattern
- `(?i)` ignore case
- `(?<![A-Z0-9])` left side is not a letter/digit
- `([A-Z]{2})` two letters
- `[- ]` separator
- `(\d{4})` four digits
- `[- ]` separator
- `([A-Z]{1,2})` one or two letters
- `(?![A-Z0-9])` right side is not a letter/digit

## 6. Added Comparison Helper
We included a helper function `compare_detectors()` so both methods can be run on the same sample text to show they agree.

## 7. Testing
We built a test paragraph mixing valid plates, invalid near-misses, different separators, different ending lengths, and case variations. Both methods returned the same valid list.

## 8. Results (In Simple Terms)
- All expected plates were detected.
- No incorrect matches were printed in testing examples.
- Output is consistent and clean.

## 9. Limitations
- No fuzzy matching (typos or OCR mistakes not handled).
- Only supports the two Senegal patterns described.
- Does not check whether the plate is officially issued, just the pattern.

## 10. Conclusion
We now have two clear ways to find Senegalese license plates inside any text: a manual parser and a regex version. Both meet the goals of correctness, clarity, and simple output. The manual version is ideal for learning and future tweaks. The regex version is concise and practical. 