[![Labellerr](https://storage.googleapis.com/labellerr-cdn/%200%20Labellerr%20template/notebook.webp)](https://www.labellerr.com)

# **Fashion Brand Tag OCR**

---

[![labellerr](https://img.shields.io/badge/Labellerr-BLOG-black.svg)](https://www.labellerr.com/blog/<BLOG_NAME>)
[![Youtube](https://img.shields.io/badge/Labellerr-YouTube-b31b1b.svg)](https://www.youtube.com/@Labellerr)
[![Github](https://img.shields.io/badge/Labellerr-GitHub-green.svg)](https://github.com/Labellerr/Hands-On-Learning-in-Computer-Vision)

## Overview

This notebook demonstrates an end-to-end computer vision pipeline for automated retail price tag analysis using a YOLO-based detection model combined with optical character recognition (OCR). The workflow covers real-time tag detection, intelligent text extraction, and inventory logging, designed to streamline data entry in retail environments.

#### Key Features:
* **Hybrid Architecture:** Combines YOLOv11 for precise object detection (Brand, Price, Size) with EasyOCR for robust text extraction.
* **Intelligent Pre-processing:** Implements adaptive thresholding and contour analysis to handle varying lighting conditions, shadows, and image noise.
* **Surgical OCR Logic:** Features custom "surgical" cropping techniques to eliminate currency symbols (e.g., ‚Çπ) that confuse standard OCR engines, ensuring high-accuracy price validation.
* **Global Context Awareness:** Utilizes full-tag fuzzy matching to identify brands even when specific brand bounding boxes are missed or occluded.
* **Automated Inventory Logging:** Automatically validates extracted data (Price, Size, Brand) and logs unique items to an Excel database in real-time.

#### Real-World Applications:
* Automated inventory management and stock auditing
* Retail checkout automation and smart shopping systems
* Price verification and compliance monitoring
* Data digitization for e-commerce cataloging
* Supply chain tracking for fashion and apparel industries

## Import Libraries

This section imports all the required libraries used throughout the project for computer vision, visualization, deep learning, and structured coding.

In [None]:
import cv2
from pathlib import Path
from ultralytics import YOLO

In [None]:
# !git clone https://github.com/Labellerr/yolo_finetune_utils.git

## üì• Download Annotations from Labellerr

After completing data labeling on the **Labellerr** platform, export the annotations in **COCO JSON format**.

Download the COCO JSON file from the Labellerr website and upload it into this project workspace to use it for further dataset preparation and training.

This COCO JSON file will be used in the next steps for:
- Frame‚Äìannotation alignment
- COCO ‚Üí YOLO format conversion
- Model training and evaluation


In [3]:
from yolo_finetune_utils.coco_yolo_converter.bbox_converter import coco_to_yolo_converter

coco_to_yolo_converter(
    json_path="export-#LsYK4074YWR3WypfuCLR.json",
    images_dir="drive-download-20260122T045027Z-1-001",
    output_dir="yolo_dataset",
    use_split=True,
    train_ratio=0.7,
    val_ratio=0.2,
    test_ratio=0.1,
    verbose=True
)


Loading COCO dataset from export-#LsYK4074YWR3WypfuCLR.json
Found 98 images and 270 annotations
Categories mapping:
  COCO ID 0 (Brand) -> YOLO class 0
  COCO ID 1 (Size) -> YOLO class 1
  COCO ID 2 (MRP) -> YOLO class 2
Images with annotations: 92
Dataset split:
  train: 64 images
  val: 18 images
  test: 10 images

Processing train split...

Processing val split...

Processing test split...

Conversion completed:
  Successfully processed: 92 images
  Failed to find: 0 images
  Total annotations converted: 270
  Categories: 3

YOLO dataset created at: yolo_dataset
Dataset configuration: yolo_dataset\dataset.yaml


{'output_path': 'yolo_dataset',
 'yaml_path': 'yolo_dataset\\dataset.yaml',
 'stats': {'total_images': 98,
  'images_with_annotations': 92,
  'successful_copies': 92,
  'failed_copies': 0,
  'total_annotations': 270,
  'categories': 3,
  'category_mapping': {0: 0, 1: 1, 2: 2},
  'class_names': {0: 'Brand', 1: 'Size', 2: 'MRP'}}}

### Model Training

The following script initiates the training process for the YOLOv11 model. It uses the custom dataset defined in `dataset.yaml` and trains for 100 epochs to ensure robust detection of tags, prices, and brand text.

In [None]:
# Initialize YOLOv11 Nano model
model = YOLO('yolo11n.pt')

# Train the model
results = model.train(
    data="yolo_dataset\dataset.yaml",      # Path to the yaml file created above
    epochs=100,         # 100 iterations
    imgsz=640,          # Standard image size
    batch=16,           # Batch size (reduce to 8 if you run out of memory)
    project="Tag_OCR_Project",
    name="run_tag_detection",
    device="cpu"        # Change to 0 if you have an NVIDIA GPU setup
)

print(" Training Finished!")

#  Inference

This script is the central engine for the Price Tag OCR project, coordinating object detection (YOLO), text recognition (EasyOCR), and database logging (Pandas).

---

##  1. Libraries & Dependencies

| Library | Purpose |
| :--- | :--- |
| **cv2** (OpenCV) | Image processing: converting to grayscale, resizing crops, and drawing boxes. |
| **re** | Regex for cleaning OCR output (extracting 1299 from Rs. 1,299). |
| **pandas** | Database management: reading/writing the inventory_log.xlsx file. |
| **ultralytics** | Loads the custom YOLOv11 model (best.pt) for object detection. |
| **easyocr** | The Deep Learning engine that reads text from image crops. |
| **difflib** | Fuzzy logic to correct brand typos (e.g., matching "PEPE JENS" to "PEPE JEANS"). |
| **`tkinter** | Creates the Windows file upload popup for "Upload Mode". |

---

##  2. Key Configuration

* **MODEL_PATH**: Points to your trained YOLO weights (best.pt).
* **CONFIDENCE_THRESHOLD = 0.3**: Ignores any detection with less than 30% confidence to reduce noise.
* **KNOWN_BRANDS**: A strict allow-list. The system only logs brands that fuzzy-match this list, preventing random text from being saved as a brand.

---

##  3. Core Logic Breakdown

### **A. clean_price(text)**
Sanitizes raw OCR output.
* **Logic:** Removes junk symbols (MRP, ‚Çπ, Rs.) and uses Regex (\d+) to grab the first valid number sequence.

### **B. process_frame(frame)**
The main pipeline running on every image. It has two stages:

**Stage 1: YOLO Detection (Price & Size)**
* **The "Left Cut":** For Price boxes, it slices off the **left 20%** of the image. This physically removes the Rupee symbol (`‚Çπ`) so OCR doesn't misread it as a 2.
* **Upscaling:** Resizes crops (2x) to make small text clearer for EasyOCR.

**Stage 2: Global Context (Full Tag Scan)**
* **Trigger:** Runs *only* if a valid Price is found.
* **Full Scan:** Reads the entire image text to find the **Brand Name** 

### **C. save_to_excel(data)**
Handles data persistence.
* **Duplicate Check:** Creates a unique ID (Brand_Price) and checks if it exists in the Excel file before saving.
* **Logging:** Appends valid data (Brand, Size, MRP, Time) to inventory_log.xlsx.

---

##  4. Execution Modes

* **Mode 1 (Webcam):** Sets camera to HD (1280x720) for clarity and runs in a live loop.
* **Mode 2 (Upload):** Uses a hidden Tkinter window (forced to top) to open a file picker for high-res static images.

In [None]:
import cv2
import re
import pandas as pd
from ultralytics import YOLO
import easyocr
from datetime import datetime
from difflib import get_close_matches
import tkinter as tk
from tkinter import filedialog
import os
import numpy as np


MODEL_PATH = "best (5).pt"  
CONFIDENCE_THRESHOLD = 0.3 


KNOWN_BRANDS = [
    "MAXFASHION", "MAX", 
    "PEPE JEANS", "PEPE", 
    "ARVIND YOUTH", "ARVIND",
    "ADITYA BIRLA FASHION", "ADITYA BIRLA", "MADURA FASHION",
    "SPYKAR", "LIFESTYLE", "LIFESTYLE INTERNATIONAL", "MELANGE", "GINGER", "CODE"
]

def clean_price(text):
    """ Robust Price Cleaner """
    text = text.upper()
    for junk in [',', ' ', 'MRP', 'RS', 'RS.', 'INR', '‚Çπ', 'PRICE', ':', 'INCL', 'TAXES', 'ALL', 'OF', '.00', '/-']:
        text = text.replace(junk, '')
        
    match = re.search(r'(\d+)', text)
    if match: return match.group(1)
    return None

def fuzzy_match_brand(full_text):
    """ Scans the WHOLE tag text for the brand """
    if not full_text: return None
    full_text_upper = full_text.upper()
    
    
    for brand in KNOWN_BRANDS:
        if brand in full_text_upper: return brand

    
    words = full_text_upper.split()
    for word in words:
        if len(word) < 3: continue 
        matches = get_close_matches(word, KNOWN_BRANDS, n=1, cutoff=0.6)
        if matches: return matches[0]
    return None

def process_frame(frame, model, reader):
    results = model(frame, stream=True, verbose=False)
    data = {"Brand": None, "Size": None, "MRP": None}
    price_detected = False
    h, w, _ = frame.shape

    for r in results:
        for box in r.boxes:
            conf = float(box.conf[0])
            cls_name = model.names[int(box.cls[0])]
            
           
            if cls_name == 'Brand': 
                continue

            thresh = 0.15 if cls_name == 'Size' else CONFIDENCE_THRESHOLD
            if conf < thresh: continue
            
            x1, y1, x2, y2 = map(int, box.xyxy[0])
            color = (0, 255, 0) if cls_name == 'MRP' else (0, 165, 255)
            cv2.rectangle(frame, (x1, y1), (x2, y2), color, 3)
            
            
            pad = 5
            crop = frame[max(0, y1-pad):min(h, y2+pad), max(0, x1-pad):min(w, x2+pad)]
            
            try:
                
                if cls_name == 'MRP':
                    crop_h, crop_w = crop.shape[:2]
                    crop = crop[:, int(crop_w * 0.20) : int(crop_w * 0.95)]
                
            
                gray = cv2.cvtColor(crop, cv2.COLOR_BGR2GRAY)
                gray = cv2.resize(gray, None, fx=2, fy=2, interpolation=cv2.INTER_CUBIC)
                
                
                ocr_result = reader.readtext(gray, detail=0) 
                text_val = " ".join(ocr_result)
                
                print(f"RAW READ ({cls_name}): {text_val}") 
                
                if text_val:
                    if cls_name == 'MRP':
                        cleaned = clean_price(text_val)
                        if cleaned:
                            data["MRP"] = cleaned
                            price_detected = True
                            print(f"üí∞ Price Validated: {cleaned}")
                            
                    elif cls_name == 'Size':
                        size_clean = text_val.split()[0].replace(',', '').replace('.', '')
                        data["Size"] = size_clean
                    
                    cv2.putText(frame, f"{cls_name}: {text_val}", (x1, y1-15), 
                                cv2.FONT_HERSHEY_SIMPLEX, 0.8, color, 2)
            except Exception as e:
                print(f"Error: {e}")

    if price_detected:
        print("Reading Full Tag for Brand...")
        try:
            
            scan_frame = frame.copy()
            if w > 1600:
                scale = 1600 / w
                scan_frame = cv2.resize(scan_frame, None, fx=scale, fy=scale)
            
            full_text_list = reader.readtext(scan_frame, detail=0)
            full_text = " ".join(full_text_list)
            
            found_brand = fuzzy_match_brand(full_text)
            if found_brand:
                data["Brand"] = found_brand
                cv2.rectangle(frame, (0, 0), (w, 80), (0, 255, 0), -1)
                cv2.putText(frame, f"BRAND: {found_brand}", (20, 60), 
                           cv2.FONT_HERSHEY_SIMPLEX, 1.5, (255, 255, 255), 4)

            
            if not data["Size"]:
                common_sizes = ['XS', 'S', 'M', 'L', 'XL', 'XXL', '30', '32', '34', '36', '38', '40', '42', '44']
                full_text_words = full_text.upper().replace('\n', ' ').split()
                for size in common_sizes:
                    if size in full_text_words:
                        data["Size"] = size
                        break

        except Exception as e:
            print(f"Global Error: {e}")
            
    return frame, data

def save_to_excel(data):
    """
    Saves data to Excel WITHOUT checking for duplicates.
    Every scan = New Row.
    """
    if data["MRP"]:
        final_brand = data["Brand"] if data["Brand"] else "Unknown"
        final_size = data["Size"] if data["Size"] else "N/A"
        
        try:
            df = pd.read_excel("inventory_log.xlsx")
            inventory_log = df.to_dict('records')
        except:
            inventory_log = []

        
        item_id = f"{final_brand}_{data['MRP']}" 
        
        print(f"SAVING: {final_brand} | {final_size} | {data['MRP']}")
        entry = {
            "ID": item_id, 
            "Brand": final_brand, 
            "Size": final_size, 
            "MRP": data["MRP"], 
            "Time": datetime.now().strftime("%H:%M:%S")
        }
        
        inventory_log.append(entry)
        pd.DataFrame(inventory_log).to_excel("inventory_log.xlsx", index=False)
        print("Excel Updated!")

if __name__ == "__main__":
    print("‚è≥ Loading YOLO...")
    model = YOLO(MODEL_PATH)
    print("‚è≥ Loading EasyOCR...")
    reader = easyocr.Reader(['en'], gpu=False)
    
    mode = input("Select Mode:\n[1] Live Webcam\n[2] Upload Image\n>>> ")

    if mode == '2':
        root = tk.Tk(); root.withdraw(); root.attributes('-topmost', True)
        print(" Pick your image!")
        file_path = filedialog.askopenfilename(parent=root, filetypes=[("Images", "*.jpg;*.png;*.jpeg")])
        
        if file_path:
            img = cv2.imread(file_path)
            processed_img, result_data = process_frame(img, model, reader)
            save_to_excel(result_data)
            
            h, w = processed_img.shape[:2]
            if h > 800:
                scale = 800 / h
                processed_img = cv2.resize(processed_img, None, fx=scale, fy=scale)
                
            cv2.imshow("Result", processed_img)
            cv2.waitKey(0)
            cv2.destroyAllWindows()
            
    else:
        
        cap = cv2.VideoCapture(0, cv2.CAP_DSHOW)
        cap.set(cv2.CAP_PROP_FRAME_WIDTH, 1280)
        cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 720)
        print(" Webcam Started!")
        while True:
            ret, frame = cap.read()
            if not ret: break
            processed_frame, result_data = process_frame(frame, model, reader)
            if result_data["Brand"] and result_data["MRP"]:
                save_to_excel(result_data)
            cv2.imshow("Live", processed_frame)
            if cv2.waitKey(1) & 0xFF == ord('q'): break
        cap.release()
        cv2.destroyAllWindows()

---

## üë®‚Äçüíª About Labellerr's Hands-On Learning in Computer Vision

Thank you for exploring this **Labellerr Hands-On Computer Vision Cookbook**! We hope this notebook helped you learn, prototype, and accelerate your vision projects.  
Labellerr provides ready-to-run Jupyter/Colab notebooks for the latest models and real-world use cases in computer vision, AI agents, and data annotation.

---
## üßë‚Äçüî¨ Check Our Popular Youtube Videos

Whether you're a beginner or a practitioner, our hands-on training videos are perfect for learning custom model building, computer vision techniques, and applied AI:

- [How to Fine-Tune YOLO on Custom Dataset](https://www.youtube.com/watch?v=pBLWOe01QXU)  
  Step-by-step guide to fine-tuning YOLO for real-world use‚Äîenvironment setup, annotation, training, validation, and inference.
- [Build a Real-Time Intrusion Detection System with YOLO](https://www.youtube.com/watch?v=kwQeokYDVcE)  
  Create an AI-powered system to detect intruders in real time using YOLO and computer vision.
- [Finding Athlete Speed Using YOLO](https://www.youtube.com/watch?v=txW0CQe_pw0)  
  Estimate real-time speed of athletes for sports analytics.
- [Object Counting Using AI](https://www.youtube.com/watch?v=smsjBBQcIUQ)  
  Learn dataset curation, annotation, and training for robust object counting AI applications.
---

## üé¶ Popular Labellerr YouTube Videos

Level up your skills and see video walkthroughs of these tools and notebooks on the  
[Labellerr YouTube Channel](https://www.youtube.com/@Labellerr/videos):

- [How I Fixed My Biggest Annotation Nightmare with Labellerr](https://www.youtube.com/watch?v=hlcFdiuz_HI) ‚Äì Solving complex annotation for ML engineers.
- [Explore Your Dataset with Labellerr's AI](https://www.youtube.com/watch?v=LdbRXYWVyN0) ‚Äì Auto-tagging, object counting, image descriptions, and dataset exploration.
- [Boost AI Image Annotation 10X with Labellerr's CLIP Mode](https://www.youtube.com/watch?v=pY_o4EvYMz8) ‚Äì Refine annotations with precision using CLIP mode.
- [Boost Data Annotation Accuracy and Efficiency with Active Learning](https://www.youtube.com/watch?v=lAYu-ewIhTE) ‚Äì Speed up your annotation workflow using Active Learning.

> üëâ **Subscribe** for Labellerr's deep learning, annotation, and AI tutorials, or watch videos directly alongside notebooks!

---

## ü§ù Stay Connected

- **Website:** [https://www.labellerr.com/](https://www.labellerr.com/)
- **Blog:** [https://www.labellerr.com/blog/](https://www.labellerr.com/blog/)
- **GitHub:** [Labellerr/Hands-On-Learning-in-Computer-Vision](https://github.com/Labellerr/Hands-On-Learning-in-Computer-Vision)
- **LinkedIn:** [Labellerr](https://in.linkedin.com/company/labellerr)
- **Twitter/X:** [@Labellerr1](https://x.com/Labellerr1)

*Happy learning and building with Labellerr!*
