# Design a vision system capable of running on a Raspberry Pi 4 Model B with a Raspberry Pi camera.

This system will be able to read the numbers on the screens of industrial sensors from which the sampling data cannot be directly extracted.

## 1) Requirements (with links)

- **Python 3.9+**
  - Download/manage: [https://www.python.org/downloads/](https://www.python.org/downloads/)  
  - On Raspberry Pi OS it usually comes preinstalled (check with `python3 --version`).

- **OpenCV (cv2)**
  ```bash
  sudo pip install opencv-python
- **Tesseract OCR (system binary)**
  - Official site/docs: [tesseract-ocr.github.io](https://tesseract-ocr.github.io/)
  - **Windows (recommended installer):** [UB Mannheim – Tesseract for Windows](https://github.com/UB-Mannheim/tesseract/wiki)
  - **Linux (Debian/Ubuntu/Raspberry Pi OS):** available via `apt` as `tesseract-ocr` and language packages (e.g. `tesseract-ocr-eng`)

- **Python libraries**
  - OCR bridge (Wrapper): [pytesseract (PyPI)](https://pypi.org/project/pytesseract/)
    ```bash
    python3 -m pip install pytesseract Pillow

- **Official Raspberry Pi Camera (if not using USB webcam)**
  - Modern framework: **Picamera2**  
    Documentation: [Picamera2 – Raspberry Pi](https://www.raspberrypi.com/documentation/computers/camera_software.html#picamera2)

---

## 2) Quick Installation

### A) Linux (Raspberry Pi OS / Ubuntu)

```bash
# Update and install Tesseract + English language
sudo apt update
sudo apt install -y tesseract-ocr tesseract-ocr-eng

# Precompiled OpenCV for Python (recommended on Raspberry Pi)
sudo apt install -y python3-opencv

# Python tools and libraries
sudo apt install -y python3-pip
python3 -m pip install --upgrade pip
python3 -m pip install pytesseract Pillow


In [1]:
import pytesseract
import cv2
from datetime import datetime
import time

# Configure Tesseract path
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'

# Initialize webcam or raspberry pi camera
cap = cv2.VideoCapture(0)

# Define the rectangle
x, y = 250, 200  # Starting position
w, h = 150, 50  # Width and height of rectangle

try:
    while True:
        # Capture frame from webcam
        ret, frame = cap.read()
        if not ret:
            print("Failed to grab frame")
            break

        # Draw the rectangle on the frame
        cv2.rectangle(frame, (x, y), (x + w, y + h), (0, 0, 255), 2)
        roi = frame[y:y+h, x:x+w]
        text = pytesseract.image_to_string(roi, config='--psm 7 outputbase digits')
        text = text.strip()

        # Display the text on the frame
        if text:
            cv2.putText(frame, f"Reading: {text}", (x, y-10), 
                       cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 255, 0), 2)

        # Display the frame
        cv2.imshow('Sensor Reading', frame)

        # Break loop with ESC key
        if cv2.waitKey(1) & 0xFF == ord('\x1b'):
            break
        time.sleep(0.01)  # Small delay to reduce CPU usage

finally:
    cap.release()
    cv2.destroyAllWindows()


## Recommendations for 7-segment displays / digits

- Use `--psm 7` (single line) and restrict to digits with  
  `tessedit_char_whitelist=0123456789`.
- Crop a **ROI** (region of interest) tight around the digits.
- Preprocess: grayscale → light blur → Otsu threshold.  
  - If the background is light and digits are dark, try **BINARY**.  
  - If the opposite, use **BINARY_INV**.

### Minimal example (processing a frame/image already loaded as `roi`)

```python
import cv2
import pytesseract

# Suppose you already have a frame from camera in 'frame'
# and defined a rectangle ROI (x, y, w, h) that covers the digits:
x, y, w, h = 250, 200, 150, 50
roi = frame[y:y+h, x:x+w]

# --- Preprocessing ---
gray = cv2.cvtColor(roi, cv2.COLOR_BGR2GRAY)            # Grayscale
gray = cv2.GaussianBlur(gray, (3, 3), 0)                # Light blur
_, bw = cv2.threshold(gray, 0, 255,                     # Otsu threshold
                      cv2.THRESH_BINARY + cv2.THRESH_OTSU)

# Tesseract config: LSTM engine, single line, digits only
tess_cfg = r'--oem 3 --psm 7 -c tessedit_char_whitelist=0123456789'

# OCR
text = pytesseract.image_to_string(bw, config=tess_cfg).strip()
print("OCR Reading:", text)

In [2]:
# -----------------------------------------------------------
# Just BW for 7-segment displays OCR
# -----------------------------------------------------------
import os
import re
import time
import cv2
import pytesseract

# -----------------------------------------------------------
# CONFIGURATION
# -----------------------------------------------------------
# (Windows) If Tesseract is not in PATH, set it here:
# pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"

CAM_INDEX = 0  # 0 = default webcam (/dev/video0 in Linux)

# ROI (Region of Interest) where the digits are located
x, y = 200, 200
w, h = 200, 100

# Tesseract config optimized for 7-segment digits
TESS_CFG = r'--oem 3 --psm 7 -c tessedit_char_whitelist=0123456789'

# -----------------------------------------------------------
# FUNCTIONS
# -----------------------------------------------------------
def preprocess_for_7seg(roi_bgr):
    """Preprocess ROI for OCR on 7-segment displays."""
    gray = cv2.cvtColor(roi_bgr, cv2.COLOR_BGR2GRAY)
    gray = cv2.GaussianBlur(gray, (3, 3), 0)
    _, bw = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
    return bw

def ocr_digits(img):
    """Run OCR restricted to digits only."""
    txt = pytesseract.image_to_string(img, config=TESS_CFG)
    digits = re.sub(r'[^0-9]', '', txt)  # keep only digits
    return digits

# -----------------------------------------------------------
# CAPTURE
# -----------------------------------------------------------
cap = cv2.VideoCapture(CAM_INDEX, cv2.CAP_ANY)
if not cap.isOpened():
    raise RuntimeError("Could not open camera")

try:
    while True:
        ok, frame = cap.read()
        if not ok:
            print("Failed to grab frame")
            break

        # Extract ROI
        roi = frame[y:y+h, x:x+w]
        if roi.size == 0:
            continue

        # Enlarge ROI to help OCR
        roi = cv2.resize(roi, None, fx=2.0, fy=2.0, interpolation=cv2.INTER_CUBIC)

        # Preprocess for 7-segment
        bw = preprocess_for_7seg(roi)

        # OCR
        text = ocr_digits(bw)

        # Display text directly on the ROI
        if text:
            cv2.putText(bw, f"Reading: {text}", (10, 25),
                        cv2.FONT_HERSHEY_SIMPLEX, 0.8, (0,), 2)

        # Show only the processed ROI in BW
        cv2.imshow("ROI - 7 Segment BW", bw)

        # Exit with ESC
        if cv2.waitKey(1) & 0xFF == ord('\x1b'):
            break

        time.sleep(0.01)

finally:
    cap.release()
    cv2.destroyAllWindows()
