A system that uses a camera to scan real physical Braille and convert it into English text in real time.
Two-stage YOLOv8 + CNN pipeline:
- YOLOv8 detects Braille cells in camera images (63-class dot-pattern detection)
- CNN classifies each cropped cell (optional refinement stage)
- Post-processing maps detected dot patterns to English Braille Grade 1 characters
All available datasets were normalized into a single YOLO-format dataset using 63 dot-pattern classes (the complete set of all possible 6-dot Braille combinations).
| Dataset | Source | Original Format | Original Classes | Images | Annotations | Converted |
|---|---|---|---|---|---|---|
| grade1_dataset_updated | Roboflow (thesis-vglk4/grade1-0t4ax) | YOLO txt | 67 (Tagalog Grade 1) | 10,807 | 93,497 | ✅ 67→63 |
| AngelinaDataset | Ilya Ovodov (ICCV 2021) | CSV (normalized) | 63 (dot patterns) | 290 | 72,250 | ✅ CSV→YOLO |
| BrailleDataset (Roboflow) | Roboflow (flamengo/meu-dataset) | YOLO txt | 48 (Portuguese) | 8,031 | 26,817 | ✅ 48→63 |
| char_label (segment_natural) | Natural Braille Character | LabelMe JSON / VOC XML / ICDAR TXT | 63 (dot patterns) | 1,157 | 10,853 | ✅ JSON→YOLO |
| TOTAL | 21,595 | 203,417 |
| Dataset | Reason |
|---|---|
| Braille-Iberoamericano | No data.yaml — class mapping unknown (34 classes) |
| segment_label | Dot-level annotations, not cell-level |
| Braille Alphabet Image Dataset (A–Z) | Classification only — no bounding boxes |
| Braille Dataset (28×28) | Classification only — no bounding boxes |
unified_dataset/
├── data.yaml # 63-class YOLO config
├── patterns_to_english.csv # Dot pattern → English character mapping
├── train/
│ ├── images/ (16,972) # Training images
│ └── labels/ (16,972) # YOLO .txt labels
├── val/
│ ├── images/ (2,750) # Validation images
│ └── labels/ (2,750)
└── test/
├── images/ (1,873) # Test images
└── labels/ (1,873)
Each Braille cell is one of 63 possible 6-dot combinations. The class ID in YOLO (0–62) maps to a dot pattern (1–63), where each dot position has a bit value:
| Dot | Position | Value |
|---|---|---|
| 1 | top-left | 1 |
| 2 | middle-left | 2 |
| 3 | bottom-left | 4 |
| 4 | top-right | 8 |
| 5 | middle-right | 16 |
| 6 | bottom-right | 32 |
The class value = sum of raised dot values. E.g., dots 1+4+5 = 1+8+16 = 25 = English letter d.
| Pattern | Dots | English | Notes |
|---|---|---|---|
| 1 | • | a / 1 | Also A with capital prefix |
| 3 | •• | b / 2 | |
| 5 | • • | k | |
| 7 | ••• | l | |
| 9 | • • | c / 3 | |
| 10 | • • | i / 9 | |
| 11 | •• • | f / 6 | |
| 13 | • •• | m | |
| 14 | ••• | s | |
| 15 | •••• | p | |
| 17 | • • | e / 5 | |
| 19 | •• • | h / 8 | |
| 21 | • • • | o | |
| 23 | ••• • | r | |
| 25 | • •• | d / 4 | |
| 26 | • •• | j / 0 | |
| 27 | •• •• | g / 7 | |
| 29 | • ••• | n | |
| 30 | •••• | t | |
| 31 | ••••• | q | |
| 37 | • • • | u | |
| 39 | •• • | v | |
| 45 | • •• • | x | |
| 53 | • • •• | z | |
| 56 | ••• | capital | Prefix for uppercase |
| 58 | • ••• | w | |
| 60 | •••• | number | Prefix for digits |
| 61 | • •••• | y |
Full mapping in unified_dataset/patterns_to_english.csv.
All 63 classes are represented. English letters and indicators have strong coverage (1,384–15,528 samples each). The thinnest classes (<100 samples) correspond to rare punctuation or non-English dot patterns.
| Tier | Count | Classes |
|---|---|---|
| Abundant | 4k–15k | a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z, capital, number |
| Moderate | 500–4k | punctuation, less common patterns |
| Thin | 100–500 | rare patterns |
| Critically thin | 17–92 | patterns 28, 35, 41, 44, 54, 57 |
The build script build_unified_dataset.py processes each dataset:
-
grade1_dataset_updated (67→63): Maps character labels (a, A, capital, number, etc.) to their underlying dot patterns. Uppercase and lowercase merge to the same pattern (context preserved via capital indicator cell).
-
AngelinaDataset (CSV→YOLO): Reads semicolon-delimited CSV with normalized coords, converts to YOLO format. Already uses 63-class dot patterns.
-
character_label (JSON/XML/TXT→YOLO): Parses three annotation formats:
- LabelMe JSON (rectangle shapes with dot-string labels like "1456")
- VOC XML (bndbox with numeric labels)
- ICDAR TXT (quadrilateral coords) Converts dot-string labels (e.g., "1456" → 57) to integer class IDs.
-
BrailleDataset (Roboflow) (48→63): Maps 48 Portuguese character classes (A-Z, 0-9, accented) to their standard Ibero-American Braille dot patterns. Accented characters (á, â, ã, ç, é, ê, í, ó, ô, õ, ú) are included as their corresponding patterns.
Files in the unified dataset are prefixed to indicate source:
g1_*— grade1_dataset_updatedrf_*— BrailleDataset (Roboflow)cl_*— character_label (LabelMe JSON)icdar_*— character_label (ICDAR TXT)voc_*— character_label (VOC XML)- No prefix — AngelinaDataset
Each dataset's original train/val/test splits were preserved and merged into the corresponding unified split. The grade1_dataset_updated valid directory maps to val.
yolo train data=unified_dataset/data.yaml model=yolov8n.pt epochs=100 imgsz=640 batch=16For higher accuracy, use yolov8m.pt or yolov8l.pt.
yolo export model=runs/detect/train/weights/best.pt format=onnx imgsz=640Or for TFLite:
yolo export model=runs/detect/train/weights/best.pt format=tflite imgsz=640After detection, each cell's class ID (0–62) maps to a pattern (1–63). Use patterns_to_english.csv to convert sequences of patterns to English text:
capital(56) + a(1) → "A"
number(60) + a(1) → "1"
a(1) → "a"
b(3) → "b"
period(50) → "."
comma(2) → ","
A simple text reconstruction algorithm:
- If pattern 56 (capital) → next letter is uppercase
- If pattern 60 (number) → next a-j patterns are digits
- Otherwise → lowercase letter or punctuation
| Script | Purpose |
|---|---|
build_unified_dataset.py |
Converts all datasets to unified 63-class YOLO format |
braille_pipeline.py |
Original two-stage pipeline (YOLO + CNN) |
dect.py |
Geometric Braille decoder (blob detection + DBSCAN) — classical CV fallback |
infer.py |
ONNX inference pipeline (YOLO detection → CNN classification) |
export_cnn.py |
Export trained CNN to ONNX |
See requirement.txt:
- ultralytics>=8.1.0
- opencv-python>=4.8.0
- torch>=2.2.0
- onnxruntime-gpu>=1.17.0