SignBridge — real-time ASL → speech

title	SignBridge
emoji	🤟
colorFrom	indigo
colorTo	pink
sdk	gradio
sdk_version	4.44.1
app_file	app.py
pinned	false
thumbnail	assets/cover.png
license	mit
short_description	Real-time ASL → English speech on AMD MI300X.

SignBridge — real-time ASL → speech

Two people who couldn't communicate, now can.

A deaf person signs into the webcam. SignBridge — a multi-stage vision + reasoning + voice pipeline running on a single AMD Instinct MI300X — translates the signs into spoken English in under 2 seconds.

Submission for the AMD Developer Hackathon (LabLab.ai, May 2026) — Track 3: Vision & Multimodal AI.

How it works

webcam frames  →  MediaPipe Holistic   →  trained sign classifier
   (1–5 fps)        (543-dim pose)        (WLASL Top-100 + alphabet)
                                                  │
                                                  ▼
                                      Llama-3.1-8B sentence composer
                                                  │
                                                  ▼
                                            Coqui XTTS-v2  →  speech

All four stages run concurrently on a single AMD Instinct MI300X via AMD Developer Cloud. Total weights ~22 GB on a 192 GB GPU — fits with margin for KV cache + serving overhead.

V1 use cases

ASL fingerspelling alphabet — sign A–Z and 0–9 → AI speaks the letters / numbers
Top-50 WLASL signs (hello, thank you, name, please, sorry, family, eat, drink, work, …) → AI composes grammatical English sentences

V1 is one-way: deaf signs → hearing hears. Reverse direction (speech → on-screen text) is V2.

Why AMD

The MI300X's 192 GB HBM3 fits the entire pipeline (Qwen3-VL-8B + Llama-3.1-8B + XTTS-v2) on one GPU with margin. NVIDIA H100 (80 GB) requires sharding, and the V2 plan to upgrade to a 70B reasoner is impossible on H100 without a 3-GPU cluster. Single-GPU concurrency + 5.3 TB/s memory bandwidth is the actual AMD pitch — practical accessibility tools running globally need the cost-and-availability profile that AMD enables.

Why this matters (business case)

Sign-language interpreters cost $50–200 per hour and are scarce. Courts, hospitals, schools, and public services must by law provide interpretation (ADA Title II/III in the US, EAA 2025 in the EU). Sorenson VRS — the dominant relay-services provider — books $4B+ in annual revenue in this space. SignBridge is the open-source backbone that any country, NGO, or enterprise can deploy on their own AMD compute.

Privacy

Session-only. Frames and audio are processed in-memory and not persisted server-side beyond the WebSocket / HTTP session.

For Deaf-led teams

SignBridge is open-source under MIT license and intentionally scoped to ASL-only V1. The pipeline is a substrate, not a finished product — Deaf-led organisations (schools-for-the-Deaf, NGOs, ministries) are the intended deployers. Other sign languages (BSL, MSL, CSL, ISL, +200 more) deserve their own teams, training data, and Deaf community leadership. See docs/walkthrough.md → "Deployment ethics" for the design principles drawn from the Deaf-led academic literature.

Local dev

# Setup
pip install -r requirements.txt
cp .env.example .env   # fill in HF_TOKEN, AMD_DEV_CLOUD_*, OPENAI_API_KEY (fallback)

# Run the Gradio app
python app.py

# Run the inference backend (point at AMD Dev Cloud or local ROCm)
python -m signbridge.backend

# Train the classifier on WLASL Top-100 (Day 2 task — run on AMD Dev Cloud)
python -m signbridge.scripts.train_classifier --dataset data/wlasl --epochs 30

Datasets used

WLASL — Word-Level American Sign Language; we use the Top-100 subset
ASL fingerspelling alphabet (open dataset)

Models pulled from Hugging Face Hub

meta-llama/Llama-3.1-8B-Instruct — sentence composer
coqui/XTTS-v2 — text-to-speech
(V2 stretch) openai/whisper-large-v3 — for the reverse direction

License

MIT. See LICENSE.

Status

Active development — see CLAUDE.md for the working state and docs/walkthrough.md for the technical writeup.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SignBridge — real-time ASL → speech

How it works

V1 use cases

Why AMD

Why this matters (business case)

Privacy

For Deaf-led teams

Local dev

Datasets used

Models pulled from Hugging Face Hub

License

Status

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
assets		assets
docs		docs
signbridge		signbridge
tests		tests
.env.example		.env.example
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

SignBridge — real-time ASL → speech

How it works

V1 use cases

Why AMD

Why this matters (business case)

Privacy

For Deaf-led teams

Local dev

Datasets used

Models pulled from Hugging Face Hub

License

Status

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages