This is a simple Python application that watches for eye blinks via a webcam, converts them into Morse code, and then transcribes the Morse code into text in real time.
- Detects blinks using MediaPipe Face Mesh landmarks.
- Classifies short and long blinks as dots and dashes.
- Converts Morse sequences to alphanumeric characters.
- Displays the live EAR (eye aspect ratio), detected Morse sequence, and decoded text overlayed on the camera feed.
- Optionally reads the transcript aloud through ElevenLabs once blinks pause for a couple seconds.
- Python 3.9+
- Webcam accessible by OpenCV
Install the required dependencies:
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -r requirements.txtCopy the .env file (or edit the existing placeholder) and add your ElevenLabs API key:
echo "ELEVENLABS_API_KEY=your-key-here" >> .envLeave the value blank or remove the line to disable text-to-speech.
python app.py --mode cliSettings can be tuned with CLI flags:
python app.py --mode cli --ear-threshold 0.22 --dot-duration 0.3 --dash-duration 0.7Text-to-speech specific flags:
python app.py --mode web --voice-id 21m00Tcm4TlvDq8ikWAM --tts-idle-seconds 2.5
# Disable speech entirely:
python app.py --mode web --disable-ttsTuning blink cadence:
python app.py --mode web --symbol-cooldown 0.5 --letter-pause 1.1 --dot-duration 0.3--symbol-cooldown adds extra buffer time after each blink before a letter/word is committed, making it easier to differentiate quick vs. prolonged closures.
Helpful shortcuts while the app is running:
- Press
cto clear the decoded text buffer. - Press
qto quit.
Launch the Flask-powered browser UI:
python app.py --mode web --host 0.0.0.0 --port 5000Open http://localhost:5000 in your browser to view the live stream and decoded text. Use the Clear button to reset the transcription buffer. You can pass the same tuning flags (e.g. --ear-threshold) to the Flask script as with the CLI version.
When a new character or word is transcribed and there are roughly two seconds of inactivity, the latest transcript is synthesized with ElevenLabs and played in the browser automatically (modern browsers may require a user interaction before autoplay can start).
- MediaPipe Face Mesh provides eye landmarks each frame.
- The eye aspect ratio (EAR) drops below a threshold when you blink.
- Blink duration determines whether the blink is treated as a dot or dash.
- Pauses between blinks delimit letters and words.
- The Morse code sequence is mapped to alphanumeric characters.
If detections jitter on your hardware or lighting, adjust the thresholds and durations until dots, dashes, and pauses are recognized reliably.
Thank you to BlinkAI (TreeHacks2025 Winner) for the inspiration!