An AI-powered communication assistant that combines computer vision, speech recognition, and voice synthesis technologies to improve accessibility and real-time communication.
- Real-time sign language / hand gesture recognition
- Speech-to-Text conversion
- Text-to-Speech conversion
- AI-powered communication assistance
- FastAPI backend integration
- Native iOS application using Swift UIKit
- Computer vision-based gesture detection
- Real-time camera processing
- Machine learning model integration
- Swift
- UIKit
- AVFoundation
- Vision Framework
- Xcode
- FastAPI
- Python
- Uvicorn
- MediaPipe
- OpenCV
- Scikit-learn
- RandomForestClassifier
CommAssist-AI/
│
├── ios-app/
│ └── iOS UIKit application
│
├── backend-fastapi/
│ ├── main.py
│ ├── requirements.txt
│ ├── model.p
│ └── backend files
│
├── README.md
├── LICENSE
└── .gitignore- The iOS application captures camera input.
- Hand gestures/signs are detected using computer vision.
- The FastAPI backend processes the gesture data using the trained AI model.
- The detected signs are converted into meaningful text/sentences.
- Speech-to-text and text-to-speech modules provide additional communication support.
- The processed response is displayed and spoken back to the user.
git clone https://github.com/yourusername/CommAssist-AI.git
cd CommAssist-AIpython -m venv .venv.venv\Scripts\activatesource .venv/bin/activatepip install -r requirements.txtuvicorn main:app --host 0.0.0.0 --port 8000 --reloadServer URL:
http://127.0.0.1:8000- Open the iOS project in Xcode.
- Configure the API endpoint.
- Connect a physical iPhone device.
- Build and run the application.
let url = URL(string: "https://your-api-url/predict")- Multi-language support
- More advanced sentence framing
- Cloud deployment
- Real-time translation
- Enhanced gesture recognition accuracy
- User authentication system
- Conversation history
The FastAPI backend can be deployed using:
- Render
- Railway
- Koyeb
- Fly.io
Muhammad Hammad
This project is licensed under the MIT License.
ai
fastapi
swift
ios
uikit
speech-to-text
text-to-speech
sign-language
gesture-recognition
computer-vision
machine-learning
opencv
mediapipe
accessibility
communication-assistant