Vision + Robot Backend

This project runs a Flask server that accepts images, runs YOLO detection (Ultralytics), queries an LLM (Ollama) for suggestions, and can send movement commands to a Raspberry Pi client. It now includes an optional Text-To-Speech (TTS) module to speak short descriptions of what the model sees.

Quick setup

Create and activate a Python virtualenv (recommended):

python3 -m venv venv
source venv/bin/activate

Install dependencies:

pip install -r requirements.txt

Enable TTS (optional):

The project uses pyttsx3 for offline TTS. To enable speaking, set the environment variable ENABLE_TTS=1 before running the server. Example:

export ENABLE_TTS=1

Run the server:

python app.py

Endpoints

GET / - Web UI
POST /upload - Upload image as form field image (multipart)
GET /processed - Returns the latest annotated image
GET /detections - Returns JSON list of latest detections
GET /ollama - Returns latest Ollama suggestion
GET /get_command - Endpoint polled by Pi client for movement commands
GET /health - Health and basic status

TTS behavior

When enabled, the server will call tts.speak_detections(...) after processing images. The TTS module uses pyttsx3 by default, which works offline on many platforms.

Notes & troubleshooting

You must have a YOLO model file (the repo includes yolov8m.pt and yolov8n.pt). The server attempts to load yolov8m.pt by default. Adjust init_yolo() if you want a different model.
On headless Linux servers, pyttsx3 may require additional audio backends (e.g., aplay / ALSA) to be installed. If audio playback fails, TTS falls back to printing the sentence.
For Raspberry Pi robot control, see pi_client.py and configure GPIO pins.

Example POST using curl

curl -X POST -F "image=@/path/to/photo.jpg" http://localhost:1909/upload

Example test script

See examples/test_request.py for a small Python example that uploads an image and prints the response.

License

Add a LICENSE file appropriate to your project before publishing.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
__pycache__		__pycache__
examples		examples
processed_images		processed_images
received_images		received_images
.gitignore		.gitignore
FEATURE_SUGGESTIONS.md		FEATURE_SUGGESTIONS.md
Object_Detector.py		Object_Detector.py
Ollama		Ollama
README.md		README.md
app.py		app.py
from flask import Flask, request.py		from flask import Flask, request.py
pi_client.py		pi_client.py
requirements.txt		requirements.txt
tts.py		tts.py
yolo11n.pt		yolo11n.pt
yolov8m.pt		yolov8m.pt
yolov8n.pt		yolov8n.pt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Vision + Robot Backend

About

Uh oh!

Releases

Packages

Languages

Daniel-Li-08/goonhacks

Folders and files

Latest commit

History

Repository files navigation

Vision + Robot Backend

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages