Skip to content

Daniel-Li-08/goonhacks

Repository files navigation

Vision + Robot Backend

This project runs a Flask server that accepts images, runs YOLO detection (Ultralytics), queries an LLM (Ollama) for suggestions, and can send movement commands to a Raspberry Pi client. It now includes an optional Text-To-Speech (TTS) module to speak short descriptions of what the model sees.

Quick setup

  1. Create and activate a Python virtualenv (recommended):
python3 -m venv venv
source venv/bin/activate
  1. Install dependencies:
pip install -r requirements.txt
  1. Enable TTS (optional):
  • The project uses pyttsx3 for offline TTS. To enable speaking, set the environment variable ENABLE_TTS=1 before running the server. Example:
export ENABLE_TTS=1
  1. Run the server:
python app.py

Endpoints

  • GET / - Web UI
  • POST /upload - Upload image as form field image (multipart)
  • GET /processed - Returns the latest annotated image
  • GET /detections - Returns JSON list of latest detections
  • GET /ollama - Returns latest Ollama suggestion
  • GET /get_command - Endpoint polled by Pi client for movement commands
  • GET /health - Health and basic status

TTS behavior

  • When enabled, the server will call tts.speak_detections(...) after processing images. The TTS module uses pyttsx3 by default, which works offline on many platforms.

Notes & troubleshooting

  • You must have a YOLO model file (the repo includes yolov8m.pt and yolov8n.pt). The server attempts to load yolov8m.pt by default. Adjust init_yolo() if you want a different model.
  • On headless Linux servers, pyttsx3 may require additional audio backends (e.g., aplay / ALSA) to be installed. If audio playback fails, TTS falls back to printing the sentence.
  • For Raspberry Pi robot control, see pi_client.py and configure GPIO pins.

Example POST using curl

curl -X POST -F "image=@/path/to/photo.jpg" http://localhost:1909/upload

Example test script

See examples/test_request.py for a small Python example that uploads an image and prints the response.

License

Add a LICENSE file appropriate to your project before publishing.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages