AutoFetch-Detect: An Automated LLM-Enhanced Real-Time Object Detection System

<<<<<<< HEAD

AutoFetch-Detect: An Automated LLM-Enhanced Real-Time Object Detection System

AutoFetch-Detect is an automated system that combines state-of-the-art object detection (YOLOv8) with contextual understanding through a Retrieval-Augmented Generation (RAG) system powered by a local LLM (Llama-2-7B). The system performs real-time object detection with contextual descriptions, suitable for cross-platform deployment on iOS, Android, and web.

Features

Real-time Object Detection: Uses YOLOv8 for fast and accurate object detection
Contextual Descriptions: Llama-2-7B quantized model with RAG system generates contextual descriptions for detected objects
Automated Pipeline: Complete end-to-end pipeline from dataset fetching to deployment
Cross-Platform: Flutter frontend for iOS, Android, and web deployment
Local Processing: No external APIs or cloud services required after initial setup
Modular Design: Easy to swap datasets, models, and components

Architecture

┌─────────────────┐    ┌──────────────────┐    ┌──────────────────┐
│   Flutter App   │───▶│   FastAPI API    │───▶│  YOLO + RAG      │
│                 │    │                  │    │                  │
│  • Camera feed  │    │  • /detect       │    │  • Object        │
│  • Image upload │    │  • /batch_detect │    │    detection     │
│  • Overlay      │    │  • /classes      │    │  • RAG           │
│  • Descriptions │    │  • /model_info   │    │    descriptions  │
└─────────────────┘    └──────────────────┘    └──────────────────┘

Prerequisites

Python 3.10+
Flutter 3.16+
Git
NVIDIA GPU (recommended) or CPU
At least 15GB free disk space
Access to Hugging Face (for Llama-2 model) - you must accept the terms manually

Setup

1. Clone the Repository

git clone https://github.com/yourusername/autofetch-detect.git
cd autofetch-detect

2. Install Python Dependencies

# Create virtual environment (recommended)
conda create -n autofetch python=3.10
conda activate autofetch

# Install Python dependencies
pip install -r requirements.txt

3. Setup Hugging Face Access (for Llama-2)

You need to manually accept the Llama-2 terms at https://huggingface.co/meta-llama/Llama-2-7b-chat-hf and set up your access token:

# Set your Hugging Face token
export HF_TOKEN=your_huggingface_token_here

4. Install Flutter Dependencies

# Navigate to Flutter app
cd flutter_app

# Get Flutter dependencies
flutter pub get

Usage

The system is orchestrated through the main script with different phases:

Phase 1: Setup Environment

python main.py --phase setup

Phase 2: Fetch Dataset

python main.py --phase fetch

Phase 3: Train Models

python main.py --phase train

Phase 4: Run Complete Pipeline

# Run all phases in sequence
python main.py --phase all

Phase 5: Start Server

python main.py --phase serve

Phase 6: Run Flutter App

cd flutter_app
flutter run

Custom Class Filtering

You can specify custom classes to filter the dataset:

python main.py --phase fetch --classes "hammer,screwdriver,drill"

Project Structure

obj_detect_project/
├── README.md                  # Setup, run instructions, thesis outline
├── requirements.txt           # Python deps
├── Dockerfile                 # For backend
├── main.py                    # Orchestrator: python main.py --phase [setup|fetch|train|test|serve|all]
├── data/
│   ├── fetch_dataset.py      # Auto-download/extract/convert COCO
│   ├── coco.yaml             # YOLO config (auto-generated)
│   └── raw/                  # Downloaded zips/JSons (gitignore)
├── models/
│   ├── train.py              # YOLO training + Ray Tune hyperparams
│   ├── rag_setup.py          # LLM download, embed docs, FAISS index
│   └── inference.py          # Detect + RAG query → JSON
├── backend/
│   └── app.py                # FastAPI server (/detect endpoint)
├── tests/
│   ├── test_train.py         # Pytest for metrics
│   └── test_end2end.py       # Full pipeline validation
├── notebooks/
│   └── eval_ablation.ipynb   # Jupyter for results viz (mAP tables, charts)
├── flutter_app/
│   ├── pubspec.yaml          # Flutter deps
│   ├── lib/
│   │   ├── main.dart         # Entry + camera screen
│   │   └── detection_overlay.dart  # Bbox drawing + LLM text
│   └── android/ ios/ web/    # Standard Flutter dirs
├── docs/
│   └── thesis_outline.md     # 50-page thesis template (LaTeX ready)
└── .gitignore                # Ignore data/models/large files

API Endpoints

The backend provides the following endpoints:

GET / - Health check
POST /detect - Detect objects in uploaded image
POST /batch_detect - Detect objects in multiple images
GET /classes - Get list of detectable classes
GET /model_info - Get information about loaded model
GET /stats - Get API usage statistics

Performance

Detection Speed: Up to 41+ FPS on RTX 3080, 23+ FPS on RTX 3060
Model Accuracy: mAP@0.5 of 0.7+ on COCO dataset
RAG Response: <0.8s average response time
Memory Usage: 2-8GB depending on configuration

Docker Deployment

To run the backend in Docker:

# Build the image
docker build -t autofetch-detect .

# Run the container
docker run -p 8000:8000 autofetch-detect

Troubleshooting

GPU Issues

If you encounter GPU-related issues:

Ensure CUDA drivers are properly installed
Verify PyTorch is installed with CUDA support
Check that your GPU has enough memory

Model Download Issues

If model downloads fail:

Verify your internet connection
Check if you have accepted the terms for Llama-2 on Hugging Face
Ensure you have sufficient disk space

Flutter Build Issues

For Flutter-related issues:

Ensure Flutter is properly installed and in PATH
Run flutter doctor to check for issues
Verify iOS/Android SDKs are properly configured

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Acknowledgments

YOLOv8 for the object detection backbone
Llama-2 for the language model foundation
Hugging Face for model hosting
Flutter team for the cross-platform framework =======

object_detection

b7b766ae8bc61c27dc3245a7626fff66281fbc98

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
backend		backend
data		data
docs		docs
flutter_app		flutter_app
models		models
notebooks		notebooks
runs/train/autofetch_detect		runs/train/autofetch_detect
tests		tests
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt
yolov8n.pt		yolov8n.pt

Folders and files

Latest commit

History

Repository files navigation

AutoFetch-Detect: An Automated LLM-Enhanced Real-Time Object Detection System

Features

Architecture

Prerequisites

Setup

1. Clone the Repository

2. Install Python Dependencies

3. Setup Hugging Face Access (for Llama-2)

4. Install Flutter Dependencies

Usage

Phase 1: Setup Environment

Phase 2: Fetch Dataset

Phase 3: Train Models

Phase 4: Run Complete Pipeline

Phase 5: Start Server

Phase 6: Run Flutter App

Custom Class Filtering

Project Structure

API Endpoints

Performance

Docker Deployment

Troubleshooting

GPU Issues

Model Download Issues

Flutter Build Issues

License

Contributing

Acknowledgments

object_detection

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages