Skip to content

cdtalley/AI-and-ComputerVision-Development-Project-VisionDetect-

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

VisionDetect: Advanced Object Detection with Deep Learning

License: MIT CI Status Python 3.8+ PyTorch TensorFlow

VisionDetect is a comprehensive computer vision framework for object detection using state-of-the-art deep learning techniques. It provides a modular, extensible architecture that supports multiple backends (PyTorch and TensorFlow) and various model architectures.

Features

  • Multiple Model Architectures: Support for Faster R-CNN, with extensibility for other architectures
  • Multiple Backends: Implementations in both PyTorch and TensorFlow
  • Transfer Learning: Utilize pre-trained models for faster training and better performance
  • Data Augmentation: Comprehensive data augmentation pipeline for improved model generalization
  • Evaluation Metrics: Detailed performance metrics including mAP, precision, and recall
  • Visualization Tools: Utilities for visualizing predictions and model performance
  • Model Serving: REST API for serving models in production environments
  • Command-Line Interface: Easy-to-use CLI for training, evaluation, and inference
  • Comprehensive Documentation: Detailed documentation and examples

Installation

Prerequisites

  • Python 3.8+
  • CUDA-compatible GPU (recommended for training)

Install from Source

# Clone the repository
git clone https://github.com/yourusername/visiondetect.git
cd visiondetect

# Create a virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install the package
pip install -e .

Quick Start

from src import VisionDetect

# Create VisionDetect instance
vd = VisionDetect()

# Train model
trainer, metrics = vd.train(
    data_dir="path/to/data",
    model_type="faster_rcnn",
    backbone="resnet50",
    num_classes=91,
    epochs=50
)

# Make prediction
result = vd.predict(
    image_path="path/to/image.jpg",
    model_path="checkpoints/best_model.pth"
)

Documentation

Project Structure

visiondetect/
├── config/               # Configuration files
├── data/                 # Data storage (gitignored)
├── docs/                 # Documentation
├── notebooks/            # Jupyter notebooks for exploration and demos
├── src/                  # Source code
│   ├── data/             # Data processing modules
│   ├── models/           # Model implementations
│   ├── utils/            # Utility functions
│   └── api/              # API for model serving
├── tests/                # Unit and integration tests
├── train.py              # Training script
├── evaluate.py           # Evaluation script
├── infer.py              # Inference script
├── .gitignore            # Git ignore file
├── LICENSE               # License file
├── README.md             # Project documentation
└── requirements.txt      # Python dependencies

Command-Line Interface

Training

python train.py --data-dir data --model-type faster_rcnn --backbone resnet50 --epochs 50

Evaluation

python evaluate.py --model-path checkpoints/best_model.pth --data-dir data --visualize

Inference

python infer.py --model-path checkpoints/best_model.pth --input path/to/image.jpg

Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

  • The project structure and design patterns are inspired by best practices in the deep learning community
  • Pre-trained models are based on the work of various research teams
  • Special thanks to the PyTorch and TensorFlow teams for their excellent frameworks

About

End to end AI development project.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published