This project implements a complete object detection system for home objects using PyTorch, FastAPI, and React. It allows users to upload images and receive processed images with bounding boxes drawn around detected home objects. NEW: Advanced AI 3D Scene Mapping feature that converts 2D images to interactive 3D environments!
- Object Detection: Detects 25+ different home objects including toilets, sinks, mirrors, bathtubs, towels, etc.
- Web Interface: Clean, responsive React frontend for easy image upload and result viewing
- API Backend: FastAPI server for handling image processing requests
- Real-time Results: Instant visualization of detected objects with bounding boxes
- Configurable: Adjustable confidence threshold for detection sensitivity
- AI 3D Scene Mapper: NEW! Advanced feature that converts 2D images to interactive 3D environments
- Depth Estimation: Uses state-of-the-art models (DPT, MiDaS) to estimate depth from 2D images
- 3D Reconstruction: Places detected objects in 3D space with realistic spatial relationships
- Interactive Visualization: Three.js-based 3D viewer for exploring the reconstructed scene
[React Frontend] <---> [FastAPI Backend] <---> [PyTorch Model]
- YOLO-style architecture for object detection
- Custom bounding box drawing functionality
- Synthetic dataset generation for training
- Image upload and processing endpoints
- Object detection API
- Static file serving for results
- User-friendly interface for image uploads
- Real-time display of detection results
- Confidence threshold adjustment
- Python 3.7+
- Node.js 14+
- PyTorch compatible with your system (CPU or CUDA)
-
Clone or download this repository
-
Install Python dependencies:
pip install -r requirements.txt
-
Train the model (if not already trained):
python object_detection_model.py
This will create the model file
home_objects_detection_model.pth
-
Start the API server:
python api_server.py
The API will be available at
http://localhost:8000
-
Navigate to the frontend directory:
cd frontend
-
Install dependencies:
npm install
-
Start the development server:
npm start
-
Open your browser to
http://localhost:3000
- Ensure the FastAPI backend is running on port 8000
- Open the React frontend in your browser
- Upload one or more images using the file selector
- Adjust the confidence threshold as needed
- Click "Detect Objects" to process the images
- View the results with bounding boxes and detection information
GET /
- API informationPOST /detect
- Single image detectionPOST /detect-multiple
- Multiple image detectionGET /health
- Health checkGET /classes
- List of detectable classesGET /docs
- Interactive API documentationPOST /3d-scene-map
- Create 3D reconstruction from 2D image (NEW!)GET /3d-visualization
- Serve 3D visualization interface (NEW!)
The system can detect the following home objects:
- Toilet
- Sink
- Mirror
- Bathtub
- Showerhead
- Towel
- Toothbrush
- Toothpaste
- Soap Bar
- Shampoo Bottle
- Conditioner Bottle
- Handwash Bottle
- Toilet Paper Roll
- Towel Rack
- Bath Mat
- Hair Dryer
- Razor
- Lotion Bottle
- Trash Bin
- Shower Curtain
- Comb
- Cleaning Brush
- Bucket
- Mug
- Bathroom Shelf
- Update the
HOME_OBJECTS
list inobject_detection_model.py
- Retrain the model with updated class information
- Update the frontend to handle new classes if needed
- Add more training data with bounding box annotations
- Adjust the model architecture in
object_detection_model.py
- Tune hyperparameters in the training function
├── object_detection_model.py # Core detection model and training
├── api_server.py # FastAPI backend
├── api_server_README.md # API server documentation
├── requirements.txt # Python dependencies
├── frontend/ # React frontend
│ ├── public/
│ ├── src/
│ ├── package.json
│ └── README.md
├── static/ # Directory for detection results (created automatically)
└── README.md # This file
- If you get CUDA errors, ensure PyTorch with the appropriate CUDA version is installed
- Make sure the API server is running before starting the frontend
- If detection results don't load, check that the static directory is accessible
This project is available for educational and research purposes.