An intelligent image search system powered by CLIP and FaceNet models
This application allows you to search through your image collection using natural language queries like "Neelam giving speech" or "beach picnic". It combines the power of OpenAI's CLIP model for semantic image understanding and FaceNet for face recognition.
- π Natural Language Search: Search images using descriptive text queries
- π€ Face Recognition: Identify specific people in your image collection
- πΌοΈ Multiple Format Support: JPG, PNG, HEIC, HEIF, WebP, BMP, TIFF
- π± HEIC Support: Full support for Apple's HEIC format with automatic fallbacks
- ποΈ Adjustable Parameters: Fine-tune search sensitivity and result count
- π± Responsive UI: Modern web interface that works on desktop and mobile
- β‘ Real-time Results: Fast search with similarity scoring
- Frontend: Next.js 15 with React 19 and Tailwind CSS
- Backend: FastAPI with Python
- AI Models:
- CLIP (ViT-L/14@336px) for text-to-image similarity
- MTCNN for face detection
- FaceNet (InceptionResnetV1) for face recognition
- Image Processing: PIL/Pillow with HEIC support via pillow-heif
- Python 3.8+ with pip
- Node.js 16+ with npm
- Git for cloning the repository
- 4GB+ RAM (recommended for model loading)
git clone https://github.com/your-username/ai_image_search.git
cd ai_image_searchRun the automated setup script that installs all dependencies and creates sample data:
python scripts/quick-setup.pyFor full HEIC format support:
python scripts/install-heic-support.pypython scripts/run-backend.pyYou should see:
β
All models loaded successfully!
β
HEIC support available via: pillow-heif (recommended)
β
Generated embeddings for X images
β
Generated embeddings for X known faces
INFO: Uvicorn running on http://0.0.0.0:8000
# Option 1: Use setup script (handles common issues)
python scripts/setup-frontend.py# Option 2: Manual installation
npm install --legacy-peer-deps
npm run dev# Option 3: Windows batch file
scripts\setup-frontend.bat- Main Application: http://localhost:3000
- API Documentation: http://localhost:8000/docs
- Supported Formats: http://localhost:8000/supported-formats
ai-image-search/
βββ **Images**/ # Your searchable image collection
β βββ events/
β βββ personal/
β βββ work/
βββ **known_faces**/ # Reference photos for face recognition
β βββ person1.jpg
β βββ person2.jpg
βββ backend/
β βββ main.py # FastAPI backend server
βββ app/
β βββ page.tsx # Main React component
β βββ layout.tsx # App layout
βββ components/ui/ # Reusable UI components
βββ scripts/ # Setup and utility scripts
β βββ quick-setup.py # Complete automated setup
β βββ install-heic-support.py # HEIC format support
β βββ run-backend.py # Start backend server
β βββ setup-frontend.py # Frontend setup helper
β βββ create-folders.py # Create directory structure
βββ README.md
- JPG/JPEG - Standard photo format
- PNG - Lossless image format
- WebP - Modern web format
- BMP - Bitmap format
- TIFF - High-quality format
- HEIC - Apple's modern format (iPhone photos)
- HEIF - High Efficiency Image Format
The system automatically detects and uses the best available HEIC decoder:
- pillow-heif (recommended) - Full HEIC support
- pyheif (fallback) - Alternative HEIC library
- opencv-python (fallback) - Basic image processing
- Place images in the
Images/folder - Organize in subfolders (optional):
Images/events/,Images/personal/, etc. - All formats supported: JPG, PNG, HEIC, HEIF, WebP, BMP, TIFF
- Place reference photos in the
known_faces/folder - Use descriptive filenames:
john_doe.jpg,jane_smith.png - Each image should contain a clear face
After adding new images, click the "Refresh Embeddings" button in the web interface or restart the backend server.
"John giving speech"- Find images of a specific person doing an activity"beach picnic"- Find images of outdoor dining scenes"team meeting"- Find group meeting photos"office workspace"- Find workplace environments"mountain hiking"- Find outdoor adventure photos"family dinner"- Find dining/family gathering photos
- Max Results: 1-10 (default: 5)
- Similarity Threshold: 0.0-1.0 (default: 0.2)
- Lower values = more results, less strict matching
- Higher values = fewer results, stricter matching
If the quick setup doesn't work, follow these manual steps:
# Install Python dependencies
pip install fastapi uvicorn python-multipart pillow torch torchvision tqdm numpy
pip install git+https://github.com/openai/CLIP.git
pip install facenet-pytorch
# Install HEIC support (optional)
pip install pillow-heif pyheif opencv-python
# Create folders
mkdir Images known_faces backend
# Start backend
python backend/main.py# Install Node.js dependencies
npm cache clean --force
npm install --legacy-peer-deps
# Start development server
npm run dev- "No module named uvicorn": Run
pip install uvicorn fastapi - CUDA out of memory: The app will automatically use CPU if GPU memory is insufficient
- No images found: Make sure images are in the
Images/folder with supported formats - HEIC not working: Run
python scripts/install-heic-support.py
- HEIC images not loading: Install pillow-heif:
pip install pillow-heif - HEIC conversion errors: Try alternative:
pip install pyheif - macOS HEIC issues: Ensure Xcode command line tools are installed
- "next is not recognized": Run
npm install next react react-dom - PowerShell execution policy: Use Command Prompt instead of PowerShell, or run
Set-ExecutionPolicy RemoteSigned - npm install fails: Try
npm install --legacy-peer-depsornpm install --force
- Backend offline: Make sure backend is running on port 8000
- CORS errors: Backend allows localhost:3000 by default
- Port conflicts: Change ports in the configuration if needed
- Check the browser console for frontend errors
- Check the terminal running the backend for Python errors
- Visit http://localhost:8000/docs to test the API directly
- Test HEIC support: http://localhost:8000/supported-formats
- Ensure both servers are running simultaneously
The application works out of the box, but you can customize:
- Backend Port: Modify
port=8000inbackend/main.py - Frontend Port: Modify
next.config.mjsor usenpm run dev -- -p 3001 - Model Device: Automatically detects CUDA/CPU, or set
device = "cpu"inbackend/main.py
- CLIP Model: Currently uses
ViT-L/14@336px - Face Detection: MTCNN with 160px image size
- Face Recognition: InceptionResnetV1 pretrained on VGGFace2
The system automatically detects and uses available HEIC libraries in this order:
- pillow-heif (best performance and compatibility)
- pyheif (alternative implementation)
- opencv-python (basic fallback)
- First startup: 30-60 seconds (downloading models)
- Subsequent startups: 5-10 seconds
- Embedding generation: ~1-2 seconds per image
- HEIC processing: ~2-3 seconds per image (first time)
- Minimum: 4GB RAM, CPU-only
- Recommended: 8GB+ RAM, NVIDIA GPU with 4GB+ VRAM
- Storage: ~2GB for models + your image collection
- JPG/PNG: Fastest processing
- HEIC/HEIF: Slightly slower due to conversion
- WebP/TIFF: Standard processing speed
- Fork the repository
- Create a feature branch:
git checkout -b feature-name - Make your changes and test thoroughly
- Commit your changes:
git commit -m 'Add feature' - Push to the branch:
git push origin feature-name - Submit a pull request
This project is licensed under the MIT License - see the LICENSE file for details.
- OpenAI CLIP for semantic image understanding
- FaceNet PyTorch for face recognition
- pillow-heif for HEIC support
- FastAPI for the backend framework
- Next.js for the frontend framework
- v0.dev for rapid UI development
If you encounter any issues:
- Check the troubleshooting section above
- Review the logs in your terminal
- Test the API directly at http://localhost:8000/docs
- Check HEIC support at http://localhost:8000/supported-formats
- Create an issue on GitHub with:
- Your operating system
- Python and Node.js versions
- Image formats you're trying to use
- Complete error messages
- Steps to reproduce the issue
Happy searching! πβ¨