A comprehensive, production-ready deepfake detection system supporting images, videos, and audio. Built with PyTorch and following industry best practices for scalable machine learning applications.
- Multi-Dataset Support: FaceForensics++, DFDC, Celeb-DF, WildDeepfake, ASVspoof 2021, FakeAVCeleb
- Video Processing: Face detection, extraction, and alignment with OpenCV
- Audio Processing: MFCC, spectral features, voice activity detection with librosa
- PyTorch Integration: Efficient datasets and dataloaders with augmentation
- Data Management: Automated splitting, validation, and metadata tracking
- Phase 2: Model Development (EfficientNet, Vision Transformers, AASIST)
- Phase 3: Backend API (FastAPI, real-time processing)
- Phase 4: Frontend Interface (React, mobile support)
- Phase 5: Deployment (Docker, Kubernetes, cloud)
- Phase 6: Optimization (TensorRT, quantization)
- Phase 7: Ethics & Explainability (Grad-CAM, bias mitigation)
# Clone repository
git clone https://github.com/bivek2003/DeepFake_Detection.git
cd DeepFake-Detection
# Install dependencies
pip install -r requirements.txt
# Setup development environment
make setup
DeepFake-Detection/
โโโ src/deepfake_detector/ # Main package
โ โโโ data/ # Data processing modules
โ โ โโโ dataset_manager.py # Dataset registry and management
โ โ โโโ video_processor.py # Video processing pipeline
โ โ โโโ audio_processor.py # Audio feature extraction
โ โ โโโ data_pipeline.py # PyTorch integration
โ โ โโโ __init__.py
โ โโโ models/ # Model architectures (Phase 2)
โ โโโ utils/ # Utilities and configuration
โ โโโ __init__.py
โโโ datasets/ # Raw datasets
โโโ preprocessed/ # Processed data
โโโ logs/ # Application logs
โโโ tests/ # Unit tests
โโโ requirements.txt # Dependencies
โโโ setup.py # Package setup
โโโ Makefile # Development commands
โโโ README.md # Documentation
Dataset | Type | Size | Description |
---|---|---|---|
FaceForensics++ | Video | 38.5 GB | 1,000 videos, 1.8M manipulated images |
DFDC | Video | 470 GB | 100K+ clips from 3,426 actors |
Celeb-DF | Video | 15.8 GB | 590 originals + 5,639 deepfakes |
WildDeepfake | Video | 4.2 GB | 707 "in-the-wild" deepfake videos |
DeeperForensics-1.0 | Video | 2,000 GB | 60K videos with rich perturbations |
ASVspoof 2021 | Audio | 23.1 GB | TTS/VC speech deepfake dataset |
FakeAVCeleb | Multimodal | 87.5 GB | Synchronized video + audio deepfakes |
Total Dataset Size: ~2.6 TB
The system uses YAML configuration files for easy customization:
# config.yaml
data:
data_root: "./datasets"
video_target_size: [224, 224]
audio_sample_rate: 16000
test_size: 0.2
val_size: 0.1
training:
batch_size: 32
learning_rate: 1e-4
num_epochs: 50
device: "auto"
logging:
level: "INFO"
file_logging: true
log_dir: "./logs"
- Face Detection: OpenCV Haar cascades or DNN models
- Face Extraction: Automatic cropping with padding
- Alignment & Resizing: Standardized 224ร224 RGB images
- Quality Analysis: Resolution, FPS, face detection rate scoring
- Augmentation: Flips, rotations, color jitter, compression artifacts
- Loading: Multi-format support (WAV, MP3, FLAC)
- Resampling: Standardized 16kHz mono audio
- Feature Extraction: MFCC, spectral features, mel spectrograms
- Voice Activity Detection: Automatic silence removal
- Augmentation: Noise addition, pitch shifting, time stretching
- Stratified Splitting: Maintains class balance across train/val/test
- PyTorch Integration: Efficient datasets and dataloaders
- Batch Processing: Parallel processing with configurable workers
- Memory Management: Optional preloading for faster training
- Metadata Tracking: Comprehensive logging and validation
make install # Install dependencies
make install-dev # Install with development tools
make test # Run tests
make lint # Check code style
make format # Format code with black
make clean # Clean generated files
make setup # Full project setup
from deepfake_detector.data import DatasetRegistry, DatasetInfo
# Register new dataset
registry = DatasetRegistry()
registry.datasets["new_dataset"] = DatasetInfo(
name="New Deepfake Dataset",
type="video",
url="https://example.com/dataset",
description="Custom deepfake dataset",
file_count=1000,
size_gb=50.0
)
# Run all tests
pytest tests/ -v
# Run specific test category
pytest tests/test_video_processor.py -v
pytest tests/test_audio_processor.py -v
# Run with coverage
pytest tests/ --cov=src/deepfake_detector --cov-report=html
- Face Detection: ~30 FPS on CPU, ~100 FPS on GPU
- Feature Extraction: 224ร224 faces at 60 FPS
- Batch Processing: 4ร speedup with parallel workers
- Feature Extraction: Real-time processing (3s audio in <0.1s)
- MFCC Computation: 13 coefficients in ~10ms
- Batch Processing: 8ร speedup with multiprocessing
- Video Dataset: ~2GB RAM for 10K samples (with preloading)
- Audio Dataset: ~1GB RAM for 50K samples (MFCC features)
- Streaming Mode: <100MB RAM (no preloading)
- Face Detection: OpenCV Haar cascades (fast) or DNN models (accurate)
- Preprocessing: Face alignment, padding, normalization
- Augmentation: TorchVision transforms with custom video-specific augmentations
- Quality Scoring: Multi-factor analysis (resolution, FPS, face detection rate)
- Feature Extraction: Librosa-based pipeline with 13 MFCC coefficients
- Voice Activity Detection: Energy and zero-crossing rate thresholding
- Augmentation: Time-domain and frequency-domain transformations
- Quality Analysis: SNR estimation, spectral quality metrics
- Stratified Splitting: Maintains class distribution across splits
- PyTorch Integration: Custom Dataset classes with efficient loading
- Parallel Processing: ThreadPoolExecutor for I/O bound operations
- Error Handling: Graceful degradation with fallback mechanisms
- EfficientNet backbone for video classification
- Vision Transformer integration
- AASIST audio architecture
- Ensemble methods and model fusion
- Cross-dataset evaluation
- FastAPI backend with async processing
- WebSocket support for real-time streams
- Rate limiting and authentication
- Comprehensive API documentation
- React web interface
- React Native mobile app
- Real-time camera processing
- Progressive Web App (PWA)
- Docker containerization
- Kubernetes orchestration
- CI/CD pipelines
- Cloud deployment (AWS/GCP/Azure)
- TensorRT acceleration
- Model quantization
- On-device inference
- Real-time performance (30+ FPS)
- Grad-CAM visualizations
- Bias detection and mitigation
- Fairness metrics
- Responsible AI guidelines
This project is licensed under the MIT License - see the LICENSE file for details.
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature
) - Make your changes following the code style
- Run tests (
make test
) - Commit your changes (
git commit -m 'Add amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
- Follow PEP 8 guidelines
- Use Black for code formatting
- Add type hints where possible
- Write comprehensive docstrings
- Maintain test coverage >90%
- ๐ง Email: sharmabivek12@gmail.com
- ๐ Issues: GitHub Issues
- ๐ Documentation: Project Wiki
- FaceForensics++ team for the foundational dataset
- DFDC Challenge organizers for the comprehensive benchmark
- ASVspoof community for audio spoofing research
- PyTorch team for the excellent deep learning framework
- OpenCV and librosa communities for robust media processing tools
If you use this project in your research, please cite:
@misc{deepfake-detector-2025,
title={Comprehensive Deepfake Detection System},
author={Bivek Sharma Panthi},
year={2025},
url={https://github.com/bivek2003/DeepFake_Detection}
}
โญ Star this repository if you find it helpful!