Image forensics and analysis suite with ML-powered detection
Installation β’ Usage β’ Features β’ Web Interface β’ API β’ Limitations
β οΈ Project Status & Developer's NoteThis is a hobby project in active development. I'm building this tool for fun and learning purposes. While I strive for quality, please be aware:
- π May contain bugs - This is not production-ready software
- π§ Constantly improving - Features and APIs may change
- β° Updated in my free time - Progress depends on my availability
- π‘ Suggestions welcome - If you find issues or have improvement ideas, feel free to share!
- π Learning project - May contain experimental or incomplete features
Thanks for your understanding and patience! π
PixIntelligence v2.0 is a production-grade forensic image analysis tool designed for OSINT analysts, digital forensics professionals, and security researchers. It combines traditional forensic techniques with state-of-the-art machine learning to provide comprehensive manipulation detection, steganography analysis, camera fingerprinting, and metadata extraction.
- Multi-Quality ELA - Enhanced Error Level Analysis across multiple compression levels
- Double JPEG Detection - DCT histogram analysis for precise compression history
- Median Filter Detection - Identify editing artifacts through streak analysis
- PRNU Analysis - Camera sensor fingerprinting for source identification
- Lighting Analysis - 3D lighting direction estimation with object detection
- ML Detection - ManTra-Net integration for pixel-level manipulation probability
β οΈ Note: ML model not included by default. See Detection Limitations for details.
- Web Dashboard - Modern React-based interface with real-time analysis
- FastAPI Backend - RESTful API with WebSocket support for progress updates
- JSON-Only CLI - Streamlined command-line with machine-readable output
-
Multi-Quality ELA
- Analysis at 7 different quality levels (60-95)
- Composite heatmap generation
- Consistency scoring across compression levels
-
Double JPEG Detection
- DCT coefficient histogram analysis
- Quantization artifact detection
- Compression history estimation
-
Median Filter Detection
- Streak detection algorithm
- Zero-difference analysis
- Filter size estimation
-
PRNU Analysis
- Camera sensor fingerprint extraction via wavelet denoising
- Peak-to-Correlation-Energy (PCE) matching
- Splice and tampering detection
- HDF5-based camera database
-
Lighting Inconsistency
- YOLOv8 object detection
- Per-object 3D lighting direction estimation
- Angular difference analysis
- Lighting vector visualization
-
ML-Based Detection
- ManTra-Net integration (with fallback)
- Sliding window inference
- Pixel-level manipulation probability
- Confidence mapping
-
Traditional Methods
- Clone detection (Copy-Move)
- Noise pattern analysis
- Edge inconsistency detection
- Color manipulation analysis
- LSB (Least Significant Bit) analysis
- Chi-square statistical test
- RS Steganalysis
- Frequency domain analysis
- Pattern recognition
- Complete EXIF - Camera, settings, date/time
- GPS - Coordinates, altitude, direction
- XMP - Adobe metadata, editing history
- IPTC - Copyright, keywords, location
- Embedded thumbnail analysis
- Sharpness and focus assessment
- Blur detection
- Color and saturation analysis
- Face detection
- Texture analysis
- Cryptographic: MD5, SHA1, SHA256, SHA512, BLAKE2
- Perceptual: pHash, dHash, aHash, wHash
- Structural: SIFT, ORB, contour-based
- Similarity comparison
- Interactive Web UI with real-time analysis and visualizations
- Structured JSON for integration with other tools
- Batch analysis with summary reports
- Image comparison side-by-side
No system can detect steganography with 100% certainty - this is a fundamental mathematical limitation, not a technical one.
-
Mathematical Limitation: If hidden data is well-distributed and encrypted, it becomes indistinguishable from natural image noise.
-
No Original Image: Without the "clean" image for comparison, we cannot determine which changes are natural versus intentional.
-
Fundamental Trade-off:
- Sensitive thresholds β More detection, but more false positives
- Strict thresholds β Fewer false positives, but missed detections
- Interpretation: Very unlikely that steganography is present
- Recommendation: Image appears natural
- False positives: ~5-10%
- Interpretation: Some anomalies, but could be natural
- Recommendation: Review context (JPEG compression, normal editing)
- False positives: ~20-30%
- Interpretation: Multiple suspicious indicators
- Recommendation: Additional investigation recommended
- False positives: ~30-40%
- Interpretation: Strong evidence of steganography
- Recommendation: Likely contains hidden data
- Note: "Confirmed" means "high probability", NOT "absolute certainty"
- False positives: ~10-20%
- Compression artifacts can appear as alterations
- Especially with quality < 70%
- Natural sensor noise has high entropy
- Can resemble encrypted random data
- Brightness/contrast adjustments modify LSBs
- Filters and effects alter statistical distributions
- Natural textures (grass, sand, clouds) have high entropy
- May trigger randomness tests
-
Don't rely on a single indicator: Multiple positive tests increase confidence
-
Consider the context:
- Was the image compressed/recompressed?
- Was it edited with software?
- Does it have high ISO noise?
-
Look for consistent patterns: If multiple different techniques detect anomalies in the same regions, probability increases
-
Combine with other evidence: Metadata, user behavior, timing, etc.
PixIntelligence now includes a dedicated Uncertainty Analysis tab that provides:
- Test Agreement Metrics: Visual representation of how many tests agree
- Bayesian Probability Analysis: Mathematical confidence based on test reliability
- Payload Capacity Estimation: Approximate size of potentially hidden data
- Adaptive Steganography Detection: Detection of sophisticated hiding techniques
- Contextual Recommendations: Warnings about image characteristics that may cause false positives
This feature is based on information-theoretic models and provides transparency about detection limitations.
The easiest way to run PixIntelligence is using Docker:
# Clone repository
git clone git@github.com:jacobobb/pixintelligence.git
cd pixintelligence
# Build and start with Docker Compose
docker-compose up -d
# View logs
docker-compose logs -f
# Stop the service
docker-compose downAccess the application:
- Web UI: http://localhost:8000/ui
- API Docs: http://localhost:8000/docs
- Health Check: http://localhost:8000/
Docker commands:
# Build the image
docker build -t pixintelligence:latest .
# Run manually (without docker-compose)
docker run -d \
--name pixintelligence \
-p 8000:8000 \
-v $(pwd)/src/data:/app/src/data \
-v $(pwd)/output:/app/output \
pixintelligence:latest
# Access container shell
docker exec -it pixintelligence bash
# View real-time logs
docker logs -f pixintelligence
# Restart container
docker restart pixintelligence
# Stop and remove
docker stop pixintelligence && docker rm pixintelligenceProduction deployment:
# Run in production mode (detached, with restart policy)
docker-compose -f docker-compose.yml up -d --build
# Scale if needed (for load balancing)
docker-compose up -d --scale pixintelligence=3
# Update to latest version
git pull
docker-compose down
docker-compose up -d --build- Python 3.8 or higher
- pip (Python package manager)
- Node.js 16+ (for web interface)
- Git (optional)
# Clone repository
git clone git@github.com:jacobobb/pixintelligence.git
cd pixintelligence
# Create virtual environment (recommended)
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Install the package
pip install -e .# Navigate to frontend directory
cd src/web/frontend
# Install Node dependencies
npm install
# Build for production
npm run build
# Or run in development mode
npm run devYOLOv8 (Object Detection):
- β Automatic download on first use
- Downloads ~6MB model (yolov8n.pt)
- No manual intervention needed
ManTra-Net (ML Manipulation Detection):
β οΈ Uses intelligent fallback if not available- System works perfectly without it (uses edge-based detection)
- Optional: Download pre-trained weights if desired
# Optional: For better ML detection (if you have the model)
# Place ManTra-Net weights in: src/ml/models/mantranet.pthFor full XMP functionality:
# Ubuntu/Debian
sudo apt-get install libexempi3
# macOS
brew install exempi
# Then install python-xmp-toolkit
pip install python-xmp-toolkitTest that everything is working:
# Run info command
python pixintelligence.py info
# Test with sample image
python pixintelligence.py analyze tests/test_images/DSCN0010.jpg --verboseLaunch the web dashboard for interactive analysis:
# Start the web server
pixintelligence serve
# Custom host and port
pixintelligence serve --host 0.0.0.0 --port 8000
# Development mode with auto-reload
pixintelligence serve --reloadThen open http://localhost:8000 in your browser.
Features:
- π€ Drag-and-drop image upload
- β‘ Real-time analysis progress (WebSocket)
- πΊοΈ Interactive manipulation heatmaps
- π Detailed metrics dashboard
- π· PRNU camera database management
- π Analysis history and search
- πΎ Export results as JSON
# Complete analysis (JSON output)
pixintelligence analyze image.jpg
# With pretty-printed JSON
pixintelligence analyze image.jpg --pretty
# Custom output path
pixintelligence analyze image.jpg --output results/my_analysis.json
# Specific checks only
pixintelligence analyze image.jpg --checks manipulation stego
# Verbose mode (detailed console output)
pixintelligence analyze image.jpg --verbose# Analyze directory (JSON output for each image)
pixintelligence batch ./images
# Recursive search in subdirectories
pixintelligence batch ./images --recursive
# With specific checks
pixintelligence batch ./images --checks manipulation metadata --recursive
# Custom output directory
pixintelligence batch ./images --output-dir ./results# Compare two images
pixintelligence compare image1.jpg image2.jpg
# Custom output directory
pixintelligence compare image1.jpg image2.jpg --output-dir ./results# Add camera to PRNU database (10+ images recommended)
pixintelligence prnu-add camera_001 "Canon EOS 5D Mark IV" ref1.jpg ref2.jpg ref3.jpg ref4.jpg ref5.jpg
# List all cameras in database
pixintelligence prnu-list
# Example output:
# Camera ID Model Samples Added
# ====================================================================================
# camera_001 Canon EOS 5D Mark IV 5 2025-11-07# Show tool capabilities and version
pixintelligence info| Parameter | Description | Values |
|---|---|---|
--checks, -c |
Types of analysis to perform | all, metadata, manipulation, stego, quality, hash |
--output, -o |
Custom JSON output path | File path |
--output-dir, -d |
Output directory | Directory path |
--pretty, -p |
Pretty-print JSON | Flag |
--recursive, -r |
Recursive search | Flag |
--verbose, -v |
Detailed output | Flag |
--host |
Server host (serve command) | IP address (default: 0.0.0.0) |
--port |
Server port (serve command) | Port number (default: 8000) |
--reload |
Auto-reload server (serve command) | Flag |
# Launch web interface
pixintelligence serve
# Then in browser:
# 1. Upload image via drag-and-drop
# 2. Watch real-time progress
# 3. View interactive heatmap
# 4. Explore detailed metrics
# 5. Download JSON results# Analyze suspicious image with all detection methods
pixintelligence analyze suspicious_photo.jpg --verbose
# Output shows:
# β Multi-Quality ELA: 7 levels analyzed
# β Double JPEG: Detected (confidence: 78%)
# β Median Filter: Not detected
# β PRNU Analysis: No camera match
# β Lighting Analysis: 3 objects detected, consistent
# β ML Detection: Manipulation probability 45%
# β Manipulation Score: 65% (High likelihood)
# β JSON report saved: ./output/suspicious_photo_20251107_120000_report.json# Step 1: Build camera fingerprint database
pixintelligence prnu-add my_camera "Canon EOS 5D" \
ref1.jpg ref2.jpg ref3.jpg ref4.jpg ref5.jpg \
ref6.jpg ref7.jpg ref8.jpg ref9.jpg ref10.jpg
# Step 2: Analyze unknown image
pixintelligence analyze unknown.jpg --verbose
# Results will show:
# PRNU Analysis:
# - Camera Match: Yes (PCE: 125.4)
# - Best Match: my_camera (Canon EOS 5D)
# - Tampering Detected: No
# - Match Confidence: Very High# Analyze composite image for lighting issues
pixintelligence analyze composite.jpg --checks manipulation --verbose
# Detection results:
# Lighting Analysis:
# - Objects Detected: 4
# - Object 1 (person): Azimuth 45Β°, Elevation 30Β°
# - Object 2 (person): Azimuth 135Β°, Elevation 25Β°
# - Object 3 (car): Azimuth 50Β°, Elevation 28Β°
# - Object 4 (tree): Azimuth 225Β°, Elevation 35Β°
# - Inconsistency: DETECTED
# - Angular Difference: 90Β° (Objects 1-2)
# - Confidence: 85%# Analyze entire evidence folder
pixintelligence batch ./evidence --recursive --verbose
# Process:
# Processing images ββββββββββββββββββββ 100% [25/25]
# β Successfully analyzed: 25 images
# β Reports generated in ./output/
#
# Summary:
# - High manipulation likelihood: 5 images
# - Medium likelihood: 8 images
# - Low likelihood: 12 imagesimport requests
# Upload and analyze
with open('image.jpg', 'rb') as f:
response = requests.post(
'http://localhost:8000/api/analyze',
files={'file': f}
)
analysis_id = response.json()['analysis_id']
# Get results
results = requests.get(f'http://localhost:8000/api/results/{analysis_id}')
data = results.json()
print(f"Manipulation: {data['manipulation']['likelihood']}")
print(f"Score: {data['manipulation']['score']}%")
# Detection flags
flags = data['flags']
if flags['double_jpeg']:
print("β οΈ Double JPEG compression detected")
if flags['lighting_inconsistency']:
print("β οΈ Lighting inconsistency detected")
if flags['prnu_tampering']:
print("β οΈ PRNU tampering detected")PixIntelligence v2.0 provides a complete REST API:
| Method | Endpoint | Description |
|---|---|---|
GET |
/ |
API root information |
GET |
/api/health |
Health check |
POST |
/api/analyze |
Upload and analyze image |
GET |
/api/results/{id} |
Get analysis results |
GET |
/api/reports |
List all analyses |
GET |
/api/heatmap/{id} |
Get manipulation heatmap |
POST |
/api/prnu/add-reference |
Add camera reference |
GET |
/api/prnu/cameras |
List camera database |
DELETE |
/api/analysis/{id} |
Delete analysis |
WS |
/ws |
WebSocket for real-time progress |
import requests
# Health check
response = requests.get('http://localhost:8000/api/health')
print(response.json()) # {"status": "healthy", "timestamp": "..."}
# Upload image
with open('image.jpg', 'rb') as f:
response = requests.post(
'http://localhost:8000/api/analyze',
files={'file': f}
)
analysis_id = response.json()['analysis_id']
# Get results
results = requests.get(f'http://localhost:8000/api/results/{analysis_id}')
print(results.json())
# Get heatmap
heatmap = requests.get(f'http://localhost:8000/api/heatmap/{analysis_id}')
print(heatmap.json()['statistics'])
# List all reports
reports = requests.get('http://localhost:8000/api/reports')
for report in reports.json()['analyses']:
print(f"{report['filename']}: {report['manipulation_likelihood']}")const ws = new WebSocket('ws://localhost:8000/ws');
ws.onmessage = (event) => {
const data = JSON.parse(event.data);
if (data.type === 'progress') {
console.log(`Stage: ${data.data.stage}`);
console.log(`Progress: ${data.data.progress}%`);
}
};image-metadata-extractor/
βββ src/
β βββ main.py # CLI entry point
β βββ core/ # Core analysis modules
β β βββ manipulation.py # Manipulation detection (updated)
β β βββ double_jpeg.py # NEW: Double JPEG detection
β β βββ median_filter.py # NEW: Median filter detection
β β βββ heatmap_generator.py # NEW: Unified heatmap
β β βββ prnu_analysis.py # NEW: PRNU fingerprinting
β β βββ prnu_database.py # NEW: Camera database
β β βββ lighting_analysis.py # NEW: Lighting analysis
β β βββ metadata_extractor.py # Metadata extraction
β β βββ steganography.py # Steganography detection
β β βββ image_analyzer.py # Image quality analysis
β β βββ hash_generator.py # Hash generation
β βββ ml/ # Machine learning modules
β β βββ mantranet_detector.py # NEW: ML manipulation detection
β β βββ object_detector.py # NEW: YOLOv8 integration
β β βββ stego_detector.py # ML stego detection
β βββ web/ # NEW: Web application
β β βββ app.py # FastAPI backend
β β βββ models.py # Database models
β β βββ database.py # Database connection
β β βββ frontend/ # React frontend
β β βββ src/
β β β βββ App.jsx
β β β βββ components/
β β βββ package.json
β β βββ vite.config.js
β βββ reports/ # Report generation
β β βββ json_exporter.py # JSON reports (updated)
β β βββ html_generator.py # HTML (web only)
β βββ data/ # NEW: Data storage
β β βββ camera_fingerprints/ # PRNU database
β β βββ uploads/ # Uploaded images
β β βββ pixintelligence.db # SQLite database
β βββ utils/ # Utilities
βββ tests/ # Tests and sample images
βββ output/ # CLI output directory
βββ requirements.txt # Python dependencies
βββ README.md # This file
βββ USAGE_v2.md # Detailed usage guide
βββ IMPLEMENTATION_SUMMARY.md # Implementation details
| Level | Range | Meaning |
|---|---|---|
| Very Low | 0-20% | Very unlikely |
| Low | 20-40% | Unlikely |
| Medium | 40-60% | Possible, requires investigation |
| High | 60-80% | Likely |
| Very High | 80-100% | Very likely |
Manipulation:
- Areas with different compression levels (ELA)
- Inconsistent noise patterns
- Detected cloned regions
- Multiple JPEG compressions
Steganography:
- Modified LSBs in color channels
- Positive chi-square test
- Anomalous frequency patterns
- Known tool signatures
- Interactive visualizations with Plotly
- Color histogram charts
- Organized metadata tables
- Visual risk indicators
- Tab navigation
{
"version": "1.0",
"exported_at": "2024-01-01T12:00:00",
"results": {
"file": "image.jpg",
"analysis": {
"manipulation": {
"likelihood": "Medium",
"score": 45.5,
"indicators": ["ELA anomalies", "Noise patterns"]
},
"steganography": {
"likelihood": "Low",
"score": 15.2
},
"metadata": {
"exif": {...},
"gps": {...}
}
}
}
}PyTorch Installation Fails:
# Try CPU-only version
pip install torch torchvision --index-url https://download.pytorch.org/whl/cpuUltralytics/YOLO Installation Fails:
# Install without dependencies first
pip install ultralytics --no-deps
pip install opencv-python numpyWeb Dependencies Issues:
# Install web stack separately
pip install fastapi uvicorn[standard] sqlalchemy python-multipart websockets aiofilesFrontend Build Issues:
# Clear cache and reinstall
cd src/web/frontend
rm -rf node_modules package-lock.json
npm install"Module not found" errors:
# Make sure you're in the project directory
cd /path/to/image-metadata-extractor
# Activate virtual environment
source venv/bin/activate
# Reinstall in development mode
pip install -e .YOLOv8 download fails:
# Pre-download the model manually
from ultralytics import YOLO
model = YOLO('yolov8n.pt') # Downloads to ~/.cache/torch/hub/Web server won't start:
# Check if port is in use
lsof -ti:8000 | xargs kill -9
# Try different port
pixintelligence serve --port 8001Database errors:
# Remove and recreate database
rm src/data/pixintelligence.db
# Database will be recreated on next runPRNU database issues:
# Check database location
pixintelligence prnu-list
# Clear PRNU database if corrupted
rm src/data/camera_fingerprints/camera_fingerprints.h5Analysis is slow:
- Disable ML detection if not needed
- Reduce image resolution before analysis
- Use
--checksto run specific analyses only - ML models require significant memory
Memory usage is high:
# Check if multiple analyses are running
ps aux | grep pixintelligence
# Restart web server if memory leak suspected
# (Kill and restart pixintelligence serve)Q: Do I need GPU for ML detection? A: No, CPU works fine. GPU is optional and will speed up ML inference.
Q: Can I use this offline? A: Yes, after initial model downloads (YOLOv8), everything works offline.
Q: How accurate is the manipulation detection? A: Combines 7+ methods. No single method is 100% accurate, but combined confidence is high. Always verify findings manually.
Q: What image formats are supported? A: JPG, JPEG, PNG, GIF, BMP, TIFF, WEBP, HEIC, HEIF
Q: Can I integrate this into my application?
A: Yes! Use the REST API (pixintelligence serve) or import modules directly in Python.
- Multi-Quality ELA
- Double JPEG Detection
- Median Filter Detection
- PRNU Analysis
- Lighting Analysis
- ML Integration (ManTra-Net)
- Web Dashboard
- REST API + WebSocket
- PRNU Camera Database
- Load actual ManTra-Net pre-trained weights
- Alternative models: MVSS-Net, Noiseprint
- Ensemble predictions from multiple models
- GPU acceleration support
- Model fine-tuning interface
- Reverse image search APIs (Google, TinEye, Yandex)
- Social media metadata extraction
- Geolocation enrichment
- Timeline analysis
- Related images search
- Deepfake detection (video frames)
- OCR for text extraction
- Face recognition and tracking
- PDF report generation
- Batch comparison mode
- Image clustering and similarity
- User authentication and roles
- PostgreSQL support
- S3/Cloud storage integration
- Docker containerization
- Kubernetes deployment
- Monitoring and logging (Prometheus/Grafana)
- Rate limiting and API keys
- Comprehensive test suite
- Frame-by-frame video analysis
- Improved ML for steganography
- Parallel processing
- GPU support
- Results caching
- Distributed analysis
- Plugins and extensions
This project is dual-licensed:
Licensed under CC BY-NC-SA 4.0 (Creative Commons Attribution-NonCommercial-ShareAlike 4.0)
β Free for:
- Personal use
- Research and education
- Non-profit organizations
- Open source projects
β Not allowed without commercial license:
- Selling this software
- Using in commercial products/services
- Profit-generating activities
For commercial licensing, please contact the author. A separate commercial license is required for any commercial use.
See the LICENSE file for complete details.
- Jacobo Blancas Barroso - Initial development
- OSINT community
- Project contributors
- Open source libraries used
PixIntelligence - Professional tool for forensic image analysis
Made with β€οΈ for the OSINT and digital forensics community
