PixIntelligence 🔍

Image forensics and analysis suite with ML-powered detection

Installation • Usage • Features • Web Interface • API • Limitations

⚠️ Project Status & Developer's Note

This is a hobby project in active development. I'm building this tool for fun and learning purposes. While I strive for quality, please be aware:

🐛 May contain bugs - This is not production-ready software

🔧 Constantly improving - Features and APIs may change

⏰ Updated in my free time - Progress depends on my availability

💡 Suggestions welcome - If you find issues or have improvement ideas, feel free to share!

📚 Learning project - May contain experimental or incomplete features

Thanks for your understanding and patience! 😊

📋 Description

PixIntelligence v2.0 is a production-grade forensic image analysis tool designed for OSINT analysts, digital forensics professionals, and security researchers. It combines traditional forensic techniques with state-of-the-art machine learning to provide comprehensive manipulation detection, steganography analysis, camera fingerprinting, and metadata extraction.

🆕 What's New in v2.0

Multi-Quality ELA - Enhanced Error Level Analysis across multiple compression levels
Double JPEG Detection - DCT histogram analysis for precise compression history
Median Filter Detection - Identify editing artifacts through streak analysis
PRNU Analysis - Camera sensor fingerprinting for source identification
Lighting Analysis - 3D lighting direction estimation with object detection
ML Detection - ManTra-Net integration for pixel-level manipulation probability
- ⚠️ Note: ML model not included by default. See Detection Limitations for details.
Web Dashboard - Modern React-based interface with real-time analysis
FastAPI Backend - RESTful API with WebSocket support for progress updates
JSON-Only CLI - Streamlined command-line with machine-readable output

✨ Key Features

🔬 Advanced Manipulation Detection

Multi-Quality ELA
- Analysis at 7 different quality levels (60-95)
- Composite heatmap generation
- Consistency scoring across compression levels
Double JPEG Detection
- DCT coefficient histogram analysis
- Quantization artifact detection
- Compression history estimation
Median Filter Detection
- Streak detection algorithm
- Zero-difference analysis
- Filter size estimation
PRNU Analysis
- Camera sensor fingerprint extraction via wavelet denoising
- Peak-to-Correlation-Energy (PCE) matching
- Splice and tampering detection
- HDF5-based camera database
Lighting Inconsistency
- YOLOv8 object detection
- Per-object 3D lighting direction estimation
- Angular difference analysis
- Lighting vector visualization
ML-Based Detection
- ManTra-Net integration (with fallback)
- Sliding window inference
- Pixel-level manipulation probability
- Confidence mapping
Traditional Methods
- Clone detection (Copy-Move)
- Noise pattern analysis
- Edge inconsistency detection
- Color manipulation analysis

🕵️ Steganography Detection

LSB (Least Significant Bit) analysis
Chi-square statistical test
RS Steganalysis
Frequency domain analysis
Pattern recognition

📊 Metadata Extraction

Complete EXIF - Camera, settings, date/time
GPS - Coordinates, altitude, direction
XMP - Adobe metadata, editing history
IPTC - Copyright, keywords, location
Embedded thumbnail analysis

🎯 Quality Analysis

Sharpness and focus assessment
Blur detection
Color and saturation analysis
Face detection
Texture analysis

🔐 Hash Generation

Cryptographic: MD5, SHA1, SHA256, SHA512, BLAKE2
Perceptual: pHash, dHash, aHash, wHash
Structural: SIFT, ORB, contour-based
Similarity comparison

📈 Reports and Visualization

Interactive Web UI with real-time analysis and visualizations
Structured JSON for integration with other tools
Batch analysis with summary reports
Image comparison side-by-side

📚 Understanding Steganography Detection Results

⚠️ Important Note on Certainty

No system can detect steganography with 100% certainty - this is a fundamental mathematical limitation, not a technical one.

Why We Cannot Be 100% Sure

Mathematical Limitation: If hidden data is well-distributed and encrypted, it becomes indistinguishable from natural image noise.
No Original Image: Without the "clean" image for comparison, we cannot determine which changes are natural versus intentional.
Fundamental Trade-off:
- Sensitive thresholds → More detection, but more false positives
- Strict thresholds → Fewer false positives, but missed detections

📊 How to Interpret Confidence Levels

Very Low / Improbable (< 60%)

Interpretation: Very unlikely that steganography is present
Recommendation: Image appears natural
False positives: ~5-10%

Low / Possible (60-80%)

Interpretation: Some anomalies, but could be natural
Recommendation: Review context (JPEG compression, normal editing)
False positives: ~20-30%

Medium / Probable (80-90%)

Interpretation: Multiple suspicious indicators
Recommendation: Additional investigation recommended
False positives: ~30-40%

High / Confirmed (> 90%)

Interpretation: Strong evidence of steganography
Recommendation: Likely contains hidden data
Note: "Confirmed" means "high probability", NOT "absolute certainty"
False positives: ~10-20%

🔍 Factors That Cause False Positives

1. Aggressive JPEG Compression

Compression artifacts can appear as alterations
Especially with quality < 70%

2. Camera Noise

Natural sensor noise has high entropy
Can resemble encrypted random data

3. Normal Image Editing

Brightness/contrast adjustments modify LSBs
Filters and effects alter statistical distributions

4. Complex Textures

Natural textures (grass, sand, clouds) have high entropy
May trigger randomness tests

✅ Best Practices

Don't rely on a single indicator: Multiple positive tests increase confidence
Consider the context:
- Was the image compressed/recompressed?
- Was it edited with software?
- Does it have high ISO noise?
Look for consistent patterns: If multiple different techniques detect anomalies in the same regions, probability increases
Combine with other evidence: Metadata, user behavior, timing, etc.

🎯 New in v2.1: Uncertainty Analysis

PixIntelligence now includes a dedicated Uncertainty Analysis tab that provides:

Test Agreement Metrics: Visual representation of how many tests agree
Bayesian Probability Analysis: Mathematical confidence based on test reliability
Payload Capacity Estimation: Approximate size of potentially hidden data
Adaptive Steganography Detection: Detection of sophisticated hiding techniques
Contextual Recommendations: Warnings about image characteristics that may cause false positives

This feature is based on information-theoretic models and provides transparency about detection limitations.

🚀 Installation

Option 1: Docker (Recommended) 🐳

The easiest way to run PixIntelligence is using Docker:

# Clone repository
git clone git@github.com:jacobobb/pixintelligence.git
cd pixintelligence

# Build and start with Docker Compose
docker-compose up -d

# View logs
docker-compose logs -f

# Stop the service
docker-compose down

Access the application:

Docker commands:

# Build the image
docker build -t pixintelligence:latest .

# Run manually (without docker-compose)
docker run -d \
  --name pixintelligence \
  -p 8000:8000 \
  -v $(pwd)/src/data:/app/src/data \
  -v $(pwd)/output:/app/output \
  pixintelligence:latest

# Access container shell
docker exec -it pixintelligence bash

# View real-time logs
docker logs -f pixintelligence

# Restart container
docker restart pixintelligence

# Stop and remove
docker stop pixintelligence && docker rm pixintelligence

Production deployment:

# Run in production mode (detached, with restart policy)
docker-compose -f docker-compose.yml up -d --build

# Scale if needed (for load balancing)
docker-compose up -d --scale pixintelligence=3

# Update to latest version
git pull
docker-compose down
docker-compose up -d --build

Option 2: Manual Installation

Prerequisites

Python 3.8 or higher
pip (Python package manager)
Node.js 16+ (for web interface)
Git (optional)

Quick Installation

# Clone repository
git clone git@github.com:jacobobb/pixintelligence.git
cd pixintelligence

# Create virtual environment (recommended)
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Install the package
pip install -e .

Frontend Installation (for Web Interface)

# Navigate to frontend directory
cd src/web/frontend

# Install Node dependencies
npm install

# Build for production
npm run build

# Or run in development mode
npm run dev

ML Models

YOLOv8 (Object Detection):

✅ Automatic download on first use
Downloads ~6MB model (yolov8n.pt)
No manual intervention needed

ManTra-Net (ML Manipulation Detection):

⚠️ Uses intelligent fallback if not available
System works perfectly without it (uses edge-based detection)
Optional: Download pre-trained weights if desired

# Optional: For better ML detection (if you have the model)
# Place ManTra-Net weights in: src/ml/models/mantranet.pth

Additional Dependencies (Optional)

For full XMP functionality:

# Ubuntu/Debian
sudo apt-get install libexempi3

# macOS
brew install exempi

# Then install python-xmp-toolkit
pip install python-xmp-toolkit

Verification

Test that everything is working:

# Run info command
python pixintelligence.py info

# Test with sample image
python pixintelligence.py analyze tests/test_images/DSCN0010.jpg --verbose

💻 Usage

🌐 Web Interface (Recommended)

Launch the web dashboard for interactive analysis:

# Start the web server
pixintelligence serve

# Custom host and port
pixintelligence serve --host 0.0.0.0 --port 8000

# Development mode with auto-reload
pixintelligence serve --reload

Then open http://localhost:8000 in your browser.

Features:

📤 Drag-and-drop image upload
⚡ Real-time analysis progress (WebSocket)
🗺️ Interactive manipulation heatmaps
📊 Detailed metrics dashboard
📷 PRNU camera database management
📜 Analysis history and search
💾 Export results as JSON

🖥️ Command Line Interface

Single Image Analysis

# Complete analysis (JSON output)
pixintelligence analyze image.jpg

# With pretty-printed JSON
pixintelligence analyze image.jpg --pretty

# Custom output path
pixintelligence analyze image.jpg --output results/my_analysis.json

# Specific checks only
pixintelligence analyze image.jpg --checks manipulation stego

# Verbose mode (detailed console output)
pixintelligence analyze image.jpg --verbose

Batch Analysis

# Analyze directory (JSON output for each image)
pixintelligence batch ./images

# Recursive search in subdirectories
pixintelligence batch ./images --recursive

# With specific checks
pixintelligence batch ./images --checks manipulation metadata --recursive

# Custom output directory
pixintelligence batch ./images --output-dir ./results

Image Comparison

# Compare two images
pixintelligence compare image1.jpg image2.jpg

# Custom output directory
pixintelligence compare image1.jpg image2.jpg --output-dir ./results

PRNU Camera Management

# Add camera to PRNU database (10+ images recommended)
pixintelligence prnu-add camera_001 "Canon EOS 5D Mark IV" ref1.jpg ref2.jpg ref3.jpg ref4.jpg ref5.jpg

# List all cameras in database
pixintelligence prnu-list

# Example output:
# Camera ID            Model                          Samples    Added
# ====================================================================================
# camera_001           Canon EOS 5D Mark IV           5          2025-11-07

Information

# Show tool capabilities and version
pixintelligence info

📋 Available Parameters

Parameter	Description	Values
`--checks`, `-c`	Types of analysis to perform	all, metadata, manipulation, stego, quality, hash
`--output`, `-o`	Custom JSON output path	File path
`--output-dir`, `-d`	Output directory	Directory path
`--pretty`, `-p`	Pretty-print JSON	Flag
`--recursive`, `-r`	Recursive search	Flag
`--verbose`, `-v`	Detailed output	Flag
`--host`	Server host (serve command)	IP address (default: 0.0.0.0)
`--port`	Server port (serve command)	Port number (default: 8000)
`--reload`	Auto-reload server (serve command)	Flag

📖 Usage Examples

Case 1: Web Interface Analysis

# Launch web interface
pixintelligence serve

# Then in browser:
# 1. Upload image via drag-and-drop
# 2. Watch real-time progress
# 3. View interactive heatmap
# 4. Explore detailed metrics
# 5. Download JSON results

Case 2: Complete Forensic Analysis

# Analyze suspicious image with all detection methods
pixintelligence analyze suspicious_photo.jpg --verbose

# Output shows:
# ✓ Multi-Quality ELA: 7 levels analyzed
# ✓ Double JPEG: Detected (confidence: 78%)
# ✓ Median Filter: Not detected
# ✓ PRNU Analysis: No camera match
# ✓ Lighting Analysis: 3 objects detected, consistent
# ✓ ML Detection: Manipulation probability 45%
# ✓ Manipulation Score: 65% (High likelihood)
# ✓ JSON report saved: ./output/suspicious_photo_20251107_120000_report.json

Case 3: PRNU Camera Source Identification

# Step 1: Build camera fingerprint database
pixintelligence prnu-add my_camera "Canon EOS 5D" \
  ref1.jpg ref2.jpg ref3.jpg ref4.jpg ref5.jpg \
  ref6.jpg ref7.jpg ref8.jpg ref9.jpg ref10.jpg

# Step 2: Analyze unknown image
pixintelligence analyze unknown.jpg --verbose

# Results will show:
# PRNU Analysis:
#   - Camera Match: Yes (PCE: 125.4)
#   - Best Match: my_camera (Canon EOS 5D)
#   - Tampering Detected: No
#   - Match Confidence: Very High

Case 4: Lighting Inconsistency Detection

# Analyze composite image for lighting issues
pixintelligence analyze composite.jpg --checks manipulation --verbose

# Detection results:
# Lighting Analysis:
#   - Objects Detected: 4
#   - Object 1 (person): Azimuth 45°, Elevation 30°
#   - Object 2 (person): Azimuth 135°, Elevation 25°
#   - Object 3 (car): Azimuth 50°, Elevation 28°
#   - Object 4 (tree): Azimuth 225°, Elevation 35°
#   - Inconsistency: DETECTED
#   - Angular Difference: 90° (Objects 1-2)
#   - Confidence: 85%

Case 5: Batch Analysis for Investigation

# Analyze entire evidence folder
pixintelligence batch ./evidence --recursive --verbose

# Process:
# Processing images ████████████████████ 100% [25/25]
# ✓ Successfully analyzed: 25 images
# ✓ Reports generated in ./output/
#
# Summary:
# - High manipulation likelihood: 5 images
# - Medium likelihood: 8 images
# - Low likelihood: 12 images

Case 6: API Integration

import requests

# Upload and analyze
with open('image.jpg', 'rb') as f:
    response = requests.post(
        'http://localhost:8000/api/analyze',
        files={'file': f}
    )

analysis_id = response.json()['analysis_id']

# Get results
results = requests.get(f'http://localhost:8000/api/results/{analysis_id}')
data = results.json()

print(f"Manipulation: {data['manipulation']['likelihood']}")
print(f"Score: {data['manipulation']['score']}%")

# Detection flags
flags = data['flags']
if flags['double_jpeg']:
    print("⚠️ Double JPEG compression detected")
if flags['lighting_inconsistency']:
    print("⚠️ Lighting inconsistency detected")
if flags['prnu_tampering']:
    print("⚠️ PRNU tampering detected")

🔌 API Reference

REST API Endpoints

PixIntelligence v2.0 provides a complete REST API:

Method	Endpoint	Description
`GET`	`/`	API root information
`GET`	`/api/health`	Health check
`POST`	`/api/analyze`	Upload and analyze image
`GET`	`/api/results/{id}`	Get analysis results
`GET`	`/api/reports`	List all analyses
`GET`	`/api/heatmap/{id}`	Get manipulation heatmap
`POST`	`/api/prnu/add-reference`	Add camera reference
`GET`	`/api/prnu/cameras`	List camera database
`DELETE`	`/api/analysis/{id}`	Delete analysis
`WS`	`/ws`	WebSocket for real-time progress

Example API Usage

import requests

# Health check
response = requests.get('http://localhost:8000/api/health')
print(response.json())  # {"status": "healthy", "timestamp": "..."}

# Upload image
with open('image.jpg', 'rb') as f:
    response = requests.post(
        'http://localhost:8000/api/analyze',
        files={'file': f}
    )

analysis_id = response.json()['analysis_id']

# Get results
results = requests.get(f'http://localhost:8000/api/results/{analysis_id}')
print(results.json())

# Get heatmap
heatmap = requests.get(f'http://localhost:8000/api/heatmap/{analysis_id}')
print(heatmap.json()['statistics'])

# List all reports
reports = requests.get('http://localhost:8000/api/reports')
for report in reports.json()['analyses']:
    print(f"{report['filename']}: {report['manipulation_likelihood']}")

WebSocket Real-time Updates

const ws = new WebSocket('ws://localhost:8000/ws');

ws.onmessage = (event) => {
    const data = JSON.parse(event.data);
    if (data.type === 'progress') {
        console.log(`Stage: ${data.data.stage}`);
        console.log(`Progress: ${data.data.progress}%`);
    }
};

🔧 Advanced Configuration

Project Structure

image-metadata-extractor/
├── src/
│   ├── main.py                        # CLI entry point
│   ├── core/                          # Core analysis modules
│   │   ├── manipulation.py            # Manipulation detection (updated)
│   │   ├── double_jpeg.py             # NEW: Double JPEG detection
│   │   ├── median_filter.py           # NEW: Median filter detection
│   │   ├── heatmap_generator.py       # NEW: Unified heatmap
│   │   ├── prnu_analysis.py           # NEW: PRNU fingerprinting
│   │   ├── prnu_database.py           # NEW: Camera database
│   │   ├── lighting_analysis.py       # NEW: Lighting analysis
│   │   ├── metadata_extractor.py      # Metadata extraction
│   │   ├── steganography.py           # Steganography detection
│   │   ├── image_analyzer.py          # Image quality analysis
│   │   └── hash_generator.py          # Hash generation
│   ├── ml/                            # Machine learning modules
│   │   ├── mantranet_detector.py      # NEW: ML manipulation detection
│   │   ├── object_detector.py         # NEW: YOLOv8 integration
│   │   └── stego_detector.py          # ML stego detection
│   ├── web/                           # NEW: Web application
│   │   ├── app.py                     # FastAPI backend
│   │   ├── models.py                  # Database models
│   │   ├── database.py                # Database connection
│   │   └── frontend/                  # React frontend
│   │       ├── src/
│   │       │   ├── App.jsx
│   │       │   └── components/
│   │       ├── package.json
│   │       └── vite.config.js
│   ├── reports/                       # Report generation
│   │   ├── json_exporter.py           # JSON reports (updated)
│   │   └── html_generator.py          # HTML (web only)
│   ├── data/                          # NEW: Data storage
│   │   ├── camera_fingerprints/       # PRNU database
│   │   ├── uploads/                   # Uploaded images
│   │   └── pixintelligence.db         # SQLite database
│   └── utils/                         # Utilities
├── tests/                             # Tests and sample images
├── output/                            # CLI output directory
├── requirements.txt                   # Python dependencies
├── README.md                          # This file
├── USAGE_v2.md                        # Detailed usage guide
└── IMPLEMENTATION_SUMMARY.md          # Implementation details

Results Interpretation

Likelihood Levels

Level	Range	Meaning
Very Low	0-20%	Very unlikely
Low	20-40%	Unlikely
Medium	40-60%	Possible, requires investigation
High	60-80%	Likely
Very High	80-100%	Very likely

Key Indicators

Manipulation:

Areas with different compression levels (ELA)
Inconsistent noise patterns
Detected cloned regions
Multiple JPEG compressions

Steganography:

Modified LSBs in color channels
Positive chi-square test
Anomalous frequency patterns
Known tool signatures

📊 Output Formats

HTML Report

Interactive visualizations with Plotly
Color histogram charts
Organized metadata tables
Visual risk indicators
Tab navigation

JSON Report

{
  "version": "1.0",
  "exported_at": "2024-01-01T12:00:00",
  "results": {
    "file": "image.jpg",
    "analysis": {
      "manipulation": {
        "likelihood": "Medium",
        "score": 45.5,
        "indicators": ["ELA anomalies", "Noise patterns"]
      },
      "steganography": {
        "likelihood": "Low",
        "score": 15.2
      },
      "metadata": {
        "exif": {...},
        "gps": {...}
      }
    }
  }
}

🐛 Troubleshooting

Installation Issues

PyTorch Installation Fails:

# Try CPU-only version
pip install torch torchvision --index-url https://download.pytorch.org/whl/cpu

Ultralytics/YOLO Installation Fails:

# Install without dependencies first
pip install ultralytics --no-deps
pip install opencv-python numpy

Web Dependencies Issues:

# Install web stack separately
pip install fastapi uvicorn[standard] sqlalchemy python-multipart websockets aiofiles

Frontend Build Issues:

# Clear cache and reinstall
cd src/web/frontend
rm -rf node_modules package-lock.json
npm install

Runtime Issues

"Module not found" errors:

# Make sure you're in the project directory
cd /path/to/image-metadata-extractor

# Activate virtual environment
source venv/bin/activate

# Reinstall in development mode
pip install -e .

YOLOv8 download fails:

# Pre-download the model manually
from ultralytics import YOLO
model = YOLO('yolov8n.pt')  # Downloads to ~/.cache/torch/hub/

Web server won't start:

# Check if port is in use
lsof -ti:8000 | xargs kill -9

# Try different port
pixintelligence serve --port 8001

Database errors:

# Remove and recreate database
rm src/data/pixintelligence.db
# Database will be recreated on next run

PRNU database issues:

# Check database location
pixintelligence prnu-list

# Clear PRNU database if corrupted
rm src/data/camera_fingerprints/camera_fingerprints.h5

Performance Issues

Analysis is slow:

Disable ML detection if not needed
Reduce image resolution before analysis
Use --checks to run specific analyses only
ML models require significant memory

Memory usage is high:

# Check if multiple analyses are running
ps aux | grep pixintelligence

# Restart web server if memory leak suspected
# (Kill and restart pixintelligence serve)

Common Questions

Q: Do I need GPU for ML detection? A: No, CPU works fine. GPU is optional and will speed up ML inference.

Q: Can I use this offline? A: Yes, after initial model downloads (YOLOv8), everything works offline.

Q: How accurate is the manipulation detection? A: Combines 7+ methods. No single method is 100% accurate, but combined confidence is high. Always verify findings manually.

Q: What image formats are supported? A: JPG, JPEG, PNG, GIF, BMP, TIFF, WEBP, HEIC, HEIF

Q: Can I integrate this into my application? A: Yes! Use the REST API (pixintelligence serve) or import modules directly in Python.

🗺️ Roadmap v2.1+

Completed ✅

Planned Features

Phase 2.1 - Enhanced ML

Load actual ManTra-Net pre-trained weights
Alternative models: MVSS-Net, Noiseprint
Ensemble predictions from multiple models
GPU acceleration support
Model fine-tuning interface

Phase 2.2 - OSINT Integration

Reverse image search APIs (Google, TinEye, Yandex)
Social media metadata extraction
Geolocation enrichment
Timeline analysis
Related images search

Phase 2.3 - Advanced Features

Phase 2.4 - Production Enhancements

Phase 5 - Optimization and Scalability

📄 License

This project is dual-licensed:

Non-Commercial Use (Default)

Licensed under CC BY-NC-SA 4.0 (Creative Commons Attribution-NonCommercial-ShareAlike 4.0)

✓ Free for:

Personal use
Research and education
Non-profit organizations
Open source projects

✗ Not allowed without commercial license:

Selling this software
Using in commercial products/services
Profit-generating activities

Commercial Use

For commercial licensing, please contact the author. A separate commercial license is required for any commercial use.

See the LICENSE file for complete details.

👥 Authors

Jacobo Blancas Barroso - Initial development

🙏 Acknowledgments

OSINT community
Project contributors
Open source libraries used

PixIntelligence - Professional tool for forensic image analysis

Made with ❤️ for the OSINT and digital forensics community

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
docs		docs
nginx		nginx
output		output
src		src
tests		tests
.dockerignore		.dockerignore
.env		.env
.env.example		.env.example
.gitignore		.gitignore
=2.10.0		=2.10.0
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.prod.yml		docker-compose.prod.yml
docker-compose.yml		docker-compose.yml
docker-entrypoint.sh		docker-entrypoint.sh
docker-run.sh		docker-run.sh
mantra_net_demo.py		mantra_net_demo.py
pixintelligence.py		pixintelligence.py
pixintelligence_cli.py		pixintelligence_cli.py
requirements.txt		requirements.txt
setup.py		setup.py
test_installation.py		test_installation.py

License

jacobobb/pixintelligence

Folders and files

Latest commit

History

Repository files navigation