# Violence Detection MVP - End-to-End Demo

This notebook demonstrates the complete Violence Detection MVP pipeline, from data preprocessing to model training, evaluation, and inference.

## Overview

The Violence Detection MVP uses:
- **VGG19** for feature extraction from video frames
- **LSTM with Attention** for sequence classification
- **Transfer Learning** approach for efficient training

## Pipeline Steps

1. **Data Preprocessing**: Extract frames from videos and create labels
2. **Feature Extraction**: Use VGG19 to extract features from frames
3. **Model Training**: Train LSTM-Attention model on extracted features
4. **Evaluation**: Evaluate model performance with comprehensive metrics
5. **Inference**: Make predictions on new videos

## Setup and Imports

In [None]:
import sys
import os
from pathlib import Path
import numpy as np
import matplotlib.pyplot as plt

# Add src directory to path
project_root = Path().absolute().parent
src_path = project_root / 'src'
sys.path.append(str(src_path))

# Import project modules
from config import Config
from data_preprocessing import DataPreprocessor, VideoFrameExtractor
from feature_extraction import FeaturePipeline
from model_architecture import ViolenceDetectionModel
from training import TrainingPipeline, ExperimentManager
from evaluation import ModelEvaluator
from inference import ViolencePredictor, InferenceAPI
from visualization import TrainingVisualizer, EvaluationVisualizer, DataVisualizer
from utils import SystemInfo, validate_project_setup

print("All modules imported successfully!")
print(f"Project root: {project_root}")

## System Validation

In [None]:
# Check system information
system_info = SystemInfo.get_system_info()
dependencies = SystemInfo.check_dependencies()

print("System Information:")
print(f"Platform: {system_info['platform']}")
print(f"Python Version: {system_info['python_version']}")
print(f"Memory: {system_info['memory_total_gb']:.1f} GB total, {system_info['memory_available_gb']:.1f} GB available")
print(f"CPU Count: {system_info['cpu_count']}")

print("\nDependency Check:")
for dep, available in dependencies['dependencies'].items():
    status = "✓" if available else "✗"
    version = dependencies['versions'][dep]
    print(f"{status} {dep}: {version}")

print(f"\nAll dependencies available: {dependencies['all_available']}")

## Configuration

In [None]:
# Initialize configuration
config = Config()

print("Configuration Settings:")
print(f"Image Size: {config.IMG_SIZE}x{config.IMG_SIZE}")
print(f"Frames per Video: {config.FRAMES_PER_VIDEO}")
print(f"Batch Size: {config.BATCH_SIZE}")
print(f"Learning Rate: {config.LEARNING_RATE}")
print(f"RNN Size: {config.RNN_SIZE}")
print(f"Dropout Rate: {config.DROPOUT_RATE}")

print(f"\nData Directories:")
print(f"Raw Data: {config.RAW_DATA_DIR}")
print(f"Processed Data: {config.PROCESSED_DATA_DIR}")
print(f"Models: {config.MODELS_DIR}")

## Model Architecture Validation

In [None]:
# Create and validate model architecture
from model_architecture import validate_model_architecture

validation_result = validate_model_architecture(config)

if validation_result['success']:
    print("✓ Model architecture validation successful!")
    print(f"Output shape: {validation_result['output_shape']}")
    print(f"Total parameters: {validation_result['metrics']['total_parameters']:,}")
    print(f"Trainable parameters: {validation_result['metrics']['trainable_parameters']:,}")
    print(f"Model size: {validation_result['metrics']['model_size_mb']:.1f} MB")
else:
    print("✗ Model architecture validation failed!")
    print(f"Error: {validation_result['error']}")

## Display Model Architecture

In [None]:
# Create and display model
model_builder = ViolenceDetectionModel(config)
model = model_builder.create_model()

print("Model Architecture:")
model.summary()

## Data Pipeline Demo (if data available)

In [None]:
# Check if sample data is available
sample_data_dir = config.RAW_DATA_DIR

if sample_data_dir.exists() and any(sample_data_dir.iterdir()):
    print(f"Sample data found in: {sample_data_dir}")
    
    # List available files
    video_files = []
    for ext in config.VIDEO_EXTENSIONS:
        video_files.extend(list(sample_data_dir.glob(f"*{ext}")))
    
    print(f"Found {len(video_files)} video files")
    
    if video_files:
        # Show first few files
        print("\nSample files:")
        for i, file_path in enumerate(video_files[:5]):
            print(f"  {i+1}. {file_path.name}")
        
        if len(video_files) > 5:
            print(f"  ... and {len(video_files) - 5} more")
else:
    print(f"No sample data found in: {sample_data_dir}")
    print("To run the full demo:")
    print("1. Place video files in the data/raw/ directory")
    print("2. Use naming convention: 'fi_*.avi' or 'V_*.avi' for violence videos")
    print("3. Use naming convention: 'no_*.avi' or 'NV_*.avi' for non-violence videos")

## Feature Extraction Demo

In [None]:
# Demo feature extraction pipeline
from feature_extraction import print_vgg19_info

print("VGG19 Model Information:")
print_vgg19_info()

## Training Demo (Synthetic Data)

In [None]:
# Create synthetic data for demonstration
def create_synthetic_data(num_samples=100):
    """Create synthetic data for demo purposes."""
    np.random.seed(42)
    
    # Create synthetic features (20 frames x 4096 features)
    data = []
    targets = []
    
    for i in range(num_samples):
        # Random features
        features = np.random.normal(0, 1, (config.FRAMES_PER_VIDEO, config.TRANSFER_VALUES_SIZE))
        features = features.astype(np.float16)
        
        # Random label
        if i < num_samples // 2:
            label = [1, 0]  # Violence
        else:
            label = [0, 1]  # No violence
        
        data.append(features)
        targets.append(label)
    
    return data, targets

# Create synthetic data
print("Creating synthetic data for demo...")
synthetic_data, synthetic_targets = create_synthetic_data(100)

print(f"Created {len(synthetic_data)} synthetic samples")
print(f"Data shape: {synthetic_data[0].shape}")
print(f"Label distribution: {np.sum([t[0] for t in synthetic_targets])} violence, {np.sum([t[1] for t in synthetic_targets])} no violence")

## Quick Training Demo

In [None]:
# Quick training demo with synthetic data
# Split data
split_idx = int(len(synthetic_data) * 0.8)
train_data = synthetic_data[:split_idx]
train_targets = synthetic_targets[:split_idx]
test_data = synthetic_data[split_idx:]
test_targets = synthetic_targets[split_idx:]

print(f"Training samples: {len(train_data)}")
print(f"Test samples: {len(test_data)}")

# Create a small model for quick demo
demo_config = Config()
demo_config.EPOCHS = 3  # Very few epochs for demo
demo_config.BATCH_SIZE = 8
demo_config.RNN_SIZE = 32  # Smaller model

demo_model_builder = ViolenceDetectionModel(demo_config)
demo_model = demo_model_builder.create_model()

print("\nTraining demo model (3 epochs)...")

# Convert to numpy arrays
X_train = np.array(train_data[:32])  # Use subset for quick demo
y_train = np.array(train_targets[:32])
X_test = np.array(test_data)
y_test = np.array(test_targets)

# Train
history = demo_model.fit(
    X_train, y_train,
    batch_size=demo_config.BATCH_SIZE,
    epochs=demo_config.EPOCHS,
    validation_data=(X_test, y_test),
    verbose=1
)

print("\nDemo training completed!")

## Training Visualization

In [None]:
# Visualize training results
training_viz = TrainingVisualizer(config)

# Plot training history
fig = training_viz.plot_training_history(
    history.history,
    show_plot=True
)

plt.show()

## Model Evaluation Demo

In [None]:
# Evaluate the demo model
print("Evaluating demo model...")

# Make predictions
predictions = demo_model.predict(X_test, verbose=0)
binary_predictions = (predictions > 0.5).astype(int)

# Calculate basic metrics
from sklearn.metrics import accuracy_score, classification_report

accuracy = accuracy_score(y_test, binary_predictions)
print(f"Test Accuracy: {accuracy:.4f}")

print("\nClassification Report:")
print(classification_report(y_test, binary_predictions, target_names=['Violence', 'No Violence']))

## Evaluation Visualization

In [None]:
# Visualize evaluation results
eval_viz = EvaluationVisualizer(config)

# Confusion matrix
from sklearn.metrics import confusion_matrix

y_true_single = np.argmax(y_test, axis=1)
y_pred_single = np.argmax(binary_predictions, axis=1)
cm = confusion_matrix(y_true_single, y_pred_single)

fig = eval_viz.plot_confusion_matrix(cm, show_plot=True)
plt.show()

## Inference Demo

In [None]:
# Demo inference on synthetic data
print("Inference Demo:")

# Use the trained model for inference
sample_features = X_test[0:1]  # Take first test sample
true_label = y_test[0]

# Make prediction
prediction = demo_model.predict(sample_features, verbose=0)[0]
predicted_class_idx = np.argmax(prediction)
confidence = prediction[predicted_class_idx]

class_names = ['Violence', 'No Violence']
predicted_class = class_names[predicted_class_idx]
true_class = class_names[np.argmax(true_label)]

print(f"True Class: {true_class}")
print(f"Predicted Class: {predicted_class}")
print(f"Confidence: {confidence:.4f}")
print(f"Probabilities: Violence={prediction[0]:.4f}, No Violence={prediction[1]:.4f}")

# Demo batch inference
print("\nBatch Inference Demo (first 5 test samples):")
batch_predictions = demo_model.predict(X_test[:5], verbose=0)

for i in range(5):
    pred = batch_predictions[i]
    pred_class = class_names[np.argmax(pred)]
    true_class = class_names[np.argmax(y_test[i])]
    confidence = np.max(pred)
    
    status = "✓" if pred_class == true_class else "✗"
    print(f"Sample {i+1}: {status} True: {true_class}, Pred: {pred_class} ({confidence:.3f})")

## Model Architecture Visualization

In [None]:
# Display model architecture diagram (if available)
try:
    from tensorflow.keras.utils import plot_model
    
    # Create model diagram
    plot_model(
        demo_model,
        to_file='model_architecture.png',
        show_shapes=True,
        show_layer_names=True,
        rankdir='TB',
        expand_nested=False,
        dpi=96
    )
    
    print("Model architecture diagram saved as 'model_architecture.png'")
    
except ImportError:
    print("graphviz not available for model visualization")
except Exception as e:
    print(f"Could not create model diagram: {e}")

## Performance Analysis

In [None]:
# Analyze model performance
print("Performance Analysis:")

# Model size and parameters
total_params = demo_model.count_params()
model_size_mb = total_params * 4 / (1024 * 1024)  # Approximate size in MB

print(f"Model Parameters: {total_params:,}")
print(f"Approximate Model Size: {model_size_mb:.2f} MB")

# Inference time analysis
import time

# Time single prediction
start_time = time.time()
_ = demo_model.predict(X_test[0:1], verbose=0)
single_inference_time = time.time() - start_time

# Time batch prediction
start_time = time.time()
_ = demo_model.predict(X_test, verbose=0)
batch_inference_time = time.time() - start_time

print(f"\nInference Performance:")
print(f"Single prediction time: {single_inference_time:.4f} seconds")
print(f"Batch prediction time ({len(X_test)} samples): {batch_inference_time:.4f} seconds")
print(f"Average time per sample: {batch_inference_time / len(X_test):.4f} seconds")
print(f"Throughput: {len(X_test) / batch_inference_time:.1f} samples/second")

## Project Structure Summary

In [None]:
# Display project structure
from utils import print_project_structure, validate_project_setup

print("Project Structure:")
print_project_structure(project_root, max_depth=3)

print("\nProject Validation:")
validation = validate_project_setup(project_root)

print(f"Setup Complete: {validation['setup_complete']}")
print(f"All Directories Exist: {validation['all_directories_exist']}")
print(f"All Files Exist: {validation['all_files_exist']}")
print(f"Dependencies Available: {validation['dependencies']['all_available']}")

## Usage Examples

In [None]:
print("=" * 50)
print("VIOLENCE DETECTION MVP - USAGE EXAMPLES")
print("=" * 50)

print("""
## 1. Training a Model

```python
from src.training import TrainingPipeline
from src.config import Config
from pathlib import Path

# Initialize training pipeline
config = Config()
trainer = TrainingPipeline(config)

# Train model
data_dir = Path("data/raw")
train_data, train_targets, test_data, test_targets = trainer.prepare_data(data_dir)
history = trainer.train_model(train_data, train_targets, test_data, test_targets)
```

## 2. Evaluating a Model

```python
from src.evaluation import ModelEvaluator
from pathlib import Path

# Load and evaluate model
evaluator = ModelEvaluator(config)
evaluator.load_model(Path("models/violence_detection_model.h5"))
results = evaluator.evaluate_model_comprehensive(test_data, test_targets)
```

## 3. Making Predictions

```python
from src.inference import ViolencePredictor
from pathlib import Path

# Initialize predictor
predictor = ViolencePredictor(Path("models/violence_detection_model.h5"))

# Predict single video
result = predictor.predict_video(Path("path/to/video.avi"))
print(f"Violence detected: {result['violence_detected']}")
print(f"Confidence: {result['confidence']:.3f}")
```

## 4. Visualization

```python
from src.visualization import TrainingVisualizer, EvaluationVisualizer

# Plot training curves
training_viz = TrainingVisualizer()
training_viz.plot_training_history(history.history)

# Plot confusion matrix
eval_viz = EvaluationVisualizer()
eval_viz.plot_confusion_matrix(confusion_matrix)
```

## 5. Real-time Processing

```python
from src.inference import RealTimeVideoProcessor

# Process video stream
processor = RealTimeVideoProcessor(Path("models/violence_detection_model.h5"))
processor.process_video_stream("path/to/video.mp4", display=True)
```
""")

## Next Steps

In [None]:
print("=" * 50)
print("NEXT STEPS FOR REAL DATA")
print("=" * 50)

print("""
To use this MVP with real video data:

1. **Prepare Data**:
   - Place video files in data/raw/ directory
   - Use naming convention:
     * Violence: 'fi_*.avi', 'V_*.avi'
     * No Violence: 'no_*.avi', 'NV_*.avi'

2. **Train Model**:
   ```python
   from src.training import TrainingPipeline
   trainer = TrainingPipeline()
   trainer.prepare_data(Path("data/raw"))
   # Training will automatically extract features and cache them
   ```

3. **Evaluate Performance**:
   ```python
   from src.evaluation import evaluate_model_from_cache
   results = evaluate_model_from_cache(
       Path("models/violence_detection_model.h5"),
       Path("data/processed/test_features.h5")
   )
   ```

4. **Deploy for Inference**:
   ```python
   from src.inference import InferenceAPI
   api = InferenceAPI(Path("models/violence_detection_model.h5"))
   result = api.predict_single_video("new_video.avi")
   ```

5. **Monitor and Improve**:
   - Use visualization tools to analyze performance
   - Experiment with hyperparameters
   - Collect more data for better accuracy
""")

print("\n" + "=" * 50)
print("DEMO COMPLETED SUCCESSFULLY!")
print("=" * 50)