# YOLO + Face ID + Autoencoder + FFT-Based Anomaly Scoring (Colab Demo)

This notebook demonstrates the end-to-end pipeline:

1. **Train** a per-camera convolutional autoencoder on **normal** person crops
2. **Process** a video using:
   - YOLO person detection
   - optional face recognition identity
   - AE reconstruction error
   - FFT motion feature (center_y trajectory)
   - combined anomaly score and flags
3. Save outputs (models, checkpoints, CSV logs, annotated videos) to **Google Drive**.


In [None]:
from google.colab import drive
drive.mount("/content/drive")


Define Paths / Config

In [None]:
import os

# Change this if you prefer a different Drive location
DRIVE_BASE = "/content/drive/MyDrive/anomaly_yolo_face_surveillance"

REPO_DIR = "/content/anomaly-yolo-face-surveillance"

DATA_DIR = os.path.join(DRIVE_BASE, "data")          # Your crops + faces
VIDEOS_DIR = os.path.join(DRIVE_BASE, "videos")      # Input videos
OUTPUTS_DIR = os.path.join(DRIVE_BASE, "outputs")    # Logs + annotated videos
MODELS_DIR = os.path.join(DRIVE_BASE, "models")      # Saved .keras models
CKPT_DIR = os.path.join(DRIVE_BASE, "checkpoints")   # Saved weights .h5

for p in [DATA_DIR, VIDEOS_DIR, OUTPUTS_DIR, MODELS_DIR, CKPT_DIR]:
    os.makedirs(p, exist_ok=True)

print("Drive base:", DRIVE_BASE)
print("DATA_DIR:", DATA_DIR)
print("VIDEOS_DIR:", VIDEOS_DIR)
print("OUTPUTS_DIR:", OUTPUTS_DIR)
print("MODELS_DIR:", MODELS_DIR)
print("CKPT_DIR:", CKPT_DIR)


Clone Repo

In [None]:
!rm -rf {REPO_DIR}
!git clone https://github.com/giacomobettas/anomaly-yolo-face-surveillance.git {REPO_DIR}
%cd {REPO_DIR}
!ls


Install Dependencies

In [None]:
!python -m pip install --upgrade pip
!pip install -r requirements.txt


## Optional: enable face recognition (face_recognition + dlib)

Face recognition is **optional** in this repo.

- If you skip it, identities will be `"unknown"`, but everything else works.
- If you enable it, you need system packages + `dlib` compilation.


In [None]:
# OPTIONAL: enable face_recognition in Colab (Linux)
# This may take a while because it compiles dlib.

!apt-get update -y
!apt-get install -y cmake build-essential
!pip install face_recognition


## Expected data layout (stored in Drive)

### Autoencoder training (normal crops per camera):
`{DATA_DIR}/cam1/normal_crops/*.jpg`

Example:
- `.../data/cam1/normal_crops/img001.jpg`
- `.../data/cam1/normal_crops/img002.jpg`

### Optional face identities:
`{DATA_DIR}/faces/<person_name>/*.jpg`

Example:
- `.../data/faces/person1/face1.jpg`
- `.../data/faces/person2/face2.jpg`

### Input videos:
`{VIDEOS_DIR}/cam1_example.mp4`


# Quick Synthetic Crops Generator

If you want the notebook to run “out of the box” without uploading data yet, this generates a tiny fake dataset (just for smoke-checking AE training).

In [None]:
import numpy as np
import cv2
import os

cam_id = "cam1"
normal_dir = os.path.join(DATA_DIR, cam_id, "normal_crops")
os.makedirs(normal_dir, exist_ok=True)

# Create a few synthetic "person-like" blobs
H, W = 128, 64
for i in range(40):
    img = np.zeros((H, W, 3), dtype=np.uint8)
    cx = np.random.randint(W//3, 2*W//3)
    cy = np.random.randint(H//3, 2*H//3)
    cv2.ellipse(img, (cx, cy), (W//6, H//4), 0, 0, 360, (255, 255, 255), -1)
    cv2.GaussianBlur(img, (5, 5), 0, dst=img)
    cv2.imwrite(os.path.join(normal_dir, f"synthetic_{i:03d}.jpg"), img)

print("Synthetic crops written to:", normal_dir)
print("Count:", len(os.listdir(normal_dir)))


## Train the autoencoder (per camera)

We train the AE on **normal** crops only.
Outputs are saved to Drive:

- Best weights: `{CKPT_DIR}/<cam_id>_ae_best.weights.h5`
- Final model: `{MODELS_DIR}/<cam_id>_ae_person_autoencoder.keras`

You can restart the Colab runtime and continue because everything is in Drive.


Run Training

In [None]:
cam_id = "cam1"

train_dir = os.path.join(DATA_DIR, cam_id, "normal_crops")
best_ckpt = os.path.join(CKPT_DIR, f"{cam_id}_ae_best.weights.h5")
final_model = os.path.join(MODELS_DIR, f"{cam_id}_ae_person_autoencoder.keras")

!python -m src.ae_train \
  --camera_id {cam_id} \
  --data_dir "{train_dir}" \
  --image_size 128 64 \
  --color_mode rgb \
  --batch_size 16 \
  --epochs 10 \
  --checkpoint_path "{best_ckpt}" \
  --model_path "{final_model}" \
  --patience 3


## Provide an input video

Place a video in:
`{VIDEOS_DIR}/cam1_example.mp4`

If you don't have one, you can upload a short test video to Drive manually.
Then update the filename below.


Set Video Paths

In [None]:
cam_id = "cam1"

video_path = os.path.join(VIDEOS_DIR, "cam1_example.mp4")  # change if needed
ae_model_path = os.path.join(MODELS_DIR, f"{cam_id}_ae_person_autoencoder.keras")

annotated_out = os.path.join(OUTPUTS_DIR, f"{cam_id}_annotated.avi")
csv_out = os.path.join(OUTPUTS_DIR, f"{cam_id}_log.csv")

print("video_path:", video_path)
print("ae_model_path:", ae_model_path)
print("annotated_out:", annotated_out)
print("csv_out:", csv_out)


## Run the full pipeline

This performs:
- YOLO person detection
- optional face identification (`--faces_root`)
- AE reconstruction error per person crop
- FFT motion feature (10s window by default)
- combined anomaly scoring

Outputs:
- Annotated video
- CSV log (per frame/person scores)


## Run Processing

If you want face recognition, set faces_root to os.path.join(DATA_DIR, "faces").
If not, leave it empty.

In [None]:
faces_root = os.path.join(DATA_DIR, "faces")  # optional (works only if face_recognition installed)

# If you want to disable face ID entirely:
# faces_root = ""

!python -m src.process_video \
  --camera_id {cam_id} \
  --video_path "{video_path}" \
  --ae_model_path "{ae_model_path}" \
  --output_video "{annotated_out}" \
  --output_csv "{csv_out}" \
  --image_size 128 64 \
  --color_mode rgb \
  --yolo_model yolov8n.pt \
  --conf_thres 0.25 \
  --iou_thres 0.45 \
  --fft_window_seconds 10.0 \
  --w_recon 0.6 \
  --w_posture 0.25 \
  --w_fft 0.15 \
  --recon_max 0.1 \
  --anomaly_threshold 0.5 \
  --faces_root "{faces_root}"


Preview Results

In [None]:
import pandas as pd

df = pd.read_csv(csv_out)
df.head()


Basic Summary of Anomalies

In [None]:
# frames flagged as anomaly
anoms = df[df["is_anomaly"] == 1]
print("Total rows:", len(df))
print("Anomaly rows:", len(anoms))

# show top 10 highest combined scores
df.sort_values("combined_score", ascending=False).head(10)


Display Annotated Video Inline

In [None]:
from IPython.display import Video

Video(annotated_out, embed=True)


## Notes

- The autoencoder is trained **only on normal data**, camera-by-camera.
- The combined anomaly score uses:
  - reconstruction error (primary signal)
  - posture score (weak supporting feature)
  - FFT motion score on the normalized vertical bbox center trajectory
- Face recognition is optional and gracefully degrades to `"unknown"` if not installed.

In production, anomaly flags could trigger a notification system (webhook/SMS/email),
but this repository focuses on **modeling + scoring + reproducible inference**.
