# 3DDFA_V2 Interactive Demo

This notebook demonstrates how to use the 3DDFA_V2 (Three-D Dense Face Alignment Version 2) library for 3D face reconstruction from 2D images.

## What You'll Learn:
1. **Face Detection**: How to find faces in images using FaceBoxes
2. **3D Reconstruction**: How to build 3D face models from 2D images
3. **Visualization**: Different ways to display the results (landmarks, meshes, depth maps)
4. **Performance**: How to use ONNX for faster inference

## Key Concepts:
- **3DMM (3D Morphable Model)**: Mathematical model representing faces as combinations of shape and expression
- **Parameters**: 62 numbers that describe any face (pose + shape + expression)
- **Vertices**: 3D coordinates of points on the face surface
- **Dense vs Sparse**: Dense = full mesh (~38k points), Sparse = landmarks (~68 points)

## A simple demostration of how to run

In [None]:
# IMPORTANT: Before running this notebook, make sure you've built the required modules:
# Run this in terminal: sh build.sh

# === CORE COMPUTER VISION LIBRARIES ===
import cv2                    # OpenCV - for image processing (reading, displaying, color conversion)
import yaml                   # For loading configuration files (model settings, paths, etc.)

# === 3DDFA_V2 CORE MODULES ===
from FaceBoxes import FaceBoxes           # Face detector - finds rectangular boxes around faces
from TDDFA import TDDFA                   # Main 3D face alignment engine - converts 2D faces to 3D models
from utils.functions import draw_landmarks # Helper function to visualize facial landmarks
from utils.render import render           # 3D mesh renderer - draws 3D face models on images  
from utils.depth import depth             # Depth map generator - shows face depth information

# === VISUALIZATION LIBRARY ===
import matplotlib.pyplot as plt  # For displaying images and plots in the notebook

print("All libraries imported successfully!")
print("Ready to perform 3D face reconstruction!")

### Load configs

In [None]:
# === STEP 1: LOAD CONFIGURATION ===
# The config file contains important settings like:
# - Model architecture (MobileNet, ResNet, etc.)
# - Input image size (120x120 pixels)
# - Model file paths and parameters
# - 3DMM (3D Morphable Model) settings
cfg = yaml.load(open('configs/mb1_120x120.yml'), Loader=yaml.SafeLoader)
print("Configuration loaded successfully!")
print(f"Model architecture: {cfg.get('arch', 'Unknown')}")
print(f"Input size: {cfg.get('size', 'Unknown')}x{cfg.get('size', 'Unknown')} pixels")

# === STEP 2: INITIALIZE MODELS ===
# We can use either ONNX (faster) or PyTorch (standard) models
onnx_flag = True  # Set to True for faster inference, False for standard PyTorch

if onnx_flag:
    print("\n🚀 Using ONNX optimized models (faster inference)")
    
    # Set environment variables for optimal CPU performance
    import os
    os.environ['KMP_DUPLICATE_LIB_OK'] = 'True'  # Avoid OpenMP library conflicts
    os.environ['OMP_NUM_THREADS'] = '4'          # Use 4 CPU threads for parallel processing
    
    # Import optimized ONNX versions
    from FaceBoxes.FaceBoxes_ONNX import FaceBoxes_ONNX
    from TDDFA_ONNX import TDDFA_ONNX
    
    # Initialize the models
    face_boxes = FaceBoxes_ONNX()    # Fast face detector (~1.5ms per image)
    tddfa = TDDFA_ONNX(**cfg)        # Fast 3D face alignment (~1.35ms per face)
    
    print("✅ ONNX models loaded - ready for fast inference!")
else:
    print("\n🐌 Using standard PyTorch models")
    
    # Initialize standard models
    tddfa = TDDFA(gpu_mode=False, **cfg)  # 3D face alignment (CPU mode)
    face_boxes = FaceBoxes()              # Face detection
    
    print("✅ PyTorch models loaded - ready for inference!")

In [None]:
# === STEP 3: LOAD TEST IMAGE ===
# Load an example image to test face detection and 3D reconstruction
img_fp = 'examples/inputs/emma.jpg'  # Path to test image
img = cv2.imread(img_fp)             # Load image using OpenCV (BGR format)

print(f"Image loaded: {img_fp}")
print(f"Image dimensions: {img.shape}")  # Shows (height, width, channels)
print(f"Image type: {img.dtype}")        # Should be uint8 (0-255 pixel values)

# Display the image (convert BGR to RGB for matplotlib)
plt.figure(figsize=(8, 6))
plt.imshow(img[..., ::-1])  # Convert BGR to RGB for correct colors
plt.title("Input Image for 3D Face Reconstruction")
plt.axis('off')             # Hide axes for cleaner display
plt.show()

print("✅ Image loaded and displayed successfully!")

### Detect faces using FaceBoxes

In [None]:
# === STEP 4: DETECT FACES IN THE IMAGE ===
# FaceBoxes will scan the image and return bounding boxes around detected faces

print("🔍 Detecting faces in the image...")
boxes = face_boxes(img)  # Run face detection on the image

print(f"✅ Detection complete!")
print(f"Number of faces detected: {len(boxes)}")

# Display information about each detected face
for i, box in enumerate(boxes):
    x1, y1, x2, y2, confidence = box
    width = x2 - x1
    height = y2 - y1
    print(f"Face {i+1}:")
    print(f"  - Position: ({x1:.1f}, {y1:.1f}) to ({x2:.1f}, {y2:.1f})")
    print(f"  - Size: {width:.1f} x {height:.1f} pixels")
    print(f"  - Confidence: {confidence:.3f}")
    print()

# Explanation of the bounding box format:
print("📝 Bounding box format: [x1, y1, x2, y2, confidence]")
print("   - (x1, y1): Top-left corner coordinates")
print("   - (x2, y2): Bottom-right corner coordinates") 
print("   - confidence: How sure the detector is (0-1, higher = more confident)")

### Regressing 3DMM parameters, reconstruction and visualization

In [None]:
# === STEP 5: 3D FACE RECONSTRUCTION - PARAMETER REGRESSION ===
# Now we use TDDFA to analyze each detected face and predict 3D parameters

print("🧠 Running 3D face analysis...")
print("This step:")
print("1. Crops each face region based on the detected bounding box")
print("2. Resizes to 120x120 pixels (neural network input size)")  
print("3. Runs the face through a neural network")
print("4. Outputs 62 parameters that describe the 3D face")
print()

# Run 3DMM parameter regression
param_lst, roi_box_lst = tddfa(img, boxes)

print(f"✅ 3D analysis complete!")
print(f"Generated parameters for {len(param_lst)} faces")

# Explain what we got back
for i, (param, roi_box) in enumerate(zip(param_lst, roi_box_lst)):
    print(f"\nFace {i+1} analysis results:")
    print(f"  - 3DMM parameters: {len(param)} values")
    print(f"    • First 12: Pose (rotation + translation)")
    print(f"    • Next 40: Shape coefficients (face geometry)")
    print(f"    • Last 10: Expression coefficients (facial expression)")
    print(f"  - Refined bounding box: {[f'{x:.1f}' for x in roi_box]}")
    
    # Show a few sample parameter values
    print(f"  - Sample pose values: {param[:3].round(3).tolist()}")
    print(f"  - Sample shape values: {param[12:15].round(3).tolist()}")
    print(f"  - Sample expression values: {param[52:55].round(3).tolist()}")

print("\n📚 The 62 parameters are like 'DNA' for the face - they contain all")
print("   the information needed to reconstruct the 3D face shape!")

In [None]:
# === STEP 6A: SPARSE LANDMARK RECONSTRUCTION ===
# Convert the 62 parameters into actual 3D coordinates (sparse version)

print("📍 Reconstructing SPARSE landmarks (68 key facial points)...")
print("Sparse landmarks include:")
print("- Face outline (17 points)")  
print("- Eyebrows (10 points)")
print("- Eyes (12 points)")
print("- Nose (9 points)")
print("- Mouth (20 points)")
print()

# Reconstruct 3D vertices in sparse mode (68 landmarks only)
dense_flag = False  # False = sparse landmarks, True = dense mesh
ver_lst = tddfa.recon_vers(param_lst, roi_box_lst, dense_flag=dense_flag)

print(f"✅ Sparse reconstruction complete!")
for i, ver in enumerate(ver_lst):
    print(f"Face {i+1}: Generated {ver.shape[1]} landmark points")
    print(f"  - 3D coordinates shape: {ver.shape} (3 rows: X,Y,Z coords)")
    print(f"  - Sample coordinates: X={ver[0,0]:.1f}, Y={ver[1,0]:.1f}, Z={ver[2,0]:.1f}")

print("\n🎨 Drawing sparse landmarks on the image...")
result_img = draw_landmarks(img, ver_lst, dense_flag=dense_flag)

# Display the result
plt.figure(figsize=(10, 8))
plt.imshow(result_img[..., ::-1])  # Convert BGR to RGB
plt.title("Sparse Landmarks (68 Key Facial Points)")
plt.axis('off')
plt.show()

print("✅ Sparse landmark visualization complete!")
print("Each dot represents a key facial feature point in 3D space.")

In [None]:
# === STEP 6B: DENSE MESH RECONSTRUCTION ===
# Convert the same 62 parameters into a DENSE 3D face mesh

print("🕸️ Reconstructing DENSE mesh (~38,000 vertices)...")
print("Dense reconstruction creates a complete 3D face surface with:")
print("- Every point on the face surface")
print("- Smooth transitions between features") 
print("- Full geometric detail")
print("- Ready for 3D rendering and analysis")
print()

# Reconstruct 3D vertices in dense mode (full mesh)
dense_flag = True   # True = dense mesh (~38k points), False = sparse landmarks
ver_lst = tddfa.recon_vers(param_lst, roi_box_lst, dense_flag=dense_flag)

print(f"✅ Dense reconstruction complete!")
for i, ver in enumerate(ver_lst):
    print(f"Face {i+1}: Generated {ver.shape[1]:,} mesh vertices")
    print(f"  - 3D coordinates shape: {ver.shape}")
    print(f"  - Memory usage: ~{ver.nbytes / 1024:.1f} KB per face")
    
    # Show coordinate ranges to understand the 3D shape
    x_range = ver[0].max() - ver[0].min()
    y_range = ver[1].max() - ver[1].min() 
    z_range = ver[2].max() - ver[2].min()
    print(f"  - Face dimensions: {x_range:.1f} x {y_range:.1f} x {z_range:.1f} pixels")

print("\n🎨 Drawing dense landmarks on the image...")
result_img = draw_landmarks(img, ver_lst, dense_flag=dense_flag)

# Display the result
plt.figure(figsize=(10, 8))
plt.imshow(result_img[..., ::-1])  # Convert BGR to RGB
plt.title("Dense Landmarks (~38,000 Face Surface Points)")
plt.axis('off')
plt.show()

print("✅ Dense landmark visualization complete!")
print("Notice how much more detailed the face surface representation is!")

In [None]:
# === STEP 7: 3D MESH RENDERING ===
# Render the 3D face mesh as a solid surface overlay on the original image

print("🎭 Rendering 3D face mesh overlay...")
print("This step:")
print("1. Takes the dense 3D vertices")
print("2. Connects them into triangular faces using mesh topology")
print("3. Renders the 3D surface with proper lighting and transparency")
print("4. Overlays the result on the original image")
print()

# Reconstruct vertices for rendering (dense mesh required)
ver_lst = tddfa.recon_vers(param_lst, roi_box_lst, dense_flag=dense_flag)

print(f"📐 Mesh topology info:")
print(f"  - Vertices: {ver_lst[0].shape[1]:,} points")
print(f"  - Triangles: {len(tddfa.tri):,} faces")
print(f"  - Each triangle connects 3 vertices to form a surface patch")
print()

print("🎨 Rendering 3D mesh...")
# Render the 3D mesh with semi-transparent overlay
result_img = render(img, ver_lst, tddfa.tri, alpha=0.6, show_flag=True)

# Note: The render function with show_flag=True will display the result automatically
# and return the rendered image

print("✅ 3D mesh rendering complete!")
print()
print("🔍 What you're seeing:")
print("- The original image with a 3D face mesh overlay")
print("- Semi-transparent rendering (alpha=0.6) so you can see both")
print("- Proper 3D lighting and shading on the mesh surface")
print("- The mesh follows the exact contours of the detected face")
print()
print("🎯 This proves the system has successfully:")
print("1. Detected the face in 2D")
print("2. Estimated the 3D pose and shape") 
print("3. Reconstructed a complete 3D model")
print("4. Rendered it back onto the 2D image")

In [None]:
# === STEP 8: DEPTH MAP VISUALIZATION ===
# Generate a depth map showing the 3D structure of the face

print("🗺️ Generating face depth map...")
print("A depth map visualizes 3D information by:")
print("- Showing distance from camera using colors/brightness")
print("- Closer parts appear brighter/warmer")
print("- Further parts appear darker/cooler") 
print("- Reveals the 3D shape and structure of the face")
print()

# Generate depth map from the 3D vertices
ver_lst = tddfa.recon_vers(param_lst, roi_box_lst, dense_flag=dense_flag)

print(f"📊 Depth analysis:")
for i, ver in enumerate(ver_lst):
    z_coords = ver[2]  # Z-coordinates represent depth
    min_depth = z_coords.min()
    max_depth = z_coords.max()
    depth_range = max_depth - min_depth
    print(f"Face {i+1} depth info:")
    print(f"  - Closest point: {min_depth:.1f} pixels from camera")
    print(f"  - Furthest point: {max_depth:.1f} pixels from camera")
    print(f"  - Total depth range: {depth_range:.1f} pixels")

print("\n🎨 Rendering depth map...")
# Generate depth visualization
result_img = depth(img, ver_lst, tddfa.tri, show_flag=True)

# Note: The depth function with show_flag=True will display the result automatically

print("✅ Depth map generation complete!")
print()
print("🔍 How to read the depth map:")
print("- Bright/warm colors = closer to camera (nose tip, forehead)")
print("- Dark/cool colors = further from camera (eye sockets, sides)")
print("- The gradient shows the 3D curvature of facial features")
print("- This is similar to how 3D scanners represent depth information")
print()
print("🎯 Applications of depth maps:")
print("- 3D face recognition and biometrics")
print("- Augmented reality face filters")
print("- Facial animation and motion capture")
print("- Medical and forensic face analysis")