# Lab Exercise 4: Blind Hyperspectral Unmixing (Advanced)

## Objectives
- Understand the challenges of blind unmixing
- Implement Block Coordinate Descent (BCD) algorithm
- Analyze initialization strategies
- Compare blind vs. supervised unmixing performance

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import sys
sys.path.append('../src')

from data_loader import HyperspectralDataLoader, create_synthetic_data
from visualization import HSIVisualizer
from optimization import HyperspectralUnmixer, BlindUnmixer
from metrics import UnmixingEvaluator, compute_endmember_similarity

%matplotlib inline

## Task 1: Understanding the Blind Unmixing Problem

In blind unmixing, we must estimate both endmembers S and abundances A simultaneously:

$$\min_{\mathbf{S}, \mathbf{A}} \|\mathbf{S}\mathbf{A} - \mathbf{Y}\|_F^2 \quad \text{s.t.} \quad \mathbf{A} \geq 0, \mathbf{1}^T\mathbf{A} = \mathbf{1}^T, \mathbf{S} \geq 0$$

In [None]:
# Load data
loader = HyperspectralDataLoader("../data/")
hsi_data, ground_truth = loader.load_indian_pines()
Y, _ = loader.vectorize_data()

# Load known endmembers for comparison
S_true = np.load('../data/extracted_endmembers.npy')
endmember_names = np.load('../data/endmember_names.npy', allow_pickle=True)

print(f"Data shape: {Y.shape}")
print(f"True endmembers shape: {S_true.shape}")
print(f"Endmember names: {list(endmember_names)}")

# Analyze the challenge
num_bands, num_pixels = Y.shape
num_endmembers = S_true.shape[1]
num_unknowns = num_bands * num_endmembers + num_endmembers * num_pixels
num_observations = num_bands * num_pixels

print(f"\nProblem Analysis:")
print(f"  Number of observations: {num_observations}")
print(f"  Number of unknowns: {num_unknowns}")
print(f"  Problem is {'over' if num_observations > num_unknowns else 'under'}-determined")

## Task 2: Implement Block Coordinate Descent

BCD alternates between optimizing S (with A fixed) and A (with S fixed).

In [None]:
def block_coordinate_descent_manual(Y, num_endmembers, max_iter=50, 
                                  inner_max_iter=100, tolerance=1e-6):
    """
    Manual implementation of Block Coordinate Descent for blind unmixing.
    
    TODO: Implement the BCD algorithm
    """
    bands, num_pixels = Y.shape
    
    # Initialize endmembers and abundances
    # TODO: Initialize S with random normalized columns
    S = # YOUR CODE HERE
    
    # TODO: Initialize A on the simplex
    A = # YOUR CODE HERE
    
    unmixer = HyperspectralUnmixer()
    objective_values = []
    
    for iteration in range(max_iter):
        # Compute current objective
        residual = S @ A - Y
        objective = 0.5 * np.sum(residual**2)
        objective_values.append(objective)
        
        print(f"Iteration {iteration+1}/{max_iter}, Objective: {objective:.6f}")
        
        # Step 1: Update A with S fixed (simplex-constrained least squares)
        # TODO: Use the unmixer to solve the A subproblem
        A, _ = # YOUR CODE HERE
        
        # Step 2: Update S with A fixed (non-negative least squares)
        # This is equivalent to solving: min ||A^T S^T - Y^T||_F^2 s.t. S^T >= 0
        # TODO: Use projected gradient for the S subproblem
        S_new, _ = # YOUR CODE HERE (note the transpose!)
        S = S_new.T
        
        # Normalize endmembers (optional, helps with scaling)
        S = S / np.linalg.norm(S, axis=0)
        
        # Check convergence
        if iteration > 0:
            rel_change = abs(objective_values[-1] - objective_values[-2]) / objective_values[-2]
            if rel_change < tolerance:
                print(f"Converged after {iteration+1} iterations")
                break
    
    return S, A, objective_values

# Test with synthetic data first
print("Testing BCD with synthetic data...")
hsi_synthetic, S_synth_true, A_synth_true = create_synthetic_data(
    height=30, width=30, bands=50, num_endmembers=3, noise_level=0.02)

Y_synth = hsi_synthetic.reshape(-1, hsi_synthetic.shape[2]).T

# Apply BCD
S_estimated, A_estimated, obj_values = block_coordinate_descent_manual(
    Y_synth, num_endmembers=3, max_iter=20)

print(f"\nSynthetic test completed in {len(obj_values)} iterations")
print(f"Final objective: {obj_values[-1]:.6f}")

## Task 3: Evaluate Synthetic Results

Compare estimated endmembers and abundances with ground truth.

In [None]:
# TODO: Evaluate endmember recovery
endmember_similarity = compute_endmember_similarity(S_synth_true, S_estimated)

print("Endmember Recovery Analysis:")
print(f"  Mean SAM (degrees): {endmember_similarity['mean_endmember_sam_degrees']:.4f}")
print(f"  Correlation: {endmember_similarity['endmember_correlation']:.4f}")

# TODO: Evaluate abundance recovery
evaluator = UnmixingEvaluator()
aad = evaluator.abundance_angle_distance(A_synth_true, A_estimated)
print(f"  Abundance AAD (degrees): {np.degrees(aad):.4f}")

# Visualize synthetic results
visualizer = HSIVisualizer()

# Plot true vs estimated endmembers
fig, axes = plt.subplots(1, 2, figsize=(15, 5))

for i in range(3):
    axes[0].plot(S_synth_true[:, i], label=f'True EM{i+1}', linewidth=2)
    axes[1].plot(S_estimated[:, i], label=f'Est. EM{i+1}', linewidth=2)

axes[0].set_title('True Endmembers')
axes[0].legend()
axes[0].grid(True, alpha=0.3)

axes[1].set_title('Estimated Endmembers')
axes[1].legend()
axes[1].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

# Plot convergence
visualizer.plot_convergence(obj_values, "BCD Convergence (Synthetic Data)")

## Task 4: Apply to Real Data

Now apply blind unmixing to the Indian Pines dataset.

In [None]:
# Apply BCD to real data (use subset for faster computation)
print("Applying BCD to Indian Pines data...")

# Use library implementation for efficiency
unmixer = HyperspectralUnmixer()
blind_unmixer = BlindUnmixer(unmixer)

num_endmembers = S_true.shape[1]  # Same number as supervised case

# Use subset of data for faster computation
step = 3  # Use every 3rd pixel
Y_subset = Y[:, ::step]
print(f"Using subset: {Y_subset.shape}")

# Run blind unmixing
S_blind, A_blind_subset, obj_blind = blind_unmixer.block_coordinate_descent(
    Y_subset, num_endmembers, max_iter=30, inner_max_iter=100)

print(f"\nBlind unmixing completed in {len(obj_blind)} iterations")
print(f"Final objective: {obj_blind[-1]:.6f}")

# Apply estimated endmembers to full dataset
A_blind_full, _ = unmixer.fully_constrained_least_squares(S_blind, Y, max_iter=200)

print(f"Estimated endmembers shape: {S_blind.shape}")
print(f"Full abundance matrix shape: {A_blind_full.shape}")

## Task 5: Compare Blind vs. Supervised Results

Evaluate how blind unmixing compares to supervised unmixing.

In [None]:
# Load supervised results for comparison
A_supervised = np.load('../data/simplex_abundances.npy')

# Evaluate both methods
height, width = hsi_data.shape[:2]
rgb_bands = loader.get_rgb_bands()

# Supervised results
results_supervised = evaluator.evaluate_reconstruction(
    S_true, A_supervised, Y, (height, width), rgb_bands)

# Blind results
results_blind = evaluator.evaluate_reconstruction(
    S_blind, A_blind_full, Y, (height, width), rgb_bands)

# Compare methods
comparison = {
    'Supervised': results_supervised,
    'Blind': results_blind
}

evaluator.compare_methods(comparison)

# Endmember similarity analysis
endmember_sim = compute_endmember_similarity(S_true, S_blind)
print(f"\nEndmember Recovery from Real Data:")
print(f"  Mean SAM (degrees): {endmember_sim['mean_endmember_sam_degrees']:.4f}")
print(f"  Correlation: {endmember_sim['endmember_correlation']:.4f}")

## Task 6: Visualization and Analysis

Create comprehensive visualizations of blind unmixing results.

In [None]:
# TODO: Plot endmember comparison
fig, axes = plt.subplots(1, 2, figsize=(15, 6))

# True endmembers
for i in range(S_true.shape[1]):
    axes[0].plot(S_true[:, i], label=endmember_names[i], linewidth=2)
axes[0].set_title('Supervised Endmembers (Ground Truth)')
axes[0].set_xlabel('Band Index')
axes[0].set_ylabel('Reflectance')
axes[0].legend()
axes[0].grid(True, alpha=0.3)

# Estimated endmembers
estimated_names = [f'Est. {name}' for name in endmember_names]
for i in range(S_blind.shape[1]):
    axes[1].plot(S_blind[:, i], label=estimated_names[i], linewidth=2)
axes[1].set_title('Blind Unmixing Endmembers')
axes[1].set_xlabel('Band Index')
axes[1].set_ylabel('Reflectance')
axes[1].legend()
axes[1].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

# TODO: Plot abundance maps
print("\nBlind Unmixing Abundance Maps:")
visualizer.plot_abundance_maps(A_blind_full, (height, width), estimated_names)

# TODO: Plot convergence
visualizer.plot_convergence(obj_blind, "Blind Unmixing Convergence (Real Data)")

## Task 7: Initialization Sensitivity Analysis

Test how sensitive the algorithm is to different initializations.

In [None]:
# TODO: Test multiple random initializations
num_trials = 5
initialization_results = []

# Use smaller subset for multiple trials
Y_small = Y[:, ::10]  # Every 10th pixel
print(f"Testing with {Y_small.shape[1]} pixels")

for trial in range(num_trials):
    print(f"\nTrial {trial+1}/{num_trials}")
    
    # Set different random seed for each trial
    np.random.seed(trial + 42)
    
    try:
        S_trial, A_trial, obj_trial = blind_unmixer.block_coordinate_descent(
            Y_small, num_endmembers, max_iter=20, initialization='random')
        
        # Evaluate trial
        final_objective = obj_trial[-1]
        convergence_rate = (obj_trial[0] - obj_trial[-1]) / obj_trial[0]
        
        # Endmember similarity (if possible to match order)
        try:
            endmember_sim = compute_endmember_similarity(S_true, S_trial)
            mean_sam = endmember_sim['mean_endmember_sam_degrees']
        except:
            mean_sam = float('inf')
        
        initialization_results.append({
            'trial': trial + 1,
            'final_objective': final_objective,
            'convergence_rate': convergence_rate,
            'mean_sam': mean_sam,
            'iterations': len(obj_trial)
        })
        
        print(f"  Final objective: {final_objective:.6f}")
        print(f"  Convergence rate: {convergence_rate:.4f}")
        print(f"  Mean SAM: {mean_sam:.4f}°")
        
    except Exception as e:
        print(f"  Trial {trial+1} failed: {e}")

# Analyze initialization sensitivity
if initialization_results:
    objectives = [r['final_objective'] for r in initialization_results]
    sams = [r['mean_sam'] for r in initialization_results if r['mean_sam'] != float('inf')]
    
    print(f"\nInitialization Sensitivity Analysis:")
    print(f"  Objective std: {np.std(objectives):.6f}")
    print(f"  Best objective: {np.min(objectives):.6f}")
    print(f"  Worst objective: {np.max(objectives):.6f}")
    
    if sams:
        print(f"  SAM std: {np.std(sams):.4f}°")
        print(f"  Best SAM: {np.min(sams):.4f}°")

## Task 8: Algorithm Limitations Analysis

Investigate the limitations and failure modes of blind unmixing.

In [None]:
# TODO: Test with different numbers of endmembers
endmember_counts = [2, 3, 4, 5]
count_results = {}

for K in endmember_counts:
    print(f"\nTesting with {K} endmembers...")
    
    try:
        S_K, A_K, obj_K = blind_unmixer.block_coordinate_descent(
            Y_small, K, max_iter=15)
        
        # Evaluate reconstruction quality
        reconstruction_error = np.mean((S_K @ A_K - Y_small)**2)
        
        count_results[K] = {
            'final_objective': obj_K[-1],
            'reconstruction_error': reconstruction_error,
            'iterations': len(obj_K)
        }
        
        print(f"  Final objective: {obj_K[-1]:.6f}")
        print(f"  Reconstruction error: {reconstruction_error:.6f}")
        
    except Exception as e:
        print(f"  Failed with {K} endmembers: {e}")

# Plot results
if count_results:
    counts = list(count_results.keys())
    objectives = [count_results[k]['final_objective'] for k in counts]
    errors = [count_results[k]['reconstruction_error'] for k in counts]
    
    fig, axes = plt.subplots(1, 2, figsize=(12, 4))
    
    axes[0].plot(counts, objectives, 'bo-', linewidth=2, markersize=8)
    axes[0].set_xlabel('Number of Endmembers')
    axes[0].set_ylabel('Final Objective')
    axes[0].set_title('Objective vs. Number of Endmembers')
    axes[0].grid(True, alpha=0.3)
    
    axes[1].plot(counts, errors, 'ro-', linewidth=2, markersize=8)
    axes[1].set_xlabel('Number of Endmembers')
    axes[1].set_ylabel('Reconstruction Error')
    axes[1].set_title('Error vs. Number of Endmembers')
    axes[1].grid(True, alpha=0.3)
    
    plt.tight_layout()
    plt.show()

# Analyze identifiability issues
print(f"\nIdentifiability Analysis:")
print(f"  True number of endmembers: {S_true.shape[1]}")
print(f"  Recommended: Use prior knowledge when possible")
print(f"  Challenge: Local minima depend on initialization")
print(f"  Solution: Multiple runs with different initializations")

## Reflection Questions

1. **What are the main challenges of blind unmixing compared to supervised unmixing?**

   *Your answer here*

2. **How does initialization affect the final result? Why?**

   *Your answer here*

3. **What role does the number of endmembers play in algorithm performance?**

   *Your answer here*

4. **Under what conditions might blind unmixing be preferred over supervised methods?**

   *Your answer here*

5. **How could you improve the robustness of blind unmixing algorithms?**

   *Your answer here*

## Summary and Best Practices

Based on this exercise, here are key takeaways for blind unmixing:

1. **Initialization matters**: Run multiple trials with different random seeds
2. **Convergence monitoring**: Always plot objective function evolution
3. **Parameter tuning**: Adjust step sizes and iteration counts based on data
4. **Validation**: Compare with known methods when possible
5. **Regularization**: Consider adding constraints or penalties for better solutions

In [None]:
# Final comparison summary
print("\n" + "="*60)
print("FINAL COMPARISON: SUPERVISED vs BLIND UNMIXING")
print("="*60)

print(f"Supervised (SAM): {results_supervised['mean_sam_degrees']:.4f}°")
print(f"Blind (SAM):      {results_blind['mean_sam_degrees']:.4f}°")
print(f"Supervised (RMSE): {results_supervised['rmse']:.6f}")
print(f"Blind (RMSE):      {results_blind['rmse']:.6f}")

performance_ratio = results_blind['mean_sam_degrees'] / results_supervised['mean_sam_degrees']
print(f"\nBlind/Supervised performance ratio: {performance_ratio:.2f}")

if performance_ratio < 2.0:
    print("✓ Blind unmixing achieved reasonable performance!")
else:
    print("⚠ Blind unmixing significantly worse - try different parameters")

print("\nLab Exercise 4 completed!")