# Real-World Application: Instrument + Room Acoustic Pipeline

This notebook demonstrates a **cascaded Volterra system** for modeling a complete audio chain:

$$
\text{Input signal} \xrightarrow{\text{Instrument}} \text{Nonlinear output} \xrightarrow{\text{Room}} \text{Final recording}
$$

## Application: Electric Guitar + Amplifier + Room

We'll model a complete signal chain:
1. **Instrument stage**: Guitar → nonlinear saturation/distortion (MP or GMP)
2. **Room stage**: Acoustic response with reverb/reflections (linear convolution)

**Why Volterra?**
- Captures harmonic distortion from tube amplifiers
- Models speaker nonlinearities (magnetic saturation)
- Handles memory effects (bias drift, thermal dynamics)

**Use cases:**
- Digital audio effects (amp simulators, distortion pedals)
- Virtual acoustics (room emulation)
- Audio forensics (device fingerprinting)
- Hearing aid nonlinearity compensation

---

## Setup

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from scipy import signal

from volterra import AcousticChain, GeneralizedMemoryPolynomial

np.random.seed(1011)

plt.rcParams['figure.figsize'] = (14, 5)
plt.rcParams['font.size'] = 10

---

## 1. Generate Synthetic Instrument + Room System

We'll create:
- **Instrument**: Nonlinear saturation (cubic soft-clipper)
- **Room**: Linear impulse response (exponentially decaying reverb)

In [None]:
# Configuration
fs = 48000  # Sampling rate
duration = 2.0  # seconds
n_samples = int(fs * duration)

# Generate input signal: guitar-like plucked tone
def generate_guitar_signal(fs, duration, f0=220):
    """
    Simulate a guitar pluck using Karplus-Strong algorithm.
    """
    t = np.arange(int(fs * duration)) / fs
    
    # Initial excitation: burst of noise
    excitation_length = int(fs / f0)  # One period
    excitation = np.random.randn(excitation_length) * 0.8
    excitation = np.concatenate([excitation, np.zeros(len(t) - excitation_length)])
    
    # Feedback filter (lowpass for realistic decay)
    delay = int(fs / f0)
    y = np.zeros_like(t)
    y[:delay] = excitation[:delay]
    
    alpha = 0.995  # Decay factor
    for i in range(delay, len(t)):
        y[i] = alpha * 0.5 * (y[i - delay] + y[i - delay - 1]) + excitation[i]
    
    # Add some harmonic content
    y += 0.2 * np.sin(2 * np.pi * f0 * t) * np.exp(-3 * t)
    y += 0.1 * np.sin(2 * np.pi * 2 * f0 * t) * np.exp(-4 * t)
    
    return y / np.max(np.abs(y)) * 0.6  # Normalize

x = generate_guitar_signal(fs, duration, f0=220)  # A3 note (220 Hz)

print(f"Generated guitar signal: {len(x)} samples ({duration}s at {fs} Hz)")
print(f"RMS: {np.sqrt(np.mean(x**2)):.4f}")
print(f"Peak: {np.max(np.abs(x)):.4f}")

In [None]:
# Define instrument nonlinearity (amp distortion)
def instrument_distortion(x):
    """
    Tube amplifier-like soft saturation:
    - Linear at low levels
    - Soft clipping at high levels (tanh-like)
    - Adds harmonics
    """
    # Polynomial approximation of soft saturation
    gain = 1.5  # Drive
    x_driven = x * gain
    
    y_nl = (
        0.85 * x_driven +           # Linear (fundamental)
        0.12 * x_driven**2 +        # Even harmonics (2nd)
        -0.08 * x_driven**3         # Odd harmonics (3rd) + soft clipping
    )
    
    # Add memory via simple IIR (speaker resonance)
    b = [0.25, -0.45, 0.22]
    a = [1.0, -1.88, 0.92]
    y = signal.lfilter(b, a, y_nl)
    
    return y

# Define room impulse response (exponential decay reverb)
def generate_room_ir(fs, duration=0.8, rt60=0.3):
    """
    Generate a synthetic room impulse response.
    RT60: reverberation time (time for 60 dB decay)
    """
    n_samples = int(fs * duration)
    t = np.arange(n_samples) / fs
    
    # Exponential decay envelope
    decay_const = 3 * np.log(10) / rt60  # For 60 dB = factor of 1000
    envelope = np.exp(-decay_const * t)
    
    # Add some early reflections (sparse peaks)
    ir = envelope * np.random.randn(n_samples) * 0.1
    
    # Direct sound (delta at t=0)
    ir[0] += 1.0
    
    # Early reflections (first 50ms)
    early_times = [0.008, 0.015, 0.023, 0.035, 0.048]  # seconds
    for t_reflect in early_times:
        idx = int(t_reflect * fs)
        if idx < len(ir):
            ir[idx] += np.random.uniform(0.2, 0.5) * np.exp(-decay_const * t_reflect)
    
    return ir / np.max(np.abs(ir))  # Normalize

room_ir = generate_room_ir(fs, duration=0.8, rt60=0.3)

print(f"\nRoom impulse response: {len(room_ir)} samples ({len(room_ir)/fs:.3f}s)")
print(f"RT60: 0.3 seconds (typical small room)")

In [None]:
# Create full chain: input → instrument → room → output
y_instrument = instrument_distortion(x)
y_full = signal.fftconvolve(y_instrument, room_ir, mode='same')

# Add measurement noise
y_full += np.random.randn(n_samples) * 0.005

print(f"Output signal RMS: {np.sqrt(np.mean(y_full**2)):.4f}")
print(f"SNR: ~{20 * np.log10(np.sqrt(np.mean(y_full**2)) / 0.005):.1f} dB")

In [None]:
# Visualize the acoustic chain
fig, axes = plt.subplots(2, 2, figsize=(16, 8))

# Time-domain signals
n_plot = int(0.5 * fs)  # First 0.5 seconds
t_plot = np.arange(n_plot) / fs

axes[0, 0].plot(t_plot, x[:n_plot], alpha=0.7, linewidth=1)
axes[0, 0].set_xlabel('Time (s)')
axes[0, 0].set_ylabel('Amplitude')
axes[0, 0].set_title('1. Input: Guitar Pluck')
axes[0, 0].grid(True, alpha=0.3)

axes[0, 1].plot(t_plot, y_instrument[:n_plot], alpha=0.7, linewidth=1, color='orange')
axes[0, 1].set_xlabel('Time (s)')
axes[0, 1].set_ylabel('Amplitude')
axes[0, 1].set_title('2. After Instrument (nonlinear distortion)')
axes[0, 1].grid(True, alpha=0.3)

axes[1, 0].plot(t_plot, y_full[:n_plot], alpha=0.7, linewidth=1, color='green')
axes[1, 0].set_xlabel('Time (s)')
axes[1, 0].set_ylabel('Amplitude')
axes[1, 0].set_title('3. Final Output (after room)')
axes[1, 0].grid(True, alpha=0.3)

# Room impulse response
t_ir = np.arange(len(room_ir)) / fs * 1000  # milliseconds
axes[1, 1].plot(t_ir, room_ir, alpha=0.7, linewidth=1, color='red')
axes[1, 1].set_xlabel('Time (ms)')
axes[1, 1].set_ylabel('Amplitude')
axes[1, 1].set_title('Room Impulse Response')
axes[1, 1].set_xlim([0, 300])
axes[1, 1].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

In [None]:
# Spectral analysis: compare input, instrument, and room outputs
fig, axes = plt.subplots(1, 3, figsize=(16, 5))

for ax, signal_data, title in zip(axes, [x, y_instrument, y_full], 
                                   ['Input', 'After Instrument', 'Final Output']):
    f, Pxx = signal.welch(signal_data, fs=fs, nperseg=4096)
    ax.semilogy(f / 1000, Pxx, alpha=0.7, linewidth=1.5)
    ax.set_xlabel('Frequency (kHz)')
    ax.set_ylabel('PSD (V²/Hz)')
    ax.set_title(title)
    ax.set_xlim([0, 5])
    ax.grid(True, alpha=0.3)
    
    # Mark fundamental and harmonics
    f0 = 220  # Hz
    for harmonic in [1, 2, 3, 4]:
        ax.axvline(harmonic * f0 / 1000, color='red', linestyle='--', 
                   linewidth=0.8, alpha=0.5)

plt.suptitle('Spectral Evolution Through Acoustic Chain (harmonics marked in red)', 
             fontsize=14, fontweight='bold')
plt.tight_layout()
plt.show()

print("Notice the added harmonics from instrument distortion!")

---

## 2. Model the Acoustic Chain with Volterra

We'll use `AcousticChain` to jointly identify:
- **Instrument model**: Nonlinear (MP or GMP)
- **Room model**: Linear FIR filter

In [None]:
# Split data for training and testing
n_train = int(0.7 * n_samples)
x_train, x_test = x[:n_train], x[n_train:]
y_train, y_test = y_full[:n_train], y_full[n_train:]

print(f"Training samples: {n_train} ({n_train/fs:.2f}s)")
print(f"Testing samples: {len(x_test)} ({len(x_test)/fs:.2f}s)")

In [None]:
# Create AcousticChain model
chain = AcousticChain(
    instrument_memory=8,      # Memory length for instrument nonlinearity
    instrument_order=3,       # Nonlinearity order (cubic)
    room_ir_length=2048,      # Room impulse response length (42.7 ms at 48 kHz)
    lambda_reg=1e-5           # Regularization
)

print("AcousticChain configuration:")
print(f"  Instrument: Memory={chain.instrument_memory}, Order={chain.instrument_order}")
print(f"  Room IR length: {chain.room_ir_length} samples ({chain.room_ir_length/fs*1000:.1f} ms)")
print(f"  Regularization: {chain.lambda_reg}")

In [None]:
# Fit the acoustic chain
import time

print("Fitting acoustic chain model...\n")
start = time.time()

chain.fit(x_train, y_train)

fit_time = time.time() - start

print(f"\nModel fitted in {fit_time:.2f} seconds")
print(f"Instrument model parameters: {chain.instrument_model.coeffs_.size}")
print(f"Room IR parameters: {len(chain.room_ir)}")

---

## 3. Evaluate Model Performance

In [None]:
# Predict on test set
y_test_pred = chain.predict(x_test)

# Note: prediction length accounting for both instrument and room memory
total_memory = chain.instrument_memory + chain.room_ir_length - 1
y_test_trimmed = y_test[total_memory - 1:len(y_test_pred) + total_memory - 1]

print(f"Prediction shape: {y_test_pred.shape}")
print(f"Ground truth (trimmed) shape: {y_test_trimmed.shape}")

# Compute NMSE
mse = np.mean((y_test_trimmed - y_test_pred) ** 2)
signal_power = np.mean(y_test_trimmed ** 2)
nmse_db = 10 * np.log10(mse / signal_power)

print(f"\nTest NMSE: {nmse_db:.2f} dB")
print(f"Test MSE: {mse:.6f}")
print(f"\nInterpretation:")
if nmse_db < -20:
    print("  ✅ Excellent fit! Model captured instrument + room dynamics")
elif nmse_db < -10:
    print("  ⚠️  Good fit, but some details missing (try higher order or longer memory)")
else:
    print("  ❌ Poor fit (check model configuration)")

In [None]:
# Visualize test set predictions
fig, axes = plt.subplots(2, 2, figsize=(16, 8))

n_plot = int(0.4 * fs)
t_plot = np.arange(n_plot) / fs

# Time-domain comparison
axes[0, 0].plot(t_plot, y_test_trimmed[:n_plot], label='True output', alpha=0.7, linewidth=1.5)
axes[0, 0].plot(t_plot, y_test_pred[:n_plot], label='Chain prediction', alpha=0.7, linewidth=1.5, linestyle='--')
axes[0, 0].set_xlabel('Time (s)')
axes[0, 0].set_ylabel('Amplitude')
axes[0, 0].set_title(f'Time Domain Prediction (NMSE: {nmse_db:.2f} dB)')
axes[0, 0].legend()
axes[0, 0].grid(True, alpha=0.3)

# Prediction error
error = y_test_trimmed[:n_plot] - y_test_pred[:n_plot]
axes[0, 1].plot(t_plot, error, alpha=0.7, color='red')
axes[0, 1].axhline(0, color='black', linestyle='--', linewidth=0.8)
axes[0, 1].set_xlabel('Time (s)')
axes[0, 1].set_ylabel('Error')
axes[0, 1].set_title(f'Prediction Error (RMS: {np.sqrt(mse):.5f})')
axes[0, 1].grid(True, alpha=0.3)

# Spectral comparison
f, Pyy_true = signal.welch(y_test_trimmed, fs=fs, nperseg=4096)
f, Pyy_pred = signal.welch(y_test_pred, fs=fs, nperseg=4096)
axes[1, 0].semilogy(f / 1000, Pyy_true, label='True', alpha=0.7, linewidth=1.5)
axes[1, 0].semilogy(f / 1000, Pyy_pred, label='Predicted', alpha=0.7, linewidth=1.5, linestyle='--')
axes[1, 0].set_xlabel('Frequency (kHz)')
axes[1, 0].set_ylabel('PSD (V²/Hz)')
axes[1, 0].set_title('Spectral Comparison')
axes[1, 0].set_xlim([0, 4])
axes[1, 0].legend()
axes[1, 0].grid(True, alpha=0.3)

# Scatter plot
axes[1, 1].scatter(y_test_trimmed[::20], y_test_pred[::20], alpha=0.3, s=2)
y_range = [y_test_trimmed.min(), y_test_trimmed.max()]
axes[1, 1].plot(y_range, y_range, 'r--', linewidth=2, label='Perfect')
axes[1, 1].set_xlabel('True output')
axes[1, 1].set_ylabel('Predicted output')
axes[1, 1].set_title('Predicted vs. True')
axes[1, 1].legend()
axes[1, 1].grid(True, alpha=0.3)
axes[1, 1].axis('equal')

plt.tight_layout()
plt.show()

---

## 4. Analyze Learned Components

Let's examine what the model learned about the instrument and room.

In [None]:
# Extract learned instrument kernels
instrument_kernels = chain.instrument_model.coeffs_

fig, axes = plt.subplots(1, 3, figsize=(15, 4))

for n in range(chain.instrument_order):
    h_n = instrument_kernels[:, n]
    
    axes[n].stem(range(len(h_n)), h_n, basefmt=' ')
    axes[n].set_xlabel('Memory lag')
    axes[n].set_ylabel(f'h_{n+1}[m]')
    axes[n].set_title(f'Instrument Kernel: Order {n+1}')
    axes[n].grid(True, alpha=0.3)

plt.suptitle('Learned Instrument Nonlinearity', fontsize=14, fontweight='bold')
plt.tight_layout()
plt.show()

print("Instrument kernel interpretation:")
print("  - Order 1 (linear): Main signal path + frequency shaping")
print("  - Order 2 (quadratic): Even harmonics (2nd, 4th, ...)")
print("  - Order 3 (cubic): Odd harmonics (3rd, 5th, ...) + soft clipping")

In [None]:
# Extract learned room IR
learned_room_ir = chain.room_ir

fig, axes = plt.subplots(1, 2, figsize=(15, 5))

# Time domain comparison
t_ir = np.arange(len(room_ir)) / fs * 1000
t_learned = np.arange(len(learned_room_ir)) / fs * 1000

axes[0].plot(t_ir, room_ir, label='True room IR', alpha=0.7, linewidth=1.5)
axes[0].plot(t_learned[:len(room_ir)], learned_room_ir[:len(room_ir)], 
             label='Learned room IR', alpha=0.7, linewidth=1.5, linestyle='--')
axes[0].set_xlabel('Time (ms)')
axes[0].set_ylabel('Amplitude')
axes[0].set_title('Room Impulse Response: Time Domain')
axes[0].set_xlim([0, 300])
axes[0].legend()
axes[0].grid(True, alpha=0.3)

# Frequency domain comparison
H_true = np.fft.rfft(room_ir, n=8192)
H_learned = np.fft.rfft(learned_room_ir, n=8192)
freqs = np.fft.rfftfreq(8192, 1/fs)

axes[1].semilogx(freqs[1:] / 1000, 20 * np.log10(np.abs(H_true[1:])), 
                 label='True', alpha=0.7, linewidth=1.5)
axes[1].semilogx(freqs[1:] / 1000, 20 * np.log10(np.abs(H_learned[1:])), 
                 label='Learned', alpha=0.7, linewidth=1.5, linestyle='--')
axes[1].set_xlabel('Frequency (kHz)')
axes[1].set_ylabel('Magnitude (dB)')
axes[1].set_title('Room Response: Frequency Domain')
axes[1].set_xlim([0.1, 10])
axes[1].set_ylim([-40, 5])
axes[1].legend()
axes[1].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

print("\nRoom IR analysis:")
print(f"  True IR length: {len(room_ir)} samples")
print(f"  Learned IR length: {len(learned_room_ir)} samples")
print(f"  Correlation: {np.corrcoef(room_ir[:min(len(room_ir), len(learned_room_ir))], learned_room_ir[:min(len(room_ir), len(learned_room_ir))])[0, 1]:.3f}")

---

## 5. Application: Real-Time Processing

Once trained, the model can be used for real-time audio processing.

In [None]:
# Simulate real-time processing with block-based prediction
block_size = 512  # typical audio buffer size
n_blocks = len(x_test) // block_size

print(f"Simulating real-time processing:")
print(f"  Block size: {block_size} samples ({block_size/fs*1000:.2f} ms latency)")
print(f"  Number of blocks: {n_blocks}")

# Process block-by-block
y_realtime = []
processing_times = []

for i in range(n_blocks):
    start_idx = i * block_size
    end_idx = start_idx + block_size
    
    x_block = x_test[start_idx:end_idx]
    
    # Time the prediction
    start = time.perf_counter()
    y_block = chain.predict(x_block)
    elapsed = time.perf_counter() - start
    
    y_realtime.append(y_block)
    processing_times.append(elapsed)

y_realtime = np.concatenate(y_realtime)

# Analyze real-time performance
avg_time = np.mean(processing_times) * 1000  # Convert to ms
max_time = np.max(processing_times) * 1000
realtime_factor = (block_size / fs * 1000) / avg_time  # How many times faster than real-time

print(f"\nReal-time performance:")
print(f"  Average processing time: {avg_time:.3f} ms")
print(f"  Max processing time: {max_time:.3f} ms")
print(f"  Block duration: {block_size/fs*1000:.2f} ms")
print(f"  Real-time factor: {realtime_factor:.1f}× (higher is better)")

if realtime_factor > 1:
    print(f"  ✅ Real-time capable! ({realtime_factor:.1f}× faster than real-time)")
else:
    print(f"  ❌ Not real-time (too slow by {1/realtime_factor:.1f}×)")

---

## Summary

In this notebook, we:

1. **Simulated a complete acoustic chain** (instrument + room)
2. **Modeled the system using AcousticChain** (cascaded Volterra)
3. **Evaluated model performance** on test data
4. **Analyzed learned components** (instrument kernels and room IR)
5. **Demonstrated real-time processing** capability

### Key takeaways:
- **Cascaded Volterra systems** can model complex multi-stage audio chains
- **Joint identification** learns both nonlinear instrument and linear room
- **Real-time processing** is feasible with efficient implementation
- **Spectral fidelity**: Model captures harmonics and reverb accurately

### Practical applications:
1. **Digital audio effects**:
   - Amp simulators (guitar, bass)
   - Vintage gear emulation (tape saturation, tube warmth)
   - Pedal modeling (distortion, overdrive)

2. **Virtual acoustics**:
   - Concert hall simulation
   - Studio monitoring emulation
   - Headphone spatialization

3. **System identification**:
   - Loudspeaker characterization
   - Microphone calibration
   - Audio codec modeling

4. **Audio forensics**:
   - Device fingerprinting
   - Room acoustics analysis
   - Source separation

### Extensions:
- **Multiple instruments**: Use MIMO TT-Volterra for multi-microphone setups
- **Time-varying systems**: Track parameter drift with RLS/adaptive filtering
- **Higher-order models**: Capture more complex nonlinearities (N > 3)
- **GPU acceleration**: Use JAX/PyTorch for real-time on embedded devices

### Model deployment:
```python
# Save trained model
import pickle
with open('guitar_amp_chain.pkl', 'wb') as f:
    pickle.dump(chain, f)

# Load and use in production
with open('guitar_amp_chain.pkl', 'rb') as f:
    chain_loaded = pickle.load(f)

# Process audio stream
y_processed = chain_loaded.predict(x_stream)
```

---

This completes the notebook series! You now have a complete toolkit for:
- Memory Polynomial (MP) modeling
- Generalized Memory Polynomial (GMP) with cross-terms
- Tensor-Train Volterra for full MIMO systems
- Automatic model selection
- Real-world acoustic chain applications