# VSP Analysis with DAS Data

This notebook demonstrates Vertical Seismic Profiling (VSP) processing workflows for Distributed Acoustic Sensing (DAS) data.

## Overview

VSP is a seismic technique where sensors are placed at various depths in a borehole to record seismic waves generated by a surface source. When combined with DAS technology, fiber optic cables provide high-resolution, continuous measurements along the entire wellbore.

### Key VSP Processing Steps:
1. Data loading and quality control
2. First break picking
3. Upgoing/downgoing wave separation (median filtering)
4. Corridor stack generation
5. Velocity analysis
6. Well-to-seismic tie

### Learning Objectives:
- Understand DAS-VSP data characteristics
- Implement wavefield separation techniques
- Generate corridor stacks for well ties
- Handle real-world data quality issues

In [None]:
# Import required libraries
import numpy as np
import matplotlib.pyplot as plt
from scipy import signal, interpolate
from scipy.ndimage import median_filter
import h5py
import warnings
warnings.filterwarnings('ignore')

# Set plotting parameters
plt.rcParams['figure.figsize'] = (14, 8)
plt.rcParams['font.size'] = 10

print("Libraries imported successfully")

## 1. Synthetic VSP Data Generation

For demonstration purposes, we'll create synthetic DAS-VSP data that mimics real acquisition scenarios.

In [None]:
def generate_synthetic_vsp_data(n_channels=500, n_samples=2000, fs=1000, 
                                channel_spacing=1.0, velocity=3000):
    """
    Generate synthetic DAS-VSP data with downgoing and upgoing waves.
    
    Parameters:
    -----------
    n_channels : int
        Number of depth channels (DAS channels along fiber)
    n_samples : int
        Number of time samples
    fs : float
        Sampling frequency in Hz
    channel_spacing : float
        Spatial sampling in meters
    velocity : float
        Wave propagation velocity in m/s
    
    Returns:
    --------
    data : ndarray
        Synthetic VSP data (n_channels x n_samples)
    time : ndarray
        Time axis in seconds
    depth : ndarray
        Depth axis in meters
    """
    # Create axes
    time = np.arange(n_samples) / fs
    depth = np.arange(n_channels) * channel_spacing
    
    # Initialize data array
    data = np.zeros((n_channels, n_samples))
    
    # Create source wavelet (Ricker wavelet)
    f_peak = 50  # Hz
    t_wavelet = np.arange(-0.05, 0.05, 1/fs)
    wavelet = (1 - 2*(np.pi*f_peak*t_wavelet)**2) * np.exp(-(np.pi*f_peak*t_wavelet)**2)
    
    # Add downgoing wave
    for i, d in enumerate(depth):
        travel_time = d / velocity
        arrival_sample = int(travel_time * fs) + 200  # offset for visualization
        
        if arrival_sample < n_samples - len(wavelet):
            data[i, arrival_sample:arrival_sample+len(wavelet)] += wavelet * np.exp(-d/2000)
    
    # Add upgoing reflections (from multiple reflectors)
    reflector_depths = [1500, 2500, 3500]  # meters
    reflector_coeffs = [-0.3, 0.5, -0.4]    # reflection coefficients
    
    for refl_depth, refl_coef in zip(reflector_depths, reflector_coeffs):
        for i, d in enumerate(depth):
            if d < refl_depth:
                # Two-way travel time
                travel_time = (2*refl_depth - d) / velocity
                arrival_sample = int(travel_time * fs) + 200
                
                if arrival_sample < n_samples - len(wavelet):
                    data[i, arrival_sample:arrival_sample+len(wavelet)] += \
                        wavelet * refl_coef * np.exp(-refl_depth/2000)
    
    # Add realistic noise
    noise_level = 0.05
    data += np.random.randn(n_channels, n_samples) * noise_level
    
    # Add coherent noise (tube waves)
    tube_wave_velocity = 1500  # m/s (slower than formation)
    for i, d in enumerate(depth):
        travel_time = d / tube_wave_velocity
        arrival_sample = int(travel_time * fs) + 200
        
        if arrival_sample < n_samples - len(wavelet):
            tube_wavelet = wavelet * 0.3 * np.exp(-d/1000)
            data[i, arrival_sample:arrival_sample+len(wavelet)] += tube_wavelet
    
    return data, time, depth

# Generate synthetic VSP data
vsp_data, time_axis, depth_axis = generate_synthetic_vsp_data(
    n_channels=500, 
    n_samples=2000, 
    fs=1000,
    channel_spacing=1.0,
    velocity=3000
)

print(f"VSP Data shape: {vsp_data.shape}")
print(f"Time range: {time_axis[0]:.3f} - {time_axis[-1]:.3f} s")
print(f"Depth range: {depth_axis[0]:.1f} - {depth_axis[-1]:.1f} m")

## 2. Data Visualization and Quality Control

First step in any VSP processing workflow is to visualize the raw data and assess its quality.

In [None]:
def plot_vsp_data(data, time, depth, title="VSP Data", vmin=None, vmax=None, cmap='seismic'):
    """
    Plot VSP data with proper scaling and labels.
    """
    if vmin is None:
        vmin = -np.percentile(np.abs(data), 98)
    if vmax is None:
        vmax = np.percentile(np.abs(data), 98)
    
    fig, ax = plt.subplots(figsize=(12, 8))
    
    im = ax.imshow(data, aspect='auto', cmap=cmap, 
                   extent=[time[0]*1000, time[-1]*1000, depth[-1], depth[0]],
                   vmin=vmin, vmax=vmax, interpolation='bilinear')
    
    ax.set_xlabel('Time (ms)', fontsize=12)
    ax.set_ylabel('Depth (m)', fontsize=12)
    ax.set_title(title, fontsize=14, fontweight='bold')
    
    plt.colorbar(im, ax=ax, label='Amplitude')
    plt.tight_layout()
    return fig, ax

# Plot raw VSP data
fig, ax = plot_vsp_data(vsp_data, time_axis, depth_axis, title="Raw DAS-VSP Data")
plt.show()

print("\n✓ Raw data visualization complete")
print("\nObservations:")
print("- Downgoing direct wave (linear moveout from top-left)")
print("- Upgoing reflections (opposite moveout)")
print("- Tube waves (slower linear events)")

## 3. First Break Picking

First breaks mark the arrival time of the downgoing direct wave at each depth. This is crucial for:
- Velocity analysis
- Time-to-depth conversion
- Wavefield separation

In [None]:
def pick_first_breaks(data, time, depth, search_window=(0.15, 0.8), 
                      method='energy_ratio'):
    """
    Automatic first break picking using energy ratio method.
    
    Parameters:
    -----------
    data : ndarray
        VSP data (n_channels x n_samples)
    time : ndarray
        Time axis
    depth : ndarray
        Depth axis
    search_window : tuple
        (start_time, end_time) in seconds for searching
    method : str
        Picking method ('energy_ratio', 'sta_lta', 'correlation')
    
    Returns:
    --------
    first_breaks : ndarray
        First break times for each channel
    """
    n_channels = data.shape[0]
    first_breaks = np.zeros(n_channels)
    
    # Find search window indices
    dt = time[1] - time[0]
    start_idx = int(search_window[0] / dt)
    end_idx = int(search_window[1] / dt)
    
    if method == 'energy_ratio':
        for i in range(n_channels):
            trace = data[i, start_idx:end_idx]
            
            # Calculate energy ratio
            window_len = 20
            energy_ratio = np.zeros(len(trace) - 2*window_len)
            
            for j in range(len(energy_ratio)):
                pre_energy = np.sum(trace[j:j+window_len]**2)
                post_energy = np.sum(trace[j+window_len:j+2*window_len]**2)
                energy_ratio[j] = post_energy / (pre_energy + 1e-10)
            
            # Pick maximum energy ratio
            pick_idx = np.argmax(energy_ratio) + window_len + start_idx
            first_breaks[i] = time[pick_idx]
    
    return first_breaks

# Pick first breaks
first_breaks = pick_first_breaks(vsp_data, time_axis, depth_axis)

# Calculate interval velocity
interval_velocity = np.gradient(depth_axis) / np.gradient(first_breaks)

# Plot first breaks and velocity
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(14, 6))

# Plot data with first breaks
im = ax1.imshow(vsp_data, aspect='auto', cmap='seismic',
                extent=[time_axis[0]*1000, time_axis[-1]*1000, depth_axis[-1], depth_axis[0]],
                vmin=-np.percentile(np.abs(vsp_data), 98),
                vmax=np.percentile(np.abs(vsp_data), 98))
ax1.plot(first_breaks*1000, depth_axis, 'r-', linewidth=2, label='First Breaks')
ax1.set_xlabel('Time (ms)')
ax1.set_ylabel('Depth (m)')
ax1.set_title('First Break Picks')
ax1.legend()

# Plot interval velocity
ax2.plot(interval_velocity, depth_axis, 'b-', linewidth=2)
ax2.set_xlabel('Interval Velocity (m/s)')
ax2.set_ylabel('Depth (m)')
ax2.set_title('Interval Velocity from First Breaks')
ax2.grid(True, alpha=0.3)
ax2.invert_yaxis()

plt.tight_layout()
plt.show()

print(f"✓ First breaks picked for {len(first_breaks)} channels")
print(f"Average velocity: {np.mean(interval_velocity):.1f} m/s")

## 4. Wavefield Separation: Downgoing vs Upgoing

One of the most critical steps in VSP processing is separating the downgoing waves from upgoing reflections.

### Method: Median Filtering
- Exploits the opposite moveout of upgoing and downgoing waves
- Median filter in time-depth domain removes events with specific moveout
- Filter length controls which events are separated

In [None]:
def separate_wavefields(data, filter_length=15):
    """
    Separate upgoing and downgoing waves using median filtering.
    
    Parameters:
    -----------
    data : ndarray
        Input VSP data (n_channels x n_samples)
    filter_length : int
        Median filter length (odd number)
        - Smaller values: preserve more high-frequency content
        - Larger values: stronger separation but may distort signals
    
    Returns:
    --------
    downgoing : ndarray
        Downgoing wavefield
    upgoing : ndarray
        Upgoing wavefield (reflections)
    """
    # Ensure filter length is odd
    if filter_length % 2 == 0:
        filter_length += 1
    
    # Apply median filter along depth axis
    # This preserves downgoing events (coherent in depth)
    downgoing = median_filter(data, size=(filter_length, 1), mode='reflect')
    
    # Upgoing = Total - Downgoing
    upgoing = data - downgoing
    
    return downgoing, upgoing

# Separate wavefields
downgoing_wave, upgoing_wave = separate_wavefields(vsp_data, filter_length=15)

# Plot separated wavefields
fig, axes = plt.subplots(1, 3, figsize=(18, 6))

titles = ['Raw VSP Data', 'Downgoing Wavefield', 'Upgoing Wavefield (Reflections)']
datasets = [vsp_data, downgoing_wave, upgoing_wave]

for ax, data, title in zip(axes, datasets, titles):
    vmax = np.percentile(np.abs(data), 98)
    im = ax.imshow(data, aspect='auto', cmap='seismic',
                   extent=[time_axis[0]*1000, time_axis[-1]*1000, 
                          depth_axis[-1], depth_axis[0]],
                   vmin=-vmax, vmax=vmax)
    ax.set_xlabel('Time (ms)')
    ax.set_ylabel('Depth (m)')
    ax.set_title(title, fontweight='bold')
    plt.colorbar(im, ax=ax)

plt.tight_layout()
plt.show()

print("✓ Wavefield separation complete")
print("\nKey observations:")
print("- Downgoing: Direct wave and tube waves")
print("- Upgoing: Reflection events from subsurface interfaces")

## 5. Corridor Stack Generation

The corridor stack is created by:
1. Flattening the upgoing wavefield using first breaks
2. Extracting a narrow time window (corridor) around zero time
3. Stacking the traces within this corridor

This produces a pseudo-reflection trace that can be compared with surface seismic data for well-to-seismic tie.

In [None]:
def generate_corridor_stack(upgoing_data, time, depth, first_breaks, 
                           corridor_width=0.05, taper_width=0.01):
    """
    Generate corridor stack from upgoing wavefield.
    
    Parameters:
    -----------
    upgoing_data : ndarray
        Upgoing wavefield data
    time : ndarray
        Time axis
    depth : ndarray
        Depth axis
    first_breaks : ndarray
        First break times for each depth
    corridor_width : float
        Width of corridor in seconds (±)
    taper_width : float
        Taper width at edges in seconds
    
    Returns:
    --------
    corridor_stack : ndarray
        Stacked trace
    two_way_time : ndarray
        Two-way time axis
    """
    n_channels, n_samples = upgoing_data.shape
    dt = time[1] - time[0]
    
    # Create flattened data (align on first breaks)
    flattened = np.zeros_like(upgoing_data)
    
    for i in range(n_channels):
        # Calculate shift for this channel
        shift_samples = int(first_breaks[i] / dt)
        
        if shift_samples < n_samples:
            # Shift trace to align first break at t=0
            flattened[i, :n_samples-shift_samples] = upgoing_data[i, shift_samples:]
    
    # Create corridor mute
    corridor_samples = int(corridor_width / dt)
    taper_samples = int(taper_width / dt)
    
    # Create Tukey window for corridor
    mute = np.zeros(n_samples)
    center = n_samples // 2
    
    # Corridor region
    start = max(0, center - corridor_samples)
    end = min(n_samples, center + corridor_samples)
    mute[start:end] = 1.0
    
    # Apply taper
    if taper_samples > 0:
        taper_start = np.linspace(0, 1, taper_samples)
        taper_end = np.linspace(1, 0, taper_samples)
        
        if start + taper_samples < end:
            mute[start:start+taper_samples] = taper_start
            mute[end-taper_samples:end] = taper_end
    
    # Apply corridor mute and stack
    corridor_stack = np.zeros(n_samples)
    weights = np.zeros(n_samples)
    
    for i in range(n_channels):
        corridor_stack += flattened[i, :] * mute
        weights += mute
    
    # Normalize by number of contributing traces
    weights[weights == 0] = 1  # Avoid division by zero
    corridor_stack /= weights
    
    # Create two-way time axis (centered on zero)
    two_way_time = (np.arange(n_samples) - n_samples//2) * dt
    
    return corridor_stack, two_way_time, flattened

# Generate corridor stack
corridor_stack, twt_axis, flattened_data = generate_corridor_stack(
    upgoing_wave, time_axis, depth_axis, first_breaks,
    corridor_width=0.05, taper_width=0.01
)

# Plot corridor stack generation
fig = plt.figure(figsize=(16, 10))
gs = fig.add_gridspec(2, 2, hspace=0.3, wspace=0.3)

# Upgoing wavefield
ax1 = fig.add_subplot(gs[0, 0])
vmax = np.percentile(np.abs(upgoing_wave), 98)
ax1.imshow(upgoing_wave, aspect='auto', cmap='seismic',
           extent=[time_axis[0]*1000, time_axis[-1]*1000, depth_axis[-1], depth_axis[0]],
           vmin=-vmax, vmax=vmax)
ax1.plot(first_breaks*1000, depth_axis, 'r--', linewidth=2, label='First Breaks')
ax1.set_xlabel('Time (ms)')
ax1.set_ylabel('Depth (m)')
ax1.set_title('Upgoing Wavefield', fontweight='bold')
ax1.legend()

# Flattened data
ax2 = fig.add_subplot(gs[0, 1])
vmax = np.percentile(np.abs(flattened_data), 98)
ax2.imshow(flattened_data, aspect='auto', cmap='seismic',
           extent=[twt_axis[0]*1000, twt_axis[-1]*1000, depth_axis[-1], depth_axis[0]],
           vmin=-vmax, vmax=vmax)
ax2.axvline(0, color='r', linestyle='--', linewidth=2, label='Zero Time')
ax2.set_xlabel('Two-Way Time (ms)')
ax2.set_ylabel('Depth (m)')
ax2.set_title('Flattened Upgoing Wavefield', fontweight='bold')
ax2.legend()

# Corridor stack
ax3 = fig.add_subplot(gs[1, :])
ax3.plot(twt_axis*1000, corridor_stack, 'b-', linewidth=2)
ax3.fill_between(twt_axis*1000, 0, corridor_stack, alpha=0.3)
ax3.axhline(0, color='k', linestyle='-', linewidth=0.5)
ax3.axvline(0, color='r', linestyle='--', linewidth=1)
ax3.set_xlabel('Two-Way Time (ms)')
ax3.set_ylabel('Amplitude')
ax3.set_title('Corridor Stack (Pseudo-Reflection)', fontweight='bold')
ax3.grid(True, alpha=0.3)
ax3.set_xlim([twt_axis[0]*1000, twt_axis[-1]*1000])

plt.show()

print("✓ Corridor stack generated successfully")
print("\nThis trace can be used for:")
print("- Well-to-seismic tie")
print("- Wavelet extraction")
print("- Synthetic seismogram comparison")

## 6. Real-World Scenarios and Challenges

### Scenario 1: Dealing with Poor Coupling

**Problem**: DAS fiber may have poor coupling in certain depth intervals, leading to weak signals or gaps in data.

**Solution**: Identify and interpolate over dead channels

In [None]:
def detect_dead_channels(data, threshold_percentile=5):
    """
    Detect channels with anomalously low energy (poor coupling).
    
    Parameters:
    -----------
    data : ndarray
        VSP data
    threshold_percentile : float
        Percentile threshold for dead channel detection
    
    Returns:
    --------
    dead_channels : ndarray (bool)
        Boolean mask of dead channels
    """
    # Calculate RMS energy for each channel
    channel_energy = np.sqrt(np.mean(data**2, axis=1))
    
    # Identify channels below threshold
    threshold = np.percentile(channel_energy, threshold_percentile)
    dead_channels = channel_energy < threshold
    
    return dead_channels, channel_energy

def interpolate_dead_channels(data, dead_channels):
    """
    Interpolate over dead channels using neighboring traces.
    """
    data_interp = data.copy()
    n_channels = data.shape[0]
    
    # Find groups of consecutive dead channels
    dead_idx = np.where(dead_channels)[0]
    
    for idx in dead_idx:
        # Find nearest live channels
        left = idx - 1
        while left >= 0 and dead_channels[left]:
            left -= 1
        
        right = idx + 1
        while right < n_channels and dead_channels[right]:
            right += 1
        
        # Interpolate
        if left >= 0 and right < n_channels:
            weight = (idx - left) / (right - left)
            data_interp[idx] = (1 - weight) * data[left] + weight * data[right]
        elif left >= 0:
            data_interp[idx] = data[left]
        elif right < n_channels:
            data_interp[idx] = data[right]
    
    return data_interp

# Simulate dead channels
vsp_with_dead = vsp_data.copy()
vsp_with_dead[100:110, :] *= 0.1  # Simulate poor coupling
vsp_with_dead[250, :] *= 0.05     # Single dead channel

# Detect and fix
dead_mask, energy = detect_dead_channels(vsp_with_dead)
vsp_fixed = interpolate_dead_channels(vsp_with_dead, dead_mask)

# Plot comparison
fig, axes = plt.subplots(1, 3, figsize=(18, 6))

# Original
vmax = np.percentile(np.abs(vsp_data), 98)
axes[0].imshow(vsp_data, aspect='auto', cmap='seismic',
               extent=[time_axis[0]*1000, time_axis[-1]*1000, depth_axis[-1], depth_axis[0]],
               vmin=-vmax, vmax=vmax)
axes[0].set_title('Original Data')
axes[0].set_xlabel('Time (ms)')
axes[0].set_ylabel('Depth (m)')

# With dead channels
axes[1].imshow(vsp_with_dead, aspect='auto', cmap='seismic',
               extent=[time_axis[0]*1000, time_axis[-1]*1000, depth_axis[-1], depth_axis[0]],
               vmin=-vmax, vmax=vmax)
axes[1].axhspan(depth_axis[100], depth_axis[110], alpha=0.3, color='red', 
                label='Dead zone')
axes[1].axhline(depth_axis[250], color='red', linestyle='--', linewidth=2)
axes[1].set_title('Data with Poor Coupling')
axes[1].set_xlabel('Time (ms)')
axes[1].legend()

# Fixed
axes[2].imshow(vsp_fixed, aspect='auto', cmap='seismic',
               extent=[time_axis[0]*1000, time_axis[-1]*1000, depth_axis[-1], depth_axis[0]],
               vmin=-vmax, vmax=vmax)
axes[2].set_title('After Interpolation')
axes[2].set_xlabel('Time (ms)')

plt.tight_layout()
plt.show()

print(f"✓ Detected {np.sum(dead_mask)} dead channels ({100*np.sum(dead_mask)/len(dead_mask):.1f}%)")

### Scenario 2: Tube Wave Attenuation

**Problem**: Tube waves are coherent noise traveling along the borehole, interfering with reflection signals.

**Solution**: F-K filtering to remove coherent noise with specific velocity

In [None]:
def fk_filter(data, dt, dx, velocity_min, velocity_max):
    """
    Apply F-K (frequency-wavenumber) filter to remove coherent noise.
    
    Parameters:
    -----------
    data : ndarray
        Input data (n_channels x n_samples)
    dt : float
        Time sampling interval (s)
    dx : float
        Spatial sampling interval (m)
    velocity_min : float
        Minimum velocity to pass (m/s)
    velocity_max : float
        Maximum velocity to pass (m/s)
    
    Returns:
    --------
    filtered_data : ndarray
        Filtered data
    """
    # 2D FFT to F-K domain
    fk_data = np.fft.fft2(data)
    fk_data = np.fft.fftshift(fk_data)
    
    # Create frequency and wavenumber axes
    n_channels, n_samples = data.shape
    freq = np.fft.fftshift(np.fft.fftfreq(n_samples, dt))
    knum = np.fft.fftshift(np.fft.fftfreq(n_channels, dx))
    
    # Create 2D meshgrid
    K, F = np.meshgrid(knum, freq, indexing='ij')
    
    # Calculate apparent velocity: v = f/k
    # Avoid division by zero
    with np.errstate(divide='ignore', invalid='ignore'):
        velocity = np.abs(F / (K + 1e-10))
    
    # Create filter mask (pass velocities outside tube wave range)
    filter_mask = np.ones_like(velocity)
    filter_mask[(velocity >= velocity_min) & (velocity <= velocity_max)] = 0
    
    # Apply taper to avoid ringing
    taper_width = 0.1  # 10% of velocity range
    v_range = velocity_max - velocity_min
    
    # Smooth transition
    transition_zone = (velocity >= velocity_min - taper_width * v_range) & \
                     (velocity <= velocity_min + taper_width * v_range)
    filter_mask[transition_zone] = 0.5 * (1 + np.cos(
        np.pi * (velocity[transition_zone] - velocity_min) / (taper_width * v_range)))
    
    transition_zone = (velocity >= velocity_max - taper_width * v_range) & \
                     (velocity <= velocity_max + taper_width * v_range)
    filter_mask[transition_zone] = 0.5 * (1 + np.cos(
        np.pi * (velocity_max - velocity[transition_zone]) / (taper_width * v_range)))
    
    # Apply filter
    fk_filtered = fk_data * filter_mask
    
    # Inverse FFT back to space-time domain
    fk_filtered = np.fft.ifftshift(fk_filtered)
    filtered_data = np.real(np.fft.ifft2(fk_filtered))
    
    return filtered_data

# Apply F-K filter to remove tube waves (1200-1800 m/s)
dt = time_axis[1] - time_axis[0]
dx = depth_axis[1] - depth_axis[0]

vsp_fk_filtered = fk_filter(vsp_data, dt, dx, 
                            velocity_min=1200, velocity_max=1800)

# Plot comparison
fig, axes = plt.subplots(1, 2, figsize=(14, 6))

vmax = np.percentile(np.abs(vsp_data), 98)

axes[0].imshow(vsp_data, aspect='auto', cmap='seismic',
               extent=[time_axis[0]*1000, time_axis[-1]*1000, depth_axis[-1], depth_axis[0]],
               vmin=-vmax, vmax=vmax)
axes[0].set_title('Before F-K Filter')
axes[0].set_xlabel('Time (ms)')
axes[0].set_ylabel('Depth (m)')

axes[1].imshow(vsp_fk_filtered, aspect='auto', cmap='seismic',
               extent=[time_axis[0]*1000, time_axis[-1]*1000, depth_axis[-1], depth_axis[0]],
               vmin=-vmax, vmax=vmax)
axes[1].set_title('After F-K Filter (Tube Waves Removed)')
axes[1].set_xlabel('Time (ms)')

plt.tight_layout()
plt.show()

print("✓ F-K filter applied to attenuate tube waves (1200-1800 m/s)")

## 7. Troubleshooting Guide

### Common Issues and Solutions

#### Issue 1: Weak or Missing First Breaks
**Symptoms**: 
- Automatic picking fails
- Irregular first break times

**Possible Causes**:
- Poor signal-to-noise ratio
- Incorrect search window
- Source coupling issues

**Solutions**:
1. Apply bandpass filter before picking (20-100 Hz typical)
2. Adjust search window based on expected velocities
3. Use correlation-based picking with reference trace
4. Manual quality control and correction

#### Issue 2: Poor Wavefield Separation
**Symptoms**:
- Residual downgoing energy in upgoing section
- Distorted reflection signals

**Possible Causes**:
- Inappropriate median filter length
- Complex velocity structure
- Multiple tube wave modes

**Solutions**:
1. Test different filter lengths (typically 9-21)
2. Apply F-K filtering before median filtering
3. Use adaptive filters based on local dip
4. Consider tau-p domain separation

#### Issue 3: Noisy Corridor Stack
**Symptoms**:
- Low signal-to-noise in corridor stack
- Difficult well-to-seismic tie

**Possible Causes**:
- Incomplete wavefield separation
- Corridor window too narrow/wide
- Poor data quality in certain depth ranges

**Solutions**:
1. Optimize corridor width (typically 40-100 ms)
2. Apply depth-dependent weighting
3. Exclude poor-quality depth intervals
4. Apply spectral balancing

#### Issue 4: DAS-Specific Challenges
**Symptoms**:
- Spatially varying sensitivity
- Fading or signal dropout

**DAS-Specific Considerations**:
1. **Gauge length effects**: 
   - DAS measures strain rate over gauge length (typically 10m)
   - Acts as spatial averaging → reduces resolution
   - Notch frequencies at wavelengths = gauge length

2. **Coupling variations**:
   - Cement quality affects signal strength
   - Temperature gradients affect fiber response
   - Solution: Amplitude normalization by depth

3. **Directional sensitivity**:
   - DAS measures axial strain only
   - Horizontal fiber → poor P-wave response
   - Vertical fiber → optimal for VSP

### Quality Control Checklist

- [ ] Check data acquisition parameters (sample rate, gauge length, spacing)
- [ ] Verify depth calibration against known markers
- [ ] Inspect raw data for obvious issues (noise, dead channels)
- [ ] Validate first break picks (should be smooth with depth)
- [ ] Check interval velocities (should match expected geology)
- [ ] Verify wavefield separation quality visually
- [ ] Ensure corridor stack shows coherent reflections
- [ ] Compare with offset VSP or surface seismic if available
- [ ] Document all processing parameters for reproducibility

## 8. Complete Processing Workflow Example

In [None]:
def process_vsp_workflow(raw_data, time, depth, 
                        apply_fk=True, 
                        median_length=15,
                        corridor_width=0.05):
    """
    Complete VSP processing workflow.
    
    Returns all intermediate and final products.
    """
    results = {}
    
    print("VSP Processing Workflow")
    print("=" * 50)
    
    # Step 1: Quality control
    print("[1/6] Quality control and dead channel detection...")
    dead_channels, energy = detect_dead_channels(raw_data)
    data_qc = interpolate_dead_channels(raw_data, dead_channels)
    results['data_qc'] = data_qc
    print(f"      Found and fixed {np.sum(dead_channels)} dead channels")
    
    # Step 2: F-K filtering (optional)
    if apply_fk:
        print("[2/6] F-K filtering to remove tube waves...")
        dt = time[1] - time[0]
        dx = depth[1] - depth[0]
        data_fk = fk_filter(data_qc, dt, dx, 1200, 1800)
        results['data_fk'] = data_fk
        print("      Tube waves attenuated")
    else:
        data_fk = data_qc
        print("[2/6] Skipping F-K filter")
    
    # Step 3: First break picking
    print("[3/6] First break picking...")
    fb = pick_first_breaks(data_fk, time, depth)
    results['first_breaks'] = fb
    print("      First breaks picked")
    
    # Step 4: Wavefield separation
    print("[4/6] Wavefield separation...")
    downgoing, upgoing = separate_wavefields(data_fk, median_length)
    results['downgoing'] = downgoing
    results['upgoing'] = upgoing
    print("      Upgoing and downgoing waves separated")
    
    # Step 5: Corridor stack
    print("[5/6] Generating corridor stack...")
    corridor, twt, flattened = generate_corridor_stack(
        upgoing, time, depth, fb, corridor_width
    )
    results['corridor_stack'] = corridor
    results['two_way_time'] = twt
    results['flattened'] = flattened
    print("      Corridor stack generated")
    
    # Step 6: Velocity analysis
    print("[6/6] Velocity analysis...")
    interval_vel = np.gradient(depth) / np.gradient(fb)
    results['interval_velocity'] = interval_vel
    print(f"      Average velocity: {np.mean(interval_vel):.1f} m/s")
    
    print("\n" + "=" * 50)
    print("Processing complete!")
    
    return results

# Run complete workflow
processed = process_vsp_workflow(
    vsp_data, time_axis, depth_axis,
    apply_fk=True,
    median_length=15,
    corridor_width=0.05
)

print("\nProcessed results available:")
for key in processed.keys():
    print(f"  - {key}")

## Summary and Best Practices

### Key Takeaways:

1. **DAS-VSP advantages**:
   - High spatial sampling density
   - Continuous coverage (no gaps)
   - Permanent monitoring capability
   - Cost-effective for long wells

2. **Critical processing steps**:
   - Quality control (dead channels, coupling)
   - First break picking (foundation for everything)
   - Wavefield separation (median filtering)
   - Corridor stack (for well ties)

3. **Common pitfalls to avoid**:
   - Blindly trusting automatic picks
   - Over-filtering (destroys signal)
   - Ignoring DAS-specific effects (gauge length)
   - Poor corridor window selection

4. **Best practices**:
   - Always visualize intermediate results
   - Document all processing parameters
   - Compare with independent data (logs, surface seismic)
   - Perform sensitivity analysis on key parameters
   - Use domain knowledge (expected velocities, geology)

### Further Reading:
- Mateeva et al. (2014): "Distributed acoustic sensing for reservoir monitoring"
- Daley et al. (2016): "Field testing of fiber-optic distributed acoustic sensing"
- Hardage (2000): "Vertical Seismic Profiling: Principles" (classic reference)

### Next Steps:
- Apply to real field data
- Integrate with petrophysical analysis
- Time-lapse VSP for reservoir monitoring
- 3D VSP for imaging around wellbore