# Synthetic Signal Sweep with Normal Noise Integration

This notebook generates synthetic bearing fault signals by iterating over:
- **RPM**: 1730, 1750, 1772, 1797
- **Fault Diameter**: 0.178mm, 0.356mm, 0.533mm
- **K (Pulse Height)**: Variable per defect type (Inner: 0.1, Outer/Ball: 0.05)

Crucially, the generated synthetic signal is **summed with real 'Normal' baseline data** loaded from the dataset, creating a realistic noisy fault signal.

## 1. Imports and Setup

In [23]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import os
import sys
sys.path.append(os.path.abspath('src'))
import bearing_utils as bu

# Configuration
DATA_DIR = os.path.abspath('data')
FS = 12000  # Sampling rate

# Parameters for Sweep
RPMS = [1730, 1750, 1772, 1797]
DIAMETERS = [0.178, 0.356, 0.533]
K_VALUES = {'inner': 0.1, 'outer': 0.008, 'ball': 0.05}
FAULT_TYPES = ['inner', 'outer', 'ball']

print("Configuration loaded.")

Configuration loaded.


## 2. Helper Functions

In [24]:
def load_normal_baseline(rpm, data_dir=DATA_DIR):
    """
    Loads the 'Normal' .npz file for the specific RPM.
    Expected filename format: '1730_Normal.npz', '1750_Normal.npz', etc.
    Returns the full signal array or None if file not found.
    """
    filename = f"{rpm}_Normal.npz"
    filepath = os.path.join(data_dir, str(rpm) + " RPM", filename)
    
    if not os.path.exists(filepath):
        # Try searching if not in exact path, or just return None for now
        # Backup: check root of data_dir/RPM
        # Assuming fixed structure: data_dir / {RPM} RPM / {file}
        print(f"Warning: Normal baseline file not found at {filepath}")
        return None
    
    try:
        data = np.load(filepath)
        # The key for normal data is usually unique or contains 'Normal'. 
        # Let's inspect keys if needed, or assume first key is data
        # Based on clean-and-transformation logic:
        # keys are like '1730_Normal_1', or similar.
        # We'll take the first array found.
        key = data.files[0]
        return data[key]
    except Exception as e:
        print(f"Error loading {filename}: {e}")
        return None

def synthesize_time_signal(spectrum_df, duration=1.0, fs=FS):
    """
    Reconstructs time domain signal from spectrum df (freq, amp).
    Sum of cosines with random phase (or zero phase for worst case).
    Here using zero phase for consistency as per previous notebooks.
    """
    t = np.arange(0, duration, 1/fs)
    signal = np.zeros_like(t)
    
    # Check column names from bearing_utils output
    # bu returns: 'Frequency_Hz', 'Amplitude_m_s2'
    
    freqs = spectrum_df['Frequency_Hz'].values
    amps = spectrum_df['Amplitude_m_s2'].values
    
    # Vectorized sum is memory intensive for long signals/many freqs.
    # Loop is safer for memory.
    for f, A in zip(freqs, amps):
        signal += A * np.cos(2 * np.pi * f * t)
        
    return signal

## 3. Main Generation Loop

We will iterate through all combinations, generate the synthetic fault signal, load the corresponding normal signal, resize/crop them to match (using standard segment size, e.g., 4096 points), and sum them.

For demonstration/verification, we will store a few examples to plot.

In [27]:
generated_data = []
segment_size = 4096

for rpm in RPMS:
    print(f"Processing RPM: {rpm}")
    normal_sig = load_normal_baseline(rpm)
    
    if normal_sig is None:
        # Fallback if file missing: use zero noise or skip? 
        # For now, let's skip to avoid bad data
        continue
        
    print(f"  Loaded Normal Baseline. Shape: {normal_sig.shape}")
    
    # Take a random 4096 segment from Normal signal to act as noise
    # OR start from beginning. Let's take from beginning for reproducibility.
    if len(normal_sig) < segment_size:
        # Pad if too short? Unlikely given dataset.
        continue
    
    normal_segment = normal_sig[:segment_size]
    
    # Calculate Time duration for this segment size
    duration_seg = segment_size / FS

    for f_type in FAULT_TYPES:
        k_val = K_VALUES[f_type]
        for diam in DIAMETERS:
            # Generate Synthetic Signal
            print(f"  Generating {f_type.title()} ({k_val:.3f}) fault, D={diam}, K={k_val}")
            
            # We need spectrum first
            if f_type == 'inner':
                spec_df = bu.calcular_espectro_inner_completo(diam, rpm, K=k_val)
            elif f_type == 'outer':
                spec_df = bu.calcular_espectro_outer_race(diam, rpm, K=k_val)
            elif f_type == 'ball':
                spec_df = bu.calcular_espectro_ball_completo(diam, rpm, K=k_val)
            
            # Synthesize Time Domain
            syn_sig = synthesize_time_signal(spec_df, duration=duration_seg, fs=FS)
            
            # Ensure lengths match exactly (floating point duration might cause +/-1 sample diff)
            if len(syn_sig) > len(normal_segment):
                syn_sig = syn_sig[:len(normal_segment)]
            elif len(syn_sig) < len(normal_segment):
                 # Pad with zeros if short (shouldn't happen with correct logic)
                syn_sig = np.pad(syn_sig, (0, len(normal_segment) - len(syn_sig)))
                
            # SUMMATION
            combined_sig = normal_segment + syn_sig
            
            # Store Result
            generated_data.append({
                'rpm': rpm,
                'fault_type': f_type,
                'diameter': diam,
                'k_val': k_val,
                'signal': combined_sig,
                'synthetic_component': syn_sig,
                'normal_component': normal_segment
            })

print("Generation Complete.")

Processing RPM: 1730
  Loaded Normal Baseline. Shape: (485643,)
  Generating Inner Race (0.100) fault, D=0.178, K=0.1
  Generating Inner Race (0.100) fault, D=0.356, K=0.1
  Generating Inner Race (0.100) fault, D=0.533, K=0.1
  Generating Outer Race (0.008) fault, D=0.178, K=0.008
  Generating Outer Race (0.008) fault, D=0.356, K=0.008
  Generating Outer Race (0.008) fault, D=0.533, K=0.008
  Generating Ball (0.050) fault, D=0.178, K=0.05
  Generating Ball (0.050) fault, D=0.356, K=0.05
  Generating Ball (0.050) fault, D=0.533, K=0.05
Processing RPM: 1750
  Loaded Normal Baseline. Shape: (485643,)
  Generating Inner Race (0.100) fault, D=0.178, K=0.1
  Generating Inner Race (0.100) fault, D=0.356, K=0.1
  Generating Inner Race (0.100) fault, D=0.533, K=0.1
  Generating Outer Race (0.008) fault, D=0.178, K=0.008
  Generating Outer Race (0.008) fault, D=0.356, K=0.008
  Generating Outer Race (0.008) fault, D=0.533, K=0.008
  Generating Ball (0.050) fault, D=0.178, K=0.05
  Generating Bal

## 4. Visualization

Let's inspect a few examples to verify the summation looks correct.

In [26]:
# Let's inspect RPM 1730 for all 3 fault types
sample_data = [d for d in generated_data if d['rpm'] == 1730]

fig, axes = plt.subplots(3, 1, figsize=(12, 10), sharex=True)

for i, f_type in enumerate(FAULT_TYPES):
    ax = axes[i]
    # grab first example of this fault type (smallest diameter)
    example = next(d for d in sample_data if d['fault_type'] == f_type)
    
    # Create time axis
    t = np.arange(len(example['signal'])) / FS
    
    ax.plot(t, example['signal'], label='Combined (Normal + Synthetic)', alpha=0.7)
    ax.plot(t, example['synthetic_component'], label='Synthetic Fault Only', alpha=0.7, linestyle='--')
    ax.set_title(f"RPM=1730, {f_type.upper()} Fault, D={example['diameter']}mm, K={example['k_val']}")
    ax.legend(loc='upper right')
    ax.grid(True, alpha=0.3)
    ax.set_ylabel("Acceleration (m/sÂ²)")

axes[-1].set_xlabel("Time (s)")
plt.tight_layout()
plt.show()

NameError: name 'generated_data' is not defined