# Automated Test Log Analysis System
### Interactive Walkthrough

Step through each stage of the semiconductor test log analysis pipeline.  
Run each cell with **Shift+Enter** to see results and visualizations inline.

In [None]:
import os, sys
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from IPython.display import display, Markdown

%matplotlib inline
plt.rcParams['figure.figsize'] = (14, 5)
plt.rcParams['figure.dpi'] = 120
sns.set_theme(style='whitegrid', palette='muted', font_scale=1.1)

PROJECT_ROOT = os.path.dirname(os.path.abspath('__file__'))
sys.path.insert(0, PROJECT_ROOT)

from src import generate_logs, data_processing, error_extraction
from src import pattern_detection, anomaly_detection, trend_analysis

print('All modules loaded.')

---
## Step 1: Generate Simulated Test Logs

Creates **55,000+** test records across **10,000 devices**, **25 batches**, and **8 test types**.  
Includes controlled anomalies: Batch 17 has ~4x failure rate, failures increase above 80°C.

In [None]:
data_path = os.path.join(PROJECT_ROOT, 'data', 'test_logs.csv')
generate_logs.generate(data_path)

# Preview raw data
raw = pd.read_csv(data_path, nrows=10)
display(Markdown('### Raw Data Preview (first 10 rows)'))
display(raw)

---
## Step 2: Data Processing & Cleaning

Handles messy engineering data:
- Converts numeric columns (coerces invalid values to NaN)
- Removes out-of-range readings (e.g. Temperature = -999)
- Drops rows with missing critical values
- Parses timestamps, standardizes result column

In [None]:
df, cleaning_report = data_processing.process(data_path)

display(Markdown('### Cleaning Report'))
for key, val in cleaning_report.items():
    print(f'  {key:<30} : {val:,}')

display(Markdown('### Clean Data Sample'))
display(df.head(10))

display(Markdown('### Data Types'))
display(df.dtypes)

---
## Step 3: Error Extraction & Failure Classification

Filters all FAIL results, maps each error code to one of 4 categories:
- **Timing Failure** — clock/setup/hold violations
- **Overcurrent** — voltage/leakage anomalies
- **Data Corruption** — integrity check failures
- **Thermal Issue** — burn-in/stress thermal events

In [None]:
failures, err_summary, cat_summary = error_extraction.analyze(df)

display(Markdown('### Error Code Summary'))
display(err_summary)

display(Markdown('### Failure Category Breakdown'))
display(cat_summary)

In [None]:
# Visualize: Error code frequency
fig, axes = plt.subplots(1, 2, figsize=(16, 6))

top = err_summary.head(12)
axes[0].barh(top['Error_Code'] + ' (' + top['Test_Name'] + ')', top['Count'],
             color=sns.color_palette('Set2', len(top)), edgecolor='gray')
axes[0].set_xlabel('Failure Count')
axes[0].set_title('Top Error Codes by Frequency', fontweight='bold')
axes[0].invert_yaxis()

# Visualize: Failure categories pie
axes[1].pie(cat_summary['Count'], labels=cat_summary['Failure_Category'],
            autopct='%1.1f%%', colors=sns.color_palette('Set2', len(cat_summary)),
            startangle=140, pctdistance=0.85)
axes[1].set_title('Failure Distribution by Category', fontweight='bold')

plt.tight_layout()
plt.show()

---
## Step 4: Pattern Detection & Correlation Analysis

Looks for relationships between failures and environmental conditions:  
temperature, voltage, batch, test type, and execution duration.

In [None]:
patterns = pattern_detection.analyze(df)

display(Markdown('### Feature Correlations with Failure'))
display(patterns['correlations'])

In [None]:
# Failure rate vs Temperature
temp = patterns['failure_vs_temperature']
fig, axes = plt.subplots(1, 2, figsize=(16, 5))

axes[0].bar(temp['Temperature_bin'].astype(str), temp['Failure_Rate_Pct'],
            color='coral', edgecolor='gray')
axes[0].axhline(y=temp['Failure_Rate_Pct'].mean(), color='red',
                linestyle='--', label='Average')
axes[0].set_xlabel('Temperature Range (°C)')
axes[0].set_ylabel('Failure Rate (%)')
axes[0].set_title('Failure Rate vs Temperature', fontweight='bold')
axes[0].legend()
plt.setp(axes[0].xaxis.get_majorticklabels(), rotation=45, ha='right')

# Failure rate vs Voltage
volt = patterns['failure_vs_voltage']
axes[1].bar(volt['Voltage_bin'].astype(str), volt['Failure_Rate_Pct'],
            color='steelblue', edgecolor='gray')
axes[1].set_xlabel('Voltage Range (V)')
axes[1].set_ylabel('Failure Rate (%)')
axes[1].set_title('Failure Rate vs Voltage', fontweight='bold')
plt.setp(axes[1].xaxis.get_majorticklabels(), rotation=45, ha='right')

plt.tight_layout()
plt.show()

In [None]:
# Failure heatmap: Test Type x Batch
pivot = df.pivot_table(index='Test_Name', columns='Batch_ID',
                       values='Result',
                       aggfunc=lambda x: (x == 'FAIL').mean() * 100)

fig, ax = plt.subplots(figsize=(18, 6))
sns.heatmap(pivot, annot=True, fmt='.1f', cmap='YlOrRd', ax=ax,
            linewidths=0.5, cbar_kws={'label': 'Failure Rate (%)'})
ax.set_title('Failure Rate Heatmap: Test Type vs Batch', fontsize=14, fontweight='bold')
plt.xticks(rotation=45, ha='right')
plt.tight_layout()
plt.show()

---
## Step 5: Anomaly Detection

Uses statistical methods to flag unusual behavior:
- **Z-score** on batch failure rates → anomalous production lots
- **Moving average + bounds** on daily failures → error spikes
- **Z-score** on per-device failure rates → outlier devices

In [None]:
anomaly_results = anomaly_detection.analyze(df, failures)

display(Markdown('### Anomalous Batches'))
display(anomaly_results['anomalous_batches'])

display(Markdown(f"### Outlier Devices: {len(anomaly_results['outlier_devices'])} found"))
display(anomaly_results['outlier_devices'].head(10))

In [None]:
# Batch failure rate with anomaly highlighting
batch = patterns['failure_vs_batch']
mean_rate = batch['Failure_Rate_Pct'].mean()

colors = ['tomato' if r > mean_rate * 2 else 'steelblue'
          for r in batch['Failure_Rate_Pct']]

fig, ax = plt.subplots(figsize=(15, 5))
ax.bar(batch['Batch_ID'], batch['Failure_Rate_Pct'], color=colors, edgecolor='gray')
ax.axhline(y=mean_rate, color='red', linestyle='--',
           label=f'Average ({mean_rate:.1f}%)')
ax.set_xlabel('Batch ID')
ax.set_ylabel('Failure Rate (%)')
ax.set_title('Failure Rate by Batch (anomalies in red)', fontweight='bold')
ax.legend()
plt.xticks(rotation=45, ha='right')
plt.tight_layout()
plt.show()

In [None]:
# Error spike timeline
spike_data = anomaly_results['spike_data']

fig, ax = plt.subplots(figsize=(15, 5))
ax.fill_between(spike_data.index, spike_data['Daily_Failures'], alpha=0.3, color='cornflowerblue')
ax.plot(spike_data.index, spike_data['Daily_Failures'], color='cornflowerblue',
        linewidth=1, label='Daily Failures')
ax.plot(spike_data.index, spike_data['Moving_Avg'], color='blue',
        linestyle='--', linewidth=1.5, label='Moving Avg')
ax.plot(spike_data.index, spike_data['Upper_Bound'], color='red',
        linestyle=':', linewidth=1, label='Upper Bound (2.5σ)')

spikes = spike_data[spike_data['Is_Spike']]
if not spikes.empty:
    ax.scatter(spikes.index, spikes['Daily_Failures'], color='red',
              s=80, zorder=5, label='Spikes')

ax.set_xlabel('Date')
ax.set_ylabel('Failure Count')
ax.set_title('Daily Failure Count with Spike Detection', fontweight='bold')
ax.legend()
plt.xticks(rotation=45, ha='right')
plt.tight_layout()
plt.show()

---
## Step 6: Trend Analysis

Tracks failure rates over time with daily measurements, 7-day and 14-day rolling averages.

In [None]:
trends = trend_analysis.analyze(df)

daily = trends['daily']
rolling_7 = trends['rolling_7d']
rolling_14 = trends['rolling_14d']

fig, ax = plt.subplots(figsize=(15, 5))
ax.plot(daily.index, daily['Failure_Rate_Pct'], alpha=0.3, color='gray',
        label='Daily', linewidth=0.8)
ax.plot(rolling_7.index, rolling_7, color='coral',
        label='7-day Rolling Avg', linewidth=2)
ax.plot(rolling_14.index, rolling_14, color='steelblue',
        label='14-day Rolling Avg', linewidth=2)
ax.set_xlabel('Date')
ax.set_ylabel('Failure Rate (%)')
ax.set_title('Failure Rate Trend Over Time', fontweight='bold')
ax.legend()
plt.xticks(rotation=45, ha='right')
plt.tight_layout()
plt.show()

In [None]:
# Failure rate distribution by test type
device_test = df.groupby(['Device_ID', 'Test_Name'])['Result'].agg(
    lambda x: (x == 'FAIL').mean() * 100
).reset_index()
device_test.columns = ['Device_ID', 'Test_Name', 'Failure_Rate_Pct']

fig, ax = plt.subplots(figsize=(14, 6))
sns.boxplot(data=device_test, x='Test_Name', y='Failure_Rate_Pct',
            hue='Test_Name', palette='Set2', fliersize=3, legend=False, ax=ax)
ax.set_xlabel('Test Name')
ax.set_ylabel('Failure Rate (%)')
ax.set_title('Failure Rate Distribution by Test Type', fontweight='bold')
plt.xticks(rotation=45, ha='right')
plt.tight_layout()
plt.show()

---
## Step 7: Summary Report

Auto-generated engineering analysis with key findings and recommendations.

In [None]:
from src import report_generator

report_path = os.path.join(PROJECT_ROOT, 'output', 'reports', 'analysis_report.txt')
report_text = report_generator.generate_report(
    df, cleaning_report, failures, err_summary, cat_summary,
    patterns, anomaly_results, trends, report_path
)

print(report_text)

---
## Key Findings

| Finding | Detail |
|---------|--------|
| Highest failure test | **Timing_Check** (~4.7%) |
| Temperature effect | Failure rate jumps above 80°C |
| Anomalous batch | **Batch_17** (~8.7%, Z-Score > 4) |
| Top failure category | **Data Corruption** (~29%) |
| Outlier devices | 200+ flagged by Z-score |

### Recommendations
1. Targeted root cause analysis on Timing_Check failures
2. Thermal mitigation review for high-temperature test conditions
3. Process investigation for Batch_17 production lot
4. Continuous monitoring of temperature as primary failure correlate