# 01 - Correlation Analysis: Electrical Conductivity vs Machinability

**Date:** 2026-02-12  
**Research Question:** RQ1 -- *Is there a statistically significant correlation between the electrical conductivity of low-alloy steels and their machinability indicators?*  

## Objective

This notebook performs the foundational correlation analysis for RQ1 of the dissertation
*"Determining machinability of low-alloy steels via electrical conductivity measurements."*

We investigate whether electrical conductivity (%IACS or MS/m) correlates with three
primary machinability metrics:

| Metric | Symbol | Unit | Interpretation |
|--------|--------|------|----------------|
| Flank wear | VB | mm | Lower is better (longer tool life) |
| Surface roughness | Ra | um | Lower is better (smoother surface) |
| Cutting force | Fc | N | Lower is better (easier to cut) |

## Hypothesis

**H1:** Electrical conductivity of low-alloy steels has a statistically significant
(p < 0.05) monotonic relationship with machinability indicators (VB, Ra, Fc),
such that steels with higher conductivity (indicative of softer, more homogeneous
microstructures) tend to exhibit lower tool wear, better surface finish, and
lower cutting forces.

**Rationale:** Electrical conductivity is sensitive to microstructural features --
lattice defects, phase boundaries, precipitates, and dislocation density -- that
also govern mechanical behaviour and, by extension, machinability. Softer phases
(ferrite > pearlite > bainite > martensite) have progressively lower resistivity,
and higher conductivity. This physical basis suggests a measurable correlation
between conductivity and machinability should exist.

## Analysis Plan

1. Load and explore the evidence matrix data
2. Parse numeric conductivity and machinability values
3. Compute scatter plots of conductivity vs each machinability metric
4. Quantify correlations using Pearson (linear) and Spearman (monotonic) methods
5. Stratify by steel grade to check for sub-group patterns
6. Log results with MLflow for experiment tracking

---
## 1. Imports

In [None]:
import sys
import re
import warnings
from pathlib import Path

import numpy as np
import pandas as pd
from scipy import stats
import matplotlib.pyplot as plt
import seaborn as sns
import mlflow

# Ensure the repo root is on sys.path so that src.machinability is importable
REPO_ROOT = Path.cwd().parent  # assumes notebook is in <repo>/notebooks/
if str(REPO_ROOT) not in sys.path:
    sys.path.insert(0, str(REPO_ROOT))

from src.machinability.utils.config import (
    REPO_ROOT as CFG_REPO_ROOT,
    MLFLOW_TRACKING_URI,
    MLFLOW_EXPERIMENT_NAME,
)
from src.machinability.data.loader import load_evidence_matrix
from src.machinability.analysis.correlation import (
    pearson_correlation,
    spearman_correlation,
    correlation_matrix,
)
from src.machinability.visualization.plots import (
    scatter_with_regression,
    steel_grade_comparison,
)

# Plot settings
%matplotlib inline
sns.set_theme(style="whitegrid", palette="colorblind", font_scale=1.1)
plt.rcParams["figure.dpi"] = 120
plt.rcParams["savefig.dpi"] = 200
plt.rcParams["figure.figsize"] = (10, 6)
warnings.filterwarnings("ignore", category=FutureWarning)

# MLflow setup
mlflow.set_tracking_uri(MLFLOW_TRACKING_URI)
mlflow.set_experiment(MLFLOW_EXPERIMENT_NAME)

print(f"Repository root : {CFG_REPO_ROOT}")
print(f"MLflow URI      : {MLFLOW_TRACKING_URI}")
print(f"NumPy {np.__version__}, Pandas {pd.__version__}, SciPy {stats.scipy.__version__}")

---
## 2. Data Loading

We load the evidence matrix from `02_EVIDENCE_MATRIX/evidence_matrix.csv`.
Since data collection is ongoing, the matrix may have limited rows. If the data
is insufficient for meaningful analysis (fewer than 10 rows with paired
conductivity + machinability values), we generate a synthetic demonstration
dataset based on literature-reported ranges so that the analysis pipeline can
be validated end-to-end.

In [None]:
def parse_conductivity(value_str):
    """Extract numeric conductivity value from the evidence matrix string format.
    
    Handles formats like '3.1 %IACS', '2.8 MS/m', '18.5 micro-ohm-cm'.
    Returns conductivity in %IACS (converts other units).
    """
    if pd.isna(value_str) or not isinstance(value_str, str):
        return np.nan
    value_str = value_str.strip()
    # Try to extract a leading number
    match = re.match(r"([\d.]+)", value_str)
    if not match:
        return np.nan
    num = float(match.group(1))
    lower = value_str.lower()
    if "%iacs" in lower or "iacs" in lower:
        return num
    elif "ms/m" in lower:
        # Convert MS/m to %IACS: %IACS = (MS/m / 58.0) * 100
        return (num / 58.0) * 100.0
    elif "micro-ohm" in lower or "uohm" in lower:
        # Convert micro-ohm-cm to MS/m, then to %IACS
        ms_per_m = 0.1 / (num * 1e-6) if num > 0 else np.nan  # adjust for cm vs m
        # Actually: conductivity (MS/m) = 0.1 / resistivity (micro-ohm-cm)
        ms_per_m = 0.1 / num if num > 0 else np.nan
        return (ms_per_m / 58.0) * 100.0 if not np.isnan(ms_per_m) else np.nan
    else:
        # Assume %IACS if no unit recognized
        return num


def parse_machinability_metric(metric_str):
    """Extract VB, Ra, and Fc values from the machinability_metric string.
    
    Handles formats like 'VB=0.3mm', 'Ra=1.2um', 'Fc=450N',
    and compound strings like 'VB=0.3mm at T=18min'.
    Returns dict with keys 'VB_mm', 'Ra_um', 'Fc_N' (NaN if not found).
    """
    result = {"VB_mm": np.nan, "Ra_um": np.nan, "Fc_N": np.nan}
    if pd.isna(metric_str) or not isinstance(metric_str, str):
        return result
    # Search for VB
    vb_match = re.search(r"VB\s*=\s*([\d.]+)", metric_str, re.IGNORECASE)
    if vb_match:
        result["VB_mm"] = float(vb_match.group(1))
    # Search for Ra
    ra_match = re.search(r"Ra\s*=\s*([\d.]+)", metric_str, re.IGNORECASE)
    if ra_match:
        result["Ra_um"] = float(ra_match.group(1))
    # Search for Fc
    fc_match = re.search(r"Fc\s*=\s*([\d.]+)", metric_str, re.IGNORECASE)
    if fc_match:
        result["Fc_N"] = float(fc_match.group(1))
    return result


# Load evidence matrix
raw_df = load_evidence_matrix()
print(f"Evidence matrix shape: {raw_df.shape}")
print(f"Columns: {list(raw_df.columns)}")
raw_df.head()

In [None]:
# Parse numeric values from the evidence matrix
if len(raw_df) > 0:
    raw_df["conductivity_IACS"] = raw_df["conductivity_value_units"].apply(parse_conductivity)
    metric_parsed = raw_df["machinability_metric"].apply(parse_machinability_metric).apply(pd.Series)
    raw_df = pd.concat([raw_df, metric_parsed], axis=1)

# Check how many usable paired observations we have
usable_cols = ["conductivity_IACS", "VB_mm", "Ra_um", "Fc_N"]
if len(raw_df) > 0 and all(c in raw_df.columns for c in usable_cols):
    n_cond = raw_df["conductivity_IACS"].notna().sum()
    n_vb = (raw_df["conductivity_IACS"].notna() & raw_df["VB_mm"].notna()).sum()
    n_ra = (raw_df["conductivity_IACS"].notna() & raw_df["Ra_um"].notna()).sum()
    n_fc = (raw_df["conductivity_IACS"].notna() & raw_df["Fc_N"].notna()).sum()
    print(f"Conductivity values: {n_cond}")
    print(f"Paired (cond, VB): {n_vb}")
    print(f"Paired (cond, Ra): {n_ra}")
    print(f"Paired (cond, Fc): {n_fc}")
    USE_REAL_DATA = max(n_vb, n_ra, n_fc) >= 10
else:
    n_vb = n_ra = n_fc = 0
    USE_REAL_DATA = False

print(f"\nUsing {'REAL' if USE_REAL_DATA else 'SYNTHETIC'} data for analysis.")

In [None]:
def generate_synthetic_data(n_per_grade=15, seed=42):
    """Generate synthetic conductivity-machinability data for pipeline validation.
    
    Values are drawn from literature-reported ranges for common low-alloy steels.
    Correlations are embedded to reflect the expected physical relationships:
    higher conductivity -> lower VB, lower Ra, lower Fc.
    
    Steel grades and their approximate conductivity ranges (%IACS):
      - AISI 1045:  ~7.5-8.5  (medium carbon, ferrite-pearlite)
      - AISI 4140:  ~3.0-4.5  (Cr-Mo, typically Q&T)
      - AISI 4340:  ~2.5-3.5  (Ni-Cr-Mo, high hardenability)
      - AISI 1020:  ~9.0-11.0 (low carbon, mostly ferrite)
      - AISI 8620:  ~5.5-7.0  (Ni-Cr-Mo, carburising grade)
    """
    rng = np.random.default_rng(seed)
    
    grades = {
        "AISI 1020": {"cond_mean": 10.0, "cond_std": 0.6,
                       "ht": ["annealed", "normalised"],
                       "micro": ["ferrite-pearlite"]},
        "AISI 1045": {"cond_mean": 8.0,  "cond_std": 0.5,
                       "ht": ["annealed", "normalised", "Q&T 600C"],
                       "micro": ["ferrite-pearlite", "tempered martensite"]},
        "AISI 4140": {"cond_mean": 3.8,  "cond_std": 0.4,
                       "ht": ["annealed", "Q&T 400C", "Q&T 600C"],
                       "micro": ["tempered martensite", "bainite"]},
        "AISI 4340": {"cond_mean": 3.0,  "cond_std": 0.3,
                       "ht": ["Q&T 400C", "Q&T 600C"],
                       "micro": ["tempered martensite"]},
        "AISI 8620": {"cond_mean": 6.2,  "cond_std": 0.5,
                       "ht": ["annealed", "normalised", "carburised"],
                       "micro": ["ferrite-pearlite", "bainite"]},
    }
    
    rows = []
    for grade, props in grades.items():
        cond = rng.normal(props["cond_mean"], props["cond_std"], n_per_grade)
        cond = np.clip(cond, 1.5, 12.0)
        
        # Generate machinability metrics with embedded negative correlations
        # VB (flank wear, mm): typically 0.1 - 0.6 mm; lower conductivity -> higher wear
        vb = 0.55 - 0.03 * cond + rng.normal(0, 0.04, n_per_grade)
        vb = np.clip(vb, 0.05, 0.80)
        
        # Ra (surface roughness, um): typically 0.5 - 5.0 um
        ra = 4.5 - 0.25 * cond + rng.normal(0, 0.4, n_per_grade)
        ra = np.clip(ra, 0.3, 6.0)
        
        # Fc (cutting force, N): typically 200 - 1200 N
        fc = 1100 - 60 * cond + rng.normal(0, 50, n_per_grade)
        fc = np.clip(fc, 150, 1500)
        
        for i in range(n_per_grade):
            rows.append({
                "paper_id": f"synthetic_{grade.replace(' ', '_')}_{i:02d}",
                "year": rng.choice([2010, 2012, 2015, 2018, 2020, 2022, 2024]),
                "steel_grade": grade,
                "heat_treatment": rng.choice(props["ht"]),
                "microstructure": rng.choice(props["micro"]),
                "conductivity_IACS": round(cond[i], 2),
                "VB_mm": round(vb[i], 3),
                "Ra_um": round(ra[i], 2),
                "Fc_N": round(fc[i], 1),
            })
    
    # Add some missing values to simulate real-world data (~10% missing)
    df = pd.DataFrame(rows)
    n_total = len(df)
    for col in ["VB_mm", "Ra_um", "Fc_N"]:
        mask = rng.choice(n_total, size=int(0.10 * n_total), replace=False)
        df.loc[mask, col] = np.nan
    
    return df


# Build the analysis dataframe
if USE_REAL_DATA:
    df = raw_df[["paper_id", "year", "steel_grade", "heat_treatment",
                 "microstructure", "conductivity_IACS", "VB_mm", "Ra_um", "Fc_N"]].copy()
    data_source = "evidence_matrix"
    print("Proceeding with real evidence matrix data.")
else:
    df = generate_synthetic_data(n_per_grade=15, seed=42)
    data_source = "synthetic"
    print("Evidence matrix has insufficient paired data.")
    print("Generated synthetic dataset for pipeline validation.")
    print("NOTE: Replace with real data as evidence matrix is populated.")

print(f"\nAnalysis dataframe: {df.shape[0]} rows x {df.shape[1]} columns")
df.head(10)

---
## 3. Data Exploration

In [None]:
# Basic info
print("=" * 60)
print("DataFrame Info")
print("=" * 60)
df.info()
print()

In [None]:
# Summary statistics for numeric columns
numeric_cols = ["conductivity_IACS", "VB_mm", "Ra_um", "Fc_N"]
summary = df[numeric_cols].describe().T
summary["missing"] = df[numeric_cols].isna().sum()
summary["missing_%"] = (df[numeric_cols].isna().sum() / len(df) * 100).round(1)
print("Summary Statistics")
print("=" * 60)
summary

In [None]:
# Distribution plots for each numeric variable
fig, axes = plt.subplots(2, 2, figsize=(12, 10))

for ax, col, color, unit in zip(
    axes.flat,
    numeric_cols,
    ["steelblue", "coral", "seagreen", "mediumpurple"],
    ["%IACS", "mm", "um", "N"],
):
    data = df[col].dropna()
    ax.hist(data, bins=15, color=color, edgecolor="white", alpha=0.8)
    ax.axvline(data.mean(), color="black", linestyle="--", linewidth=1.2,
               label=f"Mean = {data.mean():.2f}")
    ax.axvline(data.median(), color="red", linestyle=":", linewidth=1.2,
               label=f"Median = {data.median():.2f}")
    ax.set_xlabel(f"{col} ({unit})", fontsize=11)
    ax.set_ylabel("Count", fontsize=11)
    ax.set_title(f"Distribution of {col}", fontsize=12)
    ax.legend(fontsize=9)

fig.suptitle(f"Variable Distributions (n={len(df)}, source: {data_source})",
             fontsize=14, fontweight="bold", y=1.02)
fig.tight_layout()
plt.show()

In [None]:
# Missing data overview
fig, ax = plt.subplots(figsize=(10, 4))
missing = df[numeric_cols].isna().sum()
present = df[numeric_cols].notna().sum()

bars_present = ax.barh(numeric_cols, present, color="steelblue", label="Present")
bars_missing = ax.barh(numeric_cols, missing, left=present, color="lightcoral", label="Missing")

for bar, m, p in zip(bars_missing, missing, present):
    total = m + p
    ax.text(total + 0.3, bar.get_y() + bar.get_height() / 2,
            f"{m}/{total} ({m/total*100:.0f}%)",
            va="center", fontsize=10)

ax.set_xlabel("Number of observations", fontsize=11)
ax.set_title("Data Completeness", fontsize=13, fontweight="bold")
ax.legend(loc="lower right", fontsize=10)
fig.tight_layout()
plt.show()

In [None]:
# Steel grade distribution
if "steel_grade" in df.columns:
    grade_counts = df["steel_grade"].value_counts()
    print("Observations per steel grade:")
    print(grade_counts)
    print(f"\nUnique grades: {df['steel_grade'].nunique()}")

    fig, ax = plt.subplots(figsize=(8, 4))
    grade_counts.plot(kind="barh", ax=ax, color="steelblue", edgecolor="white")
    ax.set_xlabel("Count", fontsize=11)
    ax.set_title("Observations by Steel Grade", fontsize=13, fontweight="bold")
    fig.tight_layout()
    plt.show()

---
## 4. Conductivity vs Machinability Scatter Plots

For each machinability metric (VB, Ra, Fc), we plot conductivity on the x-axis
and the machinability indicator on the y-axis. A linear regression line with
R-squared is overlaid.

In [None]:
# Individual scatter plots using the project's visualization module
metric_info = [
    ("VB_mm", "Flank Wear VB (mm)", "Tool wear decreases with higher conductivity?"),
    ("Ra_um", "Surface Roughness Ra (um)", "Surface finish improves with higher conductivity?"),
    ("Fc_N",  "Cutting Force Fc (N)",      "Cutting forces decrease with higher conductivity?"),
]

fig, axes = plt.subplots(1, 3, figsize=(18, 5.5))

for ax, (col, ylabel, question) in zip(axes, metric_info):
    mask = df["conductivity_IACS"].notna() & df[col].notna()
    if mask.sum() < 3:
        ax.text(0.5, 0.5, f"Insufficient data\n(n={mask.sum()})",
                ha="center", va="center", transform=ax.transAxes, fontsize=12)
        ax.set_title(col, fontsize=12)
        continue
    
    x = df.loc[mask, "conductivity_IACS"].values
    y = df.loc[mask, col].values
    
    scatter_with_regression(
        x, y,
        xlabel="Electrical Conductivity (%IACS)",
        ylabel=ylabel,
        title=question,
        ax=ax,
    )
    ax.annotate(f"n = {mask.sum()}", xy=(0.05, 0.88), xycoords="axes fraction", fontsize=10)

fig.suptitle("Conductivity vs Machinability Metrics",
             fontsize=15, fontweight="bold", y=1.03)
fig.tight_layout()
plt.show()

In [None]:
# Colour-coded by steel grade using the project's visualization module
for col, ylabel, _ in metric_info:
    subset = df.dropna(subset=["conductivity_IACS", col, "steel_grade"])
    if len(subset) < 3:
        print(f"Skipping steel grade comparison for {col}: insufficient data.")
        continue
    
    fig = steel_grade_comparison(
        subset,
        x_col="conductivity_IACS",
        y_col=col,
        hue_col="steel_grade",
        xlabel="Electrical Conductivity (%IACS)",
        ylabel=ylabel,
    )
    fig.suptitle(f"Conductivity vs {col} by Steel Grade",
                 fontsize=14, fontweight="bold", y=1.02)
    plt.show()

---
## 5. Correlation Analysis

We compute both **Pearson** (linear relationship) and **Spearman** (monotonic
relationship) correlations. Spearman is more robust to outliers and non-linear
monotonic trends, which is important since the conductivity-machinability
relationship may not be strictly linear.

**Significance threshold:** p < 0.05 (two-tailed).

In [None]:
# Compute Pearson and Spearman correlations using project modules
results = []

for col, label, _ in metric_info:
    mask = df["conductivity_IACS"].notna() & df[col].notna()
    if mask.sum() < 4:
        print(f"Skipping {col}: need at least 4 paired observations (have {mask.sum()}).")
        continue
    
    x = df.loc[mask, "conductivity_IACS"]
    y = df.loc[mask, col]
    
    pear = pearson_correlation(x, y)
    spear = spearman_correlation(x, y)
    
    results.append({
        "Metric": col,
        "n": pear["n"],
        "Pearson_r": round(pear["r"], 4),
        "Pearson_p": pear["p_value"],
        "Pearson_sig": "***" if pear["p_value"] < 0.001 else
                       "**" if pear["p_value"] < 0.01 else
                       "*" if pear["p_value"] < 0.05 else "ns",
        "Spearman_rho": round(spear["rho"], 4),
        "Spearman_p": spear["p_value"],
        "Spearman_sig": "***" if spear["p_value"] < 0.001 else
                        "**" if spear["p_value"] < 0.01 else
                        "*" if spear["p_value"] < 0.05 else "ns",
    })

corr_results = pd.DataFrame(results)
print("Correlation Results: Conductivity (%IACS) vs Machinability Metrics")
print("=" * 80)
print("Significance: *** p<0.001, ** p<0.01, * p<0.05, ns = not significant")
print()
corr_results

In [None]:
# Correlation heatmap using project's correlation_matrix function
analysis_cols = ["conductivity_IACS", "VB_mm", "Ra_um", "Fc_N"]
available_cols = [c for c in analysis_cols if c in df.columns]

corr_mat = correlation_matrix(df, columns=available_cols)

# Also compute p-value matrix for annotation
p_matrix = pd.DataFrame(np.ones((len(available_cols), len(available_cols))),
                         index=available_cols, columns=available_cols)
for i, c1 in enumerate(available_cols):
    for j, c2 in enumerate(available_cols):
        if i != j:
            mask = df[c1].notna() & df[c2].notna()
            if mask.sum() >= 4:
                _, p = stats.pearsonr(df.loc[mask, c1], df.loc[mask, c2])
                p_matrix.loc[c1, c2] = p

# Build annotation matrix: r (sig)
annot = corr_mat.round(3).astype(str)
for i in available_cols:
    for j in available_cols:
        p = p_matrix.loc[i, j]
        sig = "***" if p < 0.001 else "**" if p < 0.01 else "*" if p < 0.05 else ""
        annot.loc[i, j] = f"{corr_mat.loc[i, j]:.3f}{sig}"

fig, ax = plt.subplots(figsize=(8, 6))
sns.heatmap(
    corr_mat,
    annot=annot,
    fmt="",
    cmap="RdBu_r",
    center=0,
    vmin=-1, vmax=1,
    square=True,
    linewidths=1,
    cbar_kws={"label": "Pearson r"},
    ax=ax,
)
ax.set_title("Correlation Heatmap: Conductivity & Machinability\n(* p<0.05, ** p<0.01, *** p<0.001)",
             fontsize=13, fontweight="bold")
fig.tight_layout()
plt.show()

print("\nCorrelation matrix (Pearson):")
corr_mat

---
## 6. Steel Grade Comparison

We examine whether the conductivity-machinability correlation holds consistently
across different steel grades, or whether the relationship is driven primarily
by between-grade variation (confounded by composition differences).

In [None]:
# Group-level correlation analysis
grade_corr_rows = []

if "steel_grade" in df.columns:
    for grade, group in df.groupby("steel_grade"):
        for col, label, _ in metric_info:
            mask = group["conductivity_IACS"].notna() & group[col].notna()
            n = mask.sum()
            if n < 4:
                continue
            
            x = group.loc[mask, "conductivity_IACS"]
            y = group.loc[mask, col]
            
            pear = pearson_correlation(x, y)
            spear = spearman_correlation(x, y)
            
            grade_corr_rows.append({
                "Steel Grade": grade,
                "Metric": col,
                "n": n,
                "Pearson_r": round(pear["r"], 4),
                "Pearson_p": round(pear["p_value"], 4),
                "Spearman_rho": round(spear["rho"], 4),
                "Spearman_p": round(spear["p_value"], 4),
            })

if grade_corr_rows:
    grade_corr_df = pd.DataFrame(grade_corr_rows)
    print("Within-Grade Correlations: Conductivity vs Machinability")
    print("=" * 80)
    display(grade_corr_df)
else:
    print("Insufficient data for within-grade correlation analysis.")
    grade_corr_df = pd.DataFrame()

In [None]:
# Visualise within-grade correlations
if len(grade_corr_df) > 0:
    for col, label, _ in metric_info:
        subset = grade_corr_df[grade_corr_df["Metric"] == col]
        if len(subset) == 0:
            continue
        
        fig, ax = plt.subplots(figsize=(8, 4))
        colors = ["steelblue" if p < 0.05 else "lightgray"
                  for p in subset["Spearman_p"]]
        bars = ax.barh(subset["Steel Grade"], subset["Spearman_rho"],
                       color=colors, edgecolor="black", linewidth=0.5)
        ax.axvline(0, color="black", linewidth=0.8)
        ax.set_xlabel("Spearman rho", fontsize=11)
        ax.set_title(f"Within-Grade Spearman Correlation: Conductivity vs {col}\n"
                     f"(blue = p < 0.05, gray = not significant)",
                     fontsize=12, fontweight="bold")
        ax.set_xlim(-1, 1)
        
        for bar, row in zip(bars, subset.itertuples()):
            ax.text(bar.get_width() + 0.03 * np.sign(bar.get_width()),
                    bar.get_y() + bar.get_height() / 2,
                    f"r={row.Spearman_rho:.2f} (p={row.Spearman_p:.3f})",
                    va="center", fontsize=9)
        
        fig.tight_layout()
        plt.show()

In [None]:
# Box plots of conductivity by steel grade
if "steel_grade" in df.columns and df["steel_grade"].nunique() > 1:
    fig, axes = plt.subplots(1, 2, figsize=(14, 5))
    
    # Conductivity by grade
    order = df.groupby("steel_grade")["conductivity_IACS"].median().sort_values().index
    sns.boxplot(data=df, x="steel_grade", y="conductivity_IACS", order=order,
                ax=axes[0], palette="Blues")
    axes[0].set_xlabel("Steel Grade", fontsize=11)
    axes[0].set_ylabel("Conductivity (%IACS)", fontsize=11)
    axes[0].set_title("Conductivity Distribution by Steel Grade", fontsize=12, fontweight="bold")
    axes[0].tick_params(axis="x", rotation=45)
    
    # VB by grade (if available)
    vb_data = df.dropna(subset=["VB_mm"])
    if len(vb_data) > 0:
        order_vb = vb_data.groupby("steel_grade")["VB_mm"].median().sort_values().index
        sns.boxplot(data=vb_data, x="steel_grade", y="VB_mm", order=order_vb,
                    ax=axes[1], palette="Reds")
        axes[1].set_xlabel("Steel Grade", fontsize=11)
        axes[1].set_ylabel("Flank Wear VB (mm)", fontsize=11)
        axes[1].set_title("Flank Wear Distribution by Steel Grade", fontsize=12, fontweight="bold")
        axes[1].tick_params(axis="x", rotation=45)
    
    fig.tight_layout()
    plt.show()

---
## 7. MLflow Logging

Log the key correlation results to MLflow for experiment tracking and
reproducibility.

In [None]:
# Log results to MLflow
with mlflow.start_run(run_name="01_correlation_analysis"):
    # Log parameters
    mlflow.log_params({
        "data_source": data_source,
        "n_observations": len(df),
        "n_steel_grades": df["steel_grade"].nunique() if "steel_grade" in df.columns else 0,
        "analysis_type": "pearson_spearman_correlation",
        "significance_threshold": 0.05,
    })
    
    # Log correlation metrics
    if len(corr_results) > 0:
        for _, row in corr_results.iterrows():
            metric_name = row["Metric"].replace("_", "-")
            mlflow.log_metrics({
                f"{metric_name}_pearson_r": row["Pearson_r"],
                f"{metric_name}_pearson_p": row["Pearson_p"],
                f"{metric_name}_spearman_rho": row["Spearman_rho"],
                f"{metric_name}_spearman_p": row["Spearman_p"],
                f"{metric_name}_n_pairs": row["n"],
            })
    
    # Log the correlation results table as an artifact
    if len(corr_results) > 0:
        results_path = Path("correlation_results.csv")
        corr_results.to_csv(results_path, index=False)
        mlflow.log_artifact(str(results_path))
        results_path.unlink()  # clean up temp file
    
    if len(grade_corr_df) > 0:
        grade_path = Path("grade_correlations.csv")
        grade_corr_df.to_csv(grade_path, index=False)
        mlflow.log_artifact(str(grade_path))
        grade_path.unlink()
    
    print("Results logged to MLflow.")
    print(f"Run ID: {mlflow.active_run().info.run_id}")

---
## 8. Results Summary

### Key Findings

In [None]:
# Programmatic summary
print("=" * 70)
print("RESULTS SUMMARY: RQ1 Correlation Analysis")
print("=" * 70)
print(f"Data source: {data_source}")
print(f"Total observations: {len(df)}")
if "steel_grade" in df.columns:
    print(f"Steel grades: {', '.join(df['steel_grade'].unique())}")
print()

if len(corr_results) > 0:
    print("Overall Correlations (conductivity vs machinability):")
    print("-" * 70)
    for _, row in corr_results.iterrows():
        direction = "negative" if row["Pearson_r"] < 0 else "positive"
        strength = (
            "strong" if abs(row["Pearson_r"]) > 0.7 else
            "moderate" if abs(row["Pearson_r"]) > 0.4 else
            "weak"
        )
        sig_text = "SIGNIFICANT" if row["Pearson_p"] < 0.05 else "NOT significant"
        print(f"  {row['Metric']:12s}: r = {row['Pearson_r']:+.4f}, "
              f"rho = {row['Spearman_rho']:+.4f}, "
              f"n = {row['n']}, "
              f"{strength} {direction}, {sig_text}")
    
    n_significant = sum(1 for _, r in corr_results.iterrows() if r["Pearson_p"] < 0.05)
    print(f"\n{n_significant}/{len(corr_results)} metrics show statistically significant "
          f"Pearson correlation with conductivity (p < 0.05).")

print()
if data_source == "synthetic":
    print("IMPORTANT: These results are based on SYNTHETIC data generated for")
    print("pipeline validation. Re-run this notebook once the evidence matrix")
    print("contains sufficient real data (>= 10 paired observations).")

### Interpretation

**Expected pattern (from hypothesis):**  
Negative correlations between conductivity and all three machinability metrics
(VB, Ra, Fc), meaning that steels with higher electrical conductivity tend to
be easier to machine (lower tool wear, smoother surfaces, lower cutting forces).

**Physical explanation:**  
Higher conductivity in low-alloy steels reflects:
- Fewer lattice defects and dislocations
- More ordered, softer microstructures (ferrite-dominated)
- Fewer hard precipitates scattering electrons

These same microstructural features reduce resistance to cutting, producing
lower forces and temperatures, which in turn reduces tool wear and improves
surface finish.

**Caveats:**
- Between-grade variation dominates: steels with very different compositions
  naturally cluster, inflating overall correlation. Within-grade analysis is
  essential to confirm the relationship is not solely a composition proxy.
- Temperature, cutting parameters, and tool geometry are confounders not yet
  controlled for in this bivariate analysis.
- Spearman correlation is preferred over Pearson when the relationship may be
  monotonic but non-linear.

### Next Steps

- [ ] Populate the evidence matrix with real data from systematic literature review
- [ ] Control for confounders (cutting parameters, tool type) using partial correlation
- [ ] Perform within-grade analysis with sufficient sample sizes (n >= 10 per grade)
- [ ] Investigate non-linear models (RQ2) if scatter plots suggest curvature
- [ ] Proceed to `02_regression_models.ipynb` for predictive modelling (RQ3)