| Category                     | Example Columns                                                     | Use                                                                               |
| ---------------------------- | ------------------------------------------------------------------- | --------------------------------------------------------------------------------- |
| **Chemical composition**     | C, Si, Mn, Ni, Cr, Mo, W, etc.                                      | Predict martensitic transformation, solid-solution strengthening, and weldability |
| **Process parameters**       | Current, Voltage, Heat input, Interpass temperature, PWHT temp/time | Define energy input, cooling rate, and phase transformation control               |
| **Mechanical properties**    | Yield strength, UTS, Elongation, Charpy impact toughness, Hardness  | Primary quality indicators                                                        |
| **Microstructural features** | % Ferrite, % Martensite, % Acicular ferrite, etc.                   | Validate process–structure relationships                                          |
| **Metadata**                 | Weld ID, Type of weld                                               | Sample identity or grouping for model training                                    |


| Metric                              | Column(s)                         | Interpretation                                     |
| ----------------------------------- | --------------------------------- | -------------------------------------------------- |
| **Yield Strength (YS)**             | `Yield strength / MPa`            | Base measure of load-bearing capacity              |
| **Ultimate Tensile Strength (UTS)** | `Ultimate tensile strength / MPa` | Correlates with YS; used for strength ratio checks |
| **Elongation (El%)**                | `Elongation / %`                  | Ductility; inversely related to Ni and C           |
| **Charpy Toughness (CVN)**          | `Charpy impact toughness / J`     | Resistance to brittle fracture                     |
| **Hardness (HV)**                   | `Hardness / kgmm-2`               | Local strength / microstructural indicator         |


In [2]:
import pandas as pd
import numpy as np

In [7]:
# load dataset
df = pd.read_csv("../data/clean_weld_quality_dataset.csv")

In [9]:
# Basic cleaning and unit normalization

# Replace missing P and S with mean values (as done in the thesis)
for col in ["Sulphur concentration / (weight%)", "Phosphorus concentration / (weight%)"]:
    df[col].fillna(df[col].mean(), inplace=True)

# Replace other missing alloying elements with 0 (not deliberate additions)
elements_to_zero = [
    "Nickel concentration / (weight%)", "Chromium concentration / (weight%)",
    "Molybdenum concentration / (weight%)", "Vanadium concentration / (weight%)",
    "Copper concentration / (weight%)", "Cobalt concentration / (weight%)",
    "Tungsten concentration / (weight%)", "Titanium concentration / parts per million by weight",
    "Aluminium concentration / parts per million by weight", "Boron concentration / parts per million by weight",
    "Niobium concentration / parts per million by weight", "Tin concentration / parts per million by weight",
    "Arsenic concentration / parts per million by weight", "Antimony concentration / parts per million by weight"
]
df[elements_to_zero] = df[elements_to_zero].fillna(0)



Carbon Equivalent (CE) :  is a single numerical value that combines the effects of different alloying elements (C, Mn, Cr, Mo, V, Ni, Cu, etc.) into one number that represents how "hardenable" or "crack-susceptible" a steel or weld metal is.

Think of it as a weldability index, it predicts how easily a steel can form hard, brittle martensite during welding and cooling.

When welding steel:

If cooling happens too quickly and the alloy is rich in hardeners (like C, Cr, Mo, V, Ni),
-  the microstructure becomes hard and brittle,
-  causing cracking, low toughness, and poor ductility.

CE helps you estimate the risk of that happening.

formula :   $CE = C + \frac{Mn}{6} + \frac{Cr + Mo + V}{5} + \frac{Ni + Cu}{15}$



[[source]](https://ewi.org/wp-content/uploads/2016/06/Great-Minds-of-Carbon-Equivalent_Part-3-Wang.pdf)

| CE Value        | Weldability | Recommendation                                                        |
| --------------- | ----------- | --------------------------------------------------------------------- |
| **< 0.40**      | Excellent   | Easy to weld; minimal preheat needed                                  |
| **0.40 – 0.50** | Fair        | May need moderate preheat / controlled cooling                        |
| **> 0.50**      | Poor        | High hardenability; risk of cracking; must preheat and PWHT carefully |


In [10]:
# Convert ppm-based elements to weight% for CE calculation
ppm_cols = [
    "Titanium concentration / parts per million by weight",
    "Nitrogen concentration / parts per million by weight",
    "Aluminium concentration / parts per million by weight",
    "Boron concentration / parts per million by weight",
    "Niobium concentration / parts per million by weight",
    "Tin concentration / parts per million by weight",
    "Arsenic concentration / parts per million by weight",
    "Antimony concentration / parts per million by weight"
]
for col in ppm_cols:
    df[col + " (wt%)"] = df[col] / 1e4 # weight percent intead of ppm

In [11]:
df["Carbon Equivalent (CE)"] = (
    df["Carbon concentration / (weight%)"]
    + df["Manganese concentration / (weight%)"] / 6
    + (df["Chromium concentration / (weight%)"]
       + df["Molybdenum concentration / (weight%)"]
       + df["Vanadium concentration / (weight%)"]) / 5
    + (df["Nickel concentration / (weight%)"]
       + df["Copper concentration / (weight%)"]) / 15
)

#### Metrics

In [None]:




df["Impurity Index (II)"] = (
    df["Sulphur concentration / (weight%)"] + df["Phosphorus concentration / (weight%)"]
)

# Simple solid solution strengthening index (MPa estimate)
df["SS Strengthening (MPa)"] = (
    105 * df["Silicon concentration / (weight%)"]
    + 30 * df["Manganese concentration / (weight%)"]
)

# === 4️⃣ Mechanical Metrics ===

df["Strength Ratio (SR)"] = df["Ultimate tensile strength / MPa"] / df["Yield strength / MPa"]

# Define Mechanical Quality Index (normalized)
ys_target, uts_target, el_target, cvn_target = 600, 700, 20, 100  # adjust as desired
df["MQI"] = (
    0.25 * (df["Yield strength / MPa"] / ys_target)
    + 0.25 * (df["Ultimate tensile strength / MPa"] / uts_target)
    + 0.25 * (df["Elongation / %"] / el_target)
    + 0.25 * (df["Charpy impact toughness / J"] / cvn_target)
)

# === 5️⃣ Process Metrics ===

# Normalize heat input (lower is better)
HI = df["Heat input / kJmm-1"]
df["HI_norm"] = (HI.max() - HI) / (HI.max() - HI.min())

# Normalize interpass and PWHT times (handle missing)
df["Interpass temperature / °C"].fillna(df["Interpass temperature / °C"].median(), inplace=True)
df["Post weld heat treatment time / hours"].fillna(0, inplace=True)

df["Process Stability Index (PSI)"] = (
    0.5 * df["HI_norm"]
    + 0.3 * (df["Interpass temperature / °C"] / df["Interpass temperature / °C"].max())
    + 0.2 * (df["Post weld heat treatment time / hours"] / df["Post weld heat treatment time / hours"].max())
)

# === 6️⃣ Microstructural Metrics (if available) ===

if "Acicular ferrite / %" in df.columns and "Martensite / %" in df.columns:
    df["Microstructural Balance (MB)"] = (
        df["Acicular ferrite / %"] / (df["Martensite / %"] + 1)
    )
else:
    df["Microstructural Balance (MB)"] = np.nan

# === 7️⃣ Final Composite Weld Quality Index (WQI) ===

# Normalize key metrics for 0–1 range
def normalize(series):
    return (series - series.min()) / (series.max() - series.min())

df["WQI"] = (
    0.3 * normalize(df["MQI"])
    + 0.2 * normalize(df["PSI"])
    + 0.2 * normalize(df["Microstructural Balance (MB)"])
    + 0.2 * (1 - normalize(df["Carbon Equivalent (CE)"]))  # lower CE = better weldability
    + 0.1 * normalize(df["SS Strengthening (MPa)"])
)

# === 8️⃣ Interpret Quality Classes ===

def classify_wqi(wqi):
    if wqi >= 0.8:
        return "Excellent"
    elif wqi >= 0.6:
        return "Acceptable"
    else:
        return "Poor"

df["Quality_Class"] = df["WQI"].apply(classify_wqi)

# === 9️⃣ Save Results ===

df.to_csv("weld_quality_metrics.csv", index=False)

print(df[["Weld ID", "WQI", "Quality_Class"]].head())
