# Part 2 — Structural Looseness Prediction


# Structural Looseness – Physical Hypothesis and Literature Background

## Physical Context

Structural looseness typically involves insufficient clamping force in non-rotating components (e.g., loose bolts in motor bases, couplings, or supports). 

Unlike lubrication-related faults — which often produce broadband high-frequency energy due to surface interaction and micro-impacts — structural looseness primarily alters the global dynamic behavior of the system.

Looseness changes structural stiffness and introduces non-linear contact behavior, which manifests predominantly as:

- Increased vibration at 1× rotational frequency
- Presence of harmonics (2×, 3×, ...)
- Possible sub-harmonics (0.5×, 1/3×)
- Low-frequency amplitude amplification
- Amplitude modulation effects

This behavior is consistent with classical vibration analysis references and experimental studies on bolted structures.

---

## Literature Support

### 1) Bolted Joint Looseness and Dynamic Response

Review studies on looseness detection in bolted structures indicate that loosening reduces joint stiffness and modifies system modal properties, leading to observable changes in low-frequency vibration behavior.

- Changes in dynamic stiffness affect vibration amplitudes near excitation frequencies.
- Structural looseness is often detected via spectral analysis in low-frequency bands.

(See: Review papers on vibration-based detection of looseness in bolted structures.)

---

### 2) Rotational Machinery Diagnostics

Industrial vibration analysis references (e.g., SKF vibration guidelines) indicate that:

- Mechanical faults such as looseness, misalignment, and unbalance primarily appear at 1× rotational frequency and its harmonics.
- Spectral peaks at harmonics of running speed are strong indicators of mechanical looseness.
- Sub-harmonics and modulation may appear when contact becomes intermittent.

This contrasts with lubrication-related degradation, which elevates broadband high-frequency energy.

---

## Implications for This Case

Part 1 of this project focused on broadband high-frequency elevation (carpet-like behavior), conceptually aligned with lubrication-related faults.

Part 2 focuses on structural looseness, which is expected to manifest predominantly in:

- Low-frequency components
- Harmonics and sub-harmonics of rotational frequency
- Increased amplitude at 1× and multiples

Given that rotational speed (rpm) is provided in Part 2 metadata, frequency components synchronized with running speed become physically meaningful and robust across sensors and sampling rates.

Therefore, the feature engineering strategy for Part 2 will prioritize:

- Rotation-synchronous spectral components
- Low-frequency band energy
- Harmonic ratios
- Modulation indicators
- Complementary time-domain impulsiveness metrics (if present)

This physics-driven approach ensures interpretability and improved extrapolation capability across different acquisition setups.


## Phase 1 — Data preparation

This notebook covers:
- Train/test metadata loading
- Orientation mapping parsing/validation
- Samples index construction (train labeled / train unlabeled / test)

In [1]:
from pathlib import Path
import pandas as pd
import numpy as np
from tractian_cm.io.metadata_part2 import load_train_metadata_part2, load_test_metadata_part2
from tractian_cm.part2.sample_index import build_part2_samples_index
from tractian_cm.io.loaders import load_raw_triaxial_part2_csv
from tractian_cm.io.loaders import load_raw_triaxial_part2_csv
from tractian_cm.part2.orientation import to_hva_waves

In [2]:
# Adjust these paths to your repo layout
REPO_ROOT = Path("..")  # notebooks/ -> repo root
DATA_DIR = REPO_ROOT / "data" / "part_2" / "data"           # <-- adjust if needed
TEST_DATA_DIR = REPO_ROOT / "data" / "part_2" / "test_data" # <-- adjust if needed

TRAIN_MD_PATH = REPO_ROOT / "data" / "part_2" / "part_3_metadata.csv"
TEST_MD_PATH  = REPO_ROOT / "data" / "part_2" / "test_metadata.csv"

print("DATA_DIR:", DATA_DIR)
print("TEST_DATA_DIR:", TEST_DATA_DIR)
print("TRAIN_MD_PATH:", TRAIN_MD_PATH)
print("TEST_MD_PATH:", TEST_MD_PATH)


DATA_DIR: ..\data\part_2\data
TEST_DATA_DIR: ..\data\part_2\test_data
TRAIN_MD_PATH: ..\data\part_2\part_3_metadata.csv
TEST_MD_PATH: ..\data\part_2\test_metadata.csv


In [3]:
train_md = load_train_metadata_part2(str(TRAIN_MD_PATH))
test_md  = load_test_metadata_part2(str(TEST_MD_PATH))

display(train_md.head(), test_md.head())

Unnamed: 0,sample_id,label,condition,rpm,sensor_id,orientation
0,007b7aba-18a5-5e4a-a887-e1de8cce30f2,True,structural_looseness,1598,VLQ4172,"{'axisX': 'horizontal', 'axisY': 'axial', 'axi..."
1,0123d223-8578-5c3c-997f-de2e7c3df494,False,healthy,1598,UKK6686,"{'axisX': 'vertical', 'axisY': 'axial', 'axisZ..."
2,0279ceb1-110e-5460-9e8a-75bf69d185bb,True,structural_looseness,1598,VLQ4172,"{'axisX': 'horizontal', 'axisY': 'axial', 'axi..."
3,054f22ed-91c6-5e06-ad80-fef14e34cf6d,True,structural_looseness,1598,VLQ4172,"{'axisX': 'horizontal', 'axisY': 'axial', 'axi..."
4,05a1c152-629f-5db4-99e1-bba33daee471,True,structural_looseness,1598,VLQ4172,"{'axisX': 'horizontal', 'axisY': 'axial', 'axi..."


Unnamed: 0,sample_id,rpm,asset,orientation
0,01e98ad9-23c9-5986-ace0-4519bad71198,1785,bearing,"{'axisX': 'horizontal', 'axisY': 'axial', 'axi..."
1,1dab1534-b8a8-5962-b01c-bff0782d54a9,3545,compressor,"{'axisX': 'vertical', 'axisY': 'axial', 'axisZ..."
2,2211750b-6672-5a94-bd40-cda811f69d01,2025,fan,"{'axisX': 'horizontal', 'axisY': 'axial', 'axi..."
3,33542920-30ea-5844-861d-2c82d79087b8,1170,electric-motor,"{'axisX': 'vertical', 'axisY': 'horizontal', '..."
4,680bbcbf-b1c8-544d-8f80-bf763cdcd128,3573,compressor,"{'axisX': 'vertical', 'axisY': 'axial', 'axisZ..."


## Orientation sanity checks
- orientation keys must be axisX/axisY/axisZ
- values must be exactly one of each: horizontal/vertical/axial


In [4]:
# quick sanity summaries
print("Train labeled samples:", len(train_md))
print("Test samples:", len(test_md))

# show a few unique orientation mappings
display(train_md["orientation"].head(3).tolist(), test_md["orientation"].head(3).tolist())


Train labeled samples: 250
Test samples: 7


[{'axisX': 'horizontal', 'axisY': 'axial', 'axisZ': 'vertical'},
 {'axisX': 'vertical', 'axisY': 'axial', 'axisZ': 'horizontal'},
 {'axisX': 'horizontal', 'axisY': 'axial', 'axisZ': 'vertical'}]

[{'axisX': 'horizontal', 'axisY': 'axial', 'axisZ': 'vertical'},
 {'axisX': 'vertical', 'axisY': 'axial', 'axisZ': 'horizontal'},
 {'axisX': 'horizontal', 'axisY': 'axial', 'axisZ': 'vertical'}]

## Build unified samples index
This index will drive EDA and training later, and it will be reused by scripts and the webapp.


In [5]:
samples_index = build_part2_samples_index(
    data_dir=str(DATA_DIR),
    test_data_dir=str(TEST_DATA_DIR),
    train_metadata_path=str(TRAIN_MD_PATH),
    test_metadata_path=str(TEST_MD_PATH),
)

samples_index.head(10)


Unnamed: 0,sample_id,split,filepath,rpm,sensor_id,asset,label,condition,orientation
0,01e98ad9-23c9-5986-ace0-4519bad71198,test,..\data\part_2\test_data\01e98ad9-23c9-5986-ac...,1785,,bearing,,,"{'axisX': 'horizontal', 'axisY': 'axial', 'axi..."
1,1dab1534-b8a8-5962-b01c-bff0782d54a9,test,..\data\part_2\test_data\1dab1534-b8a8-5962-b0...,3545,,compressor,,,"{'axisX': 'vertical', 'axisY': 'axial', 'axisZ..."
2,2211750b-6672-5a94-bd40-cda811f69d01,test,..\data\part_2\test_data\2211750b-6672-5a94-bd...,2025,,fan,,,"{'axisX': 'horizontal', 'axisY': 'axial', 'axi..."
3,33542920-30ea-5844-861d-2c82d79087b8,test,..\data\part_2\test_data\33542920-30ea-5844-86...,1170,,electric-motor,,,"{'axisX': 'vertical', 'axisY': 'horizontal', '..."
4,680bbcbf-b1c8-544d-8f80-bf763cdcd128,test,..\data\part_2\test_data\680bbcbf-b1c8-544d-8f...,3573,,compressor,,,"{'axisX': 'vertical', 'axisY': 'axial', 'axisZ..."
5,9f3b933a-1bc3-5093-9dee-800cc03c6b1d,test,..\data\part_2\test_data\9f3b933a-1bc3-5093-9d...,1590,,bearing,,,"{'axisX': 'horizontal', 'axisY': 'axial', 'axi..."
6,e057600e-3b4e-58ba-b8b8-357169ae6bf6,test,..\data\part_2\test_data\e057600e-3b4e-58ba-b8...,1800,,spindle,,,"{'axisX': 'horizontal', 'axisY': 'axial', 'axi..."
7,007b7aba-18a5-5e4a-a887-e1de8cce30f2,train_labeled,..\data\part_2\data\007b7aba-18a5-5e4a-a887-e1...,1598,VLQ4172,,True,structural_looseness,"{'axisX': 'horizontal', 'axisY': 'axial', 'axi..."
8,0123d223-8578-5c3c-997f-de2e7c3df494,train_labeled,..\data\part_2\data\0123d223-8578-5c3c-997f-de...,1598,UKK6686,,False,healthy,"{'axisX': 'vertical', 'axisY': 'axial', 'axisZ..."
9,0279ceb1-110e-5460-9e8a-75bf69d185bb,train_labeled,..\data\part_2\data\0279ceb1-110e-5460-9e8a-75...,1598,VLQ4172,,True,structural_looseness,"{'axisX': 'horizontal', 'axisY': 'axial', 'axi..."


In [6]:
samples_index["split"].value_counts(dropna=False)


split
train_labeled    250
test               7
Name: count, dtype: int64

## Key takeaways
- Metadata successfully parsed and orientation validated
- A master samples index was created, separating labeled/unlabeled/test
- Next commit will add: raw CSV loader normalization and waveform-level sanity checks


## Waveform-level sanity checks (Part 2)

This section validates raw signals:
- Schema normalization (data vs test_data)
- Time monotonicity
- Estimated sampling frequency (fs_est)
- Sample length consistency


In [7]:
# pick a few samples from each split
samples_labeled = samples_index[samples_index["split"] == "train_labeled"].head(2)
samples_unlabeled = samples_index[samples_index["split"] == "train_unlabeled"].head(2)
samples_test = samples_index[samples_index["split"] == "test"].head(2)

samples_to_check = pd.concat([samples_labeled, samples_unlabeled, samples_test], ignore_index=True)
samples_to_check[["sample_id", "split", "filepath"]]


Unnamed: 0,sample_id,split,filepath
0,007b7aba-18a5-5e4a-a887-e1de8cce30f2,train_labeled,..\data\part_2\data\007b7aba-18a5-5e4a-a887-e1...
1,0123d223-8578-5c3c-997f-de2e7c3df494,train_labeled,..\data\part_2\data\0123d223-8578-5c3c-997f-de...
2,01e98ad9-23c9-5986-ace0-4519bad71198,test,..\data\part_2\test_data\01e98ad9-23c9-5986-ac...
3,1dab1534-b8a8-5962-b01c-bff0782d54a9,test,..\data\part_2\test_data\1dab1534-b8a8-5962-b0...


In [8]:
rows = []
for _, r in samples_to_check.iterrows():
    raw = load_raw_triaxial_part2_csv(r["filepath"])
    rows.append({
        "sample_id": r["sample_id"],
        "split": r["split"],
        "schema": raw.schema,
        "n_samples": raw.n_samples,
        "fs_est_hz": raw.fs_est,
        "t0": float(raw.t[0]),
        "t_end": float(raw.t[-1]),
    })

sanity_df = pd.DataFrame(rows)
sanity_df


Unnamed: 0,sample_id,split,schema,n_samples,fs_est_hz,t0,t_end
0,007b7aba-18a5-5e4a-a887-e1de8cce30f2,train_labeled,part2_data,2048,4004.488829,0.0,0.511176
1,0123d223-8578-5c3c-997f-de2e7c3df494,train_labeled,part2_data,2048,4035.992675,0.0,0.507186
2,01e98ad9-23c9-5986-ace0-4519bad71198,test,part2_test,16384,7882.740562,0.0,2.078338
3,1dab1534-b8a8-5962-b01c-bff0782d54a9,test,part2_test,16384,7928.741966,0.0,2.06628


In [9]:
sanity_df.groupby(["split", "schema"])[["n_samples", "fs_est_hz"]].agg(["min","median","max"])


Unnamed: 0_level_0,Unnamed: 1_level_0,n_samples,n_samples,n_samples,fs_est_hz,fs_est_hz,fs_est_hz
Unnamed: 0_level_1,Unnamed: 1_level_1,min,median,max,min,median,max
split,schema,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2
test,part2_test,16384,16384.0,16384,7882.740562,7905.741264,7928.741966
train_labeled,part2_data,2048,2048.0,2048,4004.488829,4020.240752,4035.992675


## Orientation mapping: axis → horizontal/vertical/axial

The case requires the model to receive waves in H/V/A directions, not raw axis X/Y/Z.
Here we validate the mapping using metadata orientation.


In [18]:
# pick 2 labeled samples (they have orientation in metadata)
labeled = samples_index[samples_index["split"] == "train_labeled"].head(2).merge(
    train_md[["sample_id", "orientation", "rpm", "label"]],
    on="sample_id",
    how="left"
)

labeled[["sample_id", "rpm", "label"]]


KeyError: "['rpm', 'label'] not in index"