<a href="https://colab.research.google.com/github/RealThanosP/pred-main-mod/blob/main/Tutorial.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Clone repository from GitHub

In [None]:
# Load the data on the server session
!git clone https://github.com/RealThanosP/pred-main-mod

# Change your working directory inside the repository
%cd pred-main-mod

# Dataset:
Helwig, N., Pignanelli, E., & Schtze, A. (2015). Condition monitoring of hydraulic systems [Dataset]. UCI Machine Learning Repository. https://doi.org/10.24432/C5CW21.


# Import libraries

In [164]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import xgboost as xgb
from sklearn.metrics import mean_squared_error
from scipy.fftpack import fft

# Getting the sensor files in a list



In [2]:
import os

folder_path = "data/condition+monitoring+of+hydraulic+systems"

# Define the list of sensors and their corresponding column names
sensors = [
    "PS1", "PS2", "PS3", "PS4", "PS5", "PS6",  # Pressure sensors
    "EPS1",  # Motor power
    "FS1", "FS2",  # Volume flow
    "TS1", "TS2", "TS3", "TS4",  # Temperature sensors
    "VS1",  # Vibration
    "CE",  # Cooling efficiency
    "CP",  # Cooling power
    "SE"  # Efficiency factor
]

# Initialize a list to hold the data from each txt file
all_data = []

In [114]:
# Gets all the text files from dataset
txt_files = [f for f in os.listdir(folder_path) if f.endswith('.txt')]

# Keeps only the data that have a sensor name in the filename
sensor_file_path_list = [f"{folder_path}/{f}" for f in txt_files if any(sensor in f for sensor in sensors)]
sensor_file_path_list.sort()

# Filter sensor files into different lists
Lists contain the file path of the sensor, so you can **.read_csv()** directly from the list item.

In [142]:
# Multiple sensors
pressure_sensors = [f for f in sensor_file_path_list if "PS" in f and "EPS" not in f]
temp_sensors = [f for f in sensor_file_path_list if "TS" in f]
flow_sensors = [f for f in sensor_file_path_list if "FS" in f]

# Unary sensors, 1 sensor
vibration_sensor = [f for f in sensor_file_path_list if "VS" in f][0]
cooling_efficiency_sensor = [f for f in sensor_file_path_list if "CE" in f][0]
cooling_power_sensor = [f for f in sensor_file_path_list if "CP" in f][0]
efficiency_factor_sensor = [f for f in sensor_file_path_list if "SE" in f][0]

# Load Pressure Data

Takes a long time to load, so just run it once.

---


Based on the **documentation** the sample rate of the pressure sensors is 6000Hz.

We are talking about 36000 columns.

Each 6000 column inteval contains the data of the coresponding sensor PS(1-6)

In [147]:
df_pressure = pd.concat([pd.read_csv(f, sep="\t", header=None) for f in pressure_sensors], axis=1)

# FFT Feature Analysis

In [168]:
# Sampling frequency (assuming 100 Hz for high-speed sensors)
fs = 50

def extract_fft_features(signal, fs):
    """Computes FFT features for a given time-series signal"""
    N = len(signal)  # Length of the signal (6000)
    fft_values = fft(signal)  # Apply FFT
    fft_magnitudes = np.abs(fft_values)[:N//2]  # Keep only positive frequencies
    freqs = np.fft.fftfreq(N, d=1/fs)[:N//2]  # Frequency bins

    # Feature Extraction
    dominant_freq = freqs[np.argmax(fft_magnitudes)]  # Frequency with max amplitude
    spectral_energy = np.sum(fft_magnitudes**2)  # Total power
    spectral_entropy = -np.sum((fft_magnitudes/np.sum(fft_magnitudes)) * np.log(fft_magnitudes/np.sum(fft_magnitudes)))  # Entropy

    return dominant_freq, spectral_energy, spectral_entropy

# Apply FFT feature extraction to each row (cycle)
fft_features = df_pressure.apply(lambda row: extract_fft_features(row.values, fs), axis=1)

# Convert list of tuples into DataFrame
fft_df = pd.DataFrame(fft_features.tolist(), columns=['Dominant_Freq', 'Spectral_Energy', 'Sperctral_Entropy'])

# Combine FFT features with original cycle labels (if available)
fft_df.to_csv("fft_features.csv", index=False)
print(fft_df.head())

   Dominant_Freq  Spectral_Energy  Sperctral_Entropy
0            0.0     5.895798e+12           6.425514
1            0.0     5.883376e+12           6.412172
2            0.0     5.862218e+12           6.411171
3            0.0     5.849176e+12           6.403472
4            0.0     5.833152e+12           6.405609
