In [1]:
import autorootcwd  # noqa
import ipywidgets as widgets
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

from ipywidgets import interact
from utils.signal_representations import get_fft, get_fftfreq

## Data Description

There are 5 .npy files which represent the 5 sensors available from one single electric motor. The data was acquired simultaneously with a sample rate of 10 kHz.

The remaining .npy file has all the classes ranging from A to E, where each class represents a different operation state. It may or may not have different faults or normal behaviour.

In [2]:
SAMPLING_RATE = 10_000

In [3]:
classes = np.load("data/raw/Classes.npy", allow_pickle=True)
sensor_1 = np.load("data/raw/Dados_1.npy", allow_pickle=True)
sensor_2 = np.load("data/raw/Dados_2.npy", allow_pickle=True)
sensor_3 = np.load("data/raw/Dados_3.npy", allow_pickle=True)
sensor_4 = np.load("data/raw/Dados_4.npy", allow_pickle=True)
sensor_5 = np.load("data/raw/Dados_5.npy", allow_pickle=True)

In [4]:
df_classes = pd.DataFrame(classes, columns=["classes"])
df_classes.value_counts()

classes 
Classe A    10000
Classe B    10000
Classe C    10000
Classe D    10000
Classe E    10000
Name: count, dtype: int64

## Visualizing the data

### Raw data visualization

By plotting the 5 sensors in a single figure and changing the sample, some observations can be done:

- The first three sensors seem to be vibration sensors with a mean value close to 0, while the fifth sensor has a DC offset;
- The fourth sensor signal does not seem to be working properly, displaying only the value 50 for all data points. Therefore, does not have any predictive value for the task.

In [5]:
@interact(sample=widgets.IntSlider(min=0, max=classes.shape[0] - 1, step=1, value=0))
def raw_sensor_data_exploration(sample: int):
    fig, axs = plt.subplots(nrows=1, ncols=5, figsize=(24, 4), dpi=72)
    axes = axs.ravel()

    sensors = [
        sensor_1[sample],
        sensor_2[sample],
        sensor_3[sample],
        sensor_4[sample],
        sensor_5[sample],
    ]
    fig.suptitle(f"Sample {sample} - {classes[sample][0]}")
    
    for i, sensor in enumerate(sensors):
        axes[i].set_title(f"Sensor {i + 1}")
        axes[i].plot(sensor)
    

interactive(children=(IntSlider(value=0, description='sample', max=49999), Output()), _dom_classes=('widget-in…

Checking if all data from sensor 4 is actually only the value 50 by getting unique values, confirms the assumption.

In [6]:
pd.unique(sensor_4.ravel('K'))

array([50., nan])

### Spectrum visualization

Removing the sensor_4, and analysing the spectrum of the signals for each sample, there are a few observations that can be useful to understand what each sensor is and what the classes may represent.

Sensor 1,2, and 3 have clear peaks that may represent the vibration harmonics of the motor's fundamental rotation frequency. In rotating machinery, the first few harmonics can represent different faults, such as unbalance, misalignment, bent shaft, and others [1].

Sensor 5, on the other hand, has a different behaviour compared to the rest. It has amplitudes way higher than the others and overall noiser spectrum with not so defined peaks. From these observations, maybe a different sensor is used here or the place that it was placed has an overall different behavious with more vibration. 


> [1] RANDALL, Robert Bond. Vibration-based condition monitoring: industrial, automotive and aerospace applications. John Wiley & Sons, 2021.

In [7]:
@interact(sample=widgets.IntSlider(min=0, max=(classes.shape[0] - 1), step=1, value=0))
def spetrum_exploration(sample: int):
    fig, axs = plt.subplots(nrows=1, ncols=4, figsize=(24, 4), dpi=72)
    axes = axs.ravel()

    sensors = [sensor_1[sample], sensor_2[sample], sensor_3[sample], sensor_5[sample]]
    
    sensor_names = {0: "Sensor 1", 1: "Sensor 2", 2: "Sensor 3", 3: "Sensor 5"}
    
    fig.suptitle(f"Sample {sample} - {classes[sample][0]}")

    for i, sensor in enumerate(sensors):
        if np.isnan(sensor).any():
            sensor = sensor[~np.isnan(sensor)]
        amplitudes = get_fft(sensor, remove_mean=True)
        frequencies = get_fftfreq(len(sensor), fs=SAMPLING_RATE)
        axes[i].set_title(sensor_names[i])
        axes[i].plot(frequencies, amplitudes)

interactive(children=(IntSlider(value=0, description='sample', max=49999), Output()), _dom_classes=('widget-in…