# Team 4: Machine Predictive Maintenance Classification

Contributors: Elan Wilkinson, Zack Robertson, Alden Caterio, Laxmi Sulakshana Rapolu

## Introduction

The realm of predictive maintenance is pivotal in industry, where the anticipation of machine failures can save substantial costs and prevent downtime. However, obtaining real-world datasets for predictive maintenance poses significant challenges due to privacy and proprietary concerns. This paper leverages the AI4I 2020 Predictive Maintenance Dataset, a synthetic dataset crafted to mirror the complexities and characteristics of genuine industrial predictive maintenance data. Machine learning (ML) is integral to predictive maintenance, utilizing labeled data (datasets with known target variables) to predict when maintenance is needed. This ensures maintenance is timely, preventing failures while optimizing costs. Techniques such as decision trees, support vector machines, neural networks, and more sophisticated methods like ensemble and deep learning, are employed to enhance the accuracy of these predictions. By analyzing operational parameters like temperature and vibration, ML models can effectively forecast machine failures, transforming maintenance from a routine schedule to a data-driven decision process.

## Dataset
The AI4I 2020 Predictive Maintenance Dataset encompasses 10,000 data points, each with 14 distinct features1. These features include a unique identifier (UID), product ID with quality variants (L, M, H), air temperature [K], process temperature [K], rotational speed [rpm], torque [Nm], tool wear [min], and a ‘machine failure’ label indicating the occurrence of a failure. The dataset ingeniously simulates five independent failure modes: tool wear failure (TWF), heat dissipation failure (HDF), power failure (PWF), overstrain failure (OSF), and random failures (RNF)3. Each mode is defined by specific conditions, such as the relationship between torque and rotational speed or the product of tool wear and torque.

Dataset URL: https://archive.ics.uci.edu/dataset/601/ai4i+2020+predictive+maintenance+dataset

## Packages

In [1]:
 # Install necessary packages (if not already installed)
!pip install ucimlrepo pandas numpy matplotlib seaborn

# Import required libraries
from ucimlrepo import fetch_ucirepo
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt 




## Load Dataset

In [2]:
# fetch dataset 
ai4i_2020_predictive_maintenance_dataset = fetch_ucirepo(id=601) 
  
# data (as pandas dataframes) 
df_ids = ai4i_2020_predictive_maintenance_dataset.data.ids # data with role ID
df_features = ai4i_2020_predictive_maintenance_dataset.data.features # data with role Feature
df_target = ai4i_2020_predictive_maintenance_dataset.data.targets # data with role Target
df = pd.concat([df_ids, df_features, df_target], axis=1)

In [3]:
# metadata 
print('Metadata: \r\n', ai4i_2020_predictive_maintenance_dataset.metadata, '\r\n') 
  
# variable information 
print('Variable information: \r\n', ai4i_2020_predictive_maintenance_dataset.variables, '\r\n')

# ID dataset
print('Dataset with role ID: \r\n', df_ids.head(), '\r\n')

# Feature dataset
print('Dataset with role Feature: \r\n', df_features.head(), '\r\n')

# Target dataset
print('Dataset with role Target: \r\n', df_target.head(), '\r\n')

# Final dataset
print('Final dataset: \r\n', df.head(), '\r\n')

# Machine failure dataset
print('Machine failure dataset: \r\n', df[df['Machine failure'] != 0].head())

Metadata: 
 {'uci_id': 601, 'name': 'AI4I 2020 Predictive Maintenance Dataset', 'repository_url': 'https://archive.ics.uci.edu/dataset/601/ai4i+2020+predictive+maintenance+dataset', 'data_url': 'https://archive.ics.uci.edu/static/public/601/data.csv', 'abstract': 'The AI4I 2020 Predictive Maintenance Dataset is a synthetic dataset that reflects real predictive maintenance data encountered in industry.', 'area': 'Computer Science', 'tasks': ['Classification', 'Regression', 'Causal-Discovery'], 'characteristics': ['Multivariate', 'Time-Series'], 'num_instances': 10000, 'num_features': 6, 'feature_types': ['Real'], 'demographics': [], 'target_col': ['Machine failure', 'TWF', 'HDF', 'PWF', 'OSF', 'RNF'], 'index_col': ['UID', 'Product ID'], 'has_missing_values': 'no', 'missing_values_symbol': None, 'year_of_dataset_creation': 2020, 'last_updated': 'Wed Feb 14 2024', 'dataset_doi': '10.24432/C5HS5C', 'creators': [], 'intro_paper': {'title': 'Explainable Artificial Intelligence for Predictive

## References
*AI4I 2020 Predictive Maintenance Dataset. (2020)*. UCI Machine Learning Repository. https://doi.org/10.24432/C5HS5C.