# (Domain Knowledge)

A milling machine is a machine tool used for machining solid materials. It typically consists of a stationary worktable and a spindle that holds a rotating cutting tool. The workpiece is secured to the worktable, and the cutting tool removes material from the workpiece to achieve the desired shape or features.

![image.png](attachment:image.png)

# Client requirement 
Accurate predictions of potential failures, identification of key failure modes and interpretable models.Customization for different product variants, scalability, cost reduction, integration with existing systems, data security, comprehensive documentation, training, and long-term support.

# Problems faced in Milling Machines
In predictive maintenance for milling machines, several common problems and challenges can be addressed. Here are some general problems that milling machines may face:

Tool Wear and Replacement: Over time, milling tools wear out and need replacement. Predicting the optimal time for tool replacement can improve efficiency and reduce downtime.

Heat Dissipation Issues: In milling processes, heat dissipation can be critical. Monitoring the temperature of the milling machine components and predicting potential overheating can prevent failures.

Power-related Failures: Fluctuations in power, whether too low or too high, can lead to machine failures. Identifying and predicting power-related issues can enhance the reliability of the milling machine.

Overstrain and Load Issues: Excessive strain on the machine, especially in terms of tool wear and torque, can lead to failures. Predicting and preventing overstrain conditions can extend the machine's lifespan.

Random Failures: Despite optimal conditions, random failures can still occur. Understanding and predicting these random failures can contribute to overall maintenance strategies.

For your project, you can formulate a problem statement based on addressing these specific challenges. For example:

# Problem Statement:
Design and implement a predictive maintenance model for milling machines to address tool wear, heat dissipation issues, power-related failures, overstrain conditions, and random failures. The model should analyze historical data, including features such as air temperature, process temperature, rotational speed, torque, and tool wear, to predict potential machine failures. The goal is to improve the reliability and efficiency of milling processes by enabling proactive maintenance interventions.

This problem statement provides a clear focus on the key challenges associated with milling machine maintenance and sets the direction for developing a predictive maintenance solution.

# Five Independent failures in Dataset 
The machine failure consists of five independent failure modes

1. tool wear failure (TWF): the tool will be replaced of fail at a randomly selected tool wear time between 200 - 240 mins (120 times in our dataset). At this point in time, the tool is replaced 69 times, and fails 51 times (randomly assigned).
2. heat dissipation failure (HDF): heat dissipation causes a process failure, if the difference between air- and process temperature is below 8.6 K and the tools rotational speed is below 1380 rpm. This is the case for 115 data points.
3. power failure (PWF): the product of torque and rotational speed (in rad/s) equals the power required for the process. If this power is below 3500 W or above 9000 W, the process fails, which is the case 95 times in our dataset.
4. overstrain failure (OSF): if the product of tool wear and torque exceeds 11,000 minNm for the L product variant (12,000 M, 13,000 H), the process fails due to overstrain. This is true for 98 datapoints.
5. random failures (RNF): each process has a chance of 0,1 % to fail regardless of its process parameters. This is the case for only 5 datapoints, less than could be expected for 10,000 datapoints in our dataset.

If at least one of the above failure modes is true, the process fails and the 'machine failure' label is set to 1. It is therefore not transparent to the machine learning method, which of the failure modes has caused the process to fail.

# Data Acquisition 
* Data Engineer accessed the data from client database and gave data to the team in the format of CSV files.

# EDA 
1. Explore
2. Column Datatypes 
3. Null values 
4. Outliers 
5. Visualization(kdeplot,histplot,scatterplot,countplot,pairplot,boxplot,heatmap)
6. Corelation 
7. VIF 

# Feature Engineering 
1. Replace Null values - mean ,median , mode , KNN imputer 
2. Encoding - Label Encoding , One hot encoding
3. Scaling (Normalization and standardization)
4. Handle Outliers 

# Feature Selection
1. Filter (Before Model Training)
2. Wrapper (During Model Training)
3. Embedded (After Model Training)

# Feature Extraction (PCA)

# Model Selection 
1. Logistic regression
2. Decision Tree
3. Random Forest
4. Gradient Boosting 

# Evaluation 
1. Confusion Matrix
2. Accuracy
3. F1 score 
4. Evaluation matrix


# Model Optimization
* Hyperparameter tuning 
1. n_estimators 
2. criterion 
3. min_samples_split 
4. min_samples_leaf
5. max_depth 
6. max_features
7. OOB score 
8. Bootstrap

# Final Model Selection
* Random Forest with Hyperparameter tuning

# Deployment 
* Done on AWS by deployment team but i have hands on on deployment on AWS