<a href="https://colab.research.google.com/github/akshayugalmogale/3d-Printing-Regression-Model/blob/main/Predict_Roughness_of_3D_Print_Material.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [16]:
# Importing the libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

In [17]:
printer = pd.read_csv('/content/3DPrinting_data.csv')

In [18]:
printer['infill_pattern'].replace(['grid','honeycomb'], [0,1], inplace = True)
printer['material'].replace(['abs','pla'], [0,1], inplace = True)

# Model 1 : Predicting roughness based on 9 features.
- Here we will be **building** a **multiple linear regression** model to **predict** `roughness` based on all the 9 features from `layer_height` to `fan_speed`

In [19]:
# Defining features and labels
X = printer.drop(['roughness','tension_strenght','elongation'], axis = 1)
y = printer['roughness']

In [20]:
X.head()

Unnamed: 0,layer_height,wall_thickness,infill_density,infill_pattern,nozzle_temperature,bed_temperature,print_speed,material,fan_speed
0,0.02,8,90,0,220,60,40,0,0
1,0.02,7,90,1,225,65,40,0,25
2,0.02,1,80,0,230,70,40,0,50
3,0.02,4,70,1,240,75,40,0,75
4,0.02,6,90,0,250,80,40,0,100


In [21]:
#importing statsmodels library
import statsmodels.api as sm

In [22]:
# let's define a function for the multiple regression

def linear_Regression(x,y):
    
    x = sm.add_constant(x)
    
    #defining the model, fitting the model and printing the results
    multiple_model = sm.OLS(y,x).fit()
    print(multiple_model.summary())

In [23]:
#calling the linear regression function
linear_Regression(X,y)

                            OLS Regression Results                            
Dep. Variable:              roughness   R-squared:                       0.875
Model:                            OLS   Adj. R-squared:                  0.851
Method:                 Least Squares   F-statistic:                     35.95
Date:                Mon, 14 Nov 2022   Prob (F-statistic):           3.83e-16
Time:                        16:08:26   Log-Likelihood:                -248.19
No. Observations:                  50   AIC:                             514.4
Df Residuals:                      41   BIC:                             531.6
Df Model:                           8                                         
Covariance Type:            nonrobust                                         
                         coef    std err          t      P>|t|      [0.025      0.975]
--------------------------------------------------------------------------------------
const                 -0.9534      0

  x = pd.concat(x[::order], 1)


### Inference from model
1. The **R-squared** value is **0.875 ot 87.5%** which states that **about 87.5%** of proportion of **variability** in data can be **explained** by this **linear regression model**. The **r-squared** is the most **important parameter** for a **model evaluation**. higher the value, better is the fit.

2. The **p-value** is an **important measure** for evaluating the various **variables**. **Closer** is the **p-value to 1**, the **lesser** is that **feature variable** has to do with the **label variable**. In our model, **wall thickness, infill_density and infill_pattern** are all close to 1 meaning they **don't** play **much role** in roughness.

3. Finally, the **expression** for this model can be written from the **coefficients** as follows

`Roughness = -0.9534 + 1269.4449*layer_height + 2.3342*wall_thickness - 0.0423*infill_density - 0.1255*infill_pattern + 15.0562*nozzle_temperature - 55.6225*bbed_temperature + 0.6496*print_speed + 298.4514*material + 7.8989*fan_speed`

4. The coefficients suggest the following
   - In the **absence** of all the **parameters**, the **roughness decreases** by **0.9534 micro metre** by the constant term. This may be **due** to **other factors** like **ambient temperature** which are **not considered** for the model.
   - A **positive coefficient value** states that an **increase in wall thickness** (example) by 1 mm results in **increase in roughness** by 2.3342 micro metre.
   - Similarly, **a negative coefficient value** states that an **increase in bed temperature** by a degree C causes **55.6225 micro metre decrease** in roughness.

### Now let's eliminate `wall thickness`, `infill density` and `infill pattern` and build the model to see if there's any improvement.

In [24]:
#check X
X.head()

Unnamed: 0,layer_height,wall_thickness,infill_density,infill_pattern,nozzle_temperature,bed_temperature,print_speed,material,fan_speed
0,0.02,8,90,0,220,60,40,0,0
1,0.02,7,90,1,225,65,40,0,25
2,0.02,1,80,0,230,70,40,0,50
3,0.02,4,70,1,240,75,40,0,75
4,0.02,6,90,0,250,80,40,0,100


In [25]:
X = X.drop(['wall_thickness','infill_density','infill_pattern'], axis = 1)

In [26]:
X.head()

Unnamed: 0,layer_height,nozzle_temperature,bed_temperature,print_speed,material,fan_speed
0,0.02,220,60,40,0,0
1,0.02,225,65,40,0,25
2,0.02,230,70,40,0,50
3,0.02,240,75,40,0,75
4,0.02,250,80,40,0,100


In [27]:
#calling the linear regression function
linear_Regression(X,y)

                            OLS Regression Results                            
Dep. Variable:              roughness   R-squared:                       0.872
Model:                            OLS   Adj. R-squared:                  0.857
Method:                 Least Squares   F-statistic:                     59.78
Date:                Mon, 14 Nov 2022   Prob (F-statistic):           1.67e-18
Time:                        16:08:26   Log-Likelihood:                -248.88
No. Observations:                  50   AIC:                             509.8
Df Residuals:                      44   BIC:                             521.2
Df Model:                           5                                         
Covariance Type:            nonrobust                                         
                         coef    std err          t      P>|t|      [0.025      0.975]
--------------------------------------------------------------------------------------
const                 -0.9307      0

  x = pd.concat(x[::order], 1)


## Inference
1. The **R-squared** value decreases to **87.2%** which is **not much decrease**.
2. The **p-values** of all the **considered features** do play an **important role** in predicting the label.
3. We can **improve** the **model** further by considering **interaction terms** based on the **correlation heatmap**.
4. The various interaction terms are as follows
    - Material and nozzle temperature (-0.78)
    - Bed temperature and nozzle temperature (0.6)
    - Fan speed and nozzle temperatue (0.6)
    - Fan speed and bed temperature (1)

In [28]:
X.head()

Unnamed: 0,layer_height,nozzle_temperature,bed_temperature,print_speed,material,fan_speed
0,0.02,220,60,40,0,0
1,0.02,225,65,40,0,25
2,0.02,230,70,40,0,50
3,0.02,240,75,40,0,75
4,0.02,250,80,40,0,100


In [29]:
X.head()

Unnamed: 0,layer_height,nozzle_temperature,bed_temperature,print_speed,material,fan_speed
0,0.02,220,60,40,0,0
1,0.02,225,65,40,0,25
2,0.02,230,70,40,0,50
3,0.02,240,75,40,0,75
4,0.02,250,80,40,0,100


In [30]:
#get the interaction terms by multiplying values

inter_mn = X['material']*X['nozzle_temperature']
inter_bn = X['bed_temperature']*X['nozzle_temperature']
inter_fn = X['fan_speed']*X['nozzle_temperature']
inter_fb = X['fan_speed']*X['bed_temperature']

In [31]:
#adding these interaction terms to dataset using .concat() function of pandas
#we will call this dataset as interaction

interaction = pd.concat([X,inter_mn,inter_bn,inter_fn,inter_fb], axis = 1)

#chenge column names of this interaction terms
interaction = interaction.rename(columns = {0:'interct_mn', 1:'interact_bn', 2:'interact_fn',
                             3:'interact_fb'})

interaction.head(10)

Unnamed: 0,layer_height,nozzle_temperature,bed_temperature,print_speed,material,fan_speed,interct_mn,interact_bn,interact_fn,interact_fb
0,0.02,220,60,40,0,0,0,13200,0,0
1,0.02,225,65,40,0,25,0,14625,5625,1625
2,0.02,230,70,40,0,50,0,16100,11500,3500
3,0.02,240,75,40,0,75,0,18000,18000,5625
4,0.02,250,80,40,0,100,0,20000,25000,8000
5,0.02,200,60,40,1,0,200,12000,0,0
6,0.02,205,65,40,1,25,205,13325,5125,1625
7,0.02,210,70,40,1,50,210,14700,10500,3500
8,0.02,215,75,40,1,75,215,16125,16125,5625
9,0.02,220,80,40,1,100,220,17600,22000,8000


In [32]:
# NOw let's fit this model to the linear regression function

linear_Regression(interaction,y)

                            OLS Regression Results                            
Dep. Variable:              roughness   R-squared:                       0.925
Model:                            OLS   Adj. R-squared:                  0.910
Method:                 Least Squares   F-statistic:                     63.22
Date:                Mon, 14 Nov 2022   Prob (F-statistic):           1.31e-20
Time:                        16:08:26   Log-Likelihood:                -235.45
No. Observations:                  50   AIC:                             488.9
Df Residuals:                      41   BIC:                             506.1
Df Model:                           8                                         
Covariance Type:            nonrobust                                         
                         coef    std err          t      P>|t|      [0.025      0.975]
--------------------------------------------------------------------------------------
const                 -1.0735      0

  x = pd.concat(x[::order], 1)


## Model Summary
- Alright now, we have an increased R-squared of 92.5% or 0.925 which is very high.
- We will stop here and declare this model as the most suitable for predicting roughness

## Similarly we will fit models for predicting tension strength and elongation as well. I will reduce the explanations here. The procedure is absolutely same starting with all features and then tuning for better results.