# <h1 style="text-align:center;">Machine Learning</h1>


<h2> Backward Elimination </h2>


**Background:**
- Backward elimination is a feature selection technique used when building a machine learning model. Its goal is to remove features that do not significantly affect the prediction of the output (dependent variable).


**Methods for Building a Model in Machine Learning:**
- There are various methods to build a machine learning model, such as:
  1. All-in
  2. Backward Elimination
  3. Forward Selection
  4. Bidirectional Elimination
  5. Score Comparison
- In this explanation, we will focus on the Backward Elimination method.


**Steps of Backward Elimination:**
1. **Select a Significance Level (SL):** Choose a significance level (usually 0.05) that determines which features should stay in the model.

2. **Fit the Complete Model:** Initially, create a model with all available predictors (independent variables).

3. **Check P-values:** Calculate the P-value for each predictor. The P-value measures the significance of a variable's impact on the dependent variable.

4. **Select the Predictor:** Identify the predictor with the highest P-value.
   - If the P-value of this predictor is greater than the chosen significance level (SL), proceed to the next step.
   - If the P-value is less than or equal to SL, you can finish the process, and your model is ready.


**Remove Non-Significant Predictor:**
5. **Remove the Predictor:** If the P-value of the selected predictor in step 4 is greater than the significance level (SL), remove this predictor from the model.

6. **Rebuild the Model:** Create a new model with the remaining variables after removing the non-significant predictor.


**Need for Backward Elimination:**
- The goal of backward elimination is to create an optimal Multiple Linear Regression (MLR) model. In MLR, you typically have several independent variables and one dependent variable. However, it's essential to identify which independent variables have the most significant impact on the prediction and which ones have the least.


**Why Backward Elimination:**
- Including unnecessary features in the model can make it overly complex and may lead to poorer performance. Therefore, backward elimination helps streamline the model, including only the most significant features, simplifying the model, and improving its predictive performance.



<h3> Steps for Backward Elimination method:</h3>

We will use the same model which we build in the previous chapter of MLR. 

In [1]:
# importing libraries  
import numpy as nm  
import matplotlib.pyplot as mtp  
import pandas as pd  
  
#importing datasets  
data_set= pd.read_csv('50_Startups.csv')  
  
#Extracting Independent and dependent Variable  
x= data_set.iloc[:, :-1].values  
y= data_set.iloc[:, 4].values  
  
#Catgorical data  
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import OneHotEncoder, LabelEncoder

# Assuming you have your data in the variable 'x' where the 4th column is categorical
# Create a ColumnTransformer to apply transformations to specific columns
column_transformer = ColumnTransformer(
    transformers=[
        ('onehot', OneHotEncoder(), [3])  # Apply OneHotEncoder to the 4th column
    ],
    remainder='passthrough'  # Keep the other columns as they are
)

# Apply the transformations to your data
x = column_transformer.fit_transform(x) 
  
#Avoiding the dummy variable trap:  
x = x[:, 1:]  
  
  
# Splitting the dataset into training and test set.  
from sklearn.model_selection import train_test_split  
x_train, x_test, y_train, y_test= train_test_split(x, y, test_size= 0.2, random_state=0)  
  
#Fitting the MLR model to the training set:  
from sklearn.linear_model import LinearRegression  
regressor= LinearRegression()  
regressor.fit(x_train, y_train)  
  
#Predicting the Test set result;  
y_pred= regressor.predict(x_test)  
  
#Checking the score  
print('Train Score: ', regressor.score(x_train, y_train))  
print('Test Score: ', regressor.score(x_test, y_test))

Train Score:  0.9501847627493607
Test Score:  0.9347068473282966


The difference between both scores is 0.0154.

<h4>Step: 1- Preparation of Backward Elimination:</h4>

Importing the library: Firstly, we need to import the statsmodels.formula.api library, which is used for the estimation of various statistical models such as OLS(Ordinary Least Square). 

The OLS method can be mathematically represented as minimizing the sum of the squared residuals:

   <h3> OLS= sum((actual-predicted)^2)</h3>


Importing the library: Firstly, we need to import the statsmodels.formula.api library, which is used for the estimation of various statistical models such as OLS(Ordinary Least Square). Below is the code for it:

In [2]:
import statsmodels.api as smf  

Adding a column in matrix of features: As we can check in our MLR equation (a), there is one constant term b0, but this term is not present in our matrix of features, so we need to add it manually. We will add a column having values x0 = 1 associated with the constant term b0.
To add this, we will use append function of Numpy library (nm which we have already imported into our code), and will assign a value of 1. Below is the code for it.

In [3]:
import numpy as nm
x=nm.append(arr=nm.ones((50,1)).astype(int),values=x,axis=1)

Here we have used axis =1, as we wanted to add a column. For adding a row, we can use axis =0.

Output: By executing the above line of code, a new column will be added into our matrix of features, which will have all values equal to 1. We can check it by clicking on the x dataset under the variable explorer option.

In [19]:
x


array([[1, 0.0, 1.0, 165349.2, 136897.8, 471784.1],
       [1, 0.0, 0.0, 162597.7, 151377.59, 443898.53],
       [1, 1.0, 0.0, 153441.51, 101145.55, 407934.54],
       [1, 0.0, 1.0, 144372.41, 118671.85, 383199.62],
       [1, 1.0, 0.0, 142107.34, 91391.77, 366168.42],
       [1, 0.0, 1.0, 131876.9, 99814.71, 362861.36],
       [1, 0.0, 0.0, 134615.46, 147198.87, 127716.82],
       [1, 1.0, 0.0, 130298.13, 145530.06, 323876.68],
       [1, 0.0, 1.0, 120542.52, 148718.95, 311613.29],
       [1, 0.0, 0.0, 123334.88, 108679.17, 304981.62],
       [1, 1.0, 0.0, 101913.08, 110594.11, 229160.95],
       [1, 0.0, 0.0, 100671.96, 91790.61, 249744.55],
       [1, 1.0, 0.0, 93863.75, 127320.38, 249839.44],
       [1, 0.0, 0.0, 91992.39, 135495.07, 252664.93],
       [1, 1.0, 0.0, 119943.24, 156547.42, 256512.92],
       [1, 0.0, 1.0, 114523.61, 122616.84, 261776.23],
       [1, 0.0, 0.0, 78013.11, 121597.55, 264346.06],
       [1, 0.0, 1.0, 94657.16, 145077.58, 282574.31],
       [1, 1.0, 0.0, 9

In [20]:
x.astype(float)

array([[1.0000000e+00, 0.0000000e+00, 1.0000000e+00, 1.6534920e+05,
        1.3689780e+05, 4.7178410e+05],
       [1.0000000e+00, 0.0000000e+00, 0.0000000e+00, 1.6259770e+05,
        1.5137759e+05, 4.4389853e+05],
       [1.0000000e+00, 1.0000000e+00, 0.0000000e+00, 1.5344151e+05,
        1.0114555e+05, 4.0793454e+05],
       [1.0000000e+00, 0.0000000e+00, 1.0000000e+00, 1.4437241e+05,
        1.1867185e+05, 3.8319962e+05],
       [1.0000000e+00, 1.0000000e+00, 0.0000000e+00, 1.4210734e+05,
        9.1391770e+04, 3.6616842e+05],
       [1.0000000e+00, 0.0000000e+00, 1.0000000e+00, 1.3187690e+05,
        9.9814710e+04, 3.6286136e+05],
       [1.0000000e+00, 0.0000000e+00, 0.0000000e+00, 1.3461546e+05,
        1.4719887e+05, 1.2771682e+05],
       [1.0000000e+00, 1.0000000e+00, 0.0000000e+00, 1.3029813e+05,
        1.4553006e+05, 3.2387668e+05],
       [1.0000000e+00, 0.0000000e+00, 1.0000000e+00, 1.2054252e+05,
        1.4871895e+05, 3.1161329e+05],
       [1.0000000e+00, 0.0000000e+00,

<h4>Step 2:</h4>

Now, we are actually going to apply a backward elimination process. 

1. Choose a Significance Level: Define a significance level (e.g., 0.05) to stay in the model. This value is often denoted as SL.


2.  Fit the Initial Model: Fit the model with all the possible predictors included.


3. Consider the Predictor with the Highest P-value: Identify the predictor with the highest p-value. If the p-value is greater than your chosen significance level (SL), go to the next step. Otherwise, go to step 6.


4. Remove the Predictor: Remove the predictor identified in step 3.


5. Fit the Model Again: Fit the model without the removed predictor and go back to step 3.


6. Finalize the Model: Once you have a model where all predictors have a p-value below the chosen significance level, your model is ready. Otherwise, if you reach a point where all predictors have p-values less than SL, you can finalize the model.