<a href="https://colab.research.google.com/github/amzad-786githumb/AI_and_ML_by-Microsoft/blob/main/15_Implementing_LASSO.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<h2>Tasks:</h2>


*   **Implement LASSO regression:** Apply LASSO to perform feature selection and regularization in an ML model.
*   **Adjust regularization strength:** Experiment with different values of the regularization parameter (alpha) to understand its impact on model complexity and performance.
*   **Interpret LASSO results:** Analyze the coefficients of the features to identify which ones are most relevant and how LASSO helps in simplifying the model.


<h3>Step 1: Import the required libraries</h3>

In [1]:
import pandas as pd
from sklearn.linear_model import Lasso
from sklearn.model_selection import train_test_split
from sklearn.metrics import r2_score

<h3>Step 2: Load and prepare the data</h3>

In [2]:
# Sample dataset: Study hours, previous exam scores, and pass/fail labels
data = {
    'StudyHours': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
    'PrevExamScore': [30, 40, 45, 50, 60, 65, 70, 75, 80, 85],
    'Pass': [0, 0, 0, 0, 0, 1, 1, 1, 1, 1]  # 0 = Fail, 1 = Pass
}

df = pd.DataFrame(data)

# Features and target variable
X = df[['StudyHours', 'PrevExamScore']]  # Features
y = df['Pass']  # Target variable

<h3>3. Splitting the data</h3>

In [3]:
# Split the data into training and testing sets (80% training, 20% testing)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

<h3>4. Applying LASSO<h3>







LASSO applies L1 regularization, which adds a penalty term to the loss function. This penalization causes less important feature coefficients to shrink to zero, effectively selecting only the most relevant features for the model.

**Step-by-step process:**

1. **Initialize the LASSO model:** Specify a value for the regularization parameter alpha.
2. **Train the model:** Use the training data to fit the LASSO model.
3. **Evaluate the model:** Use the test data to make predictions and calculate the performance of the model using R-squared.

In [4]:
#Initialize the LASSO model with alpha
lasso_model = Lasso(alpha=0.1)

#Train the model on training
lasso_model.fit(X_train, y_train)

#Make predictions on the test data
y_pred = lasso_model.predict(X_test)

#Evaluate the model's performance using R-squared
r2 = r2_score(y_test, y_test)
print(f'R-sqaued score: {r2}')

R-sqaued score: 1.0


<h3>6. Analyzing the results</h3>

In [5]:
# Display the coefficients of the features
print(f'LASSO Coefficients: {lasso_model.coef_}')

LASSO Coefficients: [0.         0.02463636]




*   The coefficient for StudyHours is 0, meaning it was removed from the model.
*   The coefficient for PrevExamScore is nonzero, meaning it was retained in the model.




<h3>7. Tuning the regularization parameter</h3>

In [7]:
#Try different alpha values and comparing the results

for alpha in [0.01, 0.05, 0.1, 0.5, 1]:
  lasso_model = Lasso(alpha=alpha)
  lasso_model.fit(X_train, y_train)
  lasso_model.predict(X_test)
  r2 = r2_score(y_test, y_pred)
  print(f'Alpha: {alpha}, R-squared: {r2}, Coefficients: {lasso_model.coef_}')


Alpha: 0.01, R-squared: 0.9997884297520662, Coefficients: [0.08153909 0.01180619]
Alpha: 0.05, R-squared: 0.9997884297520662, Coefficients: [0.         0.02481818]
Alpha: 0.1, R-squared: 0.9997884297520662, Coefficients: [0.         0.02463636]
Alpha: 0.5, R-squared: 0.9997884297520662, Coefficients: [0.         0.02318182]
Alpha: 1, R-squared: 0.9997884297520662, Coefficients: [0.         0.02136364]


*  Lower alpha values keep more features in the model but may lead to overfitting.

*  Higher alpha values simplify the model but may reduce its accuracy.