In [10]:
import pandas as pd

# Correct file path based on the file available in your directory
file_path = '/kaggle/input/supervised-learning-algorithm/updated_supervised_learning_algorithms.xlsx'

# Read the Excel file (if the file contains only one sheet, or specify the sheet name if needed)
df = pd.read_excel(file_path, sheet_name='Sheet1')  # Specify the sheet name if needed

# Set pandas to display all rows and columns
#pd.set_option('display.max_rows', None)  # Display all rows
#pd.set_option('display.max_columns', None)  # Display all columns

df

Unnamed: 0,Algorithm,Type,Purpose,Method,Use Cases,One-liner Code
0,Linear Regression,Regression,Predict continuous output values,Linear equation minimizing sum of squares of r...,Predicting continuous values,from sklearn.linear_model import LinearRegress...
1,Logistic Regression,Classification,Predict binary output variable,Logistic function transforming linear relation...,Binary classification tasks,from sklearn.linear_model import LogisticRegre...
2,Decision Trees,Both,Model decisions and outcomes,Tree-like structure with decisions and outcomes,Classification and Regression tasks,from sklearn.tree import DecisionTreeClassifie...
3,Random Forests,Both,Improve classification and regression accuracy,Combining multiple decision trees,"Reducing overfitting, improving prediction acc...",from sklearn.ensemble import RandomForestClass...
4,SVM,Both,Create hyperplane for classification or predic...,Maximizing margin between classes or predictin...,Classification and Regression tasks,from sklearn.svm import SVC\nmodel = SVC().fit...
5,KNN,Both,Predict class or value based on k closest neig...,Finding k closest neighbors and predicting bas...,"Classification and Regression tasks, sensitive...",from sklearn.neighbors import KNeighborsClassi...
6,Gradient Boosting,Both,Combine weak learners to create strong model,Iteratively correcting errors with new models,Classification and Regression tasks to improve...,from sklearn.ensemble import GradientBoostingC...
7,Naive Bayes,Classification,Predict class based on feature independence as...,Bayes’ theorem with feature independence assum...,"Text classification, spam filtering, sentiment...",from sklearn.naive_bayes import GaussianNB\nmo...
8,Polynomial Regression,Regression,Predict continuous values with polynomial rela...,Polynomial terms added to linear regression,"Predicting stock prices, forecasting",from sklearn.linear_model import PolynomialFea...
9,Ridge Regression,Regression,Regularized linear regression to prevent overf...,Minimizing residuals with L2 regularization,Preventing overfitting in linear regression,from sklearn.linear_model import Ridge\nmodel ...


# Linear Regression 
Linear regression uses the relationship between the data-points to draw a straight line through all them.
This line can be used to predict future values.

Linear regression is one of the most fundamental algorithms in machine learning. It's used to predict a continuous target variable (or dependent variable) based on one or more input features (independent variables). The relationship between the input and the output is assumed to be linear.

The basic equation for linear regression is:

𝑦 = 𝛽
0
+
𝛽
1
𝑥
+
𝜖
y=β 
0
​
 +β 
1
​
 x+ϵ
 
Where:


y is the predicted value (dependent variable),

x is the input feature (independent variable),
𝛽
0
β 
0
​
  is the intercept,
𝛽
1
β 
1
​
  is the slope (coefficient),
𝜖
ϵ is the error term (difference between predicted and actual values).
Key Concepts to Understand:

Slope (
𝛽
1
β 
1
​
 ): Indicates how much change in the dependent variable happens with a unit change in the independent variable.

Intercept (
𝛽
0
β 
0
​
 ): Represents the value of 
𝑦
y 
when 
𝑥
=
0
x=0.

Error (
𝜖
ϵ): The difference between the predicted value and the actual value. We try to minimize this difference using a method called least squares.

Types of Linear Regression

Simple Linear Regression: Uses one feature to predict the target variable.

Multiple Linear Regression: Uses multiple features to predict the target variable.

Real-World Scenarios

Predicting House Prices: Based on features like square footage, number of rooms, and location.
Stock Market Prediction: Predicting future stock prices based on past data and other financial indicators.
Sales Forecasting: Estimating future sales based on advertising spend or seasonality.

In [None]:
#Predict the speed of a 10 years old car:
from scipy import stats

x = [5,7,8,7,2,17,2,9,4,11,12,9,6]
y = [99,86,87,88,111,86,103,87,94,78,77,85,86]

slope, intercept, r, p, std_err = stats.linregress(x, y)
print(r)
#r for relationship and should get low value for best model
#This relationship - the coefficient of correlation - is called r.The r value ranges from -1 to 1, 
#where 0 means no relationship, and 1 (and -1) means 100% related.

def myfunc(x):
  return slope * x + intercept

speed = myfunc(10)

print(speed)

## Simple Linear Regression

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression

# Dataset
X = np.array([1500, 1800, 2400, 3000, 3500]).reshape(-1, 1)  # Square footage,.reshape(-1, 1) reshapes this array 
#to be a 2D array (with one column),This is needed because Scikit-learn expects input features in a 2D format.

y = np.array([400000, 450000, 600000, 650000, 700000])  # House prices ,y is target variable

# Create a Linear Regression model
model = LinearRegression() #It Creates an instance of the LinearRegression model. This instance (model) will be used
#to train on the data and make predictions.

# Fit the model
model.fit(X, y) #Trains (or "fits") the linear regression model on the dataset (X, y).
#This step calculates the intercept and slope of the line that best fits the data points in X and y.

# Print the model parameters
print(f"Intercept: {model.intercept_}, Slope: {model.coef_}")
#model.intercept_,model.coef_:  Retrieves the INTERCEPT,SLOPE of the linear regression line

# Predict house price for a new square footage (e.g., 2500 sqft)
predicted_price = model.predict([[2500]]) #Uses the trained model to predict price of house
print(f"Predicted Price for 2500 sqft house: ${predicted_price[0]:,.2f}")

# Visualize the data and the regression line
plt.scatter(X, y, color='blue')  # Scatter plot of data- Creates a scatter plot of original data points with X,Y
plt.plot(X, model.predict(X), color='PINK') #Plots the linear regression line using the model’s predictions on X
plt.title("House Price Prediction") #Sets the title of the plot.
plt.xlabel("Square Footage")
plt.ylabel("Price")
plt.show()

## Multiple Linear Regression

In [9]:
# Multiple Linear Regression
from sklearn.linear_model import LinearRegression
import numpy as np

# Features: Square footage, number of bedrooms, and age of the house
X_multi = np.array([
    [1500, 3, 10],
    [1800, 4, 5],
    [2400, 3, 20],
    [3000, 4, 10],
    [3500, 5, 15]
])

# Target: Price of the house
y_multi = np.array([400000, 450000, 600000, 650000, 700000])

model_multi = LinearRegression()
model_multi.fit(X_multi, y_multi)

# Print model parameters
print(f"Intercept: {model_multi.intercept_}")
print(f"Coefficients: {model_multi.coef_}")

# Predict price for a new house with 2500 sqft, 4 bedrooms, and 12 years old
predicted_price_multi=model_multi.predict([[2500,4,12]])
print(f"Predicted Price for 2500 sqft, 4 bedrooms, 12 years old house: ${predicted_price_multi[0]:,.2f}")


Intercept: 248904.10958903958
Coefficients: [   175.34246575 -33835.61643836    986.30136986]
Predicted Price for 2500 sqft, 4 bedrooms, 12 years old house: $563,753.42


## Understanding Ridge and Lasso Regression
Both Ridge and Lasso regression are extensions of linear regression designed to address some of its limitations, especially overfitting in cases where there are many features or highly correlated features.

## Ridge Regression
Purpose: Ridge regression (also called L2 regularization) adds a penalty to the linear regression cost function, aiming to reduce the magnitude of the coefficients. This prevents any one feature from having too much influence on the prediction.

Mathematical Explanation: In regular linear regression, we minimize the sum of squared errors:
Cost = ∑ (y
−
y
^
)
2

 
In Ridge Regression, we add a penalty term:
Cost
=
∑
(
y
−
y
^
)
2
+
λ
∑
w
i
2

 
where 
λ
λ is a parameter that controls the strength of the penalty (regularization term), and 
w
i
w 
i
​
  are the weights (coefficients).
Effect: With the penalty, coefficients for less important features are shrunk towards zero, but never exactly zero. This reduces model complexity and overfitting.

## Lasso Regression
Purpose: Lasso regression (short for Least Absolute Shrinkage and Selection Operator or L1 regularization) also adds a penalty to the cost function, but it penalizes the absolute values of coefficients, which can drive some coefficients to exactly zero.
Mathematical Explanation:

Cost=∑(y− 
y
^
​
 ) 
2
 +λ∑∣w 
i
​
 ∣
 
Here, the penalty term is the sum of absolute values of the weights.

Effect: Lasso regression can reduce the coefficients of some features to zero, effectively selecting a smaller subset of features. This makes it useful for feature selection.


### Why Ridge and Lasso Are Related to Linear Regression
In linear regression, we minimize the prediction error without any regularization term. This means that in cases with too many features, especially when they are highly correlated, linear regression can overfit, capturing noise rather than the true trend. Ridge and Lasso add a regularization term to the cost function, helping to prevent overfitting by controlling the model complexity.

### Real-World Scenario Example
Suppose you’re predicting the price of a house based on features like square footage, number of bedrooms, neighborhood index, etc. If there are many features and some are highly correlated (like house age and condition), linear regression might overfit by giving too much importance to some features. By applying Ridge or Lasso regression:

Ridge: It’ll distribute the weights more evenly, preventing any one feature from dominating.

Lasso: It may drop less important features, making the model simpler and more interpretable.



In [1]:
# Import libraries
import numpy as np
from sklearn.linear_model import Ridge, Lasso, LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error

# Example dataset
X = np.array([[1500, 3], [1800, 4], [2400, 3], [3000, 5], [3500, 4]])  # Square footage and number of bedrooms
y = np.array([400000, 450000, 600000, 650000, 700000])  # Prices

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Initialize and fit models
ridge = Ridge(alpha=1.0) # Creates a Ridge Regression model with a regularization strength(
#λ) of 1.0. Higher alpha means more regularization.

lasso = Lasso(alpha=0.1) #Creates a Lasso Regression model with a regularization strength of 
#0.1. A lower alpha applies weaker regularization.

linear = LinearRegression() # Creates a basic Linear Regression model without regularization.

# Train the models
ridge.fit(X_train, y_train)
lasso.fit(X_train, y_train)
linear.fit(X_train, y_train)

# Predict with each model
ridge_preds = ridge.predict(X_test)
lasso_preds = lasso.predict(X_test)
linear_preds = linear.predict(X_test)

# Print errors for comparison
print("Linear Regression MSE:", mean_squared_error(y_test, linear_preds)) # Calculates the 
#Mean Squared Error(MSE),which measures the average squared diff btw actual and predicted values
print("Ridge Regression MSE:", mean_squared_error(y_test, ridge_preds))
print("Lasso Regression MSE:", mean_squared_error(y_test, lasso_preds))

Linear Regression MSE: 90738216.80545887
Ridge Regression MSE: 174889190.64929992
Lasso Regression MSE: 90742762.46610925


In [4]:
# Predict prices for new data points
new_data = np.array([[2000, 3],[2800, 4]])#Two new houses with features[square footage,no of bedrooms]

ridge_new_preds = ridge.predict(new_data)
lasso_new_preds = lasso.predict(new_data)
linear_new_preds = linear.predict(new_data)

# Display predictions
print("\nPredictions for new data points:")
for i, house in enumerate(new_data):
    print(f"House {i+1} (Square footage: {house[0]}, Bedrooms: {house[1]})")
    print(f"  Linear Regression Predicted Price: {linear_new_preds[i]:,.2f}")
    print(f"  Ridge Regression Predicted Price:  {ridge_new_preds[i]:,.2f}")
    print(f"  Lasso Regression Predicted Price:  {lasso_new_preds[i]:,.2f}")


Predictions for new data points:
House 1 (Square footage: 2000, Bedrooms: 3)
  Linear Regression Predicted Price: 501,360.81
  Ridge Regression Predicted Price:  500,115.02
  Lasso Regression Predicted Price:  501,360.73
House 2 (Square footage: 2800, Bedrooms: 4)
  Linear Regression Predicted Price: 616,213.06
  Ridge Regression Predicted Price:  616,628.33
  Lasso Regression Predicted Price:  616,213.09


 collection of **Linear, Ridge, and Lasso Regression** interview questions
 
---

### **General Linear Regression Questions**
1. **What is Linear Regression? Explain its working.**
   - Expected answer: A supervised learning algorithm used to predict a continuous target variable by finding a linear relationship between the input features and the target.

2. **What are the assumptions of Linear Regression?**
   - Linear relationship between features and target.
   - Homoscedasticity (constant variance of errors).
   - Independence of errors.
   - No multicollinearity among features.
   - Errors are normally distributed.

3. **What is the formula for a simple Linear Regression model?**
   - \( y = \beta_0 + \beta_1x + \epsilon \), where:
     - \( y \): Target variable.
     - \( \beta_0 \): Intercept.
     - \( \beta_1 \): Slope (coefficient for feature \( x \)).
     - \( \epsilon \): Error term.

4. **How do you evaluate the performance of a Linear Regression model?**
   - Metrics: Mean Squared Error (MSE), Root Mean Squared Error (RMSE), R-squared, Adjusted R-squared.

5. **What are the limitations of Linear Regression?**
   - Sensitivity to outliers.
   - Assumes a linear relationship between features and the target.
   - Can overfit when there are too many features without regularization.

---

### **Ridge Regression Questions**
1. **What is Ridge Regression? How is it different from Linear Regression?**
   - Ridge Regression adds an \( L2 \)-regularization term (\( \lambda \sum_{j=1}^{p} \beta_j^2 \)) to the loss function to penalize large coefficients, reducing overfitting.

2. **Explain the Ridge Regression loss function.**
   - \( J(\beta) = \sum_{i=1}^{n} (y_i - (\beta_0 + \sum_{j=1}^{p} \beta_j x_{ij}))^2 + \lambda \sum_{j=1}^{p} \beta_j^2 \), where \( \lambda \) controls the regularization strength.

3. **When should you use Ridge Regression?**
   - When the dataset has multicollinearity or a large number of features, and you want to reduce overfitting.

4. **What is the role of the hyperparameter \( \lambda \) in Ridge Regression?**
   - \( \lambda \) determines the strength of regularization. Larger \( \lambda \) values shrink coefficients more, potentially leading to underfitting.

5. **Does Ridge Regression set any coefficients to zero?**
   - No, Ridge Regression reduces the magnitude of coefficients but does not set them to zero.

---

### **Lasso Regression Questions**
1. **What is Lasso Regression? How does it differ from Ridge Regression?**
   - Lasso Regression adds an \( L1 \)-regularization term (\( \lambda \sum_{j=1}^{p} |\beta_j| \)) to the loss function, which can shrink some coefficients to exactly zero, performing feature selection.

2. **Explain the Lasso Regression loss function.**
   - \( J(\beta) = \sum_{i=1}^{n} (y_i - (\beta_0 + \sum_{j=1}^{p} \beta_j x_{ij}))^2 + \lambda \sum_{j=1}^{p} |\beta_j| \).

3. **When should you use Lasso Regression?**
   - When you want to perform feature selection or simplify a model by eliminating irrelevant features.

4. **What is the significance of \( \lambda \) in Lasso Regression?**
   - \( \lambda \) controls the regularization strength. Higher \( \lambda \) values increase sparsity by setting more coefficients to zero.

5. **What are the key differences between Lasso and Ridge Regression?**
   - Ridge penalizes \( L2 \) norm; Lasso penalizes \( L1 \) norm.
   - Ridge shrinks coefficients but does not set them to zero; Lasso can set coefficients to zero, performing feature selection.

---

### **Comparison and Practical Scenarios**
1. **How do Ridge and Lasso Regression address multicollinearity?**
   - Ridge: Reduces multicollinearity by penalizing large coefficients.
   - Lasso: Eliminates irrelevant features, indirectly reducing multicollinearity.

2. **What happens when \( \lambda = 0 \) in Ridge or Lasso Regression?**
   - Both regressions become equivalent to standard Linear Regression.

3. **Can you combine Ridge and Lasso? What is the resulting algorithm?**
   - Yes, the combined approach is called **Elastic Net**, which uses both \( L1 \) and \( L2 \)-regularization.

4. **Which regression model would you choose for high-dimensional datasets and why?**
   - Lasso for feature selection when many features are irrelevant.
   - Ridge for datasets with multicollinearity or when all features are potentially useful.

5. **Why might you prefer Lasso over Ridge in some cases?**
   - When you want a simpler model by automatically excluding irrelevant features.

---

### **Coding-Based Questions**
1. **How do you implement Ridge and Lasso Regression in Python using scikit-learn?**
   ```python
   from sklearn.linear_model import Ridge, Lasso
   ridge = Ridge(alpha=1.0)
   lasso = Lasso(alpha=0.1)
   ridge.fit(X_train, y_train)
   lasso.fit(X_train, y_train)
   ```

2. **How do you tune \( \lambda \) (regularization parameter) in Ridge and Lasso Regression?**
   - Use **GridSearchCV** or **RandomizedSearchCV** to find the optimal \( \lambda \) (hyperparameter tuning).

3. **How do you evaluate the performance of Ridge and Lasso models?**
   - Use metrics like MSE, RMSE, and R-squared on test data and compare with Linear Regression.

4. **What datasets are suitable for Ridge and Lasso Regression?**
   - High-dimensional datasets for Ridge.
   - Sparse datasets for Lasso where some features are irrelevant.