# Multiple Linear Regression

Practical machine learning problems often involve multiple features (or independent variables). **Simple linear regression**, which involves only one independent variable, is often used to illustrate the concept of regression. However, real-world problems usually involve more variables, and that's where **multiple linear regression** comes in.

## What is Multiple Linear Regression?

**Multiple Linear Regression** is a technique used to model the relationship between a continuous dependent variable (target) and two or more independent variables (features). It's an extension of **simple linear regression**, which deals with only one independent variable.





In [1]:
# Import necessary libraries
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error

# Sample data (House prices dataset)
data = {
    'Square_Feet': [1500, 1800, 2400, 3000, 3500],
    'Bedrooms': [3, 4, 3, 5, 4],
    'Age': [10, 15, 20, 5, 8],
    'Price': [400000, 500000, 600000, 650000, 700000]
}

# Create a DataFrame
df = pd.DataFrame(data)


In [2]:

# Independent variables (features)
X = df[['Square_Feet', 'Bedrooms', 'Age']]

# Dependent variable (target)
y = df['Price']

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create and train the model
model = LinearRegression()
model.fit(X_train, y_train)

# Make predictions
y_pred = model.predict(X_test)

# Evaluate the model
mse = mean_squared_error(y_test, y_pred)
print(f"Mean Squared Error: {mse}")


Mean Squared Error: 439085070.8493813


### Key Points:
- **Goal**: The goal of multiple linear regression is to predict a continuous target variable based on multiple input features.
- **Regression Process**: During the fitting process, multiple linear regression tries to find the best hyperplane that can represent the relationship between the target and the independent variables.
- **Hyperplane**: In multiple regression, the hyperplane is represented by a multi-dimensional space where each feature corresponds to a dimension, and the dependent variable is predicted based on the combination of all these features.

## Example: Multiple Linear Regression

Let's say we want to predict the **house price** based on various independent variables such as **square footage, number of bedrooms, and age of the house**.

### Formula:
The equation for multiple linear regression is:
\[
y = \beta_0 + \beta_1 \cdot X_1 + \beta_2 \cdot X_2 + \beta_3 \cdot X_3 + \cdots + \beta_n \cdot X_n + \epsilon
\]
Where:
- \( y \) is the dependent variable (house price)
- \( X_1, X_2, \cdots, X_n \) are the independent variables (square footage, number of bedrooms, age)
- \( \beta_0 \) is the intercept
- \( \beta_1, \beta_2, \cdots, \beta_n \) are the coefficients of each independent variable
- \( \epsilon \) is the error term
