# Multiple Linear Regression and Logistic Regression

## ðŸ“– Theory Explanation

### Multiple Linear Regression

#### Definition
A statistical method that extends simple linear regression to predict a dependent variable (target) using multiple independent variables (features).

### Equation
$$
Y = \beta_0 + \beta_1 X_1 + \beta_2 X_2 + \dots + \beta_n X_n + \epsilon
$$

Where:

$$
\begin{aligned}
- \, & Y: \text{Dependent variable (target).} \\
- \, & X_1, X_2, \dots, X_n: \text{Independent variables (features).} \\
- \, & \beta_0: \text{Intercept.} \\
- \, & \beta_1, \dots, \beta_n: \text{Coefficients for each feature.} \\
- \, & \epsilon: \text{Error term (difference between predicted and actual values).}
\end{aligned}
$$


#### Applications
- Predicting house prices based on factors like size, location, and age.
- Estimating sales revenue based on marketing spend and customer demographics.

---

### Logistic Regression

#### Definition
A classification algorithm used for binary outcomes, such as Yes/No, True/False.

#### Sigmoid Function
Converts linear output to probabilities:

$$
P(Y = 1 \mid X) = \frac{1}{1 + e^{-\left(\beta_0 + \beta_1 X_1 + \dots + \beta_n X_n\right)}}
$$

#### Key Points

$$
\begin{aligned}
- \, & P(Y = 1 \mid X): \text{Probability of the dependent variable being 1, given } X. \\
- \, & \beta_0, \beta_1, \dots, \beta_n: \text{Parameters learned from data.}
\end{aligned}$$



#### Goal
Predict probabilities and classify based on a threshold (commonly 0.5).

#### Applications
- Predicting whether a customer will churn.
- Classifying emails as spam or not spam.

## ðŸ’» Practical Implementation
### 1. Multiple Linear Regression Example

In [6]:
import pandas as pd
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error

# Sample dataset
data = {'Area': [1200, 1500, 1800, 2000, 2500],
        'Rooms': [2, 3, 4, 4, 5],
        'Price': [300000, 350000, 400000, 450000, 500000]}

df = pd.DataFrame(data)

X = df[['Area', 'Rooms']]
y = df['Price']

model = LinearRegression()
model.fit(X, y)

predictions = model.predict(X)
mse = mean_squared_error(y, predictions)

print("Predicted Prices:", predictions)
print("Mean Squared Error (MSE):", mse)

Predicted Prices: [303125. 353125. 403125. 431250. 509375.]
Mean Squared Error (MSE): 93750000.0


#### Output Explanation:

- **Predicted Prices:** Estimated prices based on the input features.
- **MSE:** Measures how close the predicted values are to the actual values; lower is better.

### 3. Logistic Regression Example

In [7]:
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, classification_report

# Sample dataset
data = {'Hours_Studied': [1, 2, 3, 4, 5],
        'Pass': [0, 0, 1, 1, 1]}  # 0: Fail, 1: Pass

df = pd.DataFrame(data)

X = df[['Hours_Studied']]
y = df['Pass']

log_model = LogisticRegression()
log_model.fit(X, y)

predictions = log_model.predict(X)
probabilities = log_model.predict_proba(X)

print("Predicted Classes:", predictions)
print("Predicted Probabilities:", probabilities)
print("Accuracy Score:", accuracy_score(y, predictions))
print("Classification Report:\n", classification_report(y, predictions))

Predicted Classes: [0 0 1 1 1]
Predicted Probabilities: [[0.8157613  0.1842387 ]
 [0.60836998 0.39163002]
 [0.35275336 0.64724664]
 [0.16051755 0.83948245]
 [0.06286686 0.93713314]]
Accuracy Score: 1.0
Classification Report:
               precision    recall  f1-score   support

           0       1.00      1.00      1.00         2
           1       1.00      1.00      1.00         3

    accuracy                           1.00         5
   macro avg       1.00      1.00      1.00         5
weighted avg       1.00      1.00      1.00         5



#### Output Explanation:

- **Predicted Classes:** Indicates pass (1) or fail (0) based on the logistic function.
- **Predicted Probabilities:** Gives the likelihood of each class (e.g., [0.2, 0.8] for fail vs. pass).
- **Classification Report:** Shows precision, recall, and F1-score for the model.

---

## ðŸ”‘ Key Takeaways
- **Multiple Linear Regression:** Effective for predicting continuous outcomes influenced by multiple factors.
- **Logistic Regression:** Ideal for binary classification problems.
- Both models provide fundamental techniques for supervised learning in machine learning.

---

## Conclusion
These models serve as the building blocks for more advanced predictive and classification tasks in real-world applications! ðŸš€

---