# Machine Learning Basics: Linear Regression
This notebook introduces the basic concepts of machine learning with a focus on linear regression.

## 1. Introduction to Machine Learning
Machine Learning (ML) is a field of artificial intelligence that enables computers to learn patterns from data.

## 2. Linear Regression
Linear regression is a statistical method used to model the relationship between a dependent variable (y) and one or more independent variables (x).

### 2.1 The Mathematical Formula
For a simple linear regression with one feature:
\[ y = b_0 + b_1x \]
For multiple features (multiple linear regression):
\[ y = b_0 + b_1x_1 + b_2x_2 + ... + b_nx_n \]
The goal is to find the coefficients \( b_0, b_1, ..., b_n \) that minimize the error.

## 3. Ordinary Least Squares (OLS) Method
The OLS method estimates the coefficients by minimizing the sum of squared residuals.

## 4. Example: Finding Coefficients Manually

In [None]:
import numpy as np
import matplotlib.pyplot as plt

# Example data
X = np.array([1, 2, 3, 4, 5])  # Independent variable
y = np.array([2, 3, 5, 7, 11])  # Dependent variable

# Compute means
X_mean = np.mean(X)
y_mean = np.mean(y)

# Compute coefficients manually
b1 = np.sum((X - X_mean) * (y - y_mean)) / np.sum((X - X_mean) ** 2)
b0 = y_mean - b1 * X_mean

print(f"Calculated coefficients: b0 = {b0:.2f}, b1 = {b1:.2f}")

## 5. Implementing Regression using Scikit-Learn

In [None]:
from sklearn.linear_model import LinearRegression

# Reshape data for sklearn
X = X.reshape(-1, 1)

# Train the model
model = LinearRegression()
model.fit(X, y)

# Get coefficients
b0_sklearn = model.intercept_
b1_sklearn = model.coef_[0]

print(f"Scikit-learn coefficients: b0 = {b0_sklearn:.2f}, b1 = {b1_sklearn:.2f}")

## 6. Visualizing the Regression Line

In [None]:
plt.scatter(X, y, color='blue', label='Data Points')
plt.plot(X, model.predict(X), color='red', label='Regression Line')
plt.xlabel('X')
plt.ylabel('y')
plt.legend()
plt.show()

## 7. Complex Example: Multiple Linear Regression

In [None]:
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error

# Generating a dataset with multiple features
np.random.seed(42)
X_complex = np.random.rand(100, 3)  # Three independent variables
y_complex = 3 + 5 * X_complex[:, 0] + 2 * X_complex[:, 1] + 1.5 * X_complex[:, 2] + np.random.randn(100)  # Target variable

# Splitting into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X_complex, y_complex, test_size=0.2, random_state=42)

# Training the model
model_complex = LinearRegression()
model_complex.fit(X_train, y_train)

# Getting coefficients
b0_complex = model_complex.intercept_
b1_complex, b2_complex, b3_complex = model_complex.coef_

print(f"Complex Model Coefficients: b0 = {b0_complex:.2f}, b1 = {b1_complex:.2f}, b2 = {b2_complex:.2f}, b3 = {b3_complex:.2f}")

# Making predictions
y_pred = model_complex.predict(X_test)

# Evaluating the model
mse = mean_squared_error(y_test, y_pred)
print(f"Mean Squared Error: {mse:.2f}")

## Conclusion
This notebook introduced linear regression and demonstrated how to calculate coefficients manually and using scikit-learn. 
It also included a more complex example with multiple features and evaluation metrics. 
Happy Learning! 🎉