# Machine Learning Project: 2. Model - choose machine learning model

As always, we need to solve real-world business problems. Machine learning is just a tool that we can use.

### 1. What is a Machine Learning Model?

A machine learning model is a function that maps inputs (features) to outputs (predictions) based on learned patterns from data. It consists of:
- **Mathematical representation**: Functions, equations, probabilities, or optimization techniques.
- **Training process**: Adjusting parameters to minimize error.
- **Evaluation and inference**: Making predictions on new data.

---

### Example: Linear Regression Model

A simple linear regression model follows this equation:

$$
y = wX + b
$$

where:
- $y$ is the predicted output
- $X$ is the input feature
- $w$ is the weight (coefficient)
- $b$ is the bias (intercept)


### Scratch-Built Implementation of Linear Regression Using Only NumPy

A scratch-built implementation of Linear Regression using only NumPy and no external ML libraries like scikit-learn. This implementation will include:

- **Mathematical foundation**: Using the equation $ y = wX + b $

- **Gradient Descent**: To learn $w$ and $b$

- **Training**: Adjusting weights based on loss

- **Prediction**: Making predictions on new data


In [2]:
import numpy as np

class LinearRegressionScratch:
    def __init__(self, learning_rate=0.01, epochs=1000):
        self.learning_rate = learning_rate
        self.epochs = epochs
        self.weights = None
        self.bias = None

    def fit(self, X, y):
        # Initialize weights and bias
        n_samples, n_features = X.shape
        self.weights = np.zeros(n_features)
        self.bias = 0

        # Gradient Descent
        for _ in range(self.epochs):
            # Compute predictions
            y_pred = np.dot(X, self.weights) + self.bias
            
            # Compute gradients
            dw = (1 / n_samples) * np.dot(X.T, (y_pred - y)) # Deriv. w.r.t weights
            db = (1 / n_samples) * np.sum(y_pred - y) # Deriv. w.r.t bias
            
            # Update weights and bias
            self.weights -= self.learning_rate * dw
            self.bias -= self.learning_rate * db

    def predict(self, X):
        return np.dot(X, self.weights) + self.bias

In [3]:
# Example usage:
if __name__ == "__main__":
    # Generate some random data
    np.random.seed(42)
    X = np.random.rand(100, 1) * 10  # 100 samples, 1 feature
    y = 3 * X.squeeze() + 5 + np.random.randn(100) * 2  # y = 3X + 5 + noise

    # Train the model
    model = LinearRegressionScratch(learning_rate=0.01, epochs=1000)
    model.fit(X, y)

    # Predict on new data
    X_test = np.array([[2], [5], [10]])  # Test samples
    predictions = model.predict(X_test)

    print("Predictions:", predictions)

Predictions: [11.03652962 19.90030725 34.67326997]


### How It Works

#### **Initialization:**
- We start with random weights ($w$) and bias ($b$), both initialized to zero.

#### **Training (Gradient Descent):**
1. **Compute predictions:** $ y_{\text{pred}} = wX + b $

2. **Compute gradients** (partial derivatives of the loss function w.r.t $w$ and $b$).

3. **Update $w$ and $b$ using gradient descent**.

#### **Prediction:**
- Once trained, the model predicts new values using: $ y = wX + b $


### 2. What is an Estimator?

An estimator is any object in machine learning that can be trained on data and make predictions. It typically implements the following methods:
- **fit(X, y)**: Learns from data
- **predict(X)**: Makes predictions
- **score(X, y)**: Evaluates performance

Most models in **scikit-learn**, **TensorFlow/Keras**, and **PyTorch** are estimators.

---

### Example: Logistic Regression (for classification)

A simple logistic regression model follows this equation:

$$
P(y=1 | X) = \frac{1}{1 + e^{-(wX + b)}}
$$

where:
- $P(y=1 | X)$ is the probability that $y = 1$ given $X$
- $X$ is the input feature
- $w$ is the weight (coefficient)
- $b$ is the bias (intercept)
- $e$ is Euler's number (≈2.718)


### 3. Neural Networks as ML Models

For deep learning, neural networks use multiple layers of weights and activation functions:

$$
y = f(WX + B)
$$

where:
- $y$ is the output
- $W$ is the weight matrix
- $X$ is the input
- $B$ is the bias vector
- $f$ is an activation function (e.g., ReLU, Sigmoid)

---

### Example: A Simple Neural Network

A simple neural network consists of an input layer, one or more hidden layers, and an output layer. Each layer has neurons (nodes) that apply weights, biases, and activation functions to the inputs to produce an output.

The general structure is as follows:
- **Input Layer**: Takes in the features (input data).
- **Hidden Layer(s)**: Processes the data using weights, biases, and activation functions.
- **Output Layer**: Produces the final prediction or classification.

In [4]:
# scikit-learn Estimator - Linear Regression
from sklearn.linear_model import LinearRegression
model_sklearn = LinearRegression()

In [None]:
# TensorFlow/Keras Estimator - Sequential Model
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

# Create a Sequential model
model = Sequential([
    Dense(64, activation='relu', input_shape=(784,)),
    Dense(64, activation='relu'),
    Dense(10, activation='softmax')
])

# Print the model summary
model.summary()

In [None]:
# PyTorch Estimator - Simple Neural Network
import torch
import torch.nn as nn
model_pytorch = nn.Linear(1, 1) # Single-layer linear model