# **Complete Machine Learning Notes**

# ✅ **AI vs ML vs DL vs DS**

---

## Artificial Intelligence (AI)

- AI = Machines that can **think, learn, and make decisions** like humans
- Goal: Build **smart systems** that can simulate intelligence

**Examples:**  
- Chatbots, Self-driving cars, Siri, Game bots

---

## Machine Learning (ML)

- ML = Subfield of AI where machines **learn from data**  
- Learns patterns from past data and **makes predictions**

---

## Deep Learning (DL)

- DL = Subset of ML using **neural networks** with many layers
- Mimic human brains.
- Best for large, complex data like **images, speech, text**

---

## Data Science (DS)

- DS = Field that uses **data + tools + math + storytelling**
- Combines AI, ML, stats, and domain knowledge to extract insights.

---

# ✅ **Types of Machine Learning (ML)**

---

## 1. Supervised Learning

- Uses **labeled data** (input + correct output)
- The model learns to **map inputs to outputs**

**Goal:** Predict outcomes on new, unseen data

**Examples:**
- Spam detection (Email → Spam or Not Spam)
- House price prediction (Features → Price)

**Algorithms:**

- **Linear Regression** – Predict continuous values  
- **Logistic Regression** – Binary classification (0 or 1)  
- **Decision Tree** – Tree-based prediction  
- **Random Forest** – Many decision trees (ensemble)  
- **Support Vector Machine (SVM)** – Find best separating boundary  
- **K-Nearest Neighbors (KNN)** – Based on closest points  
- **Naive Bayes** – Based on probability (Bayes theorem)  
- **Gradient Boosting** – Boosting weak models  
- **XGBoost / LightGBM / CatBoost** – Fast gradient boosting models  
- **Neural Networks** – For complex data (also used in DL)

---

## 2. Unsupervised Learning

- Uses **unlabeled data** (only input, no output)
- The model tries to **find patterns or structure**

**Goal:** Group or organize data automatically

**Examples:**
- Customer segmentation  
- Market basket analysis  
- Anomaly detection

**Algorithms:**

- **K-Means Clustering** – Group similar data points  
- **Hierarchical Clustering** – Tree-like cluster hierarchy  
- **DBSCAN** – Density-based clustering  
- **Principal Component Analysis (PCA)** – Dimensionality reduction  
- **t-SNE** – Visualization of high-dimensional data  
- **Autoencoders** – Neural network for feature learning  
- **Apriori Algorithm** – Association rule learning  
- **Gaussian Mixture Models (GMM)** – Probabilistic clustering  
- **Isolation Forest** – Anomaly detection  
- **Fuzzy C-Means** – Soft clustering (points can belong to multiple clusters)

---

## 3. Reinforcement Learning

- The model **learns by interacting** with an environment
- Learns from **reward or punishment** (trial and error)

**Goal:** Maximize cumulative reward

**Examples:**
- Game playing (like chess, Atari)  
- Robotics (walking, picking objects)  
- Self-driving cars

**Key Terms:**
- **Agent**: the learner  
- **Environment**: where it acts  
- **Reward**: feedback for action  
- **Policy**: strategy to choose actions

---



#  ✅**Linear Regression Algorithm** 
**Definition**: Predicts a continuous output by fitting a straight line.  
**Formula**:  
\\[
y = mx+c
\\]  

Where,

`m` = slope, `c` = bias(intercept), `x` = input

**Example**: Predict house price from size.

---

## Cost Function


In [4]:
from IPython.display import display, HTML

display(HTML("""
<div style="text-align: center;">
    <img src="Screenshots/1.png" style="width: 500px;"/>
</div>
"""))


---
## MSE vs MAE vs RMSE

In [6]:
from IPython.display import display, HTML

display(HTML("""
<div style="display: flex; justify-content: center; gap: 20px;">
    <div>
        <img src="Screenshots/MSE.png" style="width: 500px;"/><br>
        <p style="text-align: center;">MSE</p>
    </div>
    <div>
        <img src="Screenshots/MAE.png" style="width: 500px;"/><br>
        <p style="text-align: center;">MAE</p>
    </div>
    <div>
        <img src="Screenshots/RMSE.png" style="width: 300px;"/><br>
        <p style="text-align: center;">RMSE</p>
    </div>
</div>
"""))


---
## Gradient Descent

In [8]:
from IPython.display import display, HTML
display(HTML("""
<div style="text-align: center;">
    <img src="Screenshots/2.png" style="width: 500px;"/>
</div>
"""))

---
## Convergence Algorithm

**Definition**: WThe point at which the cost function stops decreasing significantly, meaning the model has found a good solution (or minimum).

**How to know**:
Plot cost vs iterations → curve flattens.

Cost change < threshold (e.g., 0.0001).


In [7]:
from IPython.display import display, HTML
display(HTML("""
<div style="text-align: center;">
    <img src="Screenshots/3.png" style="width: 600px;"/>
</div>
"""))

---
## Learning Rate (α)  
**Definition**: Controls how big the steps are during gradient descent updates.

- Too small → slow convergence. 
- Too large → may overshoot and never converge.
---

## Final Note for Linear Regression

In [12]:
from IPython.display import display, HTML
display(HTML("""
<div style="text-align: center;">
    <img src="Screenshots/4.png" style="width: 500px;"/>
</div>
"""))

---
#  ✅ **R-squared vs Adjusted R-squared**

---

### R² (Coefficient of Determination)

**Definition**: Measures how much of the variation in the dependent variable is explained by the model.

**Formula**:  
\\[
R^2 = 1 - {SS_{res}}/{SS_{tot}}
\\]

Where:  
- \\( SS_{res} = \\sum (y_i - \\hat{y}_i)^2 \\)  → Residual Sum of Squares
- \\( SS_{tot} = \\sum (y_i - \\bar{y})^2 \\)  → Total Sum of Squares

**Interpretation**:  
- R² = 0 → Explains 0% variance  
- R² = 1 → Explains 100% variance  
- R² = 0.85 → Explains 85% variance

---

### Adjusted R²

**Definition**: Adjusted version of R² that **penalizes irrelevant features**.

**Formula**:  
\\[
\\ Adjusted R^2 = 1 -{(1 - R^2)/(n - 1)}/({n - k - 1})
\\]

Where:  
- \\( n \\): number of samples  
- \\( k \\): number of features

---


# ✅  **Bias, Variance, Overfitting, Underfitting**

---

## Bias
- Error due to **simplistic assumptions**
- Model is **too simple**, misses patterns
- Wrong method (always off target)
- Leads to **underfitting**
- Bias means we are talking about training data.(consider this concept when checking for error in training data)

---

## Variance
- Error due to **too much sensitivity** to training data
- Model is **too complex**, learns noise
- Inconsistent shots (random)
- Leads to **overfitting**
- Variance means we are talking about testing data.(consider this concept when checking for error in testing data)


---

## Underfitting
- Model performs poorly on training **and** test data
- Caused by **high bias**

---

## Overfitting
- Model performs well on training but poorly on test data
- Caused by **high variance**

---

## Bias-Variance Trade-off

| Model Type     | Bias      | Variance   |
|----------------|-----------|------------|
|  Underfitting  | High      | Low        |
| Overfitting    | Low       | High       |
| Just Right     | Low       | Low        |

---

# ✅ **Confusion Matrix**

A confusion matrix shows how well the classification model is performing.

For **binary classification**:

|               | Predicted Positive | Predicted Negative |
|---------------|--------------------|--------------------|
| Actual Positive | True Positive (TP)  | False Negative (FN) |
| Actual Negative | False Positive (FP) | True Negative (TN)  |

---

# ✅ **Performance Metrics**


In [13]:
from IPython.display import display, HTML

display(HTML("""
<div style="display: flex; justify-content: center; gap: 20px;">
    <div>
        <img src="Screenshots/Accuracy.png" style="width: 600px;"/><br>
    </div>
    <div>
        <img src="Screenshots/Precision.png" style="width: 600px;"/><br>
    </div>
</div>
"""))


In [14]:
from IPython.display import display, HTML
display(HTML("""
<div style="display: flex; justify-content: center; gap: 20px;">
    <div>
        <img src="Screenshots/Recall.png" style="width: 600px;"/><br>
    </div>
    <div>
        <img src="Screenshots/F1 Score.png" style="width: 600px;"/><br>
    </div>
</div>
"""))

In [15]:
from IPython.display import display, HTML
display(HTML("""
<div style="display: flex; justify-content: center; gap: 20px;">
    <div>
        <img src="Screenshots/ROC.png" style="width: 600px;"/><br>
    </div>
    <div>
        <img src="Screenshots/AUC.png" style="width: 600px;"/><br>
    </div>
</div>
"""))

- **ROC Curve**: Plot of TPR=Recall(y-axis) vs FPR(x-axis) at all thresholds
- **AUC**: A single number summary of ROC
---

# ✅ **Training, Validation and Testing Data**

## Training Data
- Used to **teach** the model.
- The model **learns patterns** from this data.

## Testing Data
- Used to **evaluate** the model **after training**.
- Helps check how well the model performs on **unseen data**.

## Validation Data
- Used during training to **tune parameters** (like choosing the best model settings).
- Helps prevent **overfitting**.
  
---

# ✅ **Cross-Validation (CV)**

- **Cross-validation** is a technique used to **evaluate the performance** of a machine learning model by splitting the data into multiple parts and testing the model on different subsets.
- It helps ensure the model **generalizes well** to unseen data and avoids overfitting.

---

## Types of Cross-Validation

### 1. K-Fold Cross-Validation
- Split data into **K equal parts** (folds)
- Train on K-1 folds, test on the remaining fold
- Repeat K times and average the results

### 2. Stratified K-Fold
- Like K-Fold but **preserves class distribution** in each fold (useful for classification)

### 3. Leave-One-Out (LOO)
- Each sample is used once as a test set, rest as training
- Very accurate but **computationally expensive**

### 4. Hold-Out Validation
- Simple split into **train and test** sets (e.g., 80/20)
- Fast but may not be as reliable
  
### 5. Time Series Split
- Used for **time-dependent data**
- Ensures training data is always earlier than test data

---

## Summary Table

| Type               | Description                                | Use Case                  |
|--------------------|--------------------------------------------|---------------------------|
| K-Fold             | Split into K parts, rotate test fold       | General-purpose           |
| Stratified K-Fold  | K-Fold with class balance                  | Classification problems   |
| Leave-One-Out      | One sample per test set                    | Small datasets            |
| Hold-Out           | One-time train/test split                  | Quick evaluation          |
| Time Series Split  | Respects time order                        | Time series forecasting   |


---
# ✅ **Ridge, Lasso and ELasticNet Regression Algorithms**

### Why Linear Regression Fails:

1. **Overfitting**: 
   - Tries to fit every detail including noise when many features exist.

2. **Multicollinearity**: 
   - Highly correlated features cause **unstable** weights.

3. **No Feature Selection**: 
   - Keeps **all features**, even irrelevant ones.

---



## **Ridge Regression (L2 Regularization)**

### Formula:

$$
J(w) = \text{MSE} + \lambda \sum w_i^2
$$

Where,

wi = Slope

### Meaning:

- Adds a **penalty on the square of weights**.  
- **Shrinks** large weights but **keeps all features**.  
- Good for **reducing overfitting** when features are **correlated**.

### Use when:

- You want to **reduce model complexity** but not remove features.

---

## **Lasso Regression (L1 Regularization)**

### Formula:

$$
J(w) = \text{MSE} + \lambda \sum |w_i|
$$
                                   
### Meaning:

- Adds a **penalty on the absolute value of weights**.  
- **Can make some weights zero** ➝ **feature selection**.  
- Good for **sparse models** with **irrelevant features**.

### Use when:

- You want to **remove unnecessary features** from your model.

---

## **ElasticNet Regression (L1 + L2 Regularization)**

### Formula:

$$
J(w) = \text{MSE} + \lambda_1 \sum |w_i| + \lambda_2 \sum w_i^2
$$

### Where:
- \( \lambda_1 \): L1 (Lasso) penalty strength  
- \( \lambda_2 \): L2 (Ridge) penalty strength  
- Combines **both L1 and L2 penalties**

### Meaning:

- Mixes benefits of **Lasso (feature selection)** and **Ridge (shrinkage)**  
- Helps when:
  - You have **many features**
  - Some are **correlated**
  - You also want **feature selection**

### Use when:

- You want a **balanced model** that can both **select features** and **handle multicollinearity**


In [17]:
from IPython.display import display, HTML

display(HTML("""
<div style="display: flex; justify-content: center; gap: 20px;">
    <div>
        <img src="Screenshots/Ridge Regression.png" style="width: 500px;"/><br>
        <p style="text-align: center;">Ridge</p>
    </div>
    <div>
        <img src="Screenshots/Lasso Regression.png" style="width: 600px;"/><br>
        <p style="text-align: center;">Lasso</p>
    </div>
    <div>
        <img src="Screenshots/ElasticNet Regression.png" style="width: 600px;"/><br>
        <p style="text-align: center;">ElasticNet</p>
    </div>
</div>
"""))


---
# ✅ **Logistic Regression Algorithm**

##  Why We Use It?

- Used for classification, not regression (despite its name).
- Predicts probability of class (e.g., spam/not spam, 0/1, yes/no).
- Especially used for binary classification problems.
- Use **One-vs-Rest (OvR)** or **Softmax function** for multi-class classification.
---


In [18]:
from IPython.display import display, HTML

display(HTML("""
<div style="display: flex; justify-content: center; gap: 20px;">
    <div>
        <img src="Screenshots/LR1.png" style="width: 600px;"/><br>
    </div>
    
</div>
"""))


In [19]:
from IPython.display import display, HTML

display(HTML("""
<div style="display: flex; justify-content: center; gap: 20px;">
    <div>
        <img src="Screenshots/LR2.png" style="width: 600px;"/><br>
    </div>
    
</div>
"""))


In [20]:
from IPython.display import display, HTML

display(HTML("""
<div style="display: flex; justify-content: center; gap: 20px;">
    <div>
        <img src="Screenshots/LR3.png" style="width: 600px;"/><br>
    </div>
    
</div>
"""))


---
# ✅ **Convex vs Non-Convex Functions**

---

## Convex Function

- Looks like a **U-shape (bowl)**  
- Has **only one global minimum**  
- Easy to optimize using gradient descent

**Example:**  

f(x) = x^2

---

## Non-Convex Function

- Has **many hills and valleys**  
- Has **multiple local minima**  
- Gradient descent may get **stuck**

**Example:**  

f(x) = sin(x)

---

## Notes

- Linear Regression, Logistic Regression have convex cost functions → easy to optimize.
- Neural Networks have non-convex loss functions → hard to optimize, but techniques like **SGD, momentum, and Adam** help.

---


# ✅ **Sigmoid Function**

In [21]:
from IPython.display import display, HTML
display(HTML("""
<div style="display: flex; justify-content: center; gap: 20px;">
    <div>
        <img src="Screenshots/SF.png" style="width: 500px;"/><br>   
    </div>
    
</div>
"""))

---
# ✅ **Naive Bayes Algorithm**

## What is Naive Bayes?

- A **classification algorithm** based on **Bayes' Theorem**
- Called "naive" because it **assumes all features are independent**
- Works well for small and **text data** (e.g., spam detection, sentiment analysis)
---

## Types of Naive Bayes:

- Gaussian Naive Bayes: For continuous data
- Multinomial Naive Bayes: For word counts (e.g., in text)
- Bernoulli Naive Bayes: For binary features (yes/no or 0/1)
---

In [22]:
from IPython.display import display, HTML
display(HTML("""
<div style="display: flex; justify-content: center; gap: 20px;">
    <div>
        <img src="Screenshots/BT1.png" style="width: 600px;"/><br>   
    </div>
    
</div>
"""))

In [23]:
from IPython.display import display, HTML
display(HTML("""
<div style="display: flex; justify-content: center; gap: 20px;">
    <div>
        <img src="Screenshots/BT2.png" style="width: 600px;"/><br>   
    </div>
    
</div>
"""))

In [24]:
from IPython.display import display, HTML
display(HTML("""
<div style="display: flex; justify-content: center; gap: 20px;">
    <div>
        <img src="Screenshots/BT3.png" style="width: 600px;"/><br>   
    </div>
    
</div>
"""))

---
# ✅ **K-Nearest Neighbors(KNN)**

## What is KNN?

- KNN = **K-Nearest Neighbors**
- Works for **classification** and **regression**
- KNN is Non-Parametric algorithm because it doesn’t learn a model instead it just stores the training data.
- When asked to predict, it uses distance to find the closest examples — no equation, no training parameters.

---

## Intuition:

- To predict a new point, look at the **K closest points** (neighbors) from the training data using distance metrics.
- Use **majority voting** (classification) or **average** (regression)

---

## Advantages:

- Simple and easy to understand
- No training time (lazy learner)
- Works well on small datasets

---

## Disadvantages:

- Slow for large datasets (needs to compute distance to all points)
- Sensitive to outliers
- Needs feature normalization


In [25]:
from IPython.display import display, HTML
display(HTML("""
<div style="display: flex; justify-content: center; gap: 20px;">
    <div>
        <img src="Screenshots/KNN1.png" style="width: 500px;"/><br>   
    </div>
    
</div>
"""))

---
# ✅ **Distance Metrics**

In [26]:
from IPython.display import display, HTML
display(HTML("""
<div style="display: flex; justify-content: center; gap: 20px;">
    <div>
        <img src="Screenshots/DM1.png" style="width: 400px;"/><br>   
    </div>
    <div>
        <img src="Screenshots/DM2.png" style="width: 400px;"/><br>   
    </div>
    
</div>
"""))

---
# ✅ **Support Vector Machine**

Please visit below link for SVM (SVC and SVR)

https://github.com/atharparvezce/Complete_ML_Notes/blob/main/Screenshots/SVR%20Algorithms.pdf

---
# ✅ **Kernels**

## What are Kernels?

Kernels are functions used in Machine Learning (especially in SVM) to:

> **Transform data into higher dimensions** to make it easier to classify using linear methods.

They help solve problems where the data is **not linearly separable** in its original space.

---

## Why Do We Need Kernels?

Some datasets can't be separated by a straight line (or hyperplane) in their current space.

**Kernels** allow us to implicitly map them to a **higher-dimensional space** where separation **is possible**, without computing that transformation directly.

---

In [27]:
from IPython.display import display, HTML
display(HTML("""
<div style="display: flex; justify-content: center; gap: 20px;">
    <div>
        <img src="Screenshots/Kernel1.png" style="width: 500px;"/><br>   
    </div>
    <div>
        <img src="Screenshots/Kernel2.png" style="width: 500px;"/><br>   
    </div>
    
</div>
"""))

---
## Summary

- Kernels help convert **non-linear problems** into **linear ones** in higher dimensions.
- Used heavily in **SVM**, **Kernel PCA**, etc.
- **Kernel Trick** avoids expensive computations.
- Make ML models more powerful with less manual feature engineering.

---

# ✅ **Decision Tree**

For hand notes please see below link

https://github.com/atharparvezce/Complete_ML_Notes/blob/main/Screenshots/decision%20tree.pdf

In [28]:
from IPython.display import display, HTML
display(HTML("""
<div style="display: flex; justify-content: center; gap: 20px;">
    <div>
        <img src="Screenshots/DT1.png" style="width: 800px;"/><br>   
    </div>
</div>
"""))

In [29]:
from IPython.display import display, HTML
display(HTML("""
<div style="display: flex; justify-content: center; gap: 20px;">
    <div>
        <img src="Screenshots/DT2.png" style="width: 800px;"/><br>   
    </div>
</div>
"""))

In [30]:
from IPython.display import display, HTML
display(HTML("""
<div style="display: flex; justify-content: center; gap: 20px;">
    <div>
        <img src="Screenshots/DT3.png" style="width: 800px;"/><br>   
    </div>
</div>
"""))

In [31]:
from IPython.display import display, HTML
display(HTML("""
<div style="display: flex; justify-content: center; gap: 20px;">
    <div>
        <img src="Screenshots/DT4.png" style="width: 800px;"/><br>   
    </div>
</div>
"""))

In [32]:
from IPython.display import display, HTML
display(HTML("""
<div style="display: flex; justify-content: center; gap: 20px;">
    <div>
        <img src="Screenshots/DT5.png" style="width: 800px;"/><br>   
    </div>
</div>
"""))

In [33]:
from IPython.display import display, HTML
display(HTML("""
<div style="display: flex; justify-content: center; gap: 20px;">
    <div>
        <img src="Screenshots/DT6.png" style="width: 800px;"/><br>   
    </div>
</div>
"""))

In [34]:
from IPython.display import display, HTML
display(HTML("""
<div style="display: flex; justify-content: center; gap: 20px;">
    <div>
        <img src="Screenshots/DT7.png" style="width: 800px;"/><br>   
    </div>
</div>
"""))

---

# ✅ **Ensemble Techniques**

**Ensemble methods** combine predictions from multiple models to improve accuracy and robustness.  
> “A group of weak learners can come together to form a strong learner.”

---

## Types of Ensemble Techniques

### 1. Bagging (Bootstrap Aggregating)

- **Goal**: Reduce **variance** and avoid overfitting.
- **How it works**:
  - Train multiple models (usually of the same type) in **parallel**.
  - Each model is trained on a **random subset** of the data (with replacement).
  - Final prediction is made by **averaging** (regression) or **voting** (classification).

**Popular Bagging Algorithms:**

- Random Forest
- Bagged Decision Trees
- Extra Trees (Extremely Randomized Trees)

---

### 2. Boosting

- **Goal**: Reduce **bias** and improve model performance.
- **How it works**:
  - Models are trained **sequentially**.
  - Each new model focuses on **correcting the errors** of the previous one.
  - Final prediction is a **weighted combination** of all models.

**Popular Boosting Algorithms:**

- AdaBoost (Adaptive Boosting)
- Gradient Boosting Machines (GBM)
- XGBoost (Extreme Gradient Boosting)
- LightGBM
- CatBoost
---


# ✅ **Random Forest (RF)**

Random Forest is an ensemble learning method that builds many decision trees and combines their results to make better predictions.


- It uses Bagging (Bootstrap Aggregating).
- Each tree is trained on a random subset of the data.
- It also selects random features for splitting at each node.
- Final prediction is made by:
- Majority vote (for classification)
- Average (for regression)

---

## Why Use Random Forest?

- More accurate than a single decision tree.
- Handles missing data and non-linear relationships well.
- Reduces overfitting.

---

## Note:-

| Concept           | Description                                                  |
|-------------------|--------------------------------------------------------------|
| Row Sampling      | Random rows selected with replacement for each tree          |
| Feature Sampling  | Random subset of features used at each split                 |
| OOB Samples       | Data not used to train a specific tree                       |
| OOB Score         | Accuracy on OOB samples, used to evaluate model performance  |

---


# ✅ **Important Automatic EDA Libraries**

---

## 1. Pandas Profiling
- Generates a **detailed report** from a DataFrame.
- Includes **statistics**, **missing values**, **correlations**, and **data types**.
- Output: Interactive HTML report.
- Best for: Quick, **in-depth overview** of your dataset.

---

## 2. AutoViz
- Automatically visualizes any dataset with **minimal code**.
- Handles **CSV files, DataFrames**, and **large datasets**.
- Creates **histograms, scatter plots, box plots**, etc.
- Best for: **Automated visual EDA** with minimal setup.

---

## 3. SweetViz
- Creates **beautiful, high-contrast visual reports**.
- Compares **training vs testing data**.
- Highlights **target relationships** and **data insights**.
- Best for: **Comparative analysis** and **presentation-ready visuals**.

---

## 4. D-Tale
- Combines **Pandas + Flask + React** to give a **GUI** for DataFrames.
- Lets you **interact, filter, sort, and visualize** data in your browser.
- Best for: **Interactive exploration** without writing code.

---

# ✅ **AdaBoost Algorithm**

AdaBoost builds a strong classifier by sequentially training weak classifiers (typically decision stumps—trees with one split), each focusing more on the instances misclassified by previous ones. It assigns weights to training samples and updates them iteratively to emphasize harder examples.

In [35]:
from IPython.display import display, HTML
display(HTML("""
<div style="display: flex; justify-content: center; gap: 20px;">
    <div>
        <img src="Screenshots/AD1.png" style="width: 800px;"/><br>   
    </div>
    <div>
        <img src="Screenshots/AD2.png" style="width: 800px;"/><br>   
    </div>
</div>
"""))

In [36]:
from IPython.display import display, HTML
display(HTML("""
<div style="display: flex; justify-content: center; gap: 20px;">
    <div>
        <img src="Screenshots/AD3.png" style="width: 800px;"/><br>   
    </div>
    <div>
        <img src="Screenshots/AD5.png" style="width: 800px;"/><br>   
    </div>
</div>
"""))

In [37]:
from IPython.display import display, HTML
display(HTML("""
<div style="display: flex; justify-content: center; gap: 20px;">
    <div>
        <img src="Screenshots/AD6.png" style="width: 800px;"/><br>   
    </div>
    <div>
        <img src="Screenshots/AD7.png" style="width: 800px;"/><br>   
    </div>
</div>
"""))

In [38]:
from IPython.display import display, HTML
display(HTML("""
<div style="display: flex; justify-content: center; gap: 20px;">
    <div>
        <img src="Screenshots/AD8.png" style="width: 800px;"/><br>   
    </div>
    <div>
        <img src="Screenshots/AD9.png" style="width: 800px;"/><br>   
    </div>
</div>
"""))

In [39]:
from IPython.display import display, HTML
display(HTML("""
<div style="display: flex; justify-content: center; gap: 20px;">
    <div>
        <img src="Screenshots/AD10.png" style="width: 500px;"/><br>   
    </div>
</div>
"""))

---
# ✅ **Gradient Boosting Algorithm**

Gradient Boosting is a powerful machine learning technique used for both classification and regression tasks. It builds a strong model by combining many weak models, usually decision trees, in a sequential way.

- The core idea is:

"Train models one after another, each trying to fix the mistakes made by the previous ones."
Instead of adjusting weights like AdaBoost, Gradient Boosting learns from the errors (residuals) of the previous model using gradient descent.

In [40]:
from IPython.display import display, HTML
display(HTML("""
<div style="display: flex; justify-content: center; gap: 20px;">
    <div>
        <img src="Screenshots/GB1.png" style="width: 500px;"/><br>   
    </div>
    <div>
        <img src="Screenshots/GB2.png" style="width: 500px;"/><br>   
    </div>
</div>
"""))

In [41]:
from IPython.display import display, HTML
display(HTML("""
<div style="display: flex; justify-content: center; gap: 20px;">
    <div>
        <img src="Screenshots/GB3.png" style="width: 500px;"/><br>   
    </div>
    <div>
        <img src="Screenshots/GB4.png" style="width: 500px;"/><br>   
    </div>
</div>
"""))

In [42]:
from IPython.display import display, HTML
display(HTML("""
<div style="display: flex; justify-content: center; gap: 20px;">
    <div>
        <img src="Screenshots/GB5.png" style="width: 500px;"/><br>   
    </div>
    <div>
        <img src="Screenshots/GB6.png" style="width: 500px;"/><br>   
    </div>
</div>
"""))

In [43]:
from IPython.display import display, HTML
display(HTML("""
<div style="display: flex; justify-content: center; gap: 20px;">
    <div>
        <img src="Screenshots/GB7.png" style="width: 500px;"/><br>   
    </div>
</div>
"""))

---
# ✅ **K-Means Clustering (Unsupervised Learning)**

---

- K-Means is an **unsupervised algorithm** used for **clustering**.
- It groups data into **K clusters** based on similarity.
- Each cluster has a **centroid** (center point).
- Goal: **Minimize the distance** between points and their cluster center.

---

##  Algorithm Steps:

In [46]:
from IPython.display import display, HTML
display(HTML("""
<div style="display: flex; justify-content: center; gap: 20px;">
    <div>
        <img src="Screenshots/KMeans1.png" style="width: 500px;"/><br>   
    </div>   
</div>
"""))

In [49]:
from IPython.display import display, HTML
display(HTML("""
<div style="display: flex; justify-content: center; gap: 20px;">
    <div>
        <img src="Screenshots/KMeans2.png" style="width: 500px;"/><br>   
    </div>   
</div>
"""))

In [51]:
from IPython.display import display, HTML
display(HTML("""
<div style="display: flex; justify-content: center; gap: 20px;">
    <div>
        <img src="Screenshots/KMeans3.png" style="width: 500px;"/><br>   
    </div>   
</div>
"""))

- **Note:-** WCSS and Inertia are same
---

# ✅ **Elbow Method**

---

## What is it?

- A method to find the **best number of clusters (K)** in K-Means.
- It uses the **WCSS (inertia)** value to decide.

---

## Key Idea:

- As **K increases**, WCSS decreases.
- But after a point, adding more clusters doesn't help much.
- The point where the curve **bends** (like an elbow) is the best **K**.

---

## Steps:

1. Run K-Means for K = 1 to 10
2. Calculate **WCSS (inertia)** for each K
3. Plot K vs WCSS
4. The "elbow point" = optimal K

---


# ✅ **Hierarchical Clustering**

---
- A clustering algorithm that builds a **tree-like structure** (dendrogram).
- You **don’t need to predefine** the number of clusters.
- Two types:
  - **Agglomerative** (bottom-up)
  - **Divisive** (top-down)

---

## How Agglomerative Clustering Works:

1. Start with each point as its own cluster.
2. Calculate the **distance** between all clusters.
3. **Merge** the two closest clusters.
4. Repeat until all points are in one cluster.

---

In [53]:
from IPython.display import display, HTML
display(HTML("""
<div style="display: flex; justify-content: center; gap: 20px;">
    <div>
        <img src="Screenshots/HC1.png" style="width: 500px;"/><br>   
    </div>   
</div>
"""))

---
## Distance Metrics:
- Euclidean Distance
- Manhattan Distance
  
---

## Output: Dendrogram

- A **tree diagram** showing how clusters were merged.
- Cut at a specific height to get desired number of clusters.

---

## Note:

> Hierarchical Clustering builds a **nested structure** of clusters based on **distance** and **linkage rules**, shown as a dendrogram.

---

# ✅**DBSCAN Clustering**

---

- An **unsupervised** clustering algorithm based on **density**.
- Groups points that are close together and labels low-density points as **noise**.
- Works well for **irregular-shaped clusters** and **detecting outliers**.

---

## Key Terms:

- **eps (ε)**: Radius around a point to check neighbors.
- **minPts**: Minimum number of points required to form a dense area.
- **Core Point**: Has at least `minPts` points in its `eps` neighborhood.
- **Border Point**: Less than `minPts` in `eps`, but near a core point.
- **Noise**: Not a core or border point.

---

## Distance Metric:

- Euclidean Distance

## How DBSCAN Works:

1. Pick a random unvisited point.
2. If it has **minPts or more** points within `eps`, it becomes a **core point** and a new cluster starts.
3. Expand the cluster by finding all **density-reachable** points.
4. If a point has too few neighbors, label it as **noise**.
5. Repeat for all points.

---

## Note:

> DBSCAN = Density-Based Spatial Clustering of Applications with Noise  
> It groups dense regions together and marks sparse points as noise.

---


# ✅**Silhouette Score (Clustering Evaluation Metric)**

- A metric to measure the quality of clustering.
- It tells how well each point fits within its own cluster vs. how far it is from other clusters.

In [54]:
from IPython.display import display, HTML
display(HTML("""
<div style="display: flex; justify-content: center; gap: 20px;">
    <div>
        <img src="Screenshots/SC1.png" style="width: 800px;"/><br>   
    </div>
</div>
"""))

In [56]:
from IPython.display import display, HTML
display(HTML("""
<div style="display: flex; justify-content: center; gap: 20px;">
    <div>
        <img src="Screenshots/SC2.png" style="width: 800px;"/><br>   
    </div>
</div>
"""))

In [57]:
from IPython.display import display, HTML
display(HTML("""
<div style="display: flex; justify-content: center; gap: 20px;">
    <div>
        <img src="Screenshots/SC3.png" style="width: 800px;"/><br>   
    </div>
</div>
"""))

---
## Note:

Silhouette Score tells how well a point fits in its own cluster vs. other clusters.

---

# ✅ **Please visit here for Complete ML by Krish Naik**

https://github.com/krishnaik06/Machine-Learning-Algorithms-Materials/tree/main