# Contents
- What is Machine Learning?
- How Does Machine Learning Work?
- Types of Machine Learning
- Key Concepts of Machine Learning
- Machine Learning Algorithm
    - Supervised Learning Algorithm
    - Unsupervised Learning Algorithm
    - Reinforcement Learning Algorithm
    - Deep Learning Algorithm
- Machine Learning Workflow
- Deep Learning
- Applications of Machine Learning
- Visualization technique according to algorithm
- Evaluation Metrics according to algorithm
- Confusion Matrix
- Algorithm implmentation steps

# What is Machine Learning?

Machine Learning (ML) is a branch of artificial intelligence (AI) that focuses on developing systems that can learn from and make decisions based on data. It is built on the idea that systems can automatically learn patterns and insights from data without being explicitly programmed.

# How Does Machine Learning Work?
Machine learning relies on data and algorithms:

- __Data__: Data is the backbone of ML. It can be numerical data, text, images, audio, etc. The more data a model has, the better it can learn the underlying patterns.
    - __Continous Data__: Continuous data can take any value within a given range. It can include fractions and decimals, allowing for infinitely many possible values.
        - not countable, measureable
    - __Discreate Data__: Discrete data can only take specific values, typically whole numbers
        - countable, not measuarable

    For linear regression both input and output variable are continous, for logistic regressoin input variable can be both discreate and continous but output variable always be discrete.
- __Algorithms__: These are the mathematical or statistical methods that process the data, find patterns, and make predictions. Different types of algorithms are used depending on the problem and the nature of the data.

# Types of Machine Learning
Machine learning is generally classified into three main categories:
1. __Supervised Learning__: Algorithm is trained on labeled data, meaning the training `data has input-output pairs`. The model learns the relationship between inputs (features) and outputs (labels).

2. __Unsupervised Learning__: Algorithm is given data without labeled responses. It tries to `learn the underlying structure from the data`, often by grouping similar data points or discovering patterns.

3. __Reinforcement Learning__: An agent learns to `interact with an environment` by performing actions and `receiving rewards or penalties`. The goal is to learn a policy that maximizes cumulative rewards over time.

# Key Concepts of Machine Learning
- __Features__: Input variables used to make predictions. 
- __Labels__: Output variables that the model tries to predict.
- __Training Data__: A dataset used to train the model.
- __Testing Data__: A separate dataset used to evaluate the model's performance after training. It checks how well the model generalizes to new, unseen data.
- __Validation Data__: A seperate dataset used to tune the hyperparameters to prevent overfitting.
- __Model Evaluation Metrics:__
    Common metrics to evaluate model performance include:
    - __*Accuracy*__: The percentage of correct predictions (used for classification).
    - __*Precision and Recall*__: Precision measures the proportion of positive identifications that are actually correct, while recall measures the proportion of actual positives that are correctly identified.
    - __*Mean Squared Error (MSE)*__: A metric for regression tasks that measures the average squared difference between the predicted and actual values.
- __Overfitting__: When a model learns the training data too well, including its noise and outliers, leading to poor performance on new data. It’s like memorizing rather than learning.
- __Underfitting__: When a model is too simple to capture the underlying patterns in the data, leading to poor performance on both training and testing data.

# Machine Learning Algorithm
Machine learning algorithms are the mathematical models that process data, learn from patterns, and make predictions or decisions. They are categorized based on the learning task (supervised, unsupervised, or reinforcement learning) and the type of problem they solve (regression, classification, clustering, etc.).

## 1. Supervised Learning Algorithm
### Regression Algorithms
- __Linear Regression__ Predict house prices, stock prices.
- Multiple Linear Regression
- __Polynomial Regression__ The relationship between input and output is not linear.
### Classificatoin Algorithms
- __Logistic Regression__ Binary Classification like spam detection
- __Naive Bayes__ Multi-class Classification like text classification (spam filtering, sentiment analysis
### Both Regression & Classification
- __Support Vector(SVM/SVR)__ Text classification, image recognition.
- __K-Nearest Neighbors (KNN)__ Image recognition, recommendation systems.
- __Decision Tree__ Customer segmentation, credit risk analysis.
- __Random Forest__ Fraud detection, stock market prediction.
## 2. Unsupervised Learning Algorithm
### 2.1 Clustering Algorithms
- __K-Means__ - Customer segmentation, document clustering.
- __Hierarchical Clustering__ Gene expression analysis, market segmentation.
### 2.2 Dimensionality Reduction Algorithms
- __Principal Component Analysis (PCA)__ - Feature reduction for large datasets, image compression.
- Linear Discriminant Analysis (LCA)
## 3. Reinforcement Learning Algorithm
- Upper Confidence Bound Algorithm (UCBA)
- Thompson Sampling
## 4. Deep Learning Algorithm
- __Artificial Neural Networks (ANNs)__ Predicting stock prices, detecting fraud.
- __Convolutional Neural Networks (CNNs)__ Image recognition, object detection.
- __Recurrent Neural Networks (RNNs)__ Time series forecasting, language modeling.

# Machine Learning Workflow
The general process of creating an ML model includes:

1. __Data Collection__: Gathering relevant data.
2. __Data Preprocessing__: Cleaning, transforming, and organizing data (handling missing values, scaling features).
3. __Feature Engineering__: Selecting or creating important features that have the most impact on predictions.
4. __Model Selection__: Choosing the right algorithm based on the problem type (regression, classification, clustering, etc.).
5. __Training the Model__: Feeding training data to the model to learn patterns.
6. __Model Evaluation__: Testing the model on unseen data and measuring its performance using evaluation metrics.
7. __Hyperparameter Tuning__: Adjusting the parameters that control the learning process to improve model performance.
8. __Deployment__: Integrating the model into a real-world system or application for making predictions.

# Deep Learning
Deep learning is a subset of machine learning that uses neural networks with many layers (deep neural networks). It's especially powerful for handling large datasets and complex data like images, audio, and text.

- __Convolutional Neural Networks (CNNs)__: Best for image data.
- __Recurrent Neural Networks (RNNs)__: Useful for sequential data like time series or natural language processing (NLP).

# Applications of Machine Learning
- __Computer Vision__: Image recognition, facial recognition, self-driving cars.
- __Natural Language Processing (NLP)__: Chatbots, sentiment analysis, language translation.
- __Recommendation Systems__: Suggesting products or content based on user behavior (e.g., Netflix, Amazon).
- __Healthcare__: Predicting disease progression, drug discovery, personalized medicine.
- __Finance__: Fraud detection, stock market predictions, algorithmic trading.

# Visualization technique according to algorithm
Here's a list of common machine learning algorithms along with the corresponding plots or graphs that can be used to represent their outputs or performance. Each algorithm typically has a specific type of visualization that is most informative.

| **Algorithm**                  | **Corresponding Plot/Graph**                        | **Description**                                                  |
|--------------------------------|-----------------------------------------------------|------------------------------------------------------------------|
| **Linear Regression**          | Scatter Plot with Regression Line                   | Shows the relationship between independent and dependent variables. |
| **Logistic Regression**        | ROC Curve                                          | Visualizes the trade-off between true positive rate and false positive rate. |
| **Decision Tree**              | Decision Tree Diagram                               | Illustrates the splits made at each node based on feature values. |
| **Random Forest**              | Feature Importance Bar Plot                         | Shows the importance scores of features in the model.             |
| **Support Vector Machine (SVM)** | Decision Boundary Plot                              | Visualizes the hyperplane that separates classes in the feature space. |
| **K-Nearest Neighbors (KNN)**  | Scatter Plot with KNN Decision Boundary            | Displays the decision boundaries and data points classified by KNN. |
| **K-Means Clustering**         | Scatter Plot with Cluster Centroids                | Visualizes the clusters formed, often with centroids marked.     |
| **Principal Component Analysis (PCA)** | 2D/3D Scatter Plot of Principal Components | Shows the reduced-dimensional representation of data.             |
| **Naive Bayes**                | Confusion Matrix                                   | Visualizes the performance of the classifier on test data.        |
| **Neural Networks**            | Loss Curve                                          | Plots training and validation loss over epochs during training.   |
| **Gradient Boosting**          | Learning Curve                                      | Shows the model's performance as the number of boosting iterations increases. |
| **XGBoost**                    | Feature Importance Plot                             | Displays the importance of features in the model's predictions.   |
| **Time Series Analysis**       | Time Series Plot                                   | Displays the data points over time to visualize trends or seasonality. |
| **Association Rule Learning**  | Network Graph                                      | Visualizes relationships between items or features in the dataset. |


# Evaluation Metrics according to algorithm
Evaluation metrics in machine learning are quantitative measures used to assess the performance of a model on a given task, helping determine how well it makes predictions based on the data it has been trained on.

Here is a list of common machine learning algorithms along with their corresponding evaluation metrics, depending on whether they are used for regression, classification, or both:

| **Algorithm**                    | **Evaluation Metrics**                                        |
|----------------------------------|---------------------------------------------------------------|
| **Linear Regression**            | MAE, MSE, RMSE, R²                                            |
| **Logistic Regression**          | Accuracy, Precision, Recall, F1-Score, ROC-AUC                |
| **KNN**                          | MAE, MSE, RMSE, R² (Regression), Accuracy, Precision, Recall (Classification) |
| **Decision Trees**               | MAE, MSE, RMSE, R² (Regression), Accuracy, Confusion Matrix (Classification) |
| **Random Forest**                | MAE, MSE, RMSE, R², Feature Importance, Accuracy, Precision, Recall |
| **SVM**                          | MAE, MSE, RMSE, R² (Regression), Accuracy, Precision, Recall (Classification) |
| **Gradient Boosting (XGBoost)**  | MAE, MSE, RMSE, R², Feature Importance, Accuracy, ROC-AUC     |
| **Naive Bayes**                  | Accuracy, Precision, Recall, F1-Score, ROC-AUC                |
| **Neural Networks**              | MAE, MSE, RMSE, R² (Regression), Accuracy, Log Loss (Classification) |
| **K-Means**                      | Silhouette Score, Inertia, Davies-Bouldin Index               |
| **PCA**                          | Explained Variance Ratio, Scree Plot                          |
| **Hierarchical Clustering**      | Dendrogram, Silhouette Score, Davies-Bouldin Index            |


## Confusion Matrix
A confusion matrix is a table that is used to evaluate the performance of a classification algorithm. It helps in understanding the true performance by showing where the model is getting confused when making predictions.

### Structure of the Confusion Matrix

For a binary classification problem, the confusion matrix is a 2x2 table with four outcomes:

|                        | **Predicted Positive** | **Predicted Negative** |
|------------------------|------------------------|------------------------|
| **Actual Positive**    | True Positive (TP)     | False Negative (FN)    |
| **Actual Negative**    | False Positive (FP)    | True Negative (TN)     |

#### Definitions
- **True Positive (TP)**: The model predicted "Positive" and it was actually "Positive."
- **False Negative (FN)**: The model predicted "Negative" but it was actually "Positive."
- **False Positive (FP)**: The model predicted "Positive" but it was actually "Negative."
- **True Negative (TN)**: The model predicted "Negative" and it was actually "Negative."

### Example

Suppose we have a binary classification problem where we're trying to predict whether an email is **spam** or **not spam**. 
- Positive (P): Spam emails
- Negative (N): Not spam emails

Let's say after running a classification algorithm on a test set of 100 emails, we get the following confusion matrix:

|                        | **Predicted Spam**     | **Predicted Not Spam** |
|------------------------|------------------------|------------------------|
| **Actual Spam**        | 40 (TP)                | 10 (FN)                |
| **Actual Not Spam**    | 5 (FP)                 | 45 (TN)                |

In this case:
- **True Positive (TP)** = 40: The model correctly predicted 40 spam emails as spam.
- **False Negative (FN)** = 10: The model predicted 10 emails as not spam, but they were actually spam.
- **False Positive (FP)** = 5: The model predicted 5 emails as spam, but they were actually not spam.
- **True Negative (TN)** = 45: The model correctly predicted 45 emails as not spam.

### Key Metrics from the Confusion Matrix

1. **Accuracy**: The proportion of correct predictions (both true positives and true negatives) out of the total predictions.
   $$
   \text{Accuracy} = \frac{TP + TN}{TP + TN + FP + FN} = \frac{40 + 45}{40 + 45 + 5 + 10} = \frac{85}{100} = 0.85 \text{ (or 85\%)}
   $$
   So, the model is correct 85% of the time.

2. **Precision**: The proportion of predicted positives (spam) that are actually positive (spam).
   $$
   \text{Precision} = \frac{TP}{TP + FP} = \frac{40}{40 + 5} = \frac{40}{45} = 0.89 \text{ (or 89\%)}
   $$
   So, when the model predicts spam, it’s correct 89% of the time.

3. **Recall (Sensitivity or True Positive Rate)**: The proportion of actual positives (spam) that are correctly identified.
   $$
   \text{Recall} = \frac{TP}{TP + FN} = \frac{40}{40 + 10} = \frac{40}{50} = 0.80 \text{ (or 80\%)}
   $$
   So, the model correctly identifies 80% of the actual spam emails.

4. **F1 Score**: A balance between Precision and Recall, especially useful when you have imbalanced datasets.
   $$
   F1 = 2 \times \frac{\text{Precision} \times \text{Recall}}{\text{Precision} + \text{Recall}} = 2 \times \frac{0.89 \times 0.80}{0.89 + 0.80} = 2 \times \frac{0.712}{1.69} = 0.84
   $$
   The F1 score here is 0.84 (or 84\%).

5. **False Positive Rate (FPR)**: The proportion of actual negatives (not spam) that are incorrectly classified as positive (spam).
   $$
   \text{FPR} = \frac{FP}{FP + TN} = \frac{5}{5 + 45} = \frac{5}{50} = 0.10 \text{ (or 10\%)}
   $$
   So, 10% of non-spam emails were incorrectly classified as spam.

A confusion matrix is extremely useful for understanding the details of how a classification model is performing, especially in cases where accuracy alone might be misleading. By analyzing Precision, Recall, F1 Score, and other metrics, we can understand how well the model balances its predictions and where it


In [None]:
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay

# Actual labels (1: Spam, 0: Not Spam)
y_true = np.array([1]*50 + [0]*50)  # 50 spam emails, 50 not spam
# Predicted labels (let's assume the model made some predictions)
y_pred = np.array([1]*40 + [0]*10 + [1]*5 + [0]*45)  # 40 TP, 10 FN, 5 FP, 45 TN

conf_matrix = confusion_matrix(y_true, y_pred, labels=[1, 0])

plt.figure(figsize=(6, 4))
sns.heatmap(conf_matrix, annot=True, fmt='d', cmap='Blues' , yticklabels=['Spam', 'Not Spam'], xticklabels=['Spam', 'Not Spam']  )
plt.xlabel('Predicted Labels')
plt.ylabel('True Labels')
plt.title('Confusion Matrix')
plt.show()

# Algorithm implmentation steps

1. Import the Required Libraries
2. Load the Dataset
3. Extract Features
4. Preprocess Data
5. Split the Data into Training and Test Sets
6. Create the Regression Model
7. Train the Model
8. Make Predictions
9. Evaluate the Model
10. Visualization