# <u>Supervised Machine Learning</u>

Supervised learning is a type of machine learning where an algorithm is trained on labeled data. The algorithm learns to map input data to the correct output (labels) by generalizing patterns from the training dataset, then it can predict outputs for unseen data.

### `→ How Supervised Learning is Achieved`
Supervised learning involves the following steps:

1. Data Collection: Gather labeled data, where each input is paired with the correct output.
2. Data Preprocessing: Clean the data, remove outliers, handle missing values, and scale features.
3. Model Training: Feed the labeled data to a machine learning algorithm that learns the mapping between inputs and outputs.
4. Prediction: Once trained, the model predicts outputs for new, unseen data.
5. Model Evaluation: Assess the performance of the model using metrics like accuracy, precision, recall, F1-score, or RMSE (Root Mean Square Error) for regression tasks.

### `→ Types of Supervised Learning:`

Supervised learning can be divided into two main categories: classification and regression.

1. Classification: The output variable is categorical (e.g., spam detection, customer segmentation). Algorithms used for classification include logistic regression, decision trees, support vector machines (SVM), k-nearest neighbors (KNN), neural networks, and ensemble methods like random forests or gradient boosting.

2. Regression: The output variable is continuous (e.g., housing price prediction, stock market analysis). Algorithms used for regression include linear regression, decision trees, support vector machines (SVM), neural networks, and ensemble methods like random forests or gradient boosting.

### `→ Modeling Process:`

1. Data Splitting: Split the dataset into a training set (to train the model) and a test set (to evaluate performance).
2. Model Selection: Choose a suitable algorithm (e.g., linear regression, decision trees).
3. Model Training: The algorithm adjusts its internal parameters (weights in the case of neural networks) to minimize the error between predicted and actual labels using optimization techniques like gradient descent.
4. Model Evaluation: Evaluate the model using metrics like accuracy, precision, recall, F1-score, or RMSE (Root Mean Square Error) for regression tasks. Evaluation of a supervised model involves:

    * Confusion Matrix: Measures true positives, false positives, true negatives, and false negatives (for classification tasks).
    * Precision, Recall, and F1-Score: Used for imbalanced classification tasks.
    * Accuracy: Percentage of correct predictions over total predictions.
    * R-Squared and RMSE: Commonly used for regression models to measure the closeness of predicted values to actual values.

5. Improving Model Performance
Based on the evaluation results, the following methods can improve performance:

    * Hyperparameter Tuning: Adjust parameters like learning rate, regularization strength, etc.
    * Feature Engineering: Create new features or transform existing ones for better predictive power.
    * Cross-Validation: Use techniques like K-fold cross-validation to avoid overfitting.
    * Regularization: Apply L1 (Lasso) or L2 (Ridge) regularization to reduce model complexity.
    * Ensemble Methods: Use techniques like Bagging (Random Forest) or Boosting (AdaBoost) to combine multiple weak models.
    * Transfer Learning: Use pre-trained models (e.g., VGG16, ResNet50) and fine-tune their weights on a new dataset.
    * Deep Learning: Train deep learning models like convolutional neural networks (CNNs) or recurrent neural networks (RNNs) to capture complex patterns and relationships in the data.
    * Bayesian Optimization: Use Bayesian optimization techniques to find the best hyperparameters for a machine learning model.
    * AutoML: Use automated machine learning tools like TPOT (Tree-based Pipeline Optimization Tool) or Auto-sklearn to automatically search for the best model and hyperparameters.

### `→ Use Cases of Supervised Learning:`
1. Banking and Finance: Predicting customer defaults using logistic regression or decision trees.
2. Healthcare: Predicting medical conditions using logistic regression or neural networks.
3. Telecommunications: Predicting customer churn using decision trees or random forests.
4. Retail and E-commerce: Predicting customer purchases using logistic regression or neural networks.
5. Social Media and Marketing: Analyzing user behavior using logistic regression or neural networks.
6. Product Recommendations: Predicting products users might like using collaborative filtering or content-based filtering.
7. Email Classification: Classifying emails as personal, work, or social using Naive Bayes or SVM.
7. Social Media Monitoring: Monitoring user behavior using logistic regression or neural networks.
8. Sentiment Analysis: Analyzing customer feedback using logistic regression or neural networks.
9. Image Classification: Recognizing objects in images using convolutional neural networks (CNNs).
10. Customer Segmentation: Grouping customers based on their behavior using clustering algorithms like K-means or hierarchical clustering.
11. Anomaly Detection: Identifying unusual patterns or events in data using statistical methods or machine learning algorithms.
12. Spam Detection: Classifying emails as spam or not spam using Naive Bayes or SVM.
13. Fraud Detection: Identifying fraudulent transactions using decision trees or random forests.
14. Medical Diagnosis: Predicting diseases based on patient symptoms using logistic regression or neural networks.
15. Image Classification: Recognizing objects in images using convolutional neural networks (CNNs).
16. Customer Churn Prediction: Identifying customers likely to leave using decision trees or random forests.

### `→ Techniques Involved in Supervised Learning:`

1. Classification: Predict categorical labels (e.g., spam vs. not spam).
2. Regression: Predict continuous values (e.g., house prices).
3. Clustering: Group similar data points together (e.g., customer segmentation).
4. Regularization: L1 (Lasso) and L2 (Ridge) regularization to avoid overfitting.
5. Feature Engineering: Techniques like one-hot encoding, label encoding, or binning to transform categorical variables into numerical features.
6. Dimensionality Reduction: Techniques like Principal Component Analysis (PCA) to reduce feature space.

### `→ List of Models in Supervised Learning:`

1. Classification Models - In classification tasks, the goal is to predict discrete labels or categories. The model classifies input data into one of several predefined categories. Common algorithms include:
    - Logistic Regression
    - Decision Trees
    - Random Forests
    - Support Vector Machines(SVM)
    - Naive Bayes
    - K-Nearest Neighbors (KNN)
    - Neural Networks (e.g., Feedforward Neural Networks, Recurrent Neural Networks)

2. Regression Models - In regression tasks, the goal is to predict continuous values. The model learns from the labeled training data and predicts outputs that fall on a continuous spectrum. Common algorithms include:
    - Linear Regression
    - Polynomial Regression
    - Ridge Regression
    - Lasso Regression
    - Support Vector Regression(SVR)
    - Decision Trees(for regression)
    - Bayesian Linear Regression

3. Ensemble Methods - Ensemble methods in machine learning combine the predictions of multiple models (usually called "base learners") to improve the overall performance. The idea is that by aggregating the outputs of different models, the ensemble reduces errors and improves generalizability. Some common types of ensemble methods include:
    - Bagging (Bootstrap Aggregating): It builds several independent models by training them on different random subsets of the data, and their predictions are averaged (or voted) to create a final output.
    - Boosting: An ensemble technique where models are built sequentially, each new model focusing on the mistakes made by previous models. The goal is to turn weak learners into a strong learner by assigning more weight to incorrectly classified instances. Common boosting algorithms include:
        - AdaBoost (Adaptive Boosting)
        - Gradient Boosting
        - XGBoost
        - LightGBM
        - CatBoost
        - Deep Learning (e.g., Convolutional Neural Networks, Recurrent Neural Networks)

### `→ Real-World Examples:`
1. Netflix: Uses supervised learning to recommend movies based on user preferences and past viewing history.
2. Amazon: Predicts product recommendations based on customer data.
3. Credit Scoring: Banks use supervised models to predict the likelihood of loan defaults.

### `→ Conclusion:`

Supervised learning is a powerful and widely used technique in machine learning, with various applications in various domains. By following the steps outlined above, you can build a solid understanding of supervised learning and leverage its benefits to achieve accurate predictions. Remember to apply feature engineering, cross-validation, and regularization techniques to improve model performance.

### `→ 🌐Sources:`
1. geeksforgeeks.org - Supervised Machine Learning
2. developer.ibm.com - Supervised learning models
3. medium.com - List of Machine Learning Models
4. scikit-learn.org - Supervised learning
5. wikipedia.org - Supervised learning
6. oreilly.com - Supervised Learning: Models and Concepts

# <u>Starting with the Classification Models</u> 

## Logistic Regression

Logistic regression is a supervised learning algorithm used for binary classification tasks, where the output is a discrete value (0 or 1, True or False). It is used when the dependent variable is categorical. Logistic regression models the probability that a given input belongs to a particular class using a logistic (sigmoid) function.

#### `→ Modeling Process:`

The modeling process includes:

* Step 1: Define the problem as a binary classification task.

* Step 2: Train the model by finding the best coefficients that fit the data. Logistic regression uses maximum likelihood estimation to find the optimal parameters.

* Step 3: Predict probabilities using the logistic function:

    <img src="./img/Screenshot1.png"><br/>
    where "𝑤" is the vector of coefficients, "𝑥" is the input, and "b" is the bias term.

    `Note:` Here's a breakdown of each component
    1. y^ - This is the predicted probability that the output belongs to a particular class (typically the positive class). It will always fall between 0 and 1 due to the properties of the sigmoid function.
    2. σ(): The sigmoid function transforms the linear combination of inputs and coefficients into a probability. As 𝑧 approaches infinity, 
    𝜎(𝑧) approaches 1; as 𝑧 approaches negative infinity, 𝜎(𝑧) approaches 0.
    3. w: This is the vector of coefficients (weights) that represents the strength and direction of the relationship between each input feature and the outcome.
    4. 𝑥: This is the input feature vector. Each feature contributes to the prediction based on its corresponding weight in 𝑤.
    5. b: The bias term allows the model to fit the data more flexibly by shifting the sigmoid curve left or right.
    6. wT𝑥 + b: This represents a linear combination of the input features, adjusted by the weights and bias. The result is fed into the sigmoid function to yield a probability.

        The logistic regression model uses this probability to classify inputs into one of the two classes, typically setting a threshold (often 0.5) to decide the final classification outcome.
* Step 4: Decision boundary: Convert the probabilities into class labels by applying a threshold (typically 0.5).

#### `→ Maths Involved:`

* <img src="./img/Sigmoid_func.png"><br/>
    <br/>
* <img src="./img/Log-Loss_func.png"><br/>
    <br/>
* <img src="./img/Gradient_descent.png"><br/>


#### `→ Use Cases of Logistic Regression:`

1. Medical Diagnosis: Predicting whether a patient has a disease (e.g., cancer detection).
2. Marketing: Customer churn prediction, determining whether a customer will buy a product.
3. Finance: Credit card fraud detection or predicting loan defaults.
4. Email Spam Detection: Classifying emails as spam or not spam.

#### `→ Techniques in Logistic Regression:`

1. Binary Logistic Regression: The basic form, where the output is a binary class (0 or 1).
2. Multinomial Logistic Regression: An extension to handle multi-class classification problems.
3. Regularized Logistic Regression:
    * L1 (Lasso): Adds an absolute value penalty to the weights.
    * L2 (Ridge): Adds a squared penalty to the weights to prevent overfitting.

#### `→ Models Used in Logistic Regression:`

1. Logistic Regression Model (Binary): Used for binary classification.
2. Multinomial Logistic Regression: Used for multi-class classification.
3. Regularized Logistic Regression (Lasso, Ridge): Used when there is a need to penalize large coefficients and reduce overfitting.

#### `→ Real-World Examples:`

1. Healthcare: Logistic regression is widely used to predict the likelihood of diseases based on patient data (e.g., heart disease, diabetes prediction).
2. Finance: Used to predict credit card fraud or whether a customer will default on a loan.
3. Social Media: Predicting user engagement or click-through rates for advertisements.

#### `→ Implementation:` 

The implementation of Logistic Regression is in the folder ClassificationModels > logistic_regression.ipynb