Q1: Explain the following with an example:

- Artificial Intelligenc
- Machine Learnin
- Deep Learning


Certainly! Let's break down each term and explain them with examples.

### Artificial Intelligence (AI)

**Definition:**
Artificial Intelligence is the broader concept of machines being able to carry out tasks in a way that we would consider "smart." It encompasses a range of technologies that enable computers to perform tasks that normally require human intelligence. This includes problem-solving, understanding natural language, recognizing patterns, and making decisions.

**Example:**
An example of AI is a virtual assistant like Siri or Alexa. These assistants can understand spoken language, process the information, and respond appropriately. They can set reminders, answer questions, play music, and control smart home devices.

### Machine Learning (ML)

**Definition:**
Machine Learning is a subset of AI that involves training algorithms to learn from and make predictions or decisions based on data. Instead of being explicitly programmed to perform a task, ML models are trained using large amounts of data and improve over time as they are exposed to more data.

**Example:**
A common example of ML is a recommendation system on streaming services like Netflix. The system learns from your viewing habits and those of other users to suggest movies and TV shows you might like. It uses data on what you’ve watched, rated, and interacted with to make these predictions.

### Deep Learning (DL)

**Definition:**
Deep Learning is a specialized subset of ML that uses neural networks with many layers (hence "deep") to analyze various factors of data. These deep neural networks are capable of learning from large amounts of unstructured data such as images, audio, and text.

**Example:**
An example of Deep Learning is image recognition. Consider Google's image search functionality: when you upload a photo, Google's deep learning algorithms analyze the image to identify objects, people, and scenes, and then return visually similar images from the web. These models have been trained on vast datasets of images and can recognize patterns with high accuracy.

### Summary of Relationships
- **AI** is the broadest concept and includes any technique enabling computers to mimic human intelligence.
- **ML** is a subset of AI that uses data to teach machines how to learn and make decisions without being explicitly programmed for each task.
- **DL** is a further subset of ML that uses complex neural networks to analyze more complicated data structures like images and natural language.

### Visual Example

Imagine a task like identifying animals in photos:

1. **AI**: The system can identify animals in photos as part of a broader set of intelligent behaviors (e.g., also describing the scene, suggesting tags, etc.).
2. **ML**: The system has been trained on thousands of animal photos and can predict whether an animal is a dog, cat, bird, etc., based on features learned from the data.
3. **DL**: The system uses a deep neural network to analyze the intricate details of each photo. It can identify not just the type of animal but also specific breeds of dogs or cats with high accuracy, even in varying conditions like different lighting or angles.

This hierarchical approach shows how each level builds upon the previous, with deep learning being the most advanced form of these technologies.

Q2: What is supervised learning? List some examples of supervised learning.


### Supervised Learning

**Definition:**
Supervised learning is a type of machine learning where the model is trained on labeled data. This means that each training example is paired with an output label. The model learns to map inputs to outputs by being fed many examples of input-output pairs, and it makes predictions based on this training. The goal is to infer the function that best maps the inputs to the outputs.

**Process:**
1. **Training Phase**: The model is provided with a dataset containing inputs and corresponding correct outputs (labels). It learns the relationships between inputs and outputs.
2. **Testing Phase**: The trained model is then evaluated on a separate dataset to test its performance. The accuracy of its predictions is measured against the known outputs.

### Examples of Supervised Learning

1. **Regression:**
   - **Linear Regression**: Predicting house prices based on features like square footage, number of bedrooms, and location.
   - **Logistic Regression**: Estimating the probability of a binary outcome, such as whether a student will pass or fail based on study hours and previous scores.

2. **Classification:**
   - **Support Vector Machines (SVM)**: Classifying emails as spam or not spam.
   - **Decision Trees**: Diagnosing whether a patient has a specific disease based on medical test results.
   - **K-Nearest Neighbors (KNN)**: Recognizing handwritten digits in images (e.g., the MNIST dataset).

3. **Neural Networks:**
   - **Convolutional Neural Networks (CNNs)**: Image classification tasks such as identifying objects in photos (e.g., recognizing cats vs. dogs).
   - **Recurrent Neural Networks (RNNs)**: Sentiment analysis on text data, such as determining if a movie review is positive or negative.

4. **Ensemble Methods:**
   - **Random Forests**: Combining multiple decision trees to improve classification or regression accuracy, such as predicting loan default risk.
   - **Gradient Boosting Machines (GBM)**: Boosting the performance of weak learners to create a strong predictive model, often used in winning machine learning competitions.

### Example Scenario: Predicting House Prices

**Dataset:**
- Inputs (features): Size of the house (square footage), number of bedrooms, number of bathrooms, location (neighborhood), age of the house.
- Output (label): Price of the house.

**Process:**
1. **Training Phase**: The model is trained using a dataset of houses where both the features and the prices are known. It learns how different features affect house prices.
2. **Testing Phase**: The model is tested with new houses where the prices are known but not given to the model. The model predicts the prices based on the features, and its predictions are compared to the actual prices to evaluate accuracy.

In summary, supervised learning relies on labeled datasets to train models to make accurate predictions or classifications based on input data. It is widely used in various practical applications, from finance to healthcare to technology.

Q3:  What is unsupervised learning? List some examples of unsupervised learning.


### Unsupervised Learning

**Definition:**
Unsupervised learning is a type of machine learning where the model is trained on data without labeled responses. The goal is to infer the natural structure present within a set of data points. Unlike supervised learning, there are no output labels to guide the learning process. Instead, the model tries to learn patterns and relationships in the data.

### Examples of Unsupervised Learning

1. **Clustering:**
   - **K-Means Clustering**: Grouping customers into clusters based on purchasing behavior for market segmentation.
   - **Hierarchical Clustering**: Creating a hierarchy of clusters to organize data, such as building a taxonomy of animal species.
   - **DBSCAN (Density-Based Spatial Clustering of Applications with Noise)**: Identifying clusters of varying shapes and sizes, useful for geographic data and anomaly detection.

2. **Dimensionality Reduction:**
   - **Principal Component Analysis (PCA)**: Reducing the dimensionality of a dataset while retaining most of the variation, often used for data visualization and noise reduction.
   - **t-Distributed Stochastic Neighbor Embedding (t-SNE)**: Visualizing high-dimensional data in two or three dimensions, commonly used for visualizing the clustering of data points in a lower-dimensional space.
   - **Independent Component Analysis (ICA)**: Separating a multivariate signal into additive, independent components, such as separating different audio sources in a recording.

3. **Association Rule Learning:**
   - **Apriori Algorithm**: Finding frequent itemsets and generating association rules, commonly used for market basket analysis to identify products frequently bought together.
   - **Eclat Algorithm**: An efficient algorithm for finding frequent itemsets in a dataset, often used in text mining and bioinformatics.

4. **Anomaly Detection:**
   - **Isolation Forest**: Detecting anomalies in data by isolating outliers, useful in fraud detection and network security.
   - **Gaussian Mixture Models (GMM)**: Modeling the data distribution and identifying anomalies based on probabilistic thresholds, applied in finance for fraud detection.

### Example Scenario: Customer Segmentation

**Dataset:**
- Features: Purchase history, browsing behavior, demographic information (e.g., age, income, location).

**Process:**
1. **Clustering (e.g., K-Means)**: The algorithm groups customers into clusters based on their similarities in the provided features. For example, it might identify clusters of high-value customers, occasional buyers, and frequent discount shoppers.
2. **Result**: The business can use these clusters to tailor marketing strategies, improve customer service, and personalize recommendations.

### Benefits and Applications
Unsupervised learning is beneficial when:
- You have a large amount of unlabeled data.
- You want to explore the underlying structure of the data.
- You need to discover patterns or groupings that were not previously known.

**Applications include:**
- **Market Segmentation**: Grouping customers based on purchasing behavior.
- **Image Compression**: Reducing the size of images by identifying patterns.
- **Genomic Data Analysis**: Finding gene expressions and patterns in biological data.
- **Social Network Analysis**: Detecting communities and influential nodes in a network.

In summary, unsupervised learning helps in uncovering hidden patterns and structures in data without the need for labeled examples, making it a powerful tool for exploratory data analysis and feature extraction.

Q4: What is the difference between AI, ML, DL, and DS?


### Unsupervised Learning

**Definition:**
Unsupervised learning is a type of machine learning where the model is trained on data without labeled responses. The goal is to infer the natural structure present within a set of data points. Unlike supervised learning, there are no output labels to guide the learning process. Instead, the model tries to learn patterns and relationships in the data.

### Examples of Unsupervised Learning

1. **Clustering:**
   - **K-Means Clustering**: Grouping customers into clusters based on purchasing behavior for market segmentation.
   - **Hierarchical Clustering**: Creating a hierarchy of clusters to organize data, such as building a taxonomy of animal species.
   - **DBSCAN (Density-Based Spatial Clustering of Applications with Noise)**: Identifying clusters of varying shapes and sizes, useful for geographic data and anomaly detection.

2. **Dimensionality Reduction:**
   - **Principal Component Analysis (PCA)**: Reducing the dimensionality of a dataset while retaining most of the variation, often used for data visualization and noise reduction.
   - **t-Distributed Stochastic Neighbor Embedding (t-SNE)**: Visualizing high-dimensional data in two or three dimensions, commonly used for visualizing the clustering of data points in a lower-dimensional space.
   - **Independent Component Analysis (ICA)**: Separating a multivariate signal into additive, independent components, such as separating different audio sources in a recording.

3. **Association Rule Learning:**
   - **Apriori Algorithm**: Finding frequent itemsets and generating association rules, commonly used for market basket analysis to identify products frequently bought together.
   - **Eclat Algorithm**: An efficient algorithm for finding frequent itemsets in a dataset, often used in text mining and bioinformatics.

4. **Anomaly Detection:**
   - **Isolation Forest**: Detecting anomalies in data by isolating outliers, useful in fraud detection and network security.
   - **Gaussian Mixture Models (GMM)**: Modeling the data distribution and identifying anomalies based on probabilistic thresholds, applied in finance for fraud detection.

### Example Scenario: Customer Segmentation

**Dataset:**
- Features: Purchase history, browsing behavior, demographic information (e.g., age, income, location).

**Process:**
1. **Clustering (e.g., K-Means)**: The algorithm groups customers into clusters based on their similarities in the provided features. For example, it might identify clusters of high-value customers, occasional buyers, and frequent discount shoppers.
2. **Result**: The business can use these clusters to tailor marketing strategies, improve customer service, and personalize recommendations.

### Benefits and Applications
Unsupervised learning is beneficial when:
- You have a large amount of unlabeled data.
- You want to explore the underlying structure of the data.
- You need to discover patterns or groupings that were not previously known.

**Applications include:**
- **Market Segmentation**: Grouping customers based on purchasing behavior.
- **Image Compression**: Reducing the size of images by identifying patterns.
- **Genomic Data Analysis**: Finding gene expressions and patterns in biological data.
- **Social Network Analysis**: Detecting communities and influential nodes in a network.

In summary, unsupervised learning helps in uncovering hidden patterns and structures in data without the need for labeled examples, making it a powerful tool for exploratory data analysis and feature extraction.

Q5: What are the main differences between supervised, unsupervised, and semi-supervised learning?


### Main Differences Between Supervised, Unsupervised, and Semi-Supervised Learning

1. **Definition and Approach:**
   - **Supervised Learning:**
     - **Definition**: Uses labeled data to train models. Each input comes with an associated output label.
     - **Approach**: The model learns to map inputs to the correct outputs using the labeled data.
   - **Unsupervised Learning:**
     - **Definition**: Uses unlabeled data to find patterns and structures in the data.
     - **Approach**: The model tries to infer the natural structure within the data without any guidance from labels.
   - **Semi-Supervised Learning:**
     - **Definition**: Uses a combination of a small amount of labeled data and a large amount of unlabeled data.
     - **Approach**: The model leverages the labeled data to guide its learning process and then expands its understanding using the unlabeled data.

2. **Training Data:**
   - **Supervised Learning**: Requires a dataset with both input features and corresponding output labels.
   - **Unsupervised Learning**: Requires only input features; there are no output labels.
   - **Semi-Supervised Learning**: Requires a mixture of a few labeled examples and many unlabeled examples.

3. **Objective:**
   - **Supervised Learning**: Predict outcomes or classify inputs based on learned relationships from the labeled data.
   - **Unsupervised Learning**: Discover hidden patterns, groupings, or data structures without predefined labels.
   - **Semi-Supervised Learning**: Improve learning accuracy by using the small set of labeled data to inform the learning from the larger set of unlabeled data.

4. **Applications:**
   - **Supervised Learning**:
     - Classification: Email spam detection, disease diagnosis.
     - Regression: Predicting house prices, forecasting stock prices.
   - **Unsupervised Learning**:
     - Clustering: Customer segmentation, social network analysis.
     - Dimensionality Reduction: Data visualization, noise reduction in data.
   - **Semi-Supervised Learning**:
     - Combining the strengths of supervised and unsupervised learning for tasks where obtaining labeled data is expensive or time-consuming.
     - Example: Improving image recognition models by using a small labeled dataset along with a large set of unlabeled images.

### Example Scenarios

1. **Supervised Learning Example:**
   - **Task**: Predicting house prices.
   - **Data**: Historical house prices with features like square footage, number of bedrooms, and location.
   - **Model**: Trained to predict the price of a house based on its features.

2. **Unsupervised Learning Example:**
   - **Task**: Customer segmentation.
   - **Data**: Purchase history and browsing behavior without any labels.
   - **Model**: Groups customers into segments based on their behavior patterns.

3. **Semi-Supervised Learning Example:**
   - **Task**: Classifying emails as spam or not spam.
   - **Data**: A small set of emails labeled as spam or not spam, and a large set of unlabeled emails.
   - **Model**: Trained using both the labeled and unlabeled emails to improve classification accuracy.

### Key Differences Summarized

- **Data Requirements**:
  - Supervised: Requires fully labeled datasets.
  - Unsupervised: Requires only unlabeled data.
  - Semi-Supervised: Requires a small labeled dataset along with a larger unlabeled dataset.

- **Learning Goals**:
  - Supervised: Learn a function to map inputs to outputs.
  - Unsupervised: Discover underlying patterns or structures in the data.
  - Semi-Supervised: Enhance learning accuracy by leveraging both labeled and unlabeled data.

- **Common Algorithms**:
  - Supervised: Linear regression, logistic regression, decision trees, SVMs, neural networks.
  - Unsupervised: K-means clustering, hierarchical clustering, PCA, t-SNE.
  - Semi-Supervised: Self-training, co-training, semi-supervised SVMs, graph-based methods.

Understanding these differences helps in selecting the appropriate learning approach based on the available data and the specific problem at hand.

Q6: What is train, test and validation split? Explain the importance of each term.



### Train, Test, and Validation Split

In machine learning, splitting your dataset into training, testing, and sometimes validation sets is crucial for building robust models and evaluating their performance. Each split serves a distinct purpose in the modeling process.

### 1. Training Set

**Definition:**
The training set is the portion of the dataset used to train the machine learning model. It consists of input data along with the corresponding correct outputs (labels).

**Purpose:**
- **Model Training**: The model learns the relationships between the input features and the output labels.
- **Parameter Adjustment**: It helps in adjusting the model's parameters to minimize error and improve performance on the training data.

**Importance:**
- The training set is essential for the model to learn and develop a pattern-matching capability.
- A larger training set typically leads to a more accurate model because it has more data to learn from.

### 2. Validation Set

**Definition:**
The validation set is a separate portion of the dataset used to tune the model's hyperparameters and make decisions about the model's architecture or configuration. It is not used for training the model directly.

**Purpose:**
- **Hyperparameter Tuning**: Helps in adjusting hyperparameters like learning rate, number of layers in a neural network, etc.
- **Model Selection**: Assists in selecting the best model among various models or configurations by comparing their performance on the validation set.

**Importance:**
- The validation set provides an unbiased evaluation of the model's performance during the training phase, helping prevent overfitting.
- It allows for the iterative process of training and tuning without compromising the integrity of the test set.

### 3. Test Set

**Definition:**
The test set is the portion of the dataset used to evaluate the final performance of the trained model. It contains input data and corresponding labels that the model has never seen during training or validation.

**Purpose:**
- **Performance Evaluation**: Provides an unbiased assessment of the model's performance on unseen data.
- **Generalization Check**: Helps determine how well the model generalizes to new, unseen data.

**Importance:**
- The test set is critical for understanding the real-world performance of the model.
- It ensures that the model's performance metrics are not inflated by overfitting or by being tuned on the same data repeatedly.

### Splitting the Data

**Typical Splits:**
- **Training Set**: 60-80% of the data
- **Validation Set**: 10-20% of the data (if used)
- **Test Set**: 10-20% of the data

**Example Scenario:**

Suppose you have a dataset of 10,000 labeled images for a classification task:

1. **Training Set**: 70% (7,000 images)
   - The model learns from these images.
   
2. **Validation Set**: 15% (1,500 images)
   - Used to tune hyperparameters and select the best model configuration.
   
3. **Test Set**: 15% (1,500 images)
   - Used to evaluate the final model's performance.

### Summary

- **Training Set**: Used to train the model and adjust its parameters.
- **Validation Set**: Used to tune hyperparameters and select the best model without overfitting.
- **Test Set**: Used to evaluate the final model's performance and ensure it generalizes well to new data.

Each split plays a crucial role in developing, tuning, and validating a machine learning model, ensuring that it performs well both during development and in real-world applications.

Q7: How can unsupervised learning be used in anomaly detection?


### Unsupervised Learning for Anomaly Detection

Anomaly detection involves identifying rare items, events, or observations that raise suspicions by differing significantly from the majority of the data. Unsupervised learning is particularly useful for anomaly detection because it doesn't require labeled data, which is often hard to obtain for anomalies.

### How Unsupervised Learning is Applied in Anomaly Detection

1. **Clustering Algorithms:**
   - **K-Means Clustering**: This algorithm partitions the data into clusters. Points that do not belong to any cluster or are far from all cluster centers can be considered anomalies.
   - **DBSCAN (Density-Based Spatial Clustering of Applications with Noise)**: This algorithm identifies clusters based on the density of data points. Points in low-density regions, which do not belong to any cluster, are labeled as anomalies.

2. **Dimensionality Reduction Techniques:**
   - **Principal Component Analysis (PCA)**: PCA reduces the dimensionality of the data while preserving as much variability as possible. Anomalies can be detected by analyzing the residuals or reconstruction errors. Data points that cannot be well-reconstructed from the principal components are considered anomalies.
   - **Autoencoders (a type of neural network)**: These are trained to compress the data into a lower-dimensional representation and then reconstruct it. Anomalies are identified based on high reconstruction errors.

3. **Distance-Based Methods:**
   - **Isolation Forest**: This algorithm works by randomly partitioning the data and creating an ensemble of trees. Points that require fewer splits to isolate are considered anomalies because they are less frequent and isolated from the rest of the data.
   - **Local Outlier Factor (LOF)**: LOF compares the local density of a point to that of its neighbors. Points that have a significantly lower density compared to their neighbors are considered anomalies.

### Example Scenario: Network Intrusion Detection

**Dataset:**
Network traffic data with features such as IP addresses, port numbers, packet sizes, and time stamps.

**Process:**
1. **Data Collection**: Collect network traffic data without labeling it as normal or anomalous.
2. **Feature Extraction**: Extract relevant features from the raw data to construct the dataset.
3. **Apply Unsupervised Learning Algorithm**:
   - **Clustering with DBSCAN**: Run DBSCAN on the network traffic data. Most of the network traffic should form dense clusters, representing normal behavior. Points that do not belong to any cluster (outliers) are flagged as potential intrusions or anomalies.
4. **Analysis of Anomalies**: Investigate the flagged anomalies to determine if they are indeed intrusions or other forms of network anomalies.

### Practical Steps for Anomaly Detection Using Unsupervised Learning

1. **Data Preprocessing**:
   - Normalize or standardize the data to ensure features contribute equally to the analysis.
   - Handle missing values and reduce noise if necessary.

2. **Algorithm Selection**:
   - Choose an appropriate unsupervised learning algorithm based on the nature of the data and the type of anomalies expected.

3. **Model Training and Evaluation**:
   - Train the model on the entire dataset (since it's unsupervised).
   - Evaluate the model's performance by comparing detected anomalies with any known anomalies (if available) or using domain knowledge.

4. **Anomaly Scoring**:
   - Calculate anomaly scores based on the model's output. For instance, in PCA, the reconstruction error can be used as an anomaly score.
   - Set a threshold to classify points as normal or anomalous based on the anomaly score distribution.

### Example Implementation: Isolation Forest

**Algorithm**: Isolation Forest

**Steps**:
1. **Train the Model**: Fit the Isolation Forest model on the network traffic data.
2. **Anomaly Score**: Each point is given an anomaly score based on how easy it is to isolate.
3. **Threshold Setting**: Determine a threshold score above which points are considered anomalies.
4. **Detection**: Identify points with anomaly scores above the threshold.

In [1]:
from sklearn.ensemble import IsolationForest

# Assuming data is in a DataFrame called 'network_data'
model = IsolationForest(contamination=0.01)  # Assume 1% contamination
model.fit(network_data)

# Predict anomalies
anomaly_scores = model.decision_function(network_data)
anomalies = model.predict(network_data)

# Anomalies are marked as -1, normal points as 1
anomaly_data = network_data[anomalies == -1]


NameError: name 'network_data' is not defined

**Outcome**:
- The model identifies points in the network traffic data that are significantly different from the majority, flagging them as potential anomalies for further investigation.

### Summary

Unsupervised learning techniques are powerful for anomaly detection as they can identify patterns and deviations in data without the need for labeled examples. By leveraging clustering, dimensionality reduction, and distance-based methods, unsupervised learning helps detect anomalies in various applications, from network security to fraud detection and beyond.

Q8: List down some commonly used supervised learning algorithms and unsupervised learning algorithms.



### Commonly Used Supervised Learning Algorithms

1. **Linear Regression**
   - Used for predicting a continuous target variable based on one or more input features.
   - Example: Predicting house prices based on size, location, and number of bedrooms.

2. **Logistic Regression**
   - Used for binary classification problems, predicting the probability of a binary outcome.
   - Example: Predicting whether an email is spam or not.

3. **Decision Trees**
   - Used for both classification and regression tasks, based on a tree-like model of decisions.
   - Example: Classifying whether a customer will buy a product based on their demographic data.

4. **Random Forests**
   - An ensemble method that combines multiple decision trees to improve accuracy and control overfitting.
   - Example: Predicting loan default risk.

5. **Support Vector Machines (SVM)**
   - Used for classification and regression by finding the hyperplane that best separates the data into classes.
   - Example: Image classification tasks.

6. **k-Nearest Neighbors (k-NN)**
   - A simple algorithm that classifies a data point based on the majority class among its k nearest neighbors.
   - Example: Recognizing handwritten digits.

7. **Naive Bayes**
   - A probabilistic classifier based on Bayes' theorem, assuming independence between features.
   - Example: Text classification and spam filtering.

8. **Gradient Boosting Machines (GBM)**
   - An ensemble technique that builds models sequentially to correct errors of the previous models.
   - Example: Predicting customer churn.

9. **Neural Networks**
   - Models inspired by the human brain, capable of learning complex patterns in data.
   - Example: Image and speech recognition.

10. **XGBoost**
    - An optimized implementation of gradient boosting designed for speed and performance.
    - Example: Winning machine learning competitions and Kaggle challenges.

### Commonly Used Unsupervised Learning Algorithms

1. **K-Means Clustering**
   - Partitions the data into k clusters, where each data point belongs to the cluster with the nearest mean.
   - Example: Customer segmentation in marketing.

2. **Hierarchical Clustering**
   - Builds a hierarchy of clusters by either merging or splitting existing clusters iteratively.
   - Example: Creating a taxonomy of animals or biological species.

3. **DBSCAN (Density-Based Spatial Clustering of Applications with Noise)**
   - Clusters based on the density of data points, identifying outliers as noise.
   - Example: Geospatial data analysis.

4. **Principal Component Analysis (PCA)**
   - Reduces the dimensionality of the data while retaining most of the variation, used for data visualization and noise reduction.
   - Example: Visualizing high-dimensional datasets in 2D or 3D.

5. **t-Distributed Stochastic Neighbor Embedding (t-SNE)**
   - A technique for dimensionality reduction, particularly well-suited for visualizing high-dimensional data.
   - Example: Visualizing clusters of handwritten digits.

6. **Autoencoders**
   - Neural networks used to learn compressed representations of data, useful for anomaly detection and dimensionality reduction.
   - Example: Detecting anomalies in network traffic data.

7. **Independent Component Analysis (ICA)**
   - Decomposes a multivariate signal into additive, independent components.
   - Example: Separating mixed audio signals from different sources.

8. **Gaussian Mixture Models (GMM)**
   - Probabilistic models that assume the data is generated from a mixture of several Gaussian distributions.
   - Example: Speaker identification.

9. **Isolation Forest**
   - An ensemble method specifically designed for anomaly detection by isolating anomalies.
   - Example: Fraud detection in financial transactions.

10. **Self-Organizing Maps (SOM)**
    - A type of artificial neural network used to produce a low-dimensional representation of data, preserving the topological properties.
    - Example: Visualizing high-dimensional data like molecular structures.

### Summary

**Supervised Learning Algorithms**: Used when the data has labeled outputs and the goal is to predict these outputs from input features. Common algorithms include Linear Regression, Logistic Regression, Decision Trees, Random Forests, SVM, k-NN, Naive Bayes, GBM, Neural Networks, and XGBoost.

**Unsupervised Learning Algorithms**: Used when the data does not have labeled outputs and the goal is to find patterns or structure in the data. Common algorithms include K-Means Clustering, Hierarchical Clustering, DBSCAN, PCA, t-SNE, Autoencoders, ICA, GMM, Isolation Forest, and SOM.