### Imbalanced Data Metrics: Detailed Explanation

#### 1. Balanced Accuracy
- **Definition**: Balanced Accuracy calculates the average accuracy of each class, where the accuracy of each class is the ratio of correctly predicted instances to the total number of instances in that class. It is adjusted for imbalanced datasets.
  
  $$
  \text{Balanced Accuracy} = \frac{1}{N_{\text{classes}}} \sum_{i=1}^{N_{\text{classes}}} \frac{TP_i}{TP_i + FN_i}
  $$
  
  where $N_{\text{classes}}$ is the number of classes, $TP_i$ is the number of true positives for class $i$, and $FN_i$ is the number of false negatives for class $i$.
  
- **When to Use**: Balanced Accuracy is used to evaluate the classification performance on imbalanced datasets, providing a balanced measure across all classes.
- **Advantages**: Provides a balanced measure of classification performance, taking into account class imbalance.
- **Disadvantages**: Does not consider the class distribution in the dataset.

#### 2. Fowlkes-Mallows Index (FMI)
- **Definition**: Fowlkes-Mallows Index measures the similarity between two clusters by computing the geometric mean of the pairwise precision and recall.
  
  $$
  \text{FMI} = \sqrt{\frac{\text{TP} \times \text{TP}'}{(TP + FP) \times (TP' + FP')}}
  $$
  
  where $TP$ is the number of true positives, $TP'$ is the number of true positives in the second cluster, $FP$ is the number of false positives, and $FP'$ is the number of false positives in the second cluster.
  
- **When to Use**: FMI is used to evaluate the similarity between two clustering results, particularly in cases of imbalanced data.
- **Advantages**: Provides a geometric mean of precision and recall, accommodating class imbalance.
- **Disadvantages**: Sensitive to the number of clusters and class distribution.

#### 3. Balanced Error Rate (BER)
- **Definition**: Balanced Error Rate calculates the average error rate of each class, where the error rate of each class is the ratio of misclassified instances to the total number of instances in that class. It is adjusted for imbalanced datasets.
  
  $$
  \text{BER} = \frac{1}{N_{\text{classes}}} \sum_{i=1}^{N_{\text{classes}}} \frac{FP_i}{FP_i + TN_i}
  $$
  
  where $N_{\text{classes}}$ is the number of classes, $FP_i$ is the number of false positives for class $i$, and $TN_i$ is the number of true negatives for class $i$.
  
- **When to Use**: BER is used to evaluate the classification performance on imbalanced datasets, providing a balanced measure across all classes.
- **Advantages**: Provides a balanced measure of classification performance, taking into account class imbalance.
- **Disadvantages**: Does not consider the class distribution in the dataset.

#### 4. G-Mean (Geometric Mean)
- **Definition**: G-Mean calculates the geometric mean of sensitivity and specificity, providing a single metric for evaluating classifier performance on imbalanced datasets.
  
  $$
  \text{G-Mean} = \sqrt{\text{Sensitivity} \times \text{Specificity}}
  $$
  
- **When to Use**: G-Mean is used to evaluate the overall performance of a classifier on imbalanced datasets, considering both sensitivity and specificity.
- **Advantages**: Provides a single metric for comparing classifier performance on imbalanced datasets.
- **Disadvantages**: May be sensitive to class distribution and threshold selection.

#### 5. F2 Score
- **Definition**: F2 Score is a variant of the F1 Score that places more emphasis on recall. It calculates the harmonic mean of precision and recall, with more weight given to recall.
  
  $$
  F2\text{-Score} = (1 + \beta^2) \times \frac{\text{Precision} \times \text{Recall}}{\beta^2 \times \text{Precision} + \text{Recall}}
  $$
  
  where $\beta$ is a parameter that controls the relative importance of recall compared to precision. In the case of F2 Score, $\beta = 2$.
  
- **When to Use**: F2 Score is used when recall is more important than precision, such as in imbalanced classification problems where the focus is on detecting positive instances.
- **Advantages**: Provides a single metric that balances precision and recall, with more weight given to recall.
- **Disadvantages**: May not be suitable for all imbalanced classification problems, depending on the desired balance between precision and recall.

#### 6. Precision-Recall AUC
- **Definition**: Precision-Recall AUC calculates the area under the precision-recall curve, providing a single measure of classifier performance across different thresholds.
  
- **When to Use**: Precision-Recall AUC is used to evaluate the overall performance of a classifier on imbalanced datasets, focusing on the trade-off between precision and recall.
- **Advantages**: Provides a single metric for comparing classifier performance across different thresholds, suitable for imbalanced datasets.
- **Disadvantages**: May be sensitive to class distribution and threshold selection.

These metrics provide a range of tools for evaluating the performance of classifiers on imbalanced datasets. Depending on the specific goals of the classification task and the characteristics of the dataset, different metrics may be more appropriate for assessing classifier performance.