# Metrics Comparison Across Language Models

## Overview:
In this analysis, we compared performance metrics across different language models for various languages. The metrics include __accuracy (acc), F1 score (f1), precision, recall, and loss__. Each language model is represented by a folder, and individual language results are stored in respective subfolders.

## Language Models Investigated:
- AfriBERTa
- AfroXLMR
- AngoBERTa (finetuned/MAFT using XLMR... the one I trained)
- AfroXLMR-61
- AfroXLMR-75

## Languages Explored:
Within each language model, we explored results for specific languages. Notable languages include __Kimbundu (Kmb), Umbundu (umb), Lua, Chokwe (Cjk), and Kikongo (Kon)__. These languages are identified by their respective language codes.

## Visualization:
To facilitate comparison, we employed a color-coded table representation. The highest values for accuracy, F1 score, precision, recall and loss are highlighted in blue.

## Observations:
* __AfroXMLR-75__: Achieved strong performance across most metrics. Notable strengths in accuracy, F1 score, and recall.
* __AngoBERTa__: Demonstrated competitive results, especially in _loss_. Varied performance across different languages. However it ouperformed AfriBERTa and AfroXLMR  in all the metrics. (Probably finetuning it for only 10 epochs was not a good idea)
* __AfroXLMR-61__: Consistent and balanced performance across metrics.Notable strengths in precision and F1 score.

## Conclusion:
The comparison across language models and individual languages provides valuable insights into their relative strengths and weaknesses. These findings can guide further exploration and optimization efforts for specific language tasks.


In [97]:
import os
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

def read_txt_file(file_path):
    with open(file_path, 'r') as file:
        data = file.readlines()
    return data

def extract_metrics(data):
    metrics = {}
    for line in data:
        key, value = line.strip().split(" = ")
        metrics[key] = float(value)
    return metrics

def compare_metrics(all_metrics, main_folders):
    for lang_folder in all_metrics:
        print(f"\nMetrics for {lang_folder}:")

        # Create a table (Pandas DataFrame) for metrics across main folders
        metrics_data = {main_folder: all_metrics[lang_folder][main_folder] for main_folder in main_folders}
        df = pd.DataFrame(metrics_data)

        # Create a color map for the table
        #cm = sns.light_palette("green", as_cmap=True)
        row_names = ['acc', 'f1', 'precision', 'recall', 'loss']
        styled_df = df.loc[row_names].style.highlight_max(axis=1, props='color:white; font-weight:bold; background-color:darkblue;')
        
        # Apply the color map to the table
        #styled_df = df.style.background_gradient(cmap=cm)

        # Display the styled DataFrame using the display function
        display(styled_df)


def main():
    root_folder = './'  # Use the current directory
    all_metrics = {}

    #print("Searching for text files in the following directories:")

    main_folders = ['AfriBERTa', 'AfroXLMR', 'AngoBERTa', 'AfroXLMR-61', 'AfroXLMR-75']

    for main_folder in main_folders:
        main_folder_path = os.path.join(root_folder, main_folder)

        if os.path.isdir(main_folder_path):
            #print(f"Checking {main_folder} directory:")
            
            for lang_folder in os.listdir(main_folder_path):
                lang_folder_path = os.path.join(main_folder_path, lang_folder)

                if os.path.isdir(lang_folder_path):
                    # Remove 'xlmrbase' part from lang_folder
                    lang_folder_name = lang_folder.replace('xlmrbase', '')
                    txt_file_path = os.path.join(lang_folder_path, f'test_result__{lang_folder_name}.txt')

                    if os.path.exists(txt_file_path):
                        #print(f"Found text file for {lang_folder_name}: {txt_file_path}")

                        data = read_txt_file(txt_file_path)
                        metrics = extract_metrics(data)

                        # Store metrics for comparison
                        if lang_folder_name not in all_metrics:
                            all_metrics[lang_folder_name] = {}

                        all_metrics[lang_folder_name][main_folder] = metrics

    print("\nComparing metrics across all languages:")
    # Create a table for metrics across main folders
    compare_metrics(all_metrics, main_folders)

if __name__ == "__main__":
    main()




Comparing metrics across all languages:

Metrics for umb_:


Unnamed: 0,AfriBERTa,AfroXLMR,AngoBERTa,AfroXLMR-61,AfroXLMR-75
acc,0.514706,0.607843,0.52451,0.651961,0.627451
f1,0.508249,0.589364,0.463317,0.635736,0.598336
precision,0.515042,0.604763,0.507593,0.65234,0.685853
recall,0.514706,0.607843,0.52451,0.651961,0.627451
loss,1.717912,1.23611,1.549545,1.284542,1.136076



Metrics for kmb_:


Unnamed: 0,AfriBERTa,AfroXLMR,AngoBERTa,AfroXLMR-61,AfroXLMR-75
acc,0.495098,0.514706,0.431373,0.627451,0.686275
f1,0.48823,0.514042,0.407132,0.631764,0.685766
precision,0.493146,0.526324,0.471221,0.657572,0.705792
recall,0.495098,0.514706,0.431373,0.627451,0.686275
loss,1.64044,1.315374,1.676919,1.169983,1.105627



Metrics for lua_:


Unnamed: 0,AfriBERTa,AfroXLMR,AngoBERTa,AfroXLMR-61,AfroXLMR-75
acc,0.534314,0.25,0.647059,0.70098,0.75
f1,0.520611,0.1,0.629944,0.697776,0.74391
precision,0.520994,0.0625,0.657029,0.713043,0.753232
recall,0.534314,0.25,0.647059,0.70098,0.75
loss,1.616108,1.933473,1.246629,1.064259,0.919036



Metrics for kon_:


Unnamed: 0,AfriBERTa,AfroXLMR,AngoBERTa,AfroXLMR-61,AfroXLMR-75
acc,0.598039,0.730392,0.705882,0.740196,0.818627
f1,0.587,0.728276,0.694312,0.740094,0.815286
precision,0.594213,0.732624,0.70329,0.742453,0.815373
recall,0.598039,0.730392,0.705882,0.740196,0.818627
loss,1.42033,0.949699,0.977556,0.955773,0.797104



Metrics for cjk_:


Unnamed: 0,AfriBERTa,AfroXLMR,AngoBERTa,AfroXLMR-61,AfroXLMR-75
acc,0.421569,0.504902,0.470588,0.563725,0.583333
f1,0.40981,0.502933,0.440267,0.563951,0.574221
precision,0.411425,0.518841,0.48946,0.582862,0.587561
recall,0.421569,0.504902,0.470588,0.563725,0.583333
loss,2.011199,1.568881,1.607468,1.336898,1.33683
