# Project Echo - Experiment Benchmarking Framework

This notebook provides an interactive interface to the benchmarking framework. It allows you to run various experiments with different model architectures and augmentation strategies, and compare their performance.

## Overview

The benchmarking framework is designed to systematically evaluate different combinations of:
- Model architectures (EfficientNet, MobileNet, ResNet, etc.)
- Audio augmentation strategies
- Image/spectrogram augmentation strategies

Results are collected and visualized to help identify the best performing configurations for bat sound classification.

# 0.1 Install Required Libraries
The following cell is to install required libraries if you are running this notebook remotly, such as on an instance from Vast.ai or google colab.
Ensure you have a clean python 3.9.21 kernal to start.
Details on how to set this up are contained within the readme.

In [3]:
from ipywidgets import IntSlider
from IPython.display import display
slider = IntSlider()
display(slider)

IntSlider(value=0)

## 1. Import Required Libraries

In [4]:
# Import necessary libraries
import os
import sys
import tensorflow as tf
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
from ipywidgets import widgets
from IPython.display import display, HTML, clear_output
import re
import importlib

'''# Add the current directory to path for imports
module_path = os.getcwd() # Gets the current working directory of the notebook
if module_path not in sys.path:
    sys.path.append(module_path)'''

# --- MODIFICATION FOR DOCKERIZED KERNEL ---
# This path MUST match the location of your 'Benchmarking_and_Experimentation'
# directory INSIDE THE DOCKER CONTAINER, based on your volume mount.

# Assuming your local 'c:\Users\deanf\OneDrive\Coding\GitHub\Project-Echo'
# is mounted to '/workspace/Project-Echo' inside the container:
actual_module_path_inside_container ="/Users/mankirat/Desktop/Deakin/DEAKIN25/Sem2/ProjectEcho/TechnicalTask1/Technical_Task2/Benchmarking_and_Experimentation"
# This is the directory containing your 'config' and 'utils' Python packages.

if not os.path.isdir(actual_module_path_inside_container):
    print(f"ERROR: The path '{actual_module_path_inside_container}' does NOT exist or is not a directory INSIDE THE CONTAINER.")
    print(f"Current CWD inside container (from kernel's perspective) is: {os.getcwd()}")
    # For debugging, you can list directories from the root of your mounted project:
    # mounted_project_root_in_container = "/workspace/Project-Echo" # Adjust if your mount is different
    # if os.path.exists(mounted_project_root_in_container):
    #     print(f"Contents of '{mounted_project_root_in_container}': {os.listdir(mounted_project_root_in_container)}")
else:
    if actual_module_path_inside_container not in sys.path:
        sys.path.insert(0, actual_module_path_inside_container) # Insert at the beginning for higher precedence
    print(f"Successfully added to sys.path: {actual_module_path_inside_container}")
    # You can verify the contents if needed:
    # print(f"Contents of '{actual_module_path_inside_container}': {os.listdir(actual_module_path_inside_container)}")
    # print(f"Checking for config dir: {os.path.exists(os.path.join(actual_module_path_inside_container, 'config'))}")
    # print(f"Checking for __init__.py in config: {os.path.exists(os.path.join(actual_module_path_inside_container, 'config', '__init__.py'))}")

# --- END MODIFICATION ---

# Import framework components
from config.experiment_configs import EXPERIMENTS





Successfully added to sys.path: /Users/mankirat/Desktop/Deakin/DEAKIN25/Sem2/ProjectEcho/TechnicalTask1/Technical_Task2/Benchmarking_and_Experimentation


In [5]:
# Import necessary libraries
import os
import sys
import tensorflow as tf
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
from ipywidgets import widgets
from IPython.display import display, HTML, clear_output
import re
import importlib

# Add the current directory to path for imports
module_path = os.getcwd() # Gets the current working directory of the notebook
if module_path not in sys.path:
    sys.path.append(module_path)

# Import framework components
from config.experiment_configs import EXPERIMENTS





## 3. Available Experiments
`
Here you can view and select experiments to run. Each experiment represents a combination of model architecture and augmentation strategies.

## 2. Configuration

Set up the directories and options for benchmarking. 

Ensure to update these in the system_config.py file in the config folder.

The default directories are as follows:

DATA_DIR = "D:\Echo\Audio_data"  # Directory containing audio data

CACHE_DIR = "D:\Echo\Training_cache"  # Directory for caching pipeline results

OUTPUT_DIR = "D:\Echo\results"  # Directory to save experiment results

In [6]:
# Import directories from system_config
from config.system_config import SC

# Get directory paths from system config
DATA_DIR = SC['AUDIO_DATA_DIRECTORY']
CACHE_DIR = SC['CACHE_DIRECTORY']
OUTPUT_DIR = SC['OUTPUT_DIRECTORY']

print(f"Using directories from system_config:")
print(f"Data Directory: {DATA_DIR}")
print(f"Cache Directory: {CACHE_DIR}")
print(f"Output Directory: {OUTPUT_DIR}")

# Create output directory if it doesn't exist
if not os.path.exists(OUTPUT_DIR):
    os.makedirs(OUTPUT_DIR)


print("Physical GPUs:", tf.config.list_physical_devices("GPU"))
print("Built with CUDA:", tf.test.is_built_with_cuda())
print("GPU name:", tf.test.gpu_device_name())


# Configure GPU memory if available
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
    for gpu in gpus:
        tf.config.experimental.set_memory_growth(gpu, True)
    print(f"GPU support enabled: {len(gpus)} GPU(s) found")
else:
    print("No GPU support found, running on CPU")

Using directories from system_config:
Data Directory: /Users/mankirat/Desktop/Deakin/DEAKIN25/Sem2/ProjectEcho/TechnicalTask1/Technical_Task2/Benchmarking_and_Experimentation/dataset
Cache Directory: /Users/mankirat/Desktop/Deakin/DEAKIN25/Sem2/ProjectEcho/TechnicalTask1/Technical_Task2/Benchmarking_and_Experimentation/cache
Output Directory: /Users/mankirat/Desktop/Deakin/DEAKIN25/Sem2/ProjectEcho/TechnicalTask1/Technical_Task2/Benchmarking_and_Experimentation/results
Physical GPUs: [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
Built with CUDA: False
Metal device set to: Apple M2
GPU name: /device:GPU:0
GPU support enabled: 1 GPU(s) found


2025-09-19 11:52:23.683296: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:306] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2025-09-19 11:52:23.685441: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:272] Created TensorFlow device (/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)


In [7]:
from config.experiment_configs import EXPERIMENTS
# Display available experiments
experiment_data = []
for exp in EXPERIMENTS:
    experiment_data.append({
        "name": exp["name"],
        "model": exp["model"],
        "audio_augmentation": exp["audio_augmentation"],
        "image_augmentation": exp["image_augmentation"],
        "epochs": exp["epochs"],
        "batch_size": exp["batch_size"]
    })

experiments_df = pd.DataFrame(experiment_data)
display(experiments_df)

Unnamed: 0,name,model,audio_augmentation,image_augmentation,epochs,batch_size
0,baseline,EfficientNetV2B0,none,none,10,16
1,noise_and_stretch_audio_aug,EfficientNetV2B0,noise_and_stretch,none,10,16
2,basic_image_aug,EfficientNetV2B0,none,basic_rotation,10,16
3,full_augmentation,EfficientNetV2B0,advanced,combined,10,16
4,mobilenet_baseline,MobileNetV2,none,none,10,16
5,mobilenet_full_aug,MobileNetV2,advanced,combined,10,16
6,noise_and_pitch_small,ResNet50V2,noise_and_pitch,none,10,16
7,spec_only_aggressive,InceptionV3,none,aggressive,10,16
8,mobilenetv2_tf_basic,MobileNetV2_TF,none,none,10,16


## 4. Interactive Experiment Selection

Use the widgets below to select experiments and set directories.

In [8]:
# Create widgets for directory selection
data_dir_widget = widgets.Text(
    value=DATA_DIR,
    description='Data Directory:',
    style={'description_width': 'initial'},
    layout=widgets.Layout(width='80%')
)

cache_dir_widget = widgets.Text(
    value=CACHE_DIR,
    description='Cache Directory:',
    style={'description_width': 'initial'},
    layout=widgets.Layout(width='80%')
)

output_dir_widget = widgets.Text(
    value=OUTPUT_DIR,
    description='Output Directory:',
    style={'description_width': 'initial'},
    layout=widgets.Layout(width='80%')
)

# Group directory widgets
dir_widgets_box = widgets.VBox([data_dir_widget, cache_dir_widget, output_dir_widget])

# Create widget for experiment selection
experiment_options = [(exp["name"], exp["name"]) for exp in EXPERIMENTS]
experiment_widget = widgets.SelectMultiple(
    options=experiment_options,
    description='Select Experiments:',
    disabled=False,
    style={'description_width': 'initial'},
    layout=widgets.Layout(width='50%', height='200px')
)

# Buttons for actions
run_selected_button = widgets.Button(
    description='Run Selected Experiments',
    button_style='primary',
    tooltip='Run the selected experiments'
)

run_all_button = widgets.Button(
    description='Run All Experiments',
    tooltip='Run all experiments'
)

generate_report_button = widgets.Button(
    description='Generate Report Only',
    button_style='info',
    tooltip='Generate a report from existing results'
)

# Group buttons
buttons_box = widgets.HBox([run_selected_button, run_all_button, generate_report_button])

# Output area for logs
output_area = widgets.Output(layout={'border': '1px solid black', 'width': '90%', 'height': '300px'}) # Adjusted width and added height

# Main container for all control widgets
controls_box = widgets.VBox([
    widgets.HTML("<h3>Directory Configuration:</h3>"), # Optional title
    dir_widgets_box,
    widgets.HTML("<hr><h3>Experiment Selection:</h3>"), # Optional separator and title
    experiment_widget,
    widgets.HTML("<hr><h3>Actions:</h3>"), # Optional separator and title
    buttons_box
])

# Display main controls container and then the output area
display(controls_box)
display(output_area)

VBox(children=(HTML(value='<h3>Directory Configuration:</h3>'), VBox(children=(Text(value='/Users/mankirat/Des…

Output(layout=Layout(border_bottom='1px solid black', border_left='1px solid black', border_right='1px solid b…

## 5. Experiment Runner Functions

These functions handle the execution of experiments and report generation.

In [1]:
from utils.optimised_engine_pipeline import train_model


def run_selected_experiments(b):

    from IPython.display import clear_output # Moved import here for clarity
    # clear_output(wait=True) # Clear previous output first
    output_area.clear_output(wait=True) 
    with output_area:
        print("Starting experiment run...")
        # Get new directory paths from widgets
        new_data_dir = data_dir_widget.value
        new_cache_dir = cache_dir_widget.value
        
        # Define path to system_config.py (relative to notebook directory)
        # Assumes 'config' is a subdirectory of the notebook's directory
        config_file_path = os.path.join('config', 'system_config.py')
        
        try:
            print(f"Attempting to update {config_file_path}...")
            with open(config_file_path, 'r') as f:
                lines = f.readlines()
            
            new_lines = []
            config_updated = False
            for line in lines:
                if "'AUDIO_DATA_DIRECTORY':" in line:
                    # Use r-string for replacement to handle backslashes in path correctly
                    new_line = re.sub(r"('AUDIO_DATA_DIRECTORY':\s*r\")[^\"]*(\")", rf'\1{new_data_dir}\2', line)
                    if new_line != line:
                        config_updated = True
                    new_lines.append(new_line)
                elif "'CACHE_DIRECTORY':" in line: 
                    new_line = re.sub(r"('CACHE_DIRECTORY':\s*r\")[^\"]*(\")", rf'\1{new_cache_dir}\2', line)
                    if new_line != line:
                        config_updated = True
                    new_lines.append(new_line)
                else:
                    new_lines.append(line)
            
            if config_updated:
                with open(config_file_path, 'w') as f:
                    f.writelines(new_lines)
                print(f"Successfully updated {config_file_path} with new directory paths.")
            else:
                print(f"{config_file_path} already up-to-date or keys not found.")
            
            # Reload the system_config module to apply changes
            importlib.reload(config.system_config)
            # Re-import SC if it's used directly in this notebook, or ensure train_model gets the fresh one.
            # from config.system_config import SC 
            print("System configuration reloaded.")
            
        except Exception as e:
            print(f"Error updating or reloading system_config.py: {e}")
            print("Proceeding with potentially outdated configuration.")
            # Decide if you want to return or proceed if config update fails
            # return 

        selected_experiments = list(experiment_widget.value)
        if not selected_experiments:
            print("No experiment selected. Please select at least one experiment.")
            return
        
        for exp_name in selected_experiments:
            exp_config = next((exp for exp in EXPERIMENTS if exp["name"] == exp_name), None)
            if exp_config is None:
                print(f"Experiment {exp_name} not found in EXPERIMENTS.")
                continue

            print(f"Running experiment: {exp_config['name']}")
            # Pass configuration values to the train_model function.
            # train_model will use the reloaded system_config.SC for DATA_DIR and CACHE_DIR
            model, history = train_model(
                model_name=exp_config['model'],
                epochs=exp_config.get('epochs'),
                batch_size=exp_config.get('batch_size')
            )
            print(f"Training completed for experiment: {exp_config['name']}")
            if model:
                 model.summary(print_fn=lambda x: print(x)) # Ensure summary prints to output_area
            print("-" * 40)

run_selected_button.on_click(run_selected_experiments)

NameError: name 'run_selected_button' is not defined

## 6. View Previous Results (From here down, notebook is under development)

If you've already run experiments, you can view and analyse the results here.

In [None]:
# Under development
def load_results(output_dir=OUTPUT_DIR):
    # Check if results directory exists
    if not os.path.exists(output_dir):
        print(f"Results directory does not exist: {output_dir}")
        return None
    
    # Look for comparison report CSV
    csv_files = [f for f in os.listdir(output_dir) if f.startswith("comparison_results_") and f.endswith(".csv")]
    
    if not csv_files:
        print("No comparison results found. Run experiments or generate a report first.")
        return None
    
    # Load the latest CSV file
    latest_csv = max(csv_files)
    csv_path = os.path.join(output_dir, latest_csv)
    results_df = pd.read_csv(csv_path)
    
    print(f"Loaded results from: {csv_path}")
    return results_df

# Load and display results if available
results_df = load_results()
if results_df is not None:
    display(results_df)

## 7. Visualise Results

Create various visualisations to compare experiment results.

In [2]:
# Under Development
# Taken from previously developed notebooks in Machine Learing course

def visualize_results(results_df):
    if results_df is None or len(results_df) == 0:
        print("No results available to visualize.")
        return
    
    # Set the figure size for better visibility
    plt.figure(figsize=(14, 8))
    
    # Create accuracy comparison bar chart
    plt.subplot(2, 2, 1)
    sns.barplot(x='Experiment', y='Test Accuracy', data=results_df)
    plt.title('Test Accuracy by Experiment')
    plt.xticks(rotation=45, ha='right')
    plt.tight_layout()
    
    # Create F1 score comparison bar chart
    plt.subplot(2, 2, 2)
    sns.barplot(x='Experiment', y='F1 Score', data=results_df)
    plt.title('F1 Score by Experiment')
    plt.xticks(rotation=45, ha='right')
    plt.tight_layout()
    
    # Training time comparison
    plt.subplot(2, 2, 3)
    sns.barplot(x='Experiment', y='Training Time (min)', data=results_df)
    plt.title('Training Time by Experiment (minutes)')
    plt.xticks(rotation=45, ha='right')
    plt.tight_layout()
    
    # Model comparison
    plt.subplot(2, 2, 4)
    model_comparison = results_df.groupby('Model')['Test Accuracy'].mean().reset_index()
    sns.barplot(x='Model', y='Test Accuracy', data=model_comparison)
    plt.title('Average Accuracy by Model')
    plt.tight_layout()
    
    plt.tight_layout(pad=3.0)
    plt.show()
    
    # Create a separate visualization for augmentation impact
    plt.figure(figsize=(12, 6))
       
    aug_df = pd.DataFrame(aug_data)
    sns.barplot(x='Augmentation', y='Accuracy', hue='Augmentation Type', data=aug_df)
    plt.title('Accuracy by Augmentation Type')
    plt.xticks(rotation=45, ha='right')
    plt.tight_layout()
    plt.show()

# Visualize results if available
if results_df is not None:
    visualize_results(results_df)

NameError: name 'results_df' is not defined

## 8. Experiment Analysis and Conclusions
(if required)


