## Example 3: Inference, Number of trainable parameters, Number of FLOPs, Number of MACs

* Perform inference on Audio data with trained and saved baseline/ESO models
* Compare number of trainable parameters between baseline and ESO classifiers
* Compare number of FLOPs between baseline and ESO classifiers
* Compare number of MACs between baseline and ESO

### Importing required libraries

In order to calculate the number of trainable parameters, FLOPs and MACS we can make use of the 
[calflops package](https://github.com/MrYxJ/calculate-flops.pytorch). calflops requires the [transformers](https://pypi.org/project/transformers/) package to be installed first.

In [1]:
import torch

from eso.model.model import Model
from eso.utils.preprocessing import Preprocessing
from eso import ESO

import time
import numpy as np
import pandas as pd

from pathlib import Path

from sklearn.metrics import f1_score

from calflops import calculate_flops

2024-04-09 10:54:30.131598: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2024-04-09 10:54:30.333039: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
  from .autonotebook import tqdm as notebook_tqdm


### Helper Functions for Inference on saved models

The aim of this experiment is to perform inference on audio data using saved baseline and ESO models

In [22]:
# Use GPU if available
DEVICE = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"The program will use: {DEVICE}")

The program will use: cpu


In [23]:
# CONSTANTS
POSITIVE_CLASS = "gibbon"
NEGATIVE_CLASS = "no-gibbon"

PREPROCESSING_ARGS= {
        "lowpass_cutoff": 2000,
        "downsample_rate": 4800,
        "nyquist_rate": 2400,
        "segment_duration": 4,
        "nb_negative_class": 20,
        "file_type": "svl",
        "audio_extension": ".wav",
        "n_fft": 1024,
        "hop_length": 256,
        "n_mels": 128,
        "f_min": 4000,
        "f_max": 9000,
    }


Specify the paths for saved models and audio data

In [24]:
RESULTS_PATH = Path('/home/aaron-joel/Documents/Examples/results')
# saved baseline CNN model path
BASELINE_CNN_STATE_PATH = RESULTS_PATH / 'baseline_cnn_state.pth'
# saved ESO CNN model path
ESO_CNN_STATE_PATH = RESULTS_PATH / 'chromosome_cnn_state.pth'
# saved path of best performing chromosome
CHROMOSOME_PKL_PATH = RESULTS_PATH / 'eso_chromosome.pkl'
# Audio data folder (A small dataset for demonstration purpose only)
SPECIES_FOLDER = Path('/home/aaron-joel/Documents/Examples/SmallData')
# Text file containing the names of audio files for testing
AUDIO_NAMES_TXT = SPECIES_FOLDER / 'DataFiles' / 'test.txt'

In [25]:
AUDIO_NAMES_TXT

PosixPath('/home/aaron-joel/Documents/Examples/SmallData/DataFiles/test.txt')

In [26]:
# Helper function for prediction
def _predict(model, X, batch_size=128, device=DEVICE):
    prediction_list = []
    
    # convert input X into float tensor
    X_tensor = torch.from_numpy(X).float()
    
    # Check that input has proper shape/reshape
    if len(X_tensor.shape) == 3:
        X_tensor = X_tensor.unsqueeze(1)
        
    # create dataloader object
    loader = torch.utils.data.DataLoader(dataset=X_tensor, batch_size=batch_size, shuffle=False)
    
    # put model on device and set it on eval mode
    model = model.to(device)
    model.eval()
    
    # Perform the inference
    with torch.no_grad():
        for batch in loader:
            batch = batch.to(device)
            pred = model(batch)
            prediction_list.append(pred.cpu())
    softmax_prediction = [i.detach().numpy() for i in prediction_list]
    return np.vstack(softmax_prediction)

In [27]:
# Helper function for creating dataset for selected model given 
# audio data and preprocessing arguments
def _create_data(model):
    """
    Create the dataset for the model

    Args:
        model: str, 'baseline' or 'eso'

    Returns:
        X: np.array, shape (n_samples, n_features)
        Y: np.array, shape (n_samples, )
        dataset_creation_time: float, time it took to create the dataset
    """
    start_time = time.time()
    if model == 'baseline':
        apply_preprocessing = True
    elif model == 'eso':
        apply_preprocessing = False
    else:
        raise ValueError("Model must be either 'baseline' or 'eso'")
    print(f"--- Creating Dataset for model: {model} ---")
    
    # Instantiate a Preprocessing object and create the dataset for chosen model
    preprocessor = Preprocessing(**PREPROCESSING_ARGS,
                                 apply_preprocessing=apply_preprocessing,
                                 species_folder=SPECIES_FOLDER,
                                 positive_class=POSITIVE_CLASS,
                                 negative_class=NEGATIVE_CLASS)
    
    X, Y = preprocessor.create_dataset(verbose=False,
                                       file_names=AUDIO_NAMES_TXT,
                                       augmentation=False,
                                       annotation_folder="Annotations",
                                       sufix_file='.svl')
    
    dataset_creation_time = time.time() - start_time
    print(f"--- Dataset created in {dataset_creation_time} seconds ---")
    return (X, Y, dataset_creation_time)

In [28]:
## Use helper in this function to make model prediction
def model_prediction(X, model, batch_size=128, device=DEVICE):
    '''
    This function is used to make prediction given input data (X),
    and selected model.

    Args:
        X: np.array shape (n_samples, n_features)
        model: str, 'baseline' or 'eso'
        device: torch.device, 'cpu' or 'cuda'

    Returns:
        Y_pred: np.array shape (n_samples,)
        prediction_time: time it took to make prediction.
    '''
    # starting time
    start_time = time.time()
    # Model loading depends on selected model
    if model == 'baseline':
        baseline_model = Model.load_cnn(cnn_dict=BASELINE_CNN_STATE_PATH, device=device)
        Y_pred = _predict(model=baseline_model, X=X, batch_size=batch_size, device=device)
        prediction_time = time.time() - start_time
        print(f"--- Predicted Baseline in {prediction_time} seconds ---")
        return (Y_pred, prediction_time)
    elif model == 'eso':
        eso_model = Model.load_cnn(cnn_dict=ESO_CNN_STATE_PATH, device=device)
        # load eso chromosome from file to numpy and use it to create relevant dataset
        eso_chromosome = np.load(CHROMOSOME_PKL_PATH, allow_pickle=True)
        X = eso_chromosome._create_dataset(X)
        # Make prediction for 'eso-created dataset X'
        Y_pred = _predict(model=eso_model, X=X, batch_size=batch_size, device=device)
        prediction_time = time.time() - start_time
        print(f"--- Predicted ESO in {prediction_time} seconds ---")
        return (Y_pred, prediction_time)

### Helper Functions for FLOPs, MACs Number of Trainable Parameters

In [29]:
def calc_flops(model, batch_size=1, device=DEVICE):
    '''
    This function takes the model type and batch_size as inputs and
    returns the number of flops, macs and trainable parameters.

    Args:
        model: str, 'baseline' or 'eso'
        device: torch.device,  'cpu' or 'cuda'
        batch_size: int, default to 1

    Returns:
        flops: float, number of floating point number per second
        macs: number of multiply-accumulate operations
        params: int, number of trainable parameters
    '''
    if model == 'baseline':
        baseline_model = Model.load_cnn(cnn_dict=BASELINE_CNN_STATE_PATH, device=device)
        input_shape = (batch_size, *baseline_model.input_shape)
        # Use calculate_flops from calflops to perform the calc
        flops, macs, params = calculate_flops(model=baseline_model,
                                              input_shape=input_shape,
                                              print_results=False,
                                              output_as_string=False)
        return (flops, macs, params)
    elif model == 'eso':
        eso_model = Model.load_cnn(cnn_dict=ESO_CNN_STATE_PATH, device=device)
        input_shape = (batch_size, *eso_model.input_shape)
        flops, macs, params = calculate_flops(model=eso_model,
                                              input_shape=input_shape,
                                              print_results=False,
                                              output_as_string=False)
        return (flops, macs, params)
    else:
        raise ValueError("Model must be either 'baseline' or 'eso'")

### Main function for executing inference and calculating flops, macs and number of parameters

In [32]:
def main():
    ## Inference with baseline
    
    # Step 1: create dataset
    X, Y, dataset_creation_time_base = _create_data('baseline')
    # Step 2: Make the predictions
    Y_pred_base, prediction_time_base = model_prediction(X=X, model='baseline', batch_size=128, device=DEVICE)
    
    # Step 3: Encoding (POSITIVE_CLASS ('gibbon') -> 1, NEGATIVE_CLASS ('no-gibbon' -> 0)
    Y[Y == POSITIVE_CLASS] = 1
    Y[Y == NEGATIVE_CLASS] = 0
    Y = Y.astype(int)
    
    # Step 4: Turn prob into preds (0, 1) and compute F1-score
    Y_pred_base = np.argmax(Y_pred_base, axis=1)
    f1_base = f1_score(y_true=Y, y_pred=Y_pred_base)
    
    # Step 5: Calculate FLOPs, MACs, Number of params
    flops_base, macs_base, params_base = calc_flops(model='baseline', batch_size=1, device=DEVICE)

    # Delete X, Y to save memory
    del X, Y

    ## Inference on ESO
    X, Y, dataset_creation_time_eso = _create_data('eso')
    Y_pred_eso, prediction_time_eso = model_prediction(X=X, model='eso', batch_size=128, device=DEVICE)

    # The labels are only used for performance metrics calculation here
    Y[Y == POSITIVE_CLASS] = 1
    Y[Y == NEGATIVE_CLASS] = 0
    Y = Y.astype(int)

    Y_pred_eso = np.argmax(Y_pred_eso, axis=1)
    f1_eso = f1_score(y_true=Y, y_pred=Y_pred_eso)

    flops_eso, macs_eso, params_eso = calc_flops(model='eso', batch_size=1, device=DEVICE)
    del X, Y

    ## Save results to Pandas DataFrame
    df = pd.DataFrame()
    df['Durations'] = ["Dataset Creation", "Prediction", "F1 Score", "Flops", "Macs", "Params"]
    df['Baseline'] = [dataset_creation_time_base, prediction_time_base, f1_base,
                      flops_base, macs_base, params_base]
    df['ESO'] = [dataset_creation_time_eso, prediction_time_eso, f1_eso, 
                 flops_eso, macs_eso, params_eso]

    # Calculate improvement in percentage
    df['Reduction'] = ((df['Baseline'] - df['ESO']) / df['Baseline']) * 100
    
    # Store results
    df.to_csv('inference.csv')

    return df

### Testing everything

In [33]:
df = main()

--- Creating Dataset for model: baseline ---
Found file HGSM3AB_0+1_20160305_055900
867
(array(['gibbon', 'no-gibbon'], dtype='<U9'), array([178, 689]))
--- Dataset created in 21.828219890594482 seconds ---
--- Predicted Baseline in 0.6067299842834473 seconds ---
--- Creating Dataset for model: eso ---
Found file HGSM3AB_0+1_20160305_055900
867
(array(['gibbon', 'no-gibbon'], dtype='<U9'), array([178, 689]))
--- Dataset created in 12.603198051452637 seconds ---
--- Predicted ESO in 0.17692184448242188 seconds ---


In [34]:
df

Unnamed: 0,Durations,Baseline,ESO,Reduction
0,Dataset Creation,21.82822,12.6032,42.261906
1,Prediction,0.60673,0.1769218,70.840102
2,F1 Score,0.8516746,0.8636364,-1.404494
3,Flops,9013114.0,2943298.0,67.344272
4,Macs,4406336.0,1438784.0,67.347383
5,Params,132234.0,38538.0,70.856209


In [46]:
# Number of trainable paramters 
df.loc[5]

Durations       Params
Baseline      132234.0
ESO            38538.0
Reduction    70.856209
Name: 5, dtype: object

In [47]:
# Number of FLOPS
df.loc[3]

Durations        Flops
Baseline     9013114.0
ESO          2943298.0
Reduction    67.344272
Name: 3, dtype: object

In [48]:
# Number of MACS
df.loc[4]

Durations         Macs
Baseline     4406336.0
ESO          1438784.0
Reduction    67.347383
Name: 4, dtype: object