# üìä Model Efficiency Analysis

**Comprehensive efficiency analysis with automated table and chart generation**

This notebook analyzes model performance metrics including:
- CPU/GPU latency and throughput
- Memory usage (RAM/VRAM)
- Power consumption
- Model size and parameters

Outputs publication-ready **LaTeX tables** and **efficiency plots**.

In [1]:
# Import working modules from ultimate notebook
import sys
import pandas as pd
from pathlib import Path
sys.path.append(str(Path.cwd().parent))

from utils.enhanced_data_loader import EnhancedEfficiencyDataLoader
from utils.analysis_utils import create_inference_plots, calculate_energy_metrics, calculate_inference_summary, create_efficiency_table, generate_efficiency_report
from utils.training_analysis import TrainingAnalyzer
from utils.latex_table_generator import generate_all_tables, create_real_data_latex_table, create_corrected_latex_table

print("‚úÖ All working modules imported successfully!")

‚úÖ All working modules imported successfully!


## üéØ Step 1: Load Data

First, load the experimental data:

In [130]:
# Load actual data using the working loader
BASE_PATH = Path.cwd().parent
loader = EnhancedEfficiencyDataLoader(BASE_PATH)
inference_data = loader.parse_all_data()
print(f"Loaded {len(inference_data)} inference records")

üîç Scanning for experiment files...
üéØ Using inference files for //home/amma/LLM-TIME/efficiency_experiments/experiments/time_llm_inference_ohiot1dm/seed_831363_model_LLAMA_dim_4096_seq_6_context_6_pred_6_patch_6_epochs_0/patient_570/logs/real_performance_reports
üéØ Using inference files for //home/amma/LLM-TIME/efficiency_experiments/experiments/time_llm_inference_ohiot1dm/seed_831363_model_GPT2_dim_768_seq_6_context_6_pred_6_patch_6_epochs_0/patient_570/logs/real_performance_reports
üéØ Using inference files for //home/amma/LLM-TIME/efficiency_experiments/experiments/time_llm_inference_ohiot1dm/seed_831363_model_BERT_dim_768_seq_6_context_6_pred_6_patch_6_epochs_0/patient_570/logs/real_performance_reports
üéØ Using inference files for //home/amma/LLM-TIME/efficiency_experiments/experiments/distillation_inference_ohiot1dm/seed_831363_model_tinybert/patient_570/logs/real_performance_reports
üéØ Using inference files for //home/amma/LLM-TIME/efficiency_experiments/experiments/t

## üéØ Step 2: Calculate Metrics

Calculate performance metrics:

In [131]:
# Calculate inference summary using working function
inference_metrics = calculate_inference_summary(inference_data)
print("Inference summary calculated!")

Inference summary calculated!


## üéØ Step 3: Load Additional Data

Load training and other data:

In [132]:
# Load training data using working analyzer
training_analyzer = TrainingAnalyzer(BASE_PATH)
training_data = training_analyzer.load_training_efficiency_data()
print(f"Loaded training data: {len(training_data)} records")

Loaded training data: 10 records


## üéØ Step 4: Generate Plots and Tables

Create visualizations and LaTeX tables:

In [133]:
# Create inference plots with standardized model names (saves to file only, no display)
create_inference_plots(inference_data, save_path="outputs/clean_inference_plots.png")

üìä Plotting 6 unique models: ['Chronos-T5-Base', 'Chronos-T5-Tiny', 'Time-LLM-BERT', 'Time-LLM-GPT-2', 'Time-LLM-LLaMA', 'Time-LLM-TinyBERT (Distilled)']
üìä Dashboard plot saved to: outputs/clean_inference_plots_dashboard.png
üìä Detailed analysis plot saved to: outputs/clean_inference_plots_detailed.png


In [134]:
# Calculate energy metrics using working function (needs inference summary first)
energy_metrics = calculate_energy_metrics(inference_metrics)
print("Energy metrics calculated!")

Energy metrics calculated!


In [135]:
# Generate LaTeX tables using working functions
latex_tables = generate_all_tables(inference_data, training_data)
print("‚úÖ LaTeX tables generated!")
for name, path in latex_tables.items():
    print(f"  {name}: {path}")

üìä Generating comprehensive standardized table...
‚úÖ Table generated: /home/amma/LLM-TIME/notebooks/outputs/latex_tables/comprehensive_standardized_metrics.tex
‚úÖ LaTeX tables generated!
  comprehensive_standardized: /home/amma/LLM-TIME/notebooks/outputs/latex_tables/comprehensive_standardized_metrics.tex


In [136]:
# Create efficiency comparison table using working function
comparison_table = create_efficiency_table(inference_data)
print(f"‚úÖ Efficiency table created: {comparison_table.shape}")
comparison_table

‚úÖ Efficiency table created: (44, 7)


Unnamed: 0,model_name,avg_inference_time_ms,inference_peak_ram_mb,inference_avg_power_w,total_parameters,model_size_mb,edge_feasibility
0,Chronos-T5-Base,84.12,843.79,,201374976,768.18,feasible
1,Chronos-T5-Base,84.12,843.79,,201374976,768.18,feasible
2,Chronos-T5-Tiny,37.16,842.53,,8394496,32.02,feasible
3,Chronos-T5-Tiny,37.16,842.53,,8394496,32.02,feasible
4,Time-LLM-LLaMA,,26104.16,,6642504366,25339.14,requires_optimization
5,Time-LLM-LLaMA,,26104.16,,6642504366,25339.14,requires_optimization
6,Time-LLM-GPT-2,,2199.79,,317055766,1209.47,feasible
7,Time-LLM-GPT-2,,2199.79,,317055766,1209.47,feasible
8,Time-LLM-BERT,,1968.51,,282363198,1077.13,feasible
9,Time-LLM-BERT,,1968.51,,282363198,1077.13,feasible


In [137]:
# Generate comprehensive efficiency analysis report
summary_report = generate_efficiency_report(inference_data)
print("‚úÖ Summary report generated!")
print("=" * 50)
print(summary_report)

‚úÖ Summary report generated!
# üöÄ LLM Performance Analysis Report

## üìä Dataset Overview
- **Total Records**: 44
- **Unique Models**: 6
- **Experiment Types**: 5
- **Records with Timing Data**: 28
- **Records with Power Data**: 20

## ‚ö° Performance Summary
- **Fastest Model**: Time-LLM-TinyBERT (Distilled) (1271.3ms)
- **Most Memory Efficient**: Time-LLM-TinyBERT (Distilled) (1002.3MB)

## üì± Edge Deployment Readiness
- üü° **Feasible**: 5 models
- üî¥ **Challenging**: 1 models

---
*Report generated by Enhanced LLM Efficiency Analysis Tool*


## üéØ Step 5: Generate Reports

Create analysis reports:

In [138]:
# Load distillation results using working analyzer
distillation_data = training_analyzer.load_distillation_results()
print(f"Loaded distillation data: {len(distillation_data)} records")

üìä Found distillation results at: /home/amma/LLM-TIME/efficiency_experiments/distillation_experiments/pipeline_results.csv
Loaded distillation data: 1 records


In [139]:
# Show the loaded inference data
print(f"‚úÖ Inference data loaded: {inference_data.shape}")
inference_data.head()

‚úÖ Inference data loaded: (44, 49)


Unnamed: 0,experiment_type,model_name,mode,report_type,file_path,total_parameters,model_size_mb,model_dtype,model_architecture,avg_inference_time_ms,...,gpu_reserved_mb,nvidia_system_vram_mb,nvidia_avg_system_vram_mb,peak_memory_utilization_percent,average_memory_utilization_percent,average_temperature_celsius,gpu_measurement_count,nvidia_ml_measurement_count,system_average_ram_mb,ram_measurements_count
0,chronos_inference_ohiot1dm,chronos-t5-base,inference,efficiency_reports,/home/amma/LLM-TIME/efficiency_experiments/exp...,201374976,768.18457,torch.float32,ChronosModel,84.120291,...,,,,,,,,,,
1,chronos_inference_ohiot1dm,chronos-t5-base,inference,efficiency_reports,/home/amma/LLM-TIME/efficiency_experiments/exp...,201374976,768.18457,torch.float32,ChronosModel,84.120291,...,,,,,,,,,,
2,chronos_inference_ohiot1dm,chronos-t5-tiny,inference,efficiency_reports,/home/amma/LLM-TIME/efficiency_experiments/exp...,8394496,32.022461,torch.float32,ChronosModel,37.162438,...,,,,,,,,,,
3,chronos_inference_ohiot1dm,chronos-t5-tiny,inference,efficiency_reports,/home/amma/LLM-TIME/efficiency_experiments/exp...,8394496,32.022461,torch.float32,ChronosModel,37.162438,...,,,,,,,,,,
4,time_llm_inference_ohiot1dm,LLAMA,inference,efficiency_reports,/home/amma/LLM-TIME/efficiency_experiments/exp...,6642504366,25339.143242,torch.float32,Model,,...,,,,,,,,,,


In [140]:
# Show energy metrics table
print(f"‚úÖ Energy metrics: {energy_metrics.shape}")
energy_metrics

‚úÖ Energy metrics: (6, 5)


Unnamed: 0,model_name,avg_power_w,energy_per_prediction_wh,daily_energy_moderate_wh,carbon_per_prediction_g
0,Chronos-T5-Base,141.282153,2.272605,2272.605198,1.136303
1,Chronos-T5-Tiny,109.081088,0.83805,838.049538,0.419025
2,Time-LLM-LLaMA,185.961041,25.226457,25226.457354,12.613229
3,Time-LLM-GPT-2,175.197675,15.17055,15170.550003,7.585275
4,Time-LLM-BERT,164.152812,9.49436,9494.360427,4.74718
5,Time-LLM-TinyBERT (Distilled),55.2217,0.019501,19.500974,0.00975


In [141]:
# Show inference summary table  
print(f"‚úÖ Inference summary: {inference_metrics.shape}")
inference_metrics

‚úÖ Inference summary: (6, 31)


Unnamed: 0,model_name,total_records,records_with_timing,records_with_memory,records_with_power,avg_inference_time_ms,min_inference_time_ms,max_inference_time_ms,std_inference_time_ms,throughput_predictions_per_sec,...,estimated_gpu_latency_ms,peak_power_usage_watts,peak_gpu_utilization_percent,average_gpu_utilization_percent,nvidia_system_vram_mb,nvidia_avg_system_vram_mb,total_parameters,model_size_mb,model_architecture,edge_feasibility
0,Chronos-T5-Base,7,7,7,3,57908.08334,82.726276,390011.057777,146485.441696,0.017269,...,20.14,220.465,48.333333,19.053384,6678.53125,5188.678862,201374976,768.18457,ChronosModel,feasible
1,Chronos-T5-Tiny,7,7,7,3,27658.124667,35.835989,179945.969535,67227.326547,0.036156,...,0.84,158.254667,41.666667,4.633381,1919.53125,1556.946576,8394496,32.022461,ChronosModel,feasible
2,Time-LLM-LLaMA,8,4,8,4,488356.303489,91435.458492,885277.148487,458324.713413,0.002048,...,664.25,208.105,79.5,45.554722,34006.875,33654.572793,6642504366,25339.143242,Model,challenging
3,Time-LLM-GPT-2,8,4,8,4,311727.767871,2425.929184,621029.606558,357150.999654,0.003208,...,31.71,200.021,47.0,36.444844,6526.875,6433.397086,317055766,1209.471764,Model,feasible
4,Time-LLM-BERT,10,4,10,4,208218.775673,2250.552981,414186.998365,237831.617631,0.004803,...,28.24,201.442,50.0,30.245594,6547.875,6455.541232,282363198,1077.130119,Model,feasible
5,Time-LLM-TinyBERT (Distilled),4,2,4,2,1271.302853,1271.302853,1271.302853,0.0,0.786595,...,4.5,77.877,14.0,1.7,1199.875,1120.14375,44998814,171.656853,Model,feasible


## üìä Quick Data Inspection

Check what we have:

In [142]:
# Show training data with pandas display
print("‚úÖ Training data:")
if isinstance(training_data, pd.DataFrame) and not training_data.empty:
    display(training_data.head())
else:
    print(f"Training data: {type(training_data)}")
    
print("‚úÖ Distillation data:")
if isinstance(distillation_data, pd.DataFrame) and not distillation_data.empty:
    display(distillation_data.head())
else:
    print(f"Distillation data: {type(distillation_data)}")

‚úÖ Training data:


Unnamed: 0,experiment_type,file_path,model_name,training_time_hours,training_time_minutes,epochs_completed,final_train_loss,final_val_loss,best_train_loss,best_val_loss,peak_ram_mb,avg_power_w,peak_power_w,peak_gpu_mb,avg_gpu_util_percent
0,training,/home/amma/LLM-TIME/efficiency_experiments/exp...,chronos-t5-base,,,,,,,,1531.207031,207.952649,377.355,776.30957,51.754746
1,training,/home/amma/LLM-TIME/efficiency_experiments/exp...,chronos-t5-tiny,,,,,,,,1531.285156,127.787712,259.712,40.147461,12.288202
2,training,/home/amma/LLM-TIME/efficiency_experiments/exp...,BERT,,,,,,,,2742.804688,134.960307,230.475,1832.453125,10.80292
3,training,/home/amma/LLM-TIME/efficiency_experiments/exp...,GPT2,,,,,,,,2730.183594,167.7336,292.939,2206.436523,25.664516
4,training,/home/amma/LLM-TIME/efficiency_experiments/exp...,LLAMA,,,,,,,,1907.636719,300.858576,356.82,39177.378906,76.19697


‚úÖ Distillation data:


Unnamed: 0,timestamp,pipeline_id,pipeline_dir,patient_ids,dataset_name,seed,teacher_model,student_model,learning_rate,batch_size,...,distilled_mape,distillation_time,distillation_status,teacher_to_distilled_rmse_improvement,teacher_to_distilled_rmse_improvement_pct,student_to_distilled_rmse_improvement,student_to_distilled_rmse_improvement_pct,pipeline_status,total_runtime,notes
0,2025-10-21 08:02:17,patient_570,distillation_experiments/pipeline_runs/pipelin...,570,ohiot1dm,831363,bert-base-uncased,prajjwal1/bert-tiny,0.001,32,...,5.579967,,completed,0.097586,0.572448,-0.306728,-1.843003,SUCCESS,451.0,Patient 570 complete 3-phase pipeline run


In [143]:
# Debug TinyBERT GPU values
print("Debugging TinyBERT GPU measurement extraction...")

# Find TinyBERT data
tinybert_data = None
for model_name, data in inference_data.items():
    if "tinybert" in model_name.lower():
        tinybert_data = data
        print(f"Found TinyBERT data under key: {model_name}")
        break

if tinybert_data:
    print("\nTinyBERT measurements breakdown:")
    
    # Check summary calculations
    from analysis_utils import calculate_inference_summary
    summary = calculate_inference_summary(tinybert_data)
    
    print(f"Summary VRAM: {summary.get('avg_vram_mb', 'N/A')}")
    
    # Check raw GPU measurements
    if 'gpu_vram_metrics' in summary:
        vram_metrics = summary['gpu_vram_metrics']
        print(f"VRAM metrics found: {vram_metrics}")
        
        if vram_metrics:
            avg_vram = sum(vram_metrics) / len(vram_metrics)
            print(f"Calculated avg VRAM: {avg_vram:.1f}MB")
    
    # Check raw data structure
    print(f"\nRaw data keys available: {list(tinybert_data.keys())}")
    
    # Look for GPU data in measurements
    for measurement in tinybert_data.get('measurements', []):
        gpu_data = measurement.get('gpu_memory_usage', {})
        if gpu_data:
            print(f"\nGPU memory in measurement:")
            print(f"  PyTorch allocated: {gpu_data.get('peak_allocated_mb', 'N/A')}")
            print(f"  NVIDIA ML used: {gpu_data.get('nvidia_ml', {}).get('peak_used_memory_mb', 'N/A')}")
            break

Debugging TinyBERT GPU measurement extraction...


In [None]:
# Reload the module to ensure we get the updated version
import importlib
from utils import latex_table_generator
importlib.reload(latex_table_generator)
from utils.latex_table_generator import generate_all_tables

print("üîÑ Reloaded latex_table_generator module")

üîÑ Reloaded latex_table_generator module


## üéØ That's it!

**No more 79-cell nightmare!** 

‚úÖ Clean, organized functions  
‚úÖ One-click complete analysis  
‚úÖ Real experimental data  
‚úÖ CPU + GPU metrics  
‚úÖ LaTeX tables ready for publication  

**All outputs saved to:** `/home/amma/LLM-TIME/notebooks/outputs/`