# OpenMATB Task Performance Analysis

This notebook extracts performance data for each task type (e.g., sysmon, resman, communication) from OpenMATB log files. For each task, it creates a table with users as rows and each performance field as a column.

## Outline
1. Import Required Libraries
2. Locate and Load Performance Files
3. Extract and Clean Performance Data
4. Create Separate Tables for Each Task Type
5. Display and Export Results

In [None]:
# 1. Import Required Libraries
import pandas as pd
import numpy as np
import glob
import os

In [None]:
# 2. Locate and Load Performance Files
csv_files = glob.glob(os.path.join('sessions', '*', '*.csv'))
print(f'Found {len(csv_files)} session files.')

# Load all files into a single DataFrame
all_dfs = []
for file in csv_files:
    df = pd.read_csv(file)
    df['session_file'] = file
    all_dfs.append(df)
raw_df = pd.concat(all_dfs, ignore_index=True)
print(f'Total rows loaded: {len(raw_df)}')
raw_df.head()

In [None]:
# 3. Extract and Clean Performance Data
# Only keep performance rows
perf_df = raw_df[raw_df['type'] == 'performance'].copy()

# Extract user from filename
perf_df['user'] = perf_df['session_file'].apply(lambda x: os.path.basename(x).split('_')[1] if '_' in os.path.basename(x) else 'unknown')

# Preview
perf_df.head()

In [None]:
# 4. Create Separate Tables for Each Task Type
# For each module (task), pivot so each performance field is a column

task_tables = {}
for module in perf_df['module'].unique():
    module_df = perf_df[perf_df['module'] == module]
    pivot = module_df.pivot_table(index='user', columns='address', values='value', aggfunc='first')
    task_tables[module] = pivot
    print(f'Performance table for task: {module}')
    display(pivot)

In [None]:
# 5. Display and Export Results
# Optionally, export each table to CSV for further analysis
for module, table in task_tables.items():
    out_path = f'{module}_performance_table.csv'
    table.to_csv(out_path)
    print(f'Exported {module} table to {out_path}')