## Problem 1 — Table 1 (auto-generated)

This notebook parses the `--- Table 1 Results ---` blocks produced by `main.py` in the `*_out` log files and formats them into Table 1.

In [None]:
import os
import re
import pandas as pd

ROOT = os.path.abspath('.')

LOG_FILES = [
    os.path.join(ROOT, 'vgg11_out'),
    os.path.join(ROOT, 'vgg16_out'),
    os.path.join(ROOT, 'mobilenet_out'),
]

def _parse_flops_to_m(flops_str: str):
    # e.g. '306.6M' or '0.123G' (just in case)
    m = re.match(r'^\s*([0-9]*\.?[0-9]+)\s*([KMG]?)\s*$', flops_str.strip(), re.IGNORECASE)
    if not m:
        return None
    val = float(m.group(1))
    unit = m.group(2).upper()
    if unit == '':
        # assume raw FLOPs
        return val / 1e6
    if unit == 'K':
        return val / 1e3
    if unit == 'M':
        return val
    if unit == 'G':
        return val * 1e3
    return None

def parse_table1_block(text: str):
    # Find the last Table 1 block in the file
    idx = text.rfind('--- Table 1 Results ---')
    if idx == -1:
        return None
    block = text[idx:].splitlines()
    out = {}
    for line in block[:30]:
        line = line.strip()
        if line.startswith('Model:'):
            out['Model'] = line.split(':', 1)[1].strip()
        elif line.startswith('Training accuracy'):
            out['Training accuracy [%]'] = float(line.split(':', 1)[1].strip())
        elif line.startswith('Test accuracy'):
            out['Test accuracy [%]'] = float(line.split(':', 1)[1].strip())
        elif line.startswith('Total time for training'):
            out['Total time for training [s]'] = float(line.split(':', 1)[1].strip())
        elif line.startswith('Number of trainable parameters'):
            out['Number of trainable parameters'] = int(line.split(':', 1)[1].strip().replace(',', ''))
        elif line.startswith('FLOPs:'):
            flops_raw = line.split(':', 1)[1].strip()
            out['FLOPs [M]'] = _parse_flops_to_m(flops_raw)
            out['FLOPs (raw)'] = flops_raw
        elif line.startswith('GPU memory during training'):
            out['GPU memory during training [MB]'] = float(line.split(':', 1)[1].strip())
    return out if out.get('Model') else None

rows = []
missing = []
for path in LOG_FILES:
    if not os.path.exists(path):
        missing.append(os.path.basename(path))
        continue
    with open(path, 'r', errors='ignore') as f:
        metrics = parse_table1_block(f.read())
    if metrics is None:
        missing.append(os.path.basename(path) + ' (no Table 1 block found)')
    else:
        rows.append(metrics)

if missing:
    print('Missing / incomplete logs:', ', '.join(missing))

df_raw = pd.DataFrame(rows)

# Keep just the Table 1 columns, in the assignment order
cols = [
    'Model',
    'Training accuracy [%]',
    'Test accuracy [%]',
    'Total time for training [s]',
    'Number of trainable parameters',
    'FLOPs [M]',
    'GPU memory during training [MB]',
]
df_raw = df_raw[[c for c in cols if c in df_raw.columns]].copy()

# Clean model names for consistency
if 'Model' in df_raw.columns:
    df_raw['Model'] = df_raw['Model'].str.strip().str.upper().replace({'MOBILENET': 'MOBILENET-V1'})

# Display-friendly formatting (keep df_raw numeric for CSV)
df = df_raw.copy()

# Formatting helpers for display
if 'Number of trainable parameters' in df.columns:
    df['Number of trainable parameters'] = df['Number of trainable parameters'].map(lambda x: f'{int(x):,}')
if 'FLOPs [M]' in df.columns:
    df['FLOPs [M]'] = df['FLOPs [M]'].map(lambda x: '' if pd.isna(x) else f'{float(x):.1f}M')
for c in ['Training accuracy [%]', 'Test accuracy [%]']:
    if c in df.columns:
        df[c] = df[c].map(lambda x: f'{float(x):.2f}')
for c in ['Total time for training [s]', 'GPU memory during training [MB]']:
    if c in df.columns:
        df[c] = df[c].map(lambda x: f'{float(x):.2f}')

df

# Also export a machine-readable CSV for your report workflow
out_csv = os.path.join(ROOT, 'table1.csv')
df_raw.to_csv(out_csv, index=False)
print('Saved:', out_csv)

# Display the table explicitly
from IPython.display import display
display(df)

# Also export a machine-readable CSV for your report workflow
out_csv = os.path.join(ROOT, 'table1.csv')
df_raw.to_csv(out_csv, index=False)
print('Saved:', out_csv)

Saved: /Users/slrpz/Downloads/ECE361E/HW3_files/table1.csv


Unnamed: 0,Model,Training accuracy [%],Test accuracy [%],Total time for training [s],Number of trainable parameters,FLOPs [M],GPU memory during training [MB]
0,VGG11,99.07,76.08,1468.63,9750922,306.6M,939.0
1,VGG16,98.35,78.46,1622.58,15245130,627.5M,2037.0
2,MOBILENET-V1,99.37,78.47,1754.2,3217226,96.0M,1263.0


Saved: /Users/slrpz/Downloads/ECE361E/HW3_files/table1.csv


### Problem 1 – Question 3: VGG11 vs VGG16 comparison

- Accuracy vs. epochs: VGG11 reaches higher accuracy in the very early epochs, but VGG16 quickly catches up and ultimately achieves the better final test accuracy. After 100 epochs, VGG11 ends at 76.08% test accuracy while VGG16 reaches 78.46%, so VGG16 provides about +2.4 percentage points better generalization.
- Training accuracy and overfitting: Both models achieve very high training accuracy (VGG11: 99.07%, VGG16: 98.35%). The gap between train and test accuracy indicates some overfitting for both, but the gap is not dramatically worse for VGG16, even though it is deeper.
- Training time: VGG16 takes longer to train (1622.58 s) than VGG11 (1468.63 s), roughly 10% more wall‑clock time for the same number of epochs on the same hardware.
- Model size and FLOPs: VGG16 is substantially heavier:
  - Parameters: VGG11 has 9,750,922 trainable parameters, while VGG16 has 15,245,130 (about 1.6× more).
  - FLOPs: VGG11 requires 306.6M FLOPs per forward pass, while VGG16 requires 627.5M FLOPs (about 2× more compute).
- GPU memory usage: VGG16 also uses more GPU memory during training (2037 MB) than VGG11 (939 MB), which matters if GPU memory is a bottleneck.

VGG16 provides slightly better test accuracy (about 2–3 percentage points) but at the cost of roughly 2 FLOPs, 1.6 parameters, higher GPU memory usage, and slightly longer training time. If maximum accuracy is the only goal and compute/memory are plentiful, VGG16 is preferable. However, in edge‑or resource‑constrained settings—where training and inference cost matter as much as accuracy—VGG11 is more attractive because it is significantly cheaper while achieving only a modestly lower test accuracy. In this homework context, where we care about efficiency on edge devices, we would generally prefer VGG11.