# Performance Analysis

Visualize PKI performance test results from `data/perf-metrics/`.

## How Performance Tests Work

The `scripts/perf-test.py` orchestrator uses an **in-container batch execution strategy** to minimize overhead:

1. Generates a shell script with all certificate operations (issue + revoke + CRL)
2. Copies it into the CA container via `sudo podman cp`
3. Runs it with a single `sudo podman exec` call
4. Parses structured timing data from stdout

### Metrics Captured

| Metric | Description |
|--------|-------------|
| `issued` | Total certificates issued per PKI type |
| `revoked` | Total certificates revoked per PKI type |
| `issue_rate` | Issuance throughput (certs/sec) |
| `revoke_rate` | Revocation throughput (certs/sec) |
| `issue_p50/p90/p95/p99` | Issuance latency percentiles (ms) |
| `total_duration` | Total test duration (seconds) |

### Default Certificate Distribution

| PKI Type | Share |
|----------|-------|
| RSA-4096 | 40% |
| ECC P-384 | 30% |
| ML-DSA-87 | 30% |

### Running a Test

```bash
# Quick test (100 certs, RSA only)
./lab perf-test --count 100 --pki-types rsa

# Full test (10K certs across all PKI types, 10% revocation)
./lab perf-test --count 10000 --revoke-pct 10 --pki-types rsa,ecc,pqc
```

Results are written to `data/perf-metrics/latest.json` (and a timestamped copy).

## Configuration

Performance metrics are mounted read-only at `/home/jovyan/perf-metrics/` inside the Jupyter container.

In [None]:
import json
import os
from pathlib import Path
from datetime import datetime

import pandas as pd
from IPython.display import display

METRICS_DIR = Path("/home/jovyan/perf-metrics")

print(f"Metrics directory: {METRICS_DIR}")
if METRICS_DIR.exists():
    files = sorted(METRICS_DIR.glob("*.json"))
    print(f"Found {len(files)} result file(s): {[f.name for f in files]}")
else:
    print("Metrics directory not found. Run ./lab perf-test to generate data.")
    files = []

## Load Latest Results

Parse `latest.json` and display a summary table.

In [None]:
latest_file = METRICS_DIR / "latest.json"
data = None

if latest_file.exists():
    with open(latest_file) as f:
        data = json.load(f)

    # Display top-level summary
    if isinstance(data, dict):
        # Show metadata
        for key in ["timestamp", "total_duration", "total_issued", "total_revoked"]:
            if key in data:
                print(f"{key}: {data[key]}")

        # Per-PKI summary
        pki_results = data.get("results", data.get("pki_types", {}))
        if isinstance(pki_results, dict):
            rows = []
            for pki, metrics in pki_results.items():
                if isinstance(metrics, dict):
                    rows.append({"pki_type": pki, **metrics})
            if rows:
                print("\nPer-PKI Summary:")
                display(pd.DataFrame(rows).set_index("pki_type"))
        elif isinstance(pki_results, list):
            print("\nPer-PKI Summary:")
            display(pd.DataFrame(pki_results))
        else:
            print(json.dumps(data, indent=2))
    else:
        print(json.dumps(data, indent=2))
else:
    print("No latest.json found. Run ./lab perf-test to generate data.")

## Throughput Comparison

Issuance and revocation rates (certs/sec) per PKI type.

In [None]:
if data and isinstance(data, dict):
    pki_results = data.get("results", data.get("pki_types", {}))
    if isinstance(pki_results, dict):
        rows = []
        for pki, metrics in pki_results.items():
            if isinstance(metrics, dict):
                rows.append({
                    "PKI Type": pki.upper(),
                    "Issue Rate (certs/s)": metrics.get("issue_rate", 0),
                    "Revoke Rate (certs/s)": metrics.get("revoke_rate", 0),
                })
        if rows:
            df = pd.DataFrame(rows).set_index("PKI Type")
            print("Throughput Comparison:")
            display(df)

            # Simple text bar chart
            print("\nIssuance Rate:")
            max_rate = max(r["Issue Rate (certs/s)"] for r in rows) or 1
            for r in rows:
                bar_len = int(40 * r["Issue Rate (certs/s)"] / max_rate)
                bar = "█" * bar_len
                print(f"  {r['PKI Type']:6s} {bar} {r['Issue Rate (certs/s)']:.1f}/s")

            print("\nRevocation Rate:")
            max_rate = max(r["Revoke Rate (certs/s)"] for r in rows) or 1
            for r in rows:
                bar_len = int(40 * r["Revoke Rate (certs/s)"] / max_rate)
                bar = "█" * bar_len
                print(f"  {r['PKI Type']:6s} {bar} {r['Revoke Rate (certs/s)']:.1f}/s")
        else:
            print("No per-PKI throughput data found.")
    else:
        print("Unexpected results format.")
else:
    print("No data loaded. Run the cell above first.")

## Latency Percentiles

Issuance latency distribution (p50 / p90 / p95 / p99) per PKI type.

In [None]:
if data and isinstance(data, dict):
    pki_results = data.get("results", data.get("pki_types", {}))
    if isinstance(pki_results, dict):
        rows = []
        percentile_keys = ["issue_p50", "issue_p90", "issue_p95", "issue_p99"]
        for pki, metrics in pki_results.items():
            if isinstance(metrics, dict):
                row = {"PKI Type": pki.upper()}
                for pk in percentile_keys:
                    label = pk.replace("issue_", "").upper()
                    row[label] = f"{metrics.get(pk, 0):.1f} ms"
                rows.append(row)

        if rows:
            df = pd.DataFrame(rows).set_index("PKI Type")
            print("Issuance Latency Percentiles:")
            display(df)
        else:
            print("No latency percentile data found.")
    else:
        print("Unexpected results format.")
else:
    print("No data loaded.")

## Phase Breakdown

Duration breakdown by phase (issuance, revocation, CRL generation) per PKI type.

In [None]:
if data and isinstance(data, dict):
    pki_results = data.get("results", data.get("pki_types", {}))
    if isinstance(pki_results, dict):
        rows = []
        for pki, metrics in pki_results.items():
            if isinstance(metrics, dict):
                row = {"PKI Type": pki.upper()}
                for phase in ["issue_duration", "revoke_duration", "crl_duration", "total_duration"]:
                    label = phase.replace("_duration", "").replace("_", " ").title()
                    val = metrics.get(phase, metrics.get(phase.replace("_duration", "_time"), 0))
                    row[label] = f"{val:.2f}s" if isinstance(val, (int, float)) else str(val)
                rows.append(row)

        if rows:
            df = pd.DataFrame(rows).set_index("PKI Type")
            print("Phase Breakdown:")
            display(df)
        else:
            print("No phase duration data found.")
    else:
        print("Unexpected results format.")
else:
    print("No data loaded.")

## Historical Trend

If multiple timestamped result files exist, plot throughput over time.

In [None]:
if METRICS_DIR.exists():
    result_files = sorted(
        [f for f in METRICS_DIR.glob("*.json") if f.name != "latest.json"],
        key=lambda f: f.stat().st_mtime,
    )

    if len(result_files) >= 2:
        trend_rows = []
        for rf in result_files:
            try:
                with open(rf) as f:
                    rd = json.load(f)
                ts = rd.get("timestamp", rf.stem)
                pki_results = rd.get("results", rd.get("pki_types", {}))
                if isinstance(pki_results, dict):
                    for pki, metrics in pki_results.items():
                        if isinstance(metrics, dict):
                            trend_rows.append({
                                "timestamp": ts,
                                "file": rf.name,
                                "pki_type": pki.upper(),
                                "issue_rate": metrics.get("issue_rate", 0),
                                "revoke_rate": metrics.get("revoke_rate", 0),
                            })
            except (json.JSONDecodeError, KeyError):
                continue

        if trend_rows:
            df = pd.DataFrame(trend_rows)
            print(f"Historical results from {len(result_files)} test runs:")
            display(df.set_index(["timestamp", "pki_type"]))
        else:
            print("Could not parse historical data.")
    else:
        print(f"Only {len(result_files)} timestamped result file(s) found.")
        print("Run ./lab perf-test multiple times to build historical data.")
else:
    print("Metrics directory not found.")

## Run Performance Test

Performance tests must be run from the lab host terminal (they use `sudo podman exec`). After running a test, re-execute the cells above to see updated results.

```bash
# Quick test (100 certs, RSA)
./lab perf-test --count 100 --pki-types rsa

# Multi-PKI test (1000 certs, 10% revocation)
./lab perf-test --count 1000 --revoke-pct 10 --pki-types rsa,ecc,pqc

# Large-scale test
./lab perf-test --count 10000 --revoke-pct 10 --pki-types rsa,ecc,pqc
```

In [None]:
# After running a perf test, re-run this cell to reload
latest_file = METRICS_DIR / "latest.json"
if latest_file.exists():
    mtime = datetime.fromtimestamp(latest_file.stat().st_mtime)
    print(f"latest.json last modified: {mtime.strftime('%Y-%m-%d %H:%M:%S')}")
    with open(latest_file) as f:
        data = json.load(f)
    print("Data reloaded. Re-run the analysis cells above.")
else:
    print("No results file found yet.")