## CT Pedestal Computation (Optional Feature)

This notebook supports computing the pedestal location from CT markers and a tracked nose landmark.

### How It Works

1. **CT Offset**: The offset vector from F_11 (nose reference) to F_9 (pedestal) in CT space
2. **Pedestal Computation**: For each frame, the pedestal location is computed as: `pedestal_position = tracked_nose_position + ct_offset`
3. **Integration**: The pedestal trajectory is automatically added to the 3D data and included in reprojection error calculations

### Configuration

**Option 1 (Recommended)**: Directly specify the CT offset vector
- Set `ct_offset = [dx, dy, dz]` in mm (this is F_9 - F_11 from your CT markers)
- Set `nose_landmark_name` to the name of the tracked landmark (e.g., "NostrilsTop_Center")
- Example: `ct_offset = [-1.21, -43.64, 52.76]`  # from your CT markers

**Option 2**: Load from CT marker files
- Set `ct_dir` to the directory containing `F_9.mrk.json` and `F_11.mrk.json`
- Set `nose_landmark_name` to the name of the tracked landmark
- The offset will be computed automatically from the files

If `nose_landmark_name` is `None`, pedestal computation is skipped.

### Reprojection Error Note

The pedestal is computed from the nose landmark and doesn't have corresponding 2D tracking data. Reprojection error for the pedestal shows where the computed 3D position projects into each camera view, but there's no ground truth 2D position to compare against. This is useful for visualizing where the pedestal would appear in each camera.


In [None]:
"""
PHASE 2 (Reprojection Error QC)
--------------------------------
For every frame t, every joint j, every camera c:
    1. Take 3D point (X,Y,Z) in world / ref camera coordinates.
    2. Project into camera c using that camera's 3x4 P matrix.
    3. Compare predicted 2D (px) vs actual tracked 2D (px).
    4. Euclidean distance in pixels = reprojection error.

We report median and p95 reprojection error per (camera, joint),
plus an overall summary per joint across cameras.
"""

import os, json, re
import numpy as np
import pandas as pd
import h5py

# ===================== USER PATHS (EDIT THESE) =====================

# 3D file (either .h5 with "tracks" or .npz with (T,J,3))
path_3d    = r"C:\Users\Lenovo\Desktop\Phase 2 evaluation reproj\points3d.h5"

# Matching 2D tracking files (the *.inference.analysis.h5 per camera)
# IMPORTANT: keys here MUST match the camera names in calibration["P"]
cam2d_files = {
    "cam-bottomleft.mp4":   r"C:\Users\Lenovo\Desktop\Phase 2 evaluation reproj\cam-bottomleft.inference.analysis.h5",
    "cam-bottomright.mp4":  r"C:\Users\Lenovo\Desktop\Phase 2 evaluation reproj\cam-bottomright.inference.analysis.h5",
    "cam-topleft.mp4":      r"C:\Users\Lenovo\Desktop\Phase 2 evaluation reproj\cam-topleft.inference.analysis.h5",
    "cam-topright.mp4":     r"C:\Users\Lenovo\Desktop\Phase 2 evaluation reproj\cam-topright.inference.analysis.h5",
}

# calibration.json that has P[camera] = 3x4 matrix
path_calib = r"C:\Users\Lenovo\Desktop\Phase 2 evaluation reproj\calibration.json"

# Output folder
out_root   = r"C:\Users\Lenovo\Desktop\Phase 2 evaluation reproj\out_3d_eval"
os.makedirs(out_root, exist_ok=True)

# ========== CT PEDESTAL CONFIGURATION (OPTIONAL) ==========
# If you want to include pedestal location computed from CT markers:
# 1. Set ct_dir to the directory containing F_9.mrk.json and F_11.mrk.json
# 2. Set nose_landmark_name to the name of the tracked landmark that corresponds to the nose
#    (e.g., "NostrilsTop_Center", "Nose_Tip", or whatever your tracking system calls it)
# 3. The pedestal location will be computed as: pedestal = nose_position + (F_9 - F_11) offset
#    This uses the CT-derived offset between F_9 (pedestal) and F_11 (nose reference)
# 
# If you don't want pedestal, leave ct_dir as None or empty string
ct_dir = None  # e.g., r"/path/to/CT" or None to skip pedestal
nose_landmark_name = None  # e.g., "NostrilsTop_Center" or None to skip pedestal
# =========================================================


# ===================== HELPERS =====================

def load_3d_tracks_any(path_3dfile):
    """
    Returns:
        pts3d: (T,J,3) float array in the reference/world frame
        joint_names: list[str] length J
    Assumptions:
    - For .h5: dataset "tracks" with shape (T,K,J,3) -> take K=0
    - For .npz: either one big (T,J,3) or per-joint arrays
    - Optionally computes and appends pedestal from CT markers if configured
    """
    ext = os.path.splitext(path_3dfile)[1].lower()

    if ext in [".h5", ".hdf5"]:
        with h5py.File(path_3dfile, "r") as f:
            if "tracks" not in f:
                raise RuntimeError("3D H5: expected dataset 'tracks' not found.")
            raw = np.array(f["tracks"])  # expected (T,K,J,3)
        if raw.ndim != 4 or raw.shape[-1] != 3:
            raise RuntimeError(f"3D H5 'tracks' has shape {raw.shape}, expected (T,K,J,3).")

        T,K,J,_ = raw.shape
        pts3d = raw[:,0,:,:]  # (T,J,3)
        joint_names = [f"J{i+1}" for i in range(J)]

    elif ext == ".npz":
        z = np.load(path_3dfile, allow_pickle=True)
        keys = list(z.keys())

        # Try big (T,J,3)
        cand = [(k, z[k].shape) for k in keys
                if hasattr(z[k], "ndim")
                and z[k].ndim == 3
                and z[k].shape[-1] == 3]
        if cand:
            k_big, shp = sorted(cand, key=lambda kv: kv[1][1], reverse=True)[0]
            XYZ = np.array(z[k_big])  # (T,J,3)
            T,J,_ = XYZ.shape
            if "nodes" in z and len(z["nodes"]) == J:
                joint_names = [str(s) for s in z["nodes"]]
            else:
                joint_names = [f"J{i+1}" for i in range(J)]
            pts3d = XYZ
        else:
            # Else stitch separate arrays
            joint_blocks = []
            joint_names  = []
            for k in keys:
                arr = z[k]
                if not hasattr(arr,"ndim"):
                    continue
                if arr.ndim == 2 and arr.shape[1] == 3:
                    joint_blocks.append(np.array(arr)) # (T,3)
                    joint_names.append(k)
                elif arr.ndim == 3 and arr.shape[-1] == 3:
                    # e.g. (T,JJ,3)
                    T,JJ,_ = arr.shape
                    for jsub in range(JJ):
                        joint_blocks.append(np.array(arr)[:,jsub,:])
                        joint_names.append(f"{k}{jsub+1}")
            if not joint_blocks:
                raise RuntimeError("3D NPZ: no suitable 3D arrays found.")
            pts3d = np.stack(joint_blocks, axis=1)  # (T,J,3)
    else:
        raise RuntimeError(f"Unsupported 3D file extension: {ext}")

    # ========== ADD PEDESTAL IF CONFIGURED ==========
    # If CT directory and nose landmark are specified, compute pedestal trajectory
    if ct_dir and nose_landmark_name and os.path.exists(ct_dir):
        try:
            # Load CT offset
            ct_offset_data = compute_ct_offset(ct_dir)
            ct_offset = ct_offset_data["offset"]
            print(f"[pedestal] Loaded CT offset: {ct_offset} mm (F_9 - F_11)")
            
            # Find nose landmark in loaded joints
            nose_idx = None
            if nose_landmark_name in joint_names:
                nose_idx = joint_names.index(nose_landmark_name)
            else:
                # Try case-insensitive match
                matching_indices = [i for i, n in enumerate(joint_names) if n.lower() == nose_landmark_name.lower()]
                if matching_indices:
                    nose_idx = matching_indices[0]
                    nose_landmark_name = joint_names[nose_idx]
                    print(f"[pedestal] Matched nose landmark: {nose_landmark_name}")
                else:
                    print(f"[pedestal] WARNING: Nose landmark '{nose_landmark_name}' not found in joints.")
                    print(f"[pedestal] Available joints: {joint_names}")
                    print(f"[pedestal] Skipping pedestal computation.")
                    return pts3d, joint_names
            
            # Get nose trajectory
            nose_traj = pts3d[:, nose_idx, :]  # (T, 3)
            
            # Compute pedestal trajectory
            pedestal_traj = compute_pedestal_trajectory(nose_traj, ct_offset)  # (T, 3)
            
            # Append pedestal to pts3d
            pedestal_traj_expanded = pedestal_traj[:, np.newaxis, :]  # (T, 1, 3)
            pts3d = np.concatenate([pts3d, pedestal_traj_expanded], axis=1)  # (T, J+1, 3)
            
            # Append pedestal name to joint_names
            joint_names.append("Pedestal")
            print(f"[pedestal] Added pedestal trajectory computed from '{nose_landmark_name}'")
            print(f"[pedestal] Pedestal will appear in reprojection error calculations")
        except Exception as e:
            print(f"[pedestal] ERROR: Failed to compute pedestal: {e}")
            print(f"[pedestal] Continuing without pedestal...")
    # ================================================

    return pts3d, joint_names


def load_2d_from_h5_analysis(path_2dfile):
    """
    YOUR FORMAT (from cam-*.inference.analysis.h5 dump):

    tracks shape = (1, 2, 20, 13107)
      axis 0: track index (we'll take 0)
      axis 1: coord = [x,y]
      axis 2: joint index (0..19)
      axis 3: frame index (0..T-1)

    We want pts2d[frame, joint, xy] = shape (T, J, 2).

    Steps:
      raw = tracks[0]            -> (2, J, T)
      swap axes -> (T, J, 2)
    Also returns node_names list for debugging.
    """
    with h5py.File(path_2dfile, "r") as f:
        if "tracks" not in f:
            raise RuntimeError(f"{path_2dfile}: no 'tracks' dataset")
        raw = np.array(f["tracks"])        # (1,2,J,T)
        node_names_ds = np.array(f["node_names"])  # (J,)

    if raw.ndim != 4:
        raise RuntimeError(f"{path_2dfile}: 'tracks' shape {raw.shape}, expected (1,2,J,T)")

    # squeeze first dim: (2,J,T)
    raw2 = raw[0]  # (2, J, T)

    # Now we want (T,J,2):
    # current axes: (coord=0, joint=1, frame=2)
    # move them to (frame, joint, coord) = (2,1,0)
    pts2d = np.moveaxis(raw2, [0,1,2], [2,1,0])  # -> (T,J,2)

    # decode node names to strings
    node_names = []
    for n in node_names_ds:
        if isinstance(n, (bytes, bytearray)):
            node_names.append(n.decode("utf-8"))
        else:
            node_names.append(str(n))

    return pts2d, node_names


# ========== CT MARKER LOADING FUNCTIONS ==========
def load_ct_marker(mrk_json_path):
    """
    Load a CT marker file (.mrk.json) and extract 3D position.
    
    CT markers are stored in Slicer format with positions in LPS
    (Left-Posterior-Superior) coordinate system.
    
    Args:
        mrk_json_path: Path to the .mrk.json file
        
    Returns:
        Dictionary with:
            - "name": marker name (from filename or label)
            - "position": [x, y, z] in mm
            - "coordinate_system": "LPS"
    """
    if not os.path.exists(mrk_json_path):
        raise FileNotFoundError(f"CT marker file not found: {mrk_json_path}")
    
    with open(mrk_json_path, "r", encoding="utf-8") as f:
        data = json.load(f)
    
    # Extract marker name from filename if not in data
    marker_name = os.path.splitext(os.path.basename(mrk_json_path))[0]
    
    # Parse Slicer markups format
    if "markups" not in data or len(data["markups"]) == 0:
        raise ValueError(f"No markups found in {mrk_json_path}")
    
    markup = data["markups"][0]
    
    if "controlPoints" not in markup or len(markup["controlPoints"]) == 0:
        raise ValueError(f"No control points found in {mrk_json_path}")
    
    # Get first control point position
    control_point = markup["controlPoints"][0]
    position = control_point.get("position", None)
    
    if position is None:
        raise ValueError(f"No position found in control point of {mrk_json_path}")
    
    # Extract label if available
    label = control_point.get("label", marker_name)
    if label and label != marker_name:
        marker_name = label.split("-")[0] if "-" in label else label
    
    # Get coordinate system from markup
    coord_system = markup.get("coordinateSystem", "LPS")
    
    return {
        "name": marker_name,
        "position": list(position),  # [x, y, z] in mm
        "coordinate_system": coord_system
    }


def compute_ct_offset(ct_dir, f9_filename="F_9.mrk.json", f11_filename="F_11.mrk.json"):
    """
    Load F_9 and F_11 marker files and compute offset vector.
    
    The offset is computed as: offset = F_9_position - F_11_position
    This offset can be applied to a tracked nose landmark to get pedestal location.
    
    How it works:
    - F_9 is the pedestal location in CT space
    - F_11 is the nose reference point in CT space
    - The offset (F_9 - F_11) represents the vector from nose to pedestal in CT space
    - When applied to the tracked nose position in video space, it gives the pedestal location
    
    Args:
        ct_dir: Directory containing CT marker files
        f9_filename: Filename for F_9 marker (pedestal)
        f11_filename: Filename for F_11 marker (nose reference)
        
    Returns:
        Dictionary with:
            - "offset": [dx, dy, dz] in mm (F_9 - F_11)
            - "coordinate_system": "LPS"
            - "f9_position": F_9 position
            - "f11_position": F_11 position
    """
    f9_path = os.path.join(ct_dir, f9_filename)
    f11_path = os.path.join(ct_dir, f11_filename)
    
    f9_data = load_ct_marker(f9_path)
    f11_data = load_ct_marker(f11_path)
    
    f9_pos = np.array(f9_data["position"])
    f11_pos = np.array(f11_data["position"])
    
    offset = f9_pos - f11_pos  # F_9 - F_11
    
    return {
        "offset": offset.tolist(),
        "coordinate_system": f9_data["coordinate_system"],
        "f9_position": f9_data["position"],
        "f11_position": f11_data["position"]
    }


def compute_pedestal_trajectory(nose_trajectory, ct_offset):
    """
    Compute pedestal trajectory from nose landmark trajectory and CT offset.
    
    For each frame: pedestal_position = nose_position + ct_offset
    
    Args:
        nose_trajectory: Array of shape (T, 3) where T is number of frames,
                       each row is [x, y, z] position of nose landmark
        ct_offset: Offset vector [dx, dy, dz] from CT markers (F_9 - F_11)
        
    Returns:
        Array of shape (T, 3) with pedestal positions for each frame
    """
    if nose_trajectory.ndim != 2 or nose_trajectory.shape[1] != 3:
        raise ValueError(f"Expected nose_trajectory shape (T, 3), got {nose_trajectory.shape}")
    
    offset = np.array(ct_offset)
    if offset.shape != (3,):
        raise ValueError(f"Expected ct_offset shape (3,), got {offset.shape}")
    
    # For each frame: pedestal = nose + offset
    pedestal = nose_trajectory + offset
    
    return pedestal
# ================================================


def load_calibration(calib_json_path):
    """
    calibration.json structure includes:
      "P": { "cam-name.mp4": [[3x4], ...], ... }
      Optionally: "pedestal_config": { "ct_offset": [...], "nose_landmark_name": "..." }

    We only need P[camera] to project:
        [u,v,w]^T = P @ [X,Y,Z,1]^T
        x_pred = u/w
        y_pred = v/w
    """
    with open(calib_json_path, "r") as f:
        calib = json.load(f)

    if "P" not in calib:
        raise RuntimeError("calibration.json missing top-level 'P' block")

    cam_models = {}
    for cam_name, P_list in calib["P"].items():
        P = np.array(P_list, dtype=float)  # (3,4)
        if P.shape != (3,4):
            raise RuntimeError(f"P for {cam_name} has shape {P.shape}, expected (3,4)")
        cam_models[cam_name] = {"P": P}
    return cam_models


def reprojection_error_allcams(pts3d, joint_names_3d, cam_models, cam2d_files):
    """
    Compute reprojection error in pixels for each camera/joint.

    Inputs:
        pts3d           (T,J,3)
        joint_names_3d  list[str] len J for the 3D data
        cam_models      {camera: {"P":(3,4)}}
        cam2d_files     {camera: path_to_2d_h5}

    Returns:
        rows: list of dict rows with:
            camera, joint, median_reproj_px, p95_reproj_px, n_obs
    """
    T, J, _ = pts3d.shape
    rows = []

    for cam_name, h5path in cam2d_files.items():
        if cam_name not in cam_models:
            print(f"[warn] camera {cam_name} not in calibration; skipping")
            continue

        print(f"[step] loading 2D for {cam_name} from {h5path}")
        pts2d, node_names_2d = load_2d_from_h5_analysis(h5path)  # (T, J2, 2)
        # pts2d[frame, joint, xy], xy=(x,y) in pixels

        # Sanity: same frame count?
        if pts2d.shape[0] != T:
            raise RuntimeError(
                f"Frame mismatch {cam_name}: 3D has {T}, 2D has {pts2d.shape[0]}"
            )

        # Handle joint count mismatch (J2 could be different from J)
        J2 = pts2d.shape[1]
        if J2 < J:
            print(f"[warn] {cam_name}: 2D joints={J2} < 3D joints={J}; trimming 3D.")
            J_use = J2
        else:
            J_use = J

        P = cam_models[cam_name]["P"]  # (3,4)

        all_err = [[] for _ in range(J_use)]  # collect per-joint pixel errors

        # Loop frames
        for t in range(T):
            xyz = pts3d[t, :J_use, :]          # (J_use,3)
            ones = np.ones((J_use,1), float)
            xyz1 = np.concatenate([xyz, ones], axis=1)  # (J_use,4)

            proj = (P @ xyz1.T).T              # (J_use,3)
            u = proj[:,0] / proj[:,2]
            v = proj[:,1] / proj[:,2]
            pred2d = np.stack([u,v], axis=1)   # (J_use,2)

            gt2d = pts2d[t, :J_use, :]         # (J_use,2)

            diff = pred2d - gt2d               # (J_use,2)
            err = np.sqrt(np.sum(diff**2, axis=1))  # pixel distance
            good = np.isfinite(err)

            for j_idx in np.where(good)[0]:
                # Skip pedestal if it doesn't have 2D tracking data
                # (pedestal is computed from nose, so it won't be in 2D tracking)
                if j_idx < len(joint_names_3d) and joint_names_3d[j_idx] == "Pedestal":
                    # For pedestal, we can still compute reprojection but note it's computed, not tracked
                    # We'll still record it but it represents the projected position only
                    pass  # Include it anyway - it shows where pedestal projects to
                all_err[j_idx].append(err[j_idx])

        # summarize per joint for this camera
        for j_idx in range(J_use):
            errs = np.array(all_err[j_idx], float)
            errs = errs[np.isfinite(errs)]
            if errs.size == 0:
                med = np.nan
                p95 = np.nan
                nobs = 0
            else:
                med = float(np.median(errs))
                p95 = float(np.percentile(errs, 95))
                nobs = int(errs.size)

            # use 3D joint name if available
            if j_idx < len(joint_names_3d):
                joint_label = joint_names_3d[j_idx]
            else:
                joint_label = f"J{j_idx+1}"

            rows.append(dict(
                camera=cam_name,
                joint=joint_label,
                median_reproj_px=med,
                p95_reproj_px=p95,
                n_obs=nobs,
            ))

    return rows


# ===================== MAIN =====================

print("[step] loading 3D points ...")
pts3d, joint_names_3d = load_3d_tracks_any(path_3d)
print(f"[info] 3D shape: {pts3d.shape}, joints={joint_names_3d}")

print("[step] loading calibration ...")
cam_models = load_calibration(path_calib)
print(f"[info] cameras in calib:", list(cam_models.keys()))

print("[step] computing reprojection errors ...")
rows = reprojection_error_allcams(pts3d, joint_names_3d, cam_models, cam2d_files)

df = pd.DataFrame(
    rows,
    columns=["camera","joint","median_reproj_px","p95_reproj_px","n_obs"]
)

out_csv = os.path.join(out_root, "reprojection_error_by_cam_and_joint.csv")
df.to_csv(out_csv, index=False)
print("[OK] wrote reprojection CSV:", out_csv)

# Per-joint (across cameras) summary for quick view:
df_joint_summary = (
    df.groupby("joint")
      .agg(
          median_px_overall=("median_reproj_px","median"),
          p95_px_overall   =("p95_reproj_px","median"),
          total_obs        =("n_obs","sum"),
      )
      .reset_index()
)

out_csv2 = os.path.join(out_root, "reprojection_error_by_joint_overall.csv")
df_joint_summary.to_csv(out_csv2, index=False)
print("[OK] wrote joint summary CSV:", out_csv2)
print("[DONE]")


[step] loading 3D points ...
[info] 3D shape: (13107, 20, 3), joints=['J1', 'J2', 'J3', 'J4', 'J5', 'J6', 'J7', 'J8', 'J9', 'J10', 'J11', 'J12', 'J13', 'J14', 'J15', 'J16', 'J17', 'J18', 'J19', 'J20']
[step] loading calibration ...
[info] cameras in calib: ['cam-topleft.mp4', 'cam-topright.mp4', 'cam-bottomleft.mp4', 'cam-bottomright.mp4']
[step] computing reprojection errors ...
[step] loading 2D for cam-bottomleft.mp4 from C:\Users\Lenovo\Desktop\Phase 2 evaluation reproj\cam-bottomleft.inference.analysis.h5
[step] loading 2D for cam-bottomright.mp4 from C:\Users\Lenovo\Desktop\Phase 2 evaluation reproj\cam-bottomright.inference.analysis.h5
[step] loading 2D for cam-topleft.mp4 from C:\Users\Lenovo\Desktop\Phase 2 evaluation reproj\cam-topleft.inference.analysis.h5
[step] loading 2D for cam-topright.mp4 from C:\Users\Lenovo\Desktop\Phase 2 evaluation reproj\cam-topright.inference.analysis.h5
[OK] wrote reprojection CSV: C:\Users\Lenovo\Desktop\Phase 2 evaluation reproj\out_3d_eval\re