
# Chapter 2 – Parse

*(from “Biomechanics Data in Python: A Beginner’s Guide” by **Hossein Mokhtarzadeh, PhD**)*

Visit **[PoseIQ.com](https://poseiq.com)** for tools and demos.  
Follow and learn more:
- Udemy courses: **[Hossein Mokhtarzadeh](https://www.udemy.com/user/hossein-mokhtarzadeh/)**  
- Amazon book: **[AI Mastery Series](https://www.amazon.com.au/dp/B0DJSJ14VP)**  
- LinkedIn: **[PoseIQ™](https://www.linkedin.com/company/poseiq/)**  
- Demos (free and paid): **[PoseIQ Demos](https://sites.google.com/view/arengs/solutions/poseiq-demos)**

---

## From Chapter 1 to Chapter 2
In Chapter 1 we focused on **bringing biomechanical data into Python** (C3D, TRC, CSV).  
Here in Chapter 2, we **parse** that raw input into clean, labeled, analysis-ready signals.

**Parsing goals**
- Extract time, markers (e.g., Heel, Hip), and force-plate channels (e.g., vertical GRF)
- Normalize units (mm → m; N → body weight)
- Rename columns so they’re human-readable
- Reduce noise (e.g., Butterworth low-pass filter)


##Step 0 – Extract C3D file from Chapter 1

In [None]:
!pip install ezc3d pandas
import urllib.request, zipfile, os

os.makedirs("sample_data/c3d_zip", exist_ok=True)

url = "https://www.c3d.org/data/Sample01.zip"
zip_path = "sample_data/c3d_zip/sample_data.zip"
urllib.request.urlretrieve(url, zip_path)

with zipfile.ZipFile(zip_path, 'r') as zip_ref:
    zip_ref.extractall("sample_data/c3d_zip")

print("Extracted files in c3d_zip:", os.listdir("sample_data/c3d_zip"))

# Step 2b – Download sample data
import os, urllib.request, pathlib

base_dir = pathlib.Path("sample_data/online")
base_dir.mkdir(parents=True, exist_ok=True)

# Public example TRC (OpenSim subject01 gait trial)
trc_url = "https://raw.githubusercontent.com/opensim-org/opensim-models/master/Pipelines/Gait2354_Simbody/subject01_walk1.trc"
trc_path = base_dir / "subject01_walk1.trc"

# Public example CSV (OptiTrack export in a robotics dataset)
csv_url = "https://raw.githubusercontent.com/JuSquare/ODA_Dataset/master/dataset/10/optitrack.csv"
csv_path = base_dir / "optitrack.csv"

urllib.request.urlretrieve(trc_url, trc_path.as_posix())
urllib.request.urlretrieve(csv_url, csv_path.as_posix())

print("Downloaded:", trc_path, "and", csv_path)


# Step 3 – Load a C3D file
import ezc3d, glob, os

# Prefer a C3D from the downloaded zip; otherwise try any .c3d in subfolders
c3d_candidates = glob.glob("sample_data/c3d_zip/*.c3d") + glob.glob("sample_data/**/*.c3d", recursive=True)
assert c3d_candidates, "No C3D file found!"
c3d_path = c3d_candidates[0]

c3d = ezc3d.c3d(c3d_path)
points = c3d["data"]["points"]             # shape: 4 x n_markers x n_frames
labels = list(c3d["parameters"]["POINT"]["LABELS"]["value"])

print("Loaded C3D:", c3d_path)
print("Point array shape:", points.shape)
print("Markers (first 8):", labels[:8], "...")

# Step 4 – Load CSV/TRC if present, else build DataFrame from C3D
import pandas as pd
import numpy as np
import glob, os

# Only look inside our dedicated 'online' folder to avoid accidental matches in Colab
trc_list = glob.glob("sample_data/online/*.trc")
csv_list = glob.glob("sample_data/online/*.csv")

if trc_list:
    trc_path = trc_list[0]
    # TRC: tab-delimited; first 3 lines are header in many OpenSim examples
    df_trc = pd.read_csv(trc_path, sep="\t", skiprows=3)
    print("Loaded TRC from online folder:", os.path.basename(trc_path))
    display(df_trc.head())

elif csv_list:
    csv_path = csv_list[0]
    df_csv = pd.read_csv(csv_path)
    print("Loaded CSV from online folder:", os.path.basename(csv_path))
    display(df_csv.head())

else:
    # Build DataFrame directly from the C3D markers
    n_markers, n_frames = points.shape[1], points.shape[2]
    xyz = points[:3, :, :].transpose(2, 1, 0)   # frames × markers × (X,Y,Z)

    axes = ["X", "Y", "Z"]
    cols = pd.MultiIndex.from_product([labels, axes], names=["marker", "axis"])

    df_markers = pd.DataFrame(xyz.reshape(n_frames, n_markers*3), columns=cols)
    df_markers.insert(0, "frame", np.arange(n_frames))

    print("No TRC/CSV in online folder - built DataFrame from C3D points.")
    display(df_markers.head())



# Optional – Load a single .c3d directly from a URL
from urllib.request import urlretrieve

# Replace with any direct .c3d URL you have:
example_c3d_url = "https://example.com/path/to/file.c3d"
local_c3d = "sample_data/online/input_url_file.c3d"

# Uncomment when you have a working URL
# urlretrieve(example_c3d_url, local_c3d)
# c3d_direct = ezc3d.c3d(local_c3d)
# print('Loaded from URL path:', local_c3d)


## Step 1 – Extract Time Signals

In [None]:

import numpy as np

point_rate = float(c3d["parameters"]["POINT"]["RATE"]["value"][0])
n_frames   = int(c3d["data"]["points"].shape[2])
assert point_rate > 0 and n_frames > 0

time = np.arange(n_frames, dtype=float) / point_rate
print("Time vector length:", len(time))


## Step 2 – Extract Marker Positions

In [None]:

import pandas as pd

marker_labels = list(c3d["parameters"]["POINT"]["LABELS"]["value"])
target_label  = "RFT3"   # change to your actual heel label

if target_label in marker_labels:
    heel_idx = marker_labels.index(target_label)
else:
    print(f"Warning: {target_label} not found. Using first marker:", marker_labels[0])
    heel_idx = 0
    target_label = marker_labels[0]

heel_mm = c3d["data"]["points"][:3, heel_idx, :].T
df_markers_mm = pd.DataFrame(heel_mm, columns=["Heel_X_mm", "Heel_Y_mm", "Heel_Z_mm"])
df_markers_mm.head()


## Step 3 – Extract Force Plate Channels

In [None]:

analog_labels = list(c3d["parameters"]["ANALOG"]["LABELS"]["value"])
wanted = "FZ1"

if wanted in analog_labels:
    fz_idx = analog_labels.index(wanted)
    analog = c3d["data"]["analogs"]
    fz_per_frame = analog[:, fz_idx, :].mean(axis=0)   # one value per frame
else:
    print(f"Warning: {wanted} not found. Available:", analog_labels[:10], "...")
    fz_per_frame = np.full(n_frames, np.nan, dtype=float)

print("Fz length:", len(fz_per_frame))


## Step 4 – Normalize Units

In [None]:

g = 9.81
body_mass_kg = 80.0

df_markers_m = df_markers_mm.copy() / 1000.0
df_markers_m.columns = ["Heel_X", "Heel_Y", "Heel_Z"]

fz_bw = fz_per_frame / (body_mass_kg * g)


## Step 5 – Filter Noise

In [None]:

from scipy.signal import butter, filtfilt

cutoff_hz = 6.0
nyq = point_rate / 2.0
b, a = butter(4, cutoff_hz / nyq, btype="low")

heel_filt = filtfilt(b, a, df_markers_m[["Heel_X","Heel_Y","Heel_Z"]].values, axis=0)
df_markers_filt = pd.DataFrame(heel_filt, columns=["Heel_X","Heel_Y","Heel_Z"])
df_markers_filt.head()


## Step 6 – Organize into a Clean DataFrame

In [None]:

def ensure_same_length(*arrays):
    L = min(len(a) for a in arrays)
    return [np.asarray(a)[:L] for a in arrays], L

(arrs, L) = ensure_same_length(
    time,
    df_markers_filt["Heel_X"].values,
    df_markers_filt["Heel_Y"].values,
    df_markers_filt["Heel_Z"].values,
    fz_bw
)
time_a, hx_a, hy_a, hz_a, fz_a = arrs
print("Aligned length:", L)

df = pd.DataFrame({
    "Time": time_a,
    "Heel_X": hx_a,
    "Heel_Y": hy_a,
    "Heel_Z": hz_a,
    "Fz_BW": fz_a
})
df.head()



---

## Notes & Tips for Beginners
- **Check labels**: Always print a slice of marker and analog labels before indexing, they vary by lab.
- **Units**: ezc3d gives markers in mm, analog in volts or Newtons depending on file. Convert early and be consistent.
- **Forces**: Decide whether you need per-subframe precision or just per-frame averages. For gait, per frame is fine.
- **Filtering**: Use Butterworth for kinematics; forces often use higher cutoff (15–25 Hz for walking).  
- **Save your parsed DataFrame**: `df.to_csv("parsed_ch2.csv", index=False)` to reuse later.
- **Plot often**: Heel_Z should oscillate smoothly, Fz_BW should peak near 1.0 per leg in walking.

---
**Credits**  
Author: Hossein Mokhtarzadeh, PhD  
PoseIQ.com | [Udemy](https://www.udemy.com/user/hossein-mokhtarzadeh/) | [Amazon](https://www.amazon.com.au/dp/B0DJSJ14VP) | [LinkedIn](https://www.linkedin.com/company/poseiq/)
