# Section (e) — Fine-tune a pretrained Time-Series Foundation Model (MOMENT) on HAR

This notebook fine-tunes **MOMENT** (AutonLab/MOMENT-1-*) for **18-class activity classification** on your dataset, **separately for each sensor type** (Smartwatch vs Vicon), matching your existing project structure.

It follows your project’s key assumptions:
- Input is multivariate 3-axis signal (x,y,z)
- Padding/cutting to fixed lengths (you used **3000** for Type1 and **1169** for Type2 in `only_1Dcnn.ipynb`)
- Min/max normalization using `models_utils/GLOBALS.py`.

---
## 0) One-time setup (IMPORTANT)

Your current `models_utils/GLOBALS.py` has **Windows absolute paths**. Update these two lines so they point to your local repo:
- `BASE_DIR`
- `files_directory`

Then restart the kernel and run again.


In [12]:
import os, sys

# Add Source Code folder to sys.path
BASE_DIR = r"C:\Users\husseien\Desktop\340915149_322754953\Source Code"

if BASE_DIR not in sys.path:
    sys.path.insert(0, BASE_DIR)

# Optional: change working directory (recommended)
if os.getcwd() != BASE_DIR:
    os.chdir(BASE_DIR)

# Now you can import setup_paths
import setup_paths  # this will run the code in setup_paths.py


import os

# Base data folder
DATA_DIR = r"C:\Users\husseien\Desktop\340915149_322754953\Source Code\data"

# Path to labeled train.csv
TRAIN_CSV = os.path.join(DATA_DIR, "train.csv")

# Path to folder containing all raw CSV files
FILES_DIRECTORY = os.path.join(DATA_DIR, "unlabeled", "unlabeled")

# Check
print("TRAIN_CSV exists:", os.path.exists(TRAIN_CSV))
print("FILES_DIRECTORY exists:", os.path.exists(FILES_DIRECTORY))
print("Example CSVs:", os.listdir(FILES_DIRECTORY)[:5])

# ===============================
# 1) Imports + paths
# ===============================
import os, sys
from pathlib import Path
import pandas as pd
import torch
from torch.utils.data import DataLoader, Subset
from sklearn.model_selection import train_test_split
from torch import nn
from tqdm.auto import tqdm

# If this notebook is inside your repo, set PROJECT_ROOT accordingly.
# Example: PROJECT_ROOT = Path(r"C:\Users\...\Source Code")
PROJECT_ROOT = Path.cwd()

# Ensure your repo modules are importable
if str(PROJECT_ROOT) not in sys.path:
    sys.path.insert(0, str(PROJECT_ROOT))

# ---- Import your project globals AFTER you fix GLOBALS.py paths ----
from models_utils.GLOBALS import (
    device,
    TRAIN_CSV,
    files_directory,
    min_values_type1, max_values_type1,
    min_values_type2, max_values_type2,
    activity_id_mapping, id_activity_mapping
)

from models_utils.Datasets import TrainDataframeWithLabels
print("Using TRAIN_CSV:", TRAIN_CSV)
print("Using files_directory:", files_directory)
print("Device:", device)


TRAIN_CSV exists: True
FILES_DIRECTORY exists: True
Example CSVs: ['0.csv', '1.csv', '10.csv', '100.csv', '1000.csv']
Using TRAIN_CSV: C:\Users\husseien\Desktop\340915149_322754953\Source Code\data\train.csv
Using files_directory: C:\Users\husseien\Desktop\340915149_322754953\Source Code\data\unlabeled\unlabeled
Device: cuda


In [13]:

# ===============================
# 2) Install & load MOMENT
# ===============================
# MOMENT model card / usage: https://huggingface.co/AutonLab/MOMENT-1-large
# Install (run once):
# !pip install -q momentfm

from momentfm import MOMENTPipeline

def build_moment_classifier(num_classes: int, n_channels: int = 3, model_name: str = "AutonLab/MOMENT-1-small"):
    """Create a MOMENT classification pipeline and initialize it."""
    model = MOMENTPipeline.from_pretrained(
        model_name,
        model_kwargs={
            "task_name": "classification",
            "n_channels": n_channels,
            "num_class": num_classes,
        },
    )
    model.init()
    return model

NUM_CLASSES = len(activity_id_mapping)  # should be 18
print("NUM_CLASSES:", NUM_CLASSES)


NUM_CLASSES: 18


In [14]:

# ===============================
# 3) Data split exactly like your CNN training (stratified 80/20)
# ===============================
train_df = pd.read_csv(TRAIN_CSV).reset_index(drop=True)

# Your dataset has two sensor types; in your code, Type1 vs Type2 is determined by file shape.
# In practice, your train.csv already has 'sensor' column, so we reuse it.
smartwatch_df = train_df[train_df["sensor"] == "smartwatch"].reset_index(drop=True)
vicon_df      = train_df[train_df["sensor"] == "vicon"].reset_index(drop=True)

print("Rows smartwatch:", len(smartwatch_df))
print("Rows vicon:", len(vicon_df))

# Match your only_1Dcnn.ipynb constants
TARGET_SIZE_TYPE1 = 3000
TARGET_SIZE_TYPE2 = 1169


Rows smartwatch: 36186
Rows vicon: 14062


In [15]:

def make_split(df: pd.DataFrame, test_size=0.2, seed=42):
    labels = df["activity"].tolist()
    train_idx, val_idx = train_test_split(
        range(len(df)),
        test_size=test_size,
        stratify=labels,
        random_state=seed,
    )
    return list(train_idx), list(val_idx)

smart_train_idx, smart_val_idx = make_split(smartwatch_df, test_size=0.2, seed=42)
vic_train_idx, vic_val_idx     = make_split(vicon_df, test_size=0.2, seed=42)


In [16]:

# ===============================
# 4) Datasets + loaders
# ===============================
BATCH_SIZE = 32
NUM_WORKERS = 0  # set >0 if your OS supports it well
PIN_MEMORY = torch.cuda.is_available()

smartwatch_dataset = TrainDataframeWithLabels(smartwatch_df, data_type="2", max_sequence_length=TARGET_SIZE_TYPE2)
vicon_dataset      = TrainDataframeWithLabels(vicon_df,      data_type="1", max_sequence_length=TARGET_SIZE_TYPE1)

smart_train_loader = DataLoader(Subset(smartwatch_dataset, smart_train_idx), batch_size=BATCH_SIZE, shuffle=True,
                               num_workers=NUM_WORKERS, pin_memory=PIN_MEMORY)
smart_val_loader   = DataLoader(Subset(smartwatch_dataset, smart_val_idx), batch_size=BATCH_SIZE, shuffle=False,
                               num_workers=NUM_WORKERS, pin_memory=PIN_MEMORY)

vic_train_loader = DataLoader(Subset(vicon_dataset, vic_train_idx), batch_size=BATCH_SIZE, shuffle=True,
                              num_workers=NUM_WORKERS, pin_memory=PIN_MEMORY)
vic_val_loader   = DataLoader(Subset(vicon_dataset, vic_val_idx), batch_size=BATCH_SIZE, shuffle=False,
                              num_workers=NUM_WORKERS, pin_memory=PIN_MEMORY)

print("Smartwatch batches:", len(smart_train_loader), len(smart_val_loader))
print("Vicon batches:", len(vic_train_loader), len(vic_val_loader))


Smartwatch batches: 905 227
Vicon batches: 352 88


In [17]:

# ===============================
# 5) Normalization helpers (same as your code)
# ===============================
def normalize_batch(x: torch.Tensor, data_type: str) -> torch.Tensor:
    """x shape: (bs, seq_len, 3)"""
    if data_type == "1":
        return (x - min_values_type1) / (max_values_type1 - min_values_type1 + 1e-6)
    else:
        return (x - min_values_type2) / (max_values_type2 - min_values_type2 + 1e-6)


In [18]:

# ===============================
# 6) Training loop for MOMENT classification
# ===============================
def run_epoch(model, loader, data_type: str, optimizer=None):
    is_train = optimizer is not None
    model.train() if is_train else model.eval()

    criterion = nn.CrossEntropyLoss()
    total_loss = 0.0
    total = 0
    correct = 0

    pbar = tqdm(loader, desc="train" if is_train else "val", leave=False)
    for x, y in pbar:
        # Your dataset returns x as (seq_len, 3) per sample, so batch is (bs, seq_len, 3)
        x = x.to(device)
        y = y.to(device)

        x = normalize_batch(x, data_type)

        # MOMENT expects (bs, seq_len, channels) for classification
        # If your momentfm version expects a different layout, this is the ONLY line to adjust.
        logits = model(x).logits  # (bs, num_classes)

        loss = criterion(logits, y)

        if is_train:
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()

        total_loss += loss.item() * y.size(0)
        total += y.size(0)
        correct += (logits.argmax(dim=1) == y).sum().item()

        pbar.set_postfix(loss=total_loss / max(total, 1), acc=100 * correct / max(total, 1))

    return total_loss / total, correct / total


def finetune_moment(
    train_loader,
    val_loader,
    data_type: str,
    model_name: str,
    epochs: int = 10,
    lr: float = 3e-5,
    weight_decay: float = 0.01,
    freeze_backbone_epochs: int = 2,
    save_path: str = "moment_finetuned.pth",
):
    model = build_moment_classifier(num_classes=NUM_CLASSES, n_channels=3, model_name=model_name)
    model.to(device)

    # Two-phase training:
    # (1) linear-probe (freeze backbone) to stabilize
    # (2) unfreeze and fine-tune end-to-end
    def set_backbone_trainable(trainable: bool):
        # momentfm pipelines usually expose .model; we try to freeze everything except the classification head
        # If your version differs, print(model) and adjust these names.
        for name, param in model.named_parameters():
            param.requires_grad = trainable
        # Try to keep classification head trainable if we can detect it
        for name, param in model.named_parameters():
            if "head" in name.lower() or "classifier" in name.lower():
                param.requires_grad = True

    best_val_acc = 0.0
    best_state = None

    # Phase 1: freeze
    set_backbone_trainable(False)
    optimizer = torch.optim.AdamW(filter(lambda p: p.requires_grad, model.parameters()), lr=lr, weight_decay=weight_decay)

    for epoch in range(1, epochs + 1):
        if epoch == freeze_backbone_epochs + 1:
            set_backbone_trainable(True)
            optimizer = torch.optim.AdamW(model.parameters(), lr=lr, weight_decay=weight_decay)

        train_loss, train_acc = run_epoch(model, train_loader, data_type=data_type, optimizer=optimizer)
        val_loss, val_acc     = run_epoch(model, val_loader,   data_type=data_type, optimizer=None)

        print(f"Epoch {epoch:02d} | train loss {train_loss:.4f} acc {train_acc*100:.2f}% | "
              f"val loss {val_loss:.4f} acc {val_acc*100:.2f}%")

        if val_acc > best_val_acc:
            best_val_acc = val_acc
            best_state = {k: v.detach().cpu() for k, v in model.state_dict().items()}
            torch.save(best_state, save_path)
            print(f"  ✅ saved best to: {save_path} (val acc {best_val_acc*100:.2f}%)")

    return best_val_acc


In [19]:

# ===============================
# 7) Run fine-tuning (choose model size)
# ===============================
# Options: "AutonLab/MOMENT-1-small", "AutonLab/MOMENT-1-base", "AutonLab/MOMENT-1-large"
# Start with small (faster), then try base if you have GPU memory.
MODEL_NAME = "AutonLab/MOMENT-1-small"

print("Fine-tuning on SMARTWATCH (Type2, len=1169)")
best_smart = finetune_moment(
    train_loader=smart_train_loader,
    val_loader=smart_val_loader,
    data_type="2",
    model_name=MODEL_NAME,
    epochs=10,
    lr=3e-5,
    freeze_backbone_epochs=2,
    save_path="moment_smartwatch_best.pth",
)

print("\nFine-tuning on VICON (Type1, len=3000)")
best_vicon = finetune_moment(
    train_loader=vic_train_loader,
    val_loader=vic_val_loader,
    data_type="1",
    model_name=MODEL_NAME,
    epochs=10,
    lr=3e-5,
    freeze_backbone_epochs=2,
    save_path="moment_vicon_best.pth",
)

print("\nBest val accuracies:")
print("  smartwatch:", best_smart)
print("  vicon     :", best_vicon)


Fine-tuning on SMARTWATCH (Type2, len=1169)


                                              

TypeError: new(): invalid data type 'str'


## 8) Compare to sections (c–d)

To compare fairly, use the **same split** (the stratified indices computed in this notebook) and report:
- Validation Accuracy (or Macro-F1 if you used it in c–d)
- Training vs Validation curves (loss/acc)

**What to report in the writeup (Section e):**
1. **Baseline (c–d)**: paste the best validation accuracy you got with your best CNN/LSTM model.
2. **Pretrained MOMENT**:
   - Smartwatch best val acc = `best_smart`
   - Vicon best val acc      = `best_vicon`
3. A short explanation:
   - MOMENT should help especially when labels are limited or classes are imbalanced (pretraining gives better representations).
   - If it underperforms your CNN, likely reasons are: small batch size, not enough epochs, or mismatch of expected input layout.

If you want, we can add:
- confusion matrix
- macro-F1
- a submission generator (like your `get_results` flow)
