# PocaFoldAS training (CLI runner)


Wraps the standard training script `scripts/run_training.py` using a config file (no synthetic data generation). Adjust the config path and experiment name, optionally enable Weights & Biases, then launch training.


## 1. Project root


In [6]:

import os
import pathlib

PROJECT_ROOT = None
for candidate in [pathlib.Path.cwd(), *pathlib.Path.cwd().parents]:
    if (candidate / 'setup.py').exists():
        PROJECT_ROOT = candidate
        break
if PROJECT_ROOT is None:
    raise RuntimeError('Could not locate the repository root. Please run this notebook from within the smlm project.')

os.chdir(PROJECT_ROOT)
print(f'Working directory set to: {PROJECT_ROOT}')



Working directory set to: /home/dim26fa/coding/coding/smlm


## 2. Select config and optional Weights & Biases
Set `USE_WANDB` to `True` to log the run. The notebook writes a copy of the config with `train.use_wandb` set accordingly under `artifacts/notebook_configs/`. 


In [7]:
import os
import pathlib
import yaml

CONFIG_PATH = pathlib.Path('configs/config_demo_data.yaml')  # point to your training config
EXP_NAME = 'demo_notebook_run'
USE_WANDB = False  # toggle logging

def choose_log_dir():
    env_path = os.getenv("TRAIN_LOG_DIR")
    if env_path:
        return pathlib.Path(env_path)
    container_default = pathlib.Path('/workspace/smlm_logs')
    if container_default.parent.exists() and os.access(container_default.parent, os.W_OK):
        return container_default
    return pathlib.Path('output/notebook_logs')

LOG_DIR = choose_log_dir()

if not CONFIG_PATH.exists():
    raise FileNotFoundError(f'Config file not found: {CONFIG_PATH}')

# Make a copy with the desired wandb flag and log dir so we do not overwrite the source config
cfg = yaml.safe_load(CONFIG_PATH.read_text())
cfg.setdefault('train', {})
cfg['train']['use_wandb'] = bool(USE_WANDB)
cfg['train']['log_dir'] = str(LOG_DIR)

# If using the bundled demo data, set root_folder/classes to match iso/aniso layout
root_folder = pathlib.Path(cfg.get('dataset', {}).get('root_folder', ''))
if root_folder.name == 'tetrahedron_seed1121_train' and (root_folder / 'iso').exists():
    cfg['dataset']['root_folder'] = str(root_folder.parent)
    cfg['dataset']['classes'] = [root_folder.name]

LOG_DIR.mkdir(parents=True, exist_ok=True)

copy_dir = pathlib.Path('artifacts/notebook_configs')
copy_dir.mkdir(parents=True, exist_ok=True)
CONFIG_COPY = copy_dir / CONFIG_PATH.name
CONFIG_COPY.write_text(yaml.safe_dump(cfg))
print('Using config copy:', CONFIG_COPY.resolve())
print('Log dir:', LOG_DIR.resolve())

if USE_WANDB:
    try:
        import wandb  # type: ignore
        wandb.login()
    except Exception as exc:  # pragma: no cover - convenience in notebook
        print(f'wandb login skipped: {exc}')


Using config copy: /home/dim26fa/coding/coding/smlm/artifacts/notebook_configs/config_demo_data.yaml
Log dir: /home/dim26fa/coding/coding/smlm/output/notebook_logs


### Note on log folder in containers
- Default log dir is `/workspace/smlm_logs` inside the container.
- To change it, set `TRAIN_LOG_DIR` before running the notebook (e.g., `/tmp/smlm_logs`).
- To persist logs on the host, bind-mount a host folder: `-v /host/logs:/workspace/smlm_logs` (or match your `TRAIN_LOG_DIR`).



## 3. Launch training


In [8]:

import shlex
import subprocess
import sys

cmd = [
    sys.executable, 'scripts/run_training.py',
    '--config', str(CONFIG_COPY),
    '--exp_name', EXP_NAME,
    '--fixed_alpha', '0.001',
]
print('Running:', ' '.join(shlex.quote(c) for c in cmd))
result = subprocess.run(cmd, cwd=PROJECT_ROOT)
print('Exit code:', result.returncode)
if result.returncode != 0:
    raise RuntimeError('Training failed (see logs above).')



Running: /home/dim26fa/miniforge3/envs/smlmnew/bin/python scripts/run_training.py --config artifacts/notebook_configs/config_demo_data.yaml --exp_name demo_notebook_run --fixed_alpha 0.001
Loading Data...
Dataset loaded!
Device: cuda
Scheduler activated
Train Epoch [001/200]: L1 Chamfer Distance = 169.294159
Validate Epoch [001/200]: L1 Chamfer Distance = 173.639016
PCA model saved to output/notebook_logs/logs_pcn_20251219_002217/pca_model_epoch_1.joblib
Train Epoch [002/200]: L1 Chamfer Distance = 164.840607
Validate Epoch [002/200]: L1 Chamfer Distance = 270.886662
PCA model saved to output/notebook_logs/logs_pcn_20251219_002217/pca_model_epoch_2.joblib


Traceback (most recent call last):
  File "/home/dim26fa/coding/coding/smlm/scripts/run_training.py", line 27, in <module>
    main()
  File "/home/dim26fa/coding/coding/smlm/scripts/run_training.py", line 23, in main
    train(args.config, exp_name=args.exp_name, fixed_alpha=args.fixed_alpha)
  File "/home/dim26fa/coding/coding/smlm/scripts/train_pocafoldas.py", line 339, in train
    p, c, label = p.to(device), c.to(device), label.to(device)
KeyboardInterrupt


KeyboardInterrupt: 