# Path Templates

This tutorial covers the template system in `projio` for working with common file patterns like multi-file datasets.

## What are Templates?

Templates define patterns for groups of related files. For example, a "filtered matrix" template might include:

- `matrix.mtx.gz` - the actual matrix data
- `barcodes.tsv.gz` - cell barcodes
- `features.tsv.gz` - gene features

Instead of managing these paths individually, templates let you resolve all of them at once.

## Built-in Templates

In [1]:
import tempfile
from project_io import ProjectIO

tmp = tempfile.mkdtemp()
io = ProjectIO(root=tmp, use_datestamp=False, auto_create=False)

# List registered templates
print("Registered templates:")
for name in io.templates:
    print(f"  - {name}")

Registered templates:
  - checkpoints
  - tensorboard
  - filtered_matrix
  - notebook_outputs


## Using the Filtered Matrix Template

The `filtered_matrix` template is useful for single-cell RNA-seq data:

In [2]:
# Resolve the filtered_matrix template
paths = io.template_path('filtered_matrix')

print("Filtered matrix paths:")
for name, path in paths.items():
    print(f"  {name}: {path}")

Filtered matrix paths:
  barcodes: /private/var/folders/f7/7pcpvrhn0p9gw509gyzmh8fxrwyskv/T/tmp7zdhzm_k/barcodes.tsv.gz
  matrix: /private/var/folders/f7/7pcpvrhn0p9gw509gyzmh8fxrwyskv/T/tmp7zdhzm_k/matrix.mtx
  features: /private/var/folders/f7/7pcpvrhn0p9gw509gyzmh8fxrwyskv/T/tmp7zdhzm_k/features.tsv.gz


### With Subdirectory

In [3]:
# Place under a sample-specific subdirectory
paths = io.template_path('filtered_matrix', subdir='sample_001')

print("With subdirectory:")
for name, path in paths.items():
    print(f"  {name}: {path}")

With subdirectory:
  barcodes: /private/var/folders/f7/7pcpvrhn0p9gw509gyzmh8fxrwyskv/T/tmpx8s6hrol/barcodes.tsv.gz
  matrix: /private/var/folders/f7/7pcpvrhn0p9gw509gyzmh8fxrwyskv/T/tmpx8s6hrol/matrix.mtx
  features: /private/var/folders/f7/7pcpvrhn0p9gw509gyzmh8fxrwyskv/T/tmpx8s6hrol/features.tsv.gz


### With Variant

Templates can have variants for different formats or versions:

In [3]:
# Use a specific variant (if defined)
# The filtered_matrix template uses variant in the path
paths = io.template_path('filtered_matrix', variant='v3')

print("With variant:")
for name, path in paths.items():
    print(f"  {name}: {path}")

With variant:
  barcodes: /private/var/folders/f7/7pcpvrhn0p9gw509gyzmh8fxrwyskv/T/tmp7zdhzm_k/barcodes.tsv.gz
  matrix: /private/var/folders/f7/7pcpvrhn0p9gw509gyzmh8fxrwyskv/T/tmp7zdhzm_k/matrix.mtx
  features: /private/var/folders/f7/7pcpvrhn0p9gw509gyzmh8fxrwyskv/T/tmp7zdhzm_k/features.tsv.gz


## TensorBoard Template

In [4]:
# TensorBoard template for run directories
# Note: tensorboard template requires a 'run' parameter
tb_path = io.template_path('tensorboard', run='experiment_1')

print(f"TensorBoard path: {tb_path}")

TensorBoard path: /private/var/folders/f7/7pcpvrhn0p9gw509gyzmh8fxrwyskv/T/tmp7zdhzm_k/lightning/tensorboard/experiment_1


## Creating Custom Templates

You can register your own templates using `TemplateSpec`:

In [5]:
from project_io.funcs import TemplateSpec

# Define a template for model artifacts
model_template = TemplateSpec(
    name='model_artifacts',
    base='outputs',
    pattern={
        'weights': 'model/weights.pt',
        'config': 'model/config.json',
        'vocab': 'model/vocab.txt',
        'metrics': 'model/metrics.json'
    }
)

# Register it
io.register_template(model_template)

# Now use it
paths = io.template_path('model_artifacts')

print("Model artifacts:")
for name, path in paths.items():
    print(f"  {name}: {path}")

Model artifacts:
  weights: /private/var/folders/f7/7pcpvrhn0p9gw509gyzmh8fxrwyskv/T/tmp7zdhzm_k/outputs/model/weights.pt
  config: /private/var/folders/f7/7pcpvrhn0p9gw509gyzmh8fxrwyskv/T/tmp7zdhzm_k/outputs/model/config.json
  vocab: /private/var/folders/f7/7pcpvrhn0p9gw509gyzmh8fxrwyskv/T/tmp7zdhzm_k/outputs/model/vocab.txt
  metrics: /private/var/folders/f7/7pcpvrhn0p9gw509gyzmh8fxrwyskv/T/tmp7zdhzm_k/outputs/model/metrics.json


### Multiple Files with Mapping Patterns

Use mapping patterns when you need named paths for multiple files:

In [6]:
# Template with a mapping pattern for multiple files
# (sequence patterns are for path parts, mappings are for named files)
log_template = TemplateSpec(
    name='training_logs',
    base='logs',
    pattern={
        'train': 'train.log',
        'val': 'val.log',
        'test': 'test.log'
    }
)

io.register_template(log_template)

# Returns a dict of paths
paths = io.template_path('training_logs')

print("Training logs:")
for name, path in paths.items():
    print(f"  {name}: {path}")

Training logs:
  train: /private/var/folders/f7/7pcpvrhn0p9gw509gyzmh8fxrwyskv/T/tmp7zdhzm_k/logs/train.log
  val: /private/var/folders/f7/7pcpvrhn0p9gw509gyzmh8fxrwyskv/T/tmp7zdhzm_k/logs/val.log
  test: /private/var/folders/f7/7pcpvrhn0p9gw509gyzmh8fxrwyskv/T/tmp7zdhzm_k/logs/test.log


### Format Placeholders

Templates can include format placeholders:

In [7]:
# Template with placeholders
epoch_template = TemplateSpec(
    name='epoch_checkpoint',
    base='lightning_root',
    pattern={
        'checkpoint': 'checkpoints/{variant}/epoch_{epoch:03d}.ckpt',
        'state': 'checkpoints/{variant}/epoch_{epoch:03d}_state.pt'
    }
)

io.register_template(epoch_template)

# Use with format kwargs
paths = io.template_path('epoch_checkpoint', variant='run_001', epoch=42)

print("Epoch checkpoint paths:")
for name, path in paths.items():
    print(f"  {name}: {path}")

Epoch checkpoint paths:
  checkpoint: /private/var/folders/f7/7pcpvrhn0p9gw509gyzmh8fxrwyskv/T/tmp7zdhzm_k/lightning_root/checkpoints/run_001/epoch_042.ckpt
  state: /private/var/folders/f7/7pcpvrhn0p9gw509gyzmh8fxrwyskv/T/tmp7zdhzm_k/lightning_root/checkpoints/run_001/epoch_042_state.pt


## Templates with Datestamps

Templates respect the datestamp configuration:

In [8]:
from datetime import datetime

# Create a new IO instance with datestamp enabled
io_dated = ProjectIO(
    root=tmp,
    use_datestamp=True,
    datestamp_in='dirs',
    auto_create=False
)

# Define and register template for this instance
dated_template = TemplateSpec(
    name='dated_artifacts',
    base='outputs',
    pattern={
        'weights': 'model/weights.pt',
        'config': 'model/config.json',
    }
)
io_dated.register_template(dated_template)

# Pass a specific timestamp to get consistent datestamp in paths
fixed_time = datetime(2024, 3, 15)
paths = io_dated.template_path('dated_artifacts', timestamp=fixed_time)

print("With datestamp:")
for name, path in paths.items():
    print(f"  {name}: {path}")

With datestamp:
  weights: /private/var/folders/f7/7pcpvrhn0p9gw509gyzmh8fxrwyskv/T/tmp7zdhzm_k/outputs/2024_03_15/model/weights.pt
  config: /private/var/folders/f7/7pcpvrhn0p9gw509gyzmh8fxrwyskv/T/tmp7zdhzm_k/outputs/2024_03_15/model/config.json


### Override Datestamp for Templates

In [9]:
# Skip datestamp for this template call
paths = io_dated.template_path('dated_artifacts', datestamp=False)

print("Without datestamp:")
for name, path in paths.items():
    print(f"  {name}: {path}")

Without datestamp:
  weights: /private/var/folders/f7/7pcpvrhn0p9gw509gyzmh8fxrwyskv/T/tmp7zdhzm_k/outputs/model/weights.pt
  config: /private/var/folders/f7/7pcpvrhn0p9gw509gyzmh8fxrwyskv/T/tmp7zdhzm_k/outputs/model/config.json


## Template Specification Reference

The `TemplateSpec` dataclass has these fields:

| Field | Type | Description |
|-------|------|-------------|
| `name` | str | Template identifier |
| `base` | str | Base kind (outputs, inputs, lightning_root, etc.) |
| `pattern` | list or dict | File patterns to resolve |
| `description` | str | Human-readable description |
| `datestamp` | bool or None | Override datestamp behavior |

## Best Practices

1. **Use templates for multi-file datasets** that are always used together

2. **Choose meaningful names** that describe what the template represents

3. **Use dict patterns** when you need to access individual files by key

4. **Use list patterns** when you just need all the paths

5. **Include placeholders** for values that change between uses (like epoch numbers)

## Next Steps

- Check out [advanced features](05_advanced.ipynb) like producer tracking, dry-run mode, and gitignore integration