# Flow SDK Frontends - Multiple Ways to Submit Tasks

Flow SDK provides multiple interfaces (frontends) for submitting tasks, catering to different workflows and user preferences. This notebook explores all available options.

## What You'll Learn

1. **Python API** - Direct programmatic interface
2. **YAML Configuration** - Declarative task definitions
3. **SLURM Adapter** - For HPC users familiar with SLURM
4. **CLI Commands** - Command-line interface
5. **Submitit Integration** - Facebook's job submission library
6. **When to Use Each** - Choosing the right frontend

In [None]:
# Import required modules
import os
import tempfile

import yaml

from flow import Flow, TaskConfig

## 1. Python API - Direct Programmatic Interface

The Python API is the most flexible and powerful way to use Flow SDK.

In [None]:
# Basic Python API usage
config = TaskConfig(
    name="python-api-demo",
    instance_type="gpu.nvidia.t4",
    command="nvidia-smi && echo 'Task completed'",
)

# Submit task
with Flow() as flow:
    task = flow.run(config, wait=True)
    print(f"Task {task.id} completed with status: {task.status}")

In [None]:
# Advanced Python API features
# Dynamic task generation
def create_training_task(model_name, epochs, batch_size):
    """Create a training task with dynamic parameters."""
    return TaskConfig(
        name=f"train-{model_name}",
        instance_type="gpu.nvidia.a10g",
        environment={
            "MODEL_NAME": model_name,
            "EPOCHS": str(epochs),
            "BATCH_SIZE": str(batch_size),
        },
        volumes=[{"name": "model-checkpoints", "size_gb": 100, "mount_path": "/checkpoints"}],
        command="""
            echo "Training $MODEL_NAME for $EPOCHS epochs"
            echo "Batch size: $BATCH_SIZE"
            # python train.py --model $MODEL_NAME --epochs $EPOCHS --batch-size $BATCH_SIZE
        """,
    )


# Submit multiple tasks programmatically
models = [("resnet50", 10, 32), ("efficientnet", 20, 16), ("vit", 15, 8)]

with Flow() as flow:
    tasks = []
    for model, epochs, batch_size in models:
        config = create_training_task(model, epochs, batch_size)
        task = flow.run(config, wait=False)
        tasks.append(task)
        print(f"Submitted {model} training: {task.id}")

    # Wait for all tasks
    print("\nWaiting for all tasks to complete...")
    for task in tasks:
        task.wait()
        print(f"Task {task.id} completed with status: {task.status}")

## 2. YAML Configuration - Declarative Task Definitions

YAML files provide a declarative way to define tasks, perfect for reproducibility and version control.

In [None]:
# Create a YAML configuration
yaml_config = """
name: yaml-demo-task
instance_type: gpu.nvidia.t4
max_price_per_hour: 5.0

# Environment variables
environment:
  CUDA_VISIBLE_DEVICES: "0"
  PYTHONPATH: "/workspace/src"
  LOG_LEVEL: "INFO"

# Storage volumes
volumes:
  - name: data-volume
    size_gb: 50
    mount_path: /data
  - name: output-volume
    size_gb: 20
    mount_path: /output

# Open ports for services
ports:
  - 8888  # Jupyter
  - 6006  # TensorBoard

# Task command
command: |
  echo "=== YAML Task Demo ==="
  echo "Instance: $(hostname)"
  echo "GPU: $(nvidia-smi --query-gpu=name --format=csv,noheader)"
  echo "Environment:"
  env | grep -E "^(CUDA_|PYTHON|LOG_)" | sort
  echo "Volumes:"
  df -h | grep -E "(Filesystem|/data|/output)"
"""

# Save to file
with tempfile.NamedTemporaryFile(mode="w", suffix=".yaml", delete=False) as f:
    f.write(yaml_config)
    yaml_file = f.name

print(f"Created YAML config: {yaml_file}")
print("\nContent:")
print(yaml_config)

In [None]:
# Load and submit YAML configuration
from flow.models import TaskConfig

# Method 1: Load from file
config = TaskConfig.from_yaml(yaml_file)
print("Loaded configuration:")
print(f"  Name: {config.name}")
print(f"  Instance: {config.instance_type}")
print(f"  Volumes: {len(config.volumes)}")
print(f"  Ports: {config.ports}")

# Submit the task
with Flow() as flow:
    task = flow.run(config, wait=True)
    print(f"\nTask completed: {task.status}")
    if task.status.value == "completed":
        print("\nOutput:")
        print(task.logs())

# Clean up
os.unlink(yaml_file)

In [None]:
# Advanced YAML with multiple task configurations
multi_task_yaml = """
# Base configuration shared by all tasks
defaults:
  instance_type: gpu.nvidia.a10g
  max_price_per_hour: 15.0
  volumes:
    - name: shared-data
      size_gb: 100
      mount_path: /data

# Task definitions
tasks:
  # Data preprocessing task
  - name: preprocess-data
    instance_type: cpu.large  # Override default
    command: |
      echo "Preprocessing data..."
      # python preprocess.py --input /data/raw --output /data/processed
  
  # Training task
  - name: train-model
    environment:
      MODEL_TYPE: transformer
      BATCH_SIZE: "32"
    command: |
      echo "Training $MODEL_TYPE model"
      # python train.py --data /data/processed --model $MODEL_TYPE
  
  # Evaluation task
  - name: evaluate-model
    command: |
      echo "Evaluating model performance"
      # python evaluate.py --model /data/model.pt --test /data/test
"""

# Parse multi-task YAML
config_dict = yaml.safe_load(multi_task_yaml)
print("Multi-task configuration:")
print(f"  Defaults: {list(config_dict['defaults'].keys())}")
print(f"  Tasks: {[t['name'] for t in config_dict['tasks']]}")

## 3. SLURM Adapter - For HPC Users

The SLURM adapter allows users familiar with SLURM to use Flow SDK with minimal changes to their workflow.

In [None]:
# SLURM-style job script
slurm_script = """
#!/bin/bash
#SBATCH --job-name=slurm-flow-demo
#SBATCH --partition=gpu
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=4
#SBATCH --gres=gpu:1
#SBATCH --mem=32G
#SBATCH --time=01:00:00
#SBATCH --output=job_%j.out
#SBATCH --error=job_%j.err

# Job commands
echo "SLURM Job ID: $SLURM_JOB_ID"
echo "Node: $SLURM_NODELIST"
echo "GPUs: $CUDA_VISIBLE_DEVICES"

# Load modules (simulated)
echo "Loading CUDA module..."
# module load cuda/11.8

# Run application
nvidia-smi
python3 -c "import torch; print(f'PyTorch CUDA: {torch.cuda.is_available()}')"
"""

# Convert SLURM script to Flow TaskConfig
from flow.frontends.slurm import SlurmAdapter

# Parse SLURM directives
adapter = SlurmAdapter()
config = adapter.parse_script(slurm_script)

print("Converted SLURM job to Flow config:")
print(f"  Name: {config.name}")
print(f"  CPUs: {config.cpu_count}")
print(f"  Memory: {config.memory_gb}GB")
print(f"  GPUs: {config.gpu_count}")

In [None]:
# SLURM array jobs in Flow
array_script = """
#!/bin/bash
#SBATCH --job-name=array-demo
#SBATCH --array=0-9
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=2
#SBATCH --mem=8G

echo "Array Task ID: $SLURM_ARRAY_TASK_ID"
echo "Processing dataset chunk $SLURM_ARRAY_TASK_ID"

# Process different data chunks based on array ID
python process_chunk.py --chunk-id $SLURM_ARRAY_TASK_ID --total-chunks 10
"""

# Convert to Flow array jobs
print("Converting SLURM array job to Flow tasks:\n")

# Create array of tasks
base_config = TaskConfig(name="array-demo", instance_type="cpu.medium", cpu_count=2, memory_gb=8)

# Submit array tasks
with Flow() as flow:
    tasks = []
    for i in range(10):
        # Create task with array index
        config = base_config.model_copy()
        config.name = f"{base_config.name}-{i}"
        config.environment = {"ARRAY_TASK_ID": str(i)}
        config.command = f"""
            echo "Array Task ID: {i}"
            echo "Processing dataset chunk {i}"
            # python process_chunk.py --chunk-id {i} --total-chunks 10
        """

        task = flow.run(config, wait=False)
        tasks.append(task)
        print(f"Submitted array task {i}: {task.id}")

## 4. CLI Commands - Command Line Interface

Flow SDK provides a comprehensive CLI for users who prefer working from the terminal.

In [None]:
# Demonstrate CLI commands (executed via subprocess for notebook compatibility)
import subprocess

# Show available CLI commands
result = subprocess.run(["flow", "--help"], capture_output=True, text=True)
print("Flow CLI Commands:")
print(result.stdout[:1000])  # First 1000 chars

In [None]:
# Example CLI commands (as strings for documentation)
cli_examples = """
# Submit a simple task
flow run --instance-type gpu.nvidia.t4 --command "nvidia-smi"

# Submit from YAML file
flow submit config.yaml

# Submit with environment variables
flow run --instance-type cpu.large \
  --env MODEL=bert --env EPOCHS=10 \
  --command "python train.py"

# List running tasks
flow list --status running

# Get task logs
flow logs task-123

# Stream logs in real-time
flow logs task-123 --follow

# Cancel a task
flow cancel task-123

# Find available instances
flow instances --max-price 10 --gpu-type a100

# Manage volumes
flow volume create --size 100 --name ml-data
flow volume list
flow volume delete ml-data
"""

print("Common CLI Commands:")
print(cli_examples)

## 5. Submitit Integration - Facebook's Job Submission Library

For users already using Submitit, Flow SDK provides a compatible adapter.

In [None]:
# Submitit-style job submission
# Note: This is a conceptual example

submitit_example = """
import submitit
from flow.frontends.submitit import FlowExecutor

def train_model(learning_rate, batch_size):
    '''Function to be executed on remote instance.'''
    import torch
    print(f"Training with LR={learning_rate}, BS={batch_size}")
    # Training logic here
    return {"accuracy": 0.95, "loss": 0.05}

# Create Flow-backed executor
executor = FlowExecutor(folder="logs")
executor.update_parameters(
    instance_type="gpu.nvidia.a10g",
    timeout_min=60,
    cpus_per_task=4,
    gpus_per_node=1,
)

# Submit jobs
jobs = []
for lr in [0.001, 0.01, 0.1]:
    for bs in [16, 32, 64]:
        job = executor.submit(train_model, lr, bs)
        jobs.append(job)

# Wait for results
for job in jobs:
    result = job.result()
    print(f"Job {job.job_id}: {result}")
"""

print("Submitit Integration Example:")
print(submitit_example)

## 6. Comparing Frontends - When to Use Each

Let's compare the different frontends to help you choose the right one for your use case.

In [None]:
# Frontend comparison matrix
import pandas as pd

comparison_data = {
    "Frontend": ["Python API", "YAML", "SLURM", "CLI", "Submitit"],
    "Best For": [
        "Dynamic workflows, integration",
        "Reproducible experiments",
        "HPC users, batch jobs",
        "Quick tasks, scripting",
        "Hyperparameter sweeps",
    ],
    "Flexibility": ["High", "Medium", "Medium", "Low", "High"],
    "Learning Curve": ["Medium", "Low", "Low*", "Low", "Medium"],
    "Version Control": ["Code", "Excellent", "Good", "Scripts", "Code"],
    "Dynamic Config": ["Yes", "No", "Limited", "Limited", "Yes"],
}

df = pd.DataFrame(comparison_data)
print("Frontend Comparison:")
print(df.to_string(index=False))
print("\n* Low learning curve for users already familiar with SLURM")

In [None]:
# Decision tree for choosing a frontend
decision_guide = """
Choosing the Right Frontend:

1. Do you need dynamic task generation?
   YES → Python API or Submitit
   NO  → Continue to 2

2. Are you migrating from SLURM?
   YES → SLURM Adapter
   NO  → Continue to 3

3. Do you need version-controlled configs?
   YES → YAML
   NO  → Continue to 4

4. Are you running one-off tasks?
   YES → CLI
   NO  → Python API

Special Cases:
- Hyperparameter sweeps → Submitit
- CI/CD pipelines → YAML or CLI
- Jupyter notebooks → Python API
- Shell scripts → CLI
"""

print(decision_guide)

## 7. Real-World Examples - Same Task, Different Frontends

Let's implement the same ML training task using different frontends to see the differences.

In [None]:
# Task: Train a model with specific parameters
# Requirements: GPU instance, 50GB storage, environment variables

task_description = """
Task Requirements:
- Train a ResNet50 model
- Use A10G GPU
- 50GB storage for datasets
- Set learning rate to 0.001
- Run for 10 epochs
"""

print(task_description)
print("\nImplementations:")

In [None]:
# 1. Python API Implementation
print("1. Python API:")
print("-" * 50)

python_impl = """
from flow import Flow, TaskConfig

config = TaskConfig(
    name="train-resnet50",
    instance_type="gpu.nvidia.a10g",
    volumes=[{
        "name": "training-data",
        "size_gb": 50,
        "mount_path": "/data"
    }],
    environment={
        "MODEL": "resnet50",
        "LEARNING_RATE": "0.001",
        "EPOCHS": "10"
    },
    command="python train.py --model $MODEL --lr $LEARNING_RATE --epochs $EPOCHS"
)

with Flow() as flow:
    task = flow.run(config)
"""

print(python_impl)

In [None]:
# 2. YAML Implementation
print("\n2. YAML Configuration:")
print("-" * 50)

yaml_impl = """
# train-resnet50.yaml
name: train-resnet50
instance_type: gpu.nvidia.a10g

volumes:
  - name: training-data
    size_gb: 50
    mount_path: /data

environment:
  MODEL: resnet50
  LEARNING_RATE: "0.001"
  EPOCHS: "10"

command: |
  python train.py --model $MODEL --lr $LEARNING_RATE --epochs $EPOCHS
"""

print(yaml_impl)
print("\n# Submit with: flow submit train-resnet50.yaml")

In [None]:
# 3. SLURM Implementation
print("\n3. SLURM Script:")
print("-" * 50)

slurm_impl = """
#!/bin/bash
#SBATCH --job-name=train-resnet50
#SBATCH --partition=gpu
#SBATCH --gres=gpu:a10g:1
#SBATCH --mem=32G
#SBATCH --time=02:00:00

export MODEL=resnet50
export LEARNING_RATE=0.001
export EPOCHS=10

# Note: Storage handled by Flow automatically
python train.py --model $MODEL --lr $LEARNING_RATE --epochs $EPOCHS
"""

print(slurm_impl)
print("\n# Submit with: flow slurm submit train.sh")

In [None]:
# 4. CLI Implementation
print("\n4. CLI Command:")
print("-" * 50)

cli_impl = """
flow run \
  --name train-resnet50 \
  --instance-type gpu.nvidia.a10g \
  --volume training-data:50:/data \
  --env MODEL=resnet50 \
  --env LEARNING_RATE=0.001 \
  --env EPOCHS=10 \
  --command "python train.py --model \\$MODEL --lr \\$LEARNING_RATE --epochs \\$EPOCHS"
"""

print(cli_impl)

## 8. Advanced Patterns - Combining Frontends

You can combine different frontends for maximum flexibility.

In [None]:
# Hybrid approach: YAML template + Python customization
base_yaml = """
name: base-training-template
instance_type: gpu.nvidia.a10g
max_price_per_hour: 20.0

volumes:
  - name: shared-data
    size_gb: 100
    mount_path: /data

environment:
  PYTHONPATH: /workspace/src
  CUDA_VISIBLE_DEVICES: "0"

command: python train.py
"""

# Load base config
base_config = yaml.safe_load(base_yaml)

# Customize for different experiments
experiments = [
    {"model": "resnet50", "lr": 0.001, "batch_size": 32},
    {"model": "efficientnet", "lr": 0.0001, "batch_size": 16},
    {"model": "vit", "lr": 0.0005, "batch_size": 8},
]

print("Hybrid Approach - Base + Customization:\n")

for exp in experiments:
    # Create customized config
    config_dict = base_config.copy()
    config_dict["name"] = f"train-{exp['model']}"
    config_dict["environment"].update(
        {
            "MODEL": exp["model"],
            "LEARNING_RATE": str(exp["lr"]),
            "BATCH_SIZE": str(exp["batch_size"]),
        }
    )

    # Convert to TaskConfig
    config = TaskConfig(**config_dict)
    print(f"Generated config for {exp['model']}:")
    print(f"  LR: {exp['lr']}, BS: {exp['batch_size']}")
    print()

## Summary and Best Practices

### Quick Reference Guide

| Use Case | Recommended Frontend | Example |
|----------|---------------------|----------|
| Quick test | CLI | `flow run --instance-type cpu.small --command "echo hello"` |
| Reproducible experiments | YAML | `flow submit experiment.yaml` |
| Dynamic workflows | Python API | `with Flow() as flow: flow.run(config)` |
| HPC migration | SLURM | `flow slurm submit job.sh` |
| Parameter sweeps | Submitit | `executor.submit(func, param1, param2)` |

### Best Practices

1. **Start Simple**: Use CLI for exploration, then move to YAML/Python
2. **Version Control**: Store YAML configs in git for reproducibility
3. **Modularize**: Use Python API for reusable components
4. **Document**: Include comments in YAML and docstrings in Python
5. **Test Locally**: Use `--dry-run` flag to preview before submitting

### Next Steps

- **Notebook 4**: Advanced Features - Catalogs, storage, multi-node
- **Notebook 5**: Real-World Examples - Complete ML workflows

Choose the frontend that best fits your workflow and start building!