# Amplifierd API - Collection Management

This notebook demonstrates collection discovery and management operations.

## Overview

Collections in amplifier provide:
- Profile manifests (schema v2 required)
- Agent definitions
- Context files
- Module packages

**Collection Registry**:
- Managed in `share/collections.yaml` (YAML format)
- Tracks installation metadata and timestamps
- Supports multiple source types

**Collection Source Formats**:

**Git Repositories** (with subdirectory support):
```
git+https://github.com/org/repo@main
git+https://github.com/org/repo@main#subdirectory=collections/foo
git+https://github.com/org/repo@v1.0.0
git+https://github.com/org/repo@commit-hash#subdirectory=path
```

**Local Paths**:
```
/absolute/path/to/collection               # Absolute path
./relative/path/to/collection              # Relative to daemon CWD
```

**HTTP URLs**:
```
https://example.com/path/to/collection
```

## Directory Structure

Collections are organized with efficient caching:

```
$AMPLIFIERD_HOME/
├── share/
│   ├── collections.yaml      # Registry with metadata
│   └── profiles/             # Compiled profiles
│       └── {collection}/
│           └── {profile}/
│               ├── {profile}.md
│               ├── agents/
│               ├── context/
│               └── orchestrator/
└── cache/
    ├── git/
    │   ├── {commit}/         # Full repo cache
    │   └── {commit}_{hash}/  # Subdirectory cache
    └── fsspec/
        └── http/             # HTTP URL cache
```

This notebook covers both **read operations** (discovery) and **write operations** (mounting/unmounting).

In [None]:
import json

import requests

BASE_URL = "http://127.0.0.1:8420"
API_BASE = f"{BASE_URL}/api/v1"


def print_response(response: requests.Response, title: str = "") -> None:
    if title:
        print(f"\n{'=' * 60}")
        print(f"{title}")
        print(f"{'=' * 60}")
    print(f"Status: {response.status_code} {response.reason}")
    if response.content:
        try:
            data = response.json()
            print(json.dumps(data, indent=2))
            return data
        except json.JSONDecodeError:
            print(response.text)
            return None
    return None


print("✓ Setup complete")

## Discovery Operations

### List All Collections

Get all registered collections with their metadata:

In [None]:
response = requests.get(f"{API_BASE}/collections/")
collections = print_response(response, "LIST COLLECTIONS")

if collections:
    print(f"\n✓ Found {len(collections)} collection(s)")
    for collection in collections:
        print(f"\n  - {collection['id']}")
        print(f"    Source: {collection['source']}")
        print(f"    Installed: {collection.get('installed_at', 'unknown')}")
        if collection.get("profiles_count"):
            print(f"    Profiles: {collection['profiles_count']}")

### Get Collection Details

Retrieve detailed information about a specific collection:

In [None]:
# Get details for a collection (change ID as needed)
if collections:
    collection_id = collections[0]["id"]

    response = requests.get(f"{API_BASE}/collections/{collection_id}")
    details = print_response(response, f"GET COLLECTION: {collection_id}")

    if details:
        print(f"\n✓ Collection: {details['id']}")
        print(f"  Source: {details['source']}")
        print(f"  Installed: {details.get('installed_at', 'unknown')}")

        # Show discovered profiles
        profiles = details.get("profiles", [])
        print(f"\n  Profiles discovered: {len(profiles)}")
        for profile in profiles[:5]:  # Show first 5
            print(f"    - {profile}")

        # Show cache info if available
        if "cache_path" in details:
            print(f"\n  Cache: {details['cache_path']}")
else:
    print("No collections available. Try mounting a collection first.")

## Write Operations

### Mount a Collection

Register and mount a new collection from a source:

In [None]:
# Mount a collection - Examples:
# Git repo: "git+https://github.com/org/repo@main"
# Git subdirectory: "git+https://github.com/org/repo@main#subdirectory=collections/foo"
# Local path: "/path/to/collection" or "./relative/path"
# HTTP URL: "https://example.com/collections/mycollection"

collection_source = "git+https://github.com/example/collections@main#subdirectory=foundation"
collection_id = "example-foundation"

response = requests.post(f"{API_BASE}/collections/{collection_id}/mount", json={"source": collection_source})
result = print_response(response, f"MOUNT COLLECTION: {collection_id}")

if response.ok:
    print(f"\n✓ Collection mounted: {result['id']}")
    print(f"  Source: {result['source']}")
    if result.get("profiles_discovered"):
        print(f"  Profiles discovered: {result['profiles_discovered']}")
elif response.status_code == 409:
    print(f"\nℹ Collection '{collection_id}' already mounted")
else:
    print(f"\n✗ Failed to mount collection: {response.status_code}")

### Understanding Collection Sources

Collections support multiple source formats with automatic type detection:

**Git with Subdirectory**:
```python
# Full repo
source = "git+https://github.com/org/repo@main"

# Specific subdirectory (more efficient)
source = "git+https://github.com/org/repo@main#subdirectory=collections/foundation"

# Specific version
source = "git+https://github.com/org/repo@v1.0.0#subdirectory=collections/core"
```

**Local Paths**:
```python
# Absolute path
source = "/home/user/my-collection"

# Relative to daemon CWD
source = "./local-collections/foo"
```

**HTTP URLs**:
```python
source = "https://example.com/collections/mycollection"
```

In [None]:
# Demonstrate different source types
source_examples = {
    "Git full repo": {
        "source": "git+https://github.com/org/repo@main",
        "cache": "cache/git/{commit}/",
        "notes": "Clones entire repository",
    },
    "Git subdirectory": {
        "source": "git+https://github.com/org/repo@main#subdirectory=collections/core",
        "cache": "cache/git/{commit}_{hash}/",
        "notes": "Only fetches specified subdirectory (sparse checkout)",
    },
    "Local absolute": {
        "source": "/home/user/collections/local",
        "cache": "None (used directly)",
        "notes": "Direct access to local filesystem",
    },
    "Local relative": {
        "source": "./my-collection",
        "cache": "None (used directly)",
        "notes": "Relative to daemon working directory",
    },
    "HTTP URL": {
        "source": "https://example.com/collections/remote",
        "cache": "cache/fsspec/http/",
        "notes": "Downloads and caches from HTTP(S)",
    },
}

print("Collection Source Types:\n")
for source_type, info in source_examples.items():
    print(f"{source_type}:")
    print(f"  Source: {info['source']}")
    print(f"  Cache: {info['cache']}")
    print(f"  Notes: {info['notes']}")
    print()

### Unmount a Collection

Remove a collection from the registry:

In [None]:
# Unmount a collection (change ID as needed)
collection_to_remove = "example-foundation"

response = requests.delete(f"{API_BASE}/collections/{collection_to_remove}")
result = print_response(response, f"UNMOUNT COLLECTION: {collection_to_remove}")

if response.ok:
    print(f"\n✓ Collection unmounted: {collection_to_remove}")
    print("  Note: Compiled profiles remain until cleanup")
elif response.status_code == 404:
    print(f"\n✗ Collection '{collection_to_remove}' not found")
else:
    print(f"\n✗ Failed to unmount collection: {response.status_code}")

## Git Subdirectory Support

Collections support efficient git subdirectory mounting using sparse checkout:

**Syntax**: `git+URL@ref#subdirectory=path`

**Benefits**:
- Only downloads specified subdirectory
- Faster cloning for large repositories
- Reduced cache storage
- Multiple collections from same repo

**Example Structure**:
```
github.com/org/mono-repo@main
├── collections/
│   ├── foundation/     # git+...@main#subdirectory=collections/foundation
│   ├── developer/      # git+...@main#subdirectory=collections/developer
│   └── enterprise/     # git+...@main#subdirectory=collections/enterprise
├── docs/
└── other/
```

**Cache Behavior**:
- Full repo: `cache/git/{commit}/`
- Subdirectory: `cache/git/{commit}_{subdirectory_hash}/`
- Commit hash ensures immutable caching

In [None]:
# Example: Mount multiple collections from same repo
mono_repo_url = "git+https://github.com/org/mono-repo@main"

collections_to_mount = [
    {"id": "foundation", "source": f"{mono_repo_url}#subdirectory=collections/foundation"},
    {"id": "developer", "source": f"{mono_repo_url}#subdirectory=collections/developer"},
    {"id": "enterprise", "source": f"{mono_repo_url}#subdirectory=collections/enterprise"},
]

print("Example: Multiple collections from mono-repo\n")
for coll in collections_to_mount:
    print(f"Collection: {coll['id']}")
    print(f"  Source: {coll['source']}")
    print("  Cache: cache/git/{commit}_{hash}/")
    print()

## Profile Auto-Discovery and Compilation

When a collection is mounted or synced:

**Auto-Discovery**:
1. Scans collection for `profiles/*.md` files
2. Validates schema version (must be v2)
3. Registers discovered profiles

**Auto-Compilation**:
1. Resolves all refs (agents, context, modules)
2. Downloads and caches assets
3. Creates complete profile structure
4. Preserves original manifest

**Profile Structure After Compilation**:
```
share/profiles/{collection}/{profile}/
  {profile}.md        # Original manifest preserved
  agents/             # Resolved agent files
    researcher.md
    coder.md
  context/            # Resolved context directories
    README.md
    guidelines.md
  orchestrator/       # Orchestrator module
  tools/              # Tool modules
  providers/          # Provider modules
  hooks/              # Hook modules
```

## Ref Resolution and Caching

Profile refs are resolved and cached efficiently:

In [None]:
# Ref resolution examples
ref_resolution = {
    "Git refs": {
        "format": "git+URL@ref#subdirectory=path",
        "cache": "cache/git/{commit}/ or cache/git/{commit}_{hash}/",
        "examples": [
            "git+https://github.com/org/repo@main",
            "git+https://github.com/org/repo@main#subdirectory=agents/",
            "git+https://github.com/org/repo@v1.0.0#subdirectory=context/",
        ],
    },
    "HTTP refs": {
        "format": "https://... or http://...",
        "cache": "cache/fsspec/http/",
        "examples": ["https://raw.githubusercontent.com/org/repo/main/agent.md", "https://example.com/assets/context/"],
    },
    "Local refs": {
        "format": "/absolute or ./relative",
        "cache": "None (used directly)",
        "examples": ["/home/user/agents/researcher.md", "./local-context/"],
    },
}

print("Ref Resolution and Caching:\n")
for ref_type, info in ref_resolution.items():
    print(f"{ref_type}:")
    print(f"  Format: {info['format']}")
    print(f"  Cache: {info['cache']}")
    print("  Examples:")
    for ex in info["examples"]:
        print(f"    - {ex}")
    print()

## Complete Collection Workflow

Demonstrate discovery and management workflow:

In [None]:
def collection_workflow():
    """Complete workflow: mount, explore, and use collections."""

    # 1. List current collections
    print("1. Listing current collections...")
    response = requests.get(f"{API_BASE}/collections/")
    if not response.ok:
        print("✗ Failed to list collections")
        return

    collections = response.json()
    print(f"✓ Found {len(collections)} collection(s)")

    # 2. Mount a new collection (example - will fail if already mounted)
    print("\n2. Mounting example collection...")
    response = requests.post(
        f"{API_BASE}/collections/example/mount", json={"source": "git+https://github.com/example/collections@main"}
    )
    if response.ok:
        result = response.json()
        print(f"✓ Collection mounted: {result['id']}")
        print(f"  Profiles discovered: {result.get('profiles_discovered', 0)}")
    elif response.status_code == 409:
        print("ℹ Collection already mounted")

    # 3. Explore collections
    print("\n3. Exploring collections...")
    for collection in collections[:3]:  # First 3
        coll_id = collection["id"]
        response = requests.get(f"{API_BASE}/collections/{coll_id}")
        if response.ok:
            details = response.json()
            profiles = details.get("profiles", [])
            print(f"\n  {coll_id}:")
            print(f"    Source: {details['source']}")
            print(f"    Profiles: {len(profiles)}")
            if profiles:
                print(f"    Example: {profiles[0]}")

    print("\n✓ Workflow complete")


collection_workflow()

## Collections Registry Format

The `share/collections.yaml` file tracks mounted collections:

```yaml
collections:
  foundation:
    source: git+https://github.com/org/repo@main#subdirectory=collections/foundation
    installed_at: '2025-11-25T15:36:00'
  
  developer:
    source: git+https://github.com/org/repo@main#subdirectory=collections/developer
    installed_at: '2025-11-25T15:40:00'
  
  local-collection:
    source: /home/user/my-collection
    installed_at: '2025-11-25T16:00:00'
```

**Key Features**:
- Simple YAML structure
- Tracks installation timestamp
- Supports all source types
- No duplication (ID is key)

## Summary

### Discovery Operations
- ✓ List all registered collections
- ✓ Get collection details with metadata
- ✓ View discovered profiles

### Write Operations
- ✓ Mount collections from various sources
- ✓ Unmount collections
- ✓ Auto-discovery of profiles on mount
- ✓ Auto-compilation of schema v2 profiles

## API Endpoints Reference

| Method | Endpoint | Description |
|--------|----------|-------------|
| GET | `/api/v1/collections/` | List all collections |
| GET | `/api/v1/collections/{id}` | Get collection details |
| POST | `/api/v1/collections/{id}/mount` | Mount a new collection |
| DELETE | `/api/v1/collections/{id}` | Unmount collection |

## Important Notes

### Schema v2 Profiles Required

The daemon only accepts schema v2 profiles:
- Must have `schema_version: 2` in frontmatter
- No `extends` field - profiles fully resolved
- Agents as list of refs
- Context as list of refs

### Git Subdirectory Support

Use `#subdirectory=path` for efficient mounting:
- Only clones specified subdirectory
- Faster for large repositories
- Multiple collections from same repo
- Separate caching per subdirectory

### Caching Strategy

**Git sources**:
- Cached by commit hash (immutable)
- `cache/git/{commit}/` for full repo
- `cache/git/{commit}_{hash}/` for subdirectory

**HTTP sources**:
- Cached in `cache/fsspec/http/`
- Content-based caching

**Local sources**:
- No caching (used directly)
- Changes reflected immediately

### Configuration Files

Collection registry:
- `share/collections.yaml` - YAML-based registry

Compiled profiles:
- `share/profiles/{collection}/{profile}/` - Complete profile structures

Cache locations:
- `cache/git/` - Git repository cache
- `cache/fsspec/http/` - HTTP URL cache

## Troubleshooting

### Collection Not Discovering Profiles

If a mounted collection shows no profiles:
1. Check that profile files have `schema_version: 2` in frontmatter
2. Ensure profiles are `*.md` files in `profiles/` directory
3. Verify source path is correct (especially subdirectory)
4. Check daemon logs for validation errors

### Git Collection Mounting Fails

If mounting a git collection fails:
1. Verify git URL is accessible (test with `git clone`)
2. Check that ref (branch/tag/commit) exists
3. For subdirectory: verify path exists in repo
4. Check network connectivity and authentication
5. Review daemon logs for git errors

### Local Collection Not Found

If local collection mounting fails:
1. Verify path exists and is accessible
2. Check permissions (daemon must read collection directory)
3. For relative paths: Verify daemon working directory
4. Ensure collection has proper structure (profiles/, agents/, etc.)

## Next Steps

Continue to:
- **03-profile-management.ipynb** - Profile discovery and activation
- **05-module-management.ipynb** - Module discovery