# Configuration

RadiObject provides a unified configuration system with two levels of access:

| Prefix | Purpose | Functions |
|--------|---------|----------|
| `radi_*` | RadiObject configuration | `radi_cfg()`, `radi_reset()` |
| `tdb_*` | TileDB internals | `tdb_ctx()`, `tdb_cfg()` |

**Key concepts:**
- **Write settings** (tile, compression, orientation) are **immutable after ingestion**
- **Read settings** (memory, concurrency) affect all read operations
- **S3 settings** configure cloud storage access

This notebook covers:
- Basic configuration with `configure()`
- Inspecting configuration with `radi_cfg()`
- Resetting configuration with `radi_reset()`
- Accessing TileDB internals with `tdb_ctx()` and `tdb_cfg()`
- Tile orientation strategies
- Compression options

**Prerequisites:** [01_radi_object.ipynb](./01_radi_object.ipynb)

## API Overview

```python
from radiobject import (
    # Configuration function
    configure,
    
    # RadiObject config (radi_* prefix)
    radi_cfg,    # Get current RadiObject configuration
    radi_reset,  # Reset configuration to defaults
    
    # TileDB internals (tdb_* prefix)
    tdb_ctx,     # Get TileDB context (lazy, cached)
    tdb_cfg,     # Get raw TileDB Config object
    
    # Configuration classes
    WriteConfig, ReadConfig, S3Config,
    TileConfig, CompressionConfig, OrientationConfig,
    
    # Enums
    SliceOrientation, Compressor,
)
```

In [None]:
import os
import shutil
import tempfile
import time
from pathlib import Path

import numpy as np

from radiobject import (
    CompressionConfig,
    Compressor,
    ReadConfig,
    S3Config,
    SliceOrientation,
    TileConfig,
    WriteConfig,
    configure,
    radi_cfg,
    radi_reset,
    tdb_cfg,
    tdb_ctx,
)
from radiobject.data import get_brats_uri
from radiobject.volume import Volume

BRATS_URI = get_brats_uri()
TEMP_DIR = tempfile.mkdtemp(prefix="config_tutorial_")
print(f"Working directory: {TEMP_DIR}")

## Basic Configuration

Use `configure()` to set RadiObject configuration. Settings are organized into three categories:

| Category | Class | Affects |
|----------|-------|--------|
| `write=` | `WriteConfig` | New arrays only (immutable after creation) |
| `read=` | `ReadConfig` | All read operations |
| `s3=` | `S3Config` | Cloud storage access |

In [None]:
# Configure for axial slice access with ZSTD compression
configure(
    write=WriteConfig(
        tile=TileConfig(orientation=SliceOrientation.AXIAL),
        compression=CompressionConfig(algorithm=Compressor.ZSTD, level=3),
    )
)

# View current configuration using radi_cfg()
config = radi_cfg()
print(f"Tile orientation: {config.write.tile.orientation}")
print(f"Compression: {config.write.compression.algorithm}, level={config.write.compression.level}")

## Inspecting Configuration: `radi_cfg()`

`radi_cfg()` returns the current `RadiObjectConfig` object with all settings:

In [None]:
config = radi_cfg()

# Explore the configuration structure
print("=== Write Settings (immutable after ingestion) ===")
print(f"  Tile orientation: {config.write.tile.orientation.value}")
print(
    f"  Compression: {config.write.compression.algorithm.value} (level {config.write.compression.level})"
)
print(f"  Auto-detect orientation: {config.write.orientation.auto_detect}")
print(f"  Reorient on load: {config.write.orientation.reorient_on_load}")

print("\n=== Read Settings (affects all reads) ===")
print(f"  Memory budget: {config.read.memory_budget_mb} MB")
print(f"  Concurrency: {config.read.concurrency} threads")
print(f"  Max workers: {config.read.max_workers}")

print("\n=== S3 Settings ===")
print(f"  Region: {config.s3.region}")
print(f"  Max parallel ops: {config.s3.max_parallel_ops}")
print(f"  Multipart size: {config.s3.multipart_part_size_mb} MB")

## Resetting Configuration: `radi_reset()`

`radi_reset()` restores all settings to defaults and clears the cached TileDB context:

In [None]:
# Make some changes
configure(
    write=WriteConfig(tile=TileConfig(orientation=SliceOrientation.ISOTROPIC)),
    read=ReadConfig(memory_budget_mb=2048, concurrency=8),
)
print(
    f"Before reset: tile={radi_cfg().write.tile.orientation.value}, memory={radi_cfg().read.memory_budget_mb}MB"
)

# Reset to defaults
radi_reset()
print(
    f"After reset:  tile={radi_cfg().write.tile.orientation.value}, memory={radi_cfg().read.memory_budget_mb}MB"
)

## TileDB Internals: `tdb_ctx()` and `tdb_cfg()`

For advanced use cases, access the underlying TileDB objects:

| Function | Returns | Use Case |
|----------|---------|----------|
| `tdb_ctx()` | `tiledb.Ctx` | Pass to TileDB operations, inspect thread pools |
| `tdb_cfg()` | `tiledb.Config` | Inspect/modify raw TileDB parameters |

In [None]:
# Get the TileDB context (lazily created, cached)
ctx = tdb_ctx()
print(f"TileDB Context: {type(ctx).__name__}")

# The context is cached - same object on repeated calls
ctx2 = tdb_ctx()
print(f"Same context object: {ctx is ctx2}")

In [None]:
# Get the raw TileDB Config object
tdb_config = tdb_cfg()

# Inspect TileDB parameters directly
print("=== TileDB Configuration Parameters ===")
params = [
    "sm.compute_concurrency_level",
    "sm.io_concurrency_level",
    "sm.memory_budget",
    "vfs.s3.region",
    "vfs.s3.max_parallel_ops",
]
for param in params:
    try:
        value = tdb_config[param]
        print(f"  {param}: {value}")
    except KeyError:
        print(f"  {param}: (not set)")

### Advanced: Direct TileDB Context Creation

For specialized scenarios (isolated testing, custom thread pools), create contexts directly:

In [None]:
import tiledb

# Create a custom TileDB context with specific settings
custom_config = tiledb.Config(
    {
        "sm.compute_concurrency_level": "2",
        "sm.io_concurrency_level": "2",
        "sm.memory_budget": str(512 * 1024 * 1024),  # 512 MB
    }
)
custom_ctx = tiledb.Ctx(custom_config)

# Use custom context with RadiObject classes
test_data = np.random.randn(64, 64, 32).astype(np.float32)
uri = str(Path(TEMP_DIR) / "custom_ctx_vol")

# Volume accepts ctx parameter for custom context injection
vol = Volume.from_numpy(uri, test_data, ctx=custom_ctx)
print(f"Created volume with custom context: {vol}")

## Tile Orientations

Choose tile orientation based on your primary access pattern:

| Orientation | Tile Shape | Best For |
|-------------|------------|----------|
| `AXIAL` | 240 x 240 x 1 | Slice-by-slice viewing (radiology review) |
| `SAGITTAL` | 1 x 240 x 155 | Sagittal plane analysis |
| `CORONAL` | 240 x 1 x 155 | Coronal plane analysis |
| `ISOTROPIC` | 64 x 64 x 64 | 3D ROI extraction (ML training) |

```
    AXIAL (XY slices)          ISOTROPIC (64^3 cubes)
    +--------------+           +--------------+
    |==============| Z=0       | +--+--+--+   |
    |==============| Z=1       | +--+--+--+   |
    |==============| Z=2       | +--+--+--+   |
    |      ...     |           |    ...       |
    +--------------+           +--------------+
    
    Reading Z=77:              Reading 64^3 ROI:
    Reads 1 tile (fast)        Reads 1-8 tiles
```

In [None]:
# See how tile extents are computed for different orientations
shape = (240, 240, 155)

for orient in SliceOrientation:
    tile_cfg = TileConfig(orientation=orient)
    extents = tile_cfg.extents_for_shape(shape)
    print(f"{orient.value:10s} -> tile extents: {extents}")

In [None]:
# Create test volumes with different tile orientations
test_data = np.random.randn(240, 240, 155).astype(np.float32)

volumes = {}
for orient in [SliceOrientation.AXIAL, SliceOrientation.SAGITTAL, SliceOrientation.ISOTROPIC]:
    configure(write=WriteConfig(tile=TileConfig(orientation=orient)))
    uri = str(Path(TEMP_DIR) / f"vol_{orient.value}")
    volumes[orient.value] = Volume.from_numpy(uri, test_data)
    print(f"Created {orient.value}: {volumes[orient.value]}")

In [None]:
# Benchmark: read axial slices from each volume
n_reads = 50
results = {}

for name, vol in volumes.items():
    start = time.perf_counter()
    for z in range(0, 150, 3):  # Read every 3rd slice
        _ = vol.axial(z)
    elapsed = time.perf_counter() - start
    results[name] = elapsed
    print(f"{name:10s}: {elapsed*1000:.1f}ms for {n_reads} axial reads")

print(f"\nAxial-tiled is {results['isotropic']/results['axial']:.1f}x faster for axial reads")

## Compression Options

| Compressor | Speed | Ratio | Best For |
|------------|-------|-------|----------|
| `NONE` | Fastest | 1.0x | Debugging, uncompressed formats |
| `LZ4` | Very fast | 1.0-1.2x | Real-time processing |
| `ZSTD` | Fast | 1.1-1.3x | **Default** - balanced |
| `GZIP` | Slower | 1.1-1.4x | Archival, max compression |

In [None]:
# Compare compression sizes
test_data = np.random.randn(120, 120, 60).astype(np.float32)
uncompressed_size = test_data.nbytes

for compressor in [Compressor.NONE, Compressor.LZ4, Compressor.ZSTD, Compressor.GZIP]:
    configure(
        write=WriteConfig(
            compression=CompressionConfig(algorithm=compressor, level=3),
            tile=TileConfig(orientation=SliceOrientation.AXIAL),
        )
    )
    uri = str(Path(TEMP_DIR) / f"vol_{compressor.value}")
    vol = Volume.from_numpy(uri, test_data)

    # Get directory size
    total_size = sum(
        os.path.getsize(os.path.join(dp, f))
        for dp, dn, filenames in os.walk(uri)
        for f in filenames
    )
    ratio = uncompressed_size / total_size if total_size > 0 else 0
    print(f"{compressor.value:6s}: {total_size/1024:.1f} KB (ratio: {ratio:.2f}x)")

## Read Configuration

Read settings affect all data access operations:

| Parameter | Default | Description |
|-----------|---------|-------------|
| `memory_budget_mb` | 1024 | TileDB memory budget for I/O buffers |
| `concurrency` | 4 | TileDB internal thread pool size |
| `max_workers` | 4 | Python ThreadPoolExecutor workers |

In [None]:
# Configure for high-memory workload
configure(
    read=ReadConfig(
        memory_budget_mb=2048,  # 2GB memory budget
        concurrency=8,  # 8 parallel I/O threads
        max_workers=4,  # 4 parallel volume operations
    )
)

config = radi_cfg()
print(f"Memory budget: {config.read.memory_budget_mb} MB")
print(f"I/O concurrency: {config.read.concurrency} threads")
print(f"Max workers: {config.read.max_workers}")

## S3 Configuration

RadiObject works identically with local and S3 URIs:

```python
# Local
radi = RadiObject("/data/study")

# S3 (same API!)
radi = RadiObject("s3://bucket/study")
```

Configure S3 for optimal performance:

In [None]:
configure(
    s3=S3Config(
        region="us-east-2",
        max_parallel_ops=16,  # Parallel S3 requests
        multipart_part_size_mb=50,  # Chunk size for uploads
    )
)

print("S3 configuration:")
print(f"  Region: {radi_cfg().s3.region}")
print(f"  Max parallel ops: {radi_cfg().s3.max_parallel_ops}")
print(f"  Multipart size: {radi_cfg().s3.multipart_part_size_mb} MB")

## Configuration Recipes

### Radiology Viewer (2D Slice Access)

In [None]:
# Optimal for slice-by-slice viewing
radi_reset()
configure(
    write=WriteConfig(
        tile=TileConfig(orientation=SliceOrientation.AXIAL),
        compression=CompressionConfig(algorithm=Compressor.LZ4),  # Fast decompression
    ),
    read=ReadConfig(memory_budget_mb=512),  # Lower memory for interactive use
)
print("Configured for radiology viewer")

### ML Training (3D Patch Extraction)

In [None]:
# Optimal for random 3D patch access
radi_reset()
configure(
    write=WriteConfig(
        tile=TileConfig(orientation=SliceOrientation.ISOTROPIC),
        compression=CompressionConfig(algorithm=Compressor.ZSTD, level=3),
    ),
    read=ReadConfig(
        memory_budget_mb=2048,  # More memory for training
        concurrency=8,
    ),
)
print("Configured for ML training")

### S3 Cloud Storage (High Bandwidth)

In [None]:
# Optimal for high-bandwidth S3 instances (p4d, p5)
radi_reset()
configure(
    read=ReadConfig(
        max_workers=8,  # More parallel volume operations
        concurrency=2,  # Fewer threads per operation
    ),
    s3=S3Config(
        max_parallel_ops=32,  # Maximize S3 parallelism
        multipart_part_size_mb=100,  # Larger parts for bandwidth
    ),
)
print("Configured for high-bandwidth S3")

In [None]:
# Cleanup
radi_reset()
shutil.rmtree(TEMP_DIR)
print(f"Cleaned up: {TEMP_DIR}")

## Summary

| Function | Purpose |
|----------|--------|
| `configure(write=, read=, s3=)` | Set RadiObject configuration |
| `radi_cfg()` | Get current RadiObjectConfig |
| `radi_reset()` | Reset to defaults |
| `tdb_ctx()` | Get TileDB context (for advanced use) |
| `tdb_cfg()` | Get raw TileDB Config |

### Quick Reference

```python
from radiobject import (
    configure, radi_cfg, radi_reset, tdb_ctx, tdb_cfg,
    WriteConfig, ReadConfig, S3Config,
    TileConfig, CompressionConfig,
    SliceOrientation, Compressor,
)

# Configure for slice viewing
configure(write=WriteConfig(tile=TileConfig(orientation=SliceOrientation.AXIAL)))

# Configure for ML training
configure(write=WriteConfig(tile=TileConfig(orientation=SliceOrientation.ISOTROPIC)))

# Inspect current config
print(radi_cfg().write.tile.orientation)

# Reset to defaults
radi_reset()

# Access TileDB internals
ctx = tdb_ctx()     # For passing to TileDB operations
cfg = tdb_cfg()     # For inspecting raw parameters
```