Remote Access Foundation

# Epic: Remote Access Foundation (AimX v1)

## Summary

AimDB currently only runs in-process and has no standardized way for external tools to introspect or interact with it. Before implementing the CLI, MCP adapter, dashboards, or remote admin tools, we need a **remote access layer** that exposes introspection APIs to external processes via a secure, stable protocol.

This issue introduces **AimX v1** (Aim eXchange), a minimal request/stream protocol over **Unix domain sockets** (and WebSockets via gateway later) that allows clients to:

* Discover running AimDB instances
* List records & metadata
* Read record snapshots
* Subscribe to record updates (stream)
* *(Optional opt-in)* Write to specific config records

This provides the foundation for:

* `aimdb-cli` - Command-line introspection and management
* IDE / LLM / MCP tooling - Editor integration and AI assistance
* Local dashboards - Real-time visualization
* Future remote gateways - Distributed access
* Distributed inspection tooling - Multi-instance monitoring

---

## Goals

* Introduce `aimdb-remote` subsystem in `aimdb-core`
* Provide **one async supervisor task** that accepts connections
* Spawn **per-connection handler tasks**
* Define **AimX v1 protocol** with versioning
* Establish read-only by default security policy
* Provide clean, documentable record read APIs for external access
* Support record subscriptions with bounded per-client queues
* Implement graceful degradation and error handling

---

## Non-Goals

* ❌ No CLI implementation yet (separate issue after this)
* ❌ No binary protocol (JSON/NDJSON only in v1)
* ❌ No TCP/WebSocket transport yet (future gateway work)
* ❌ No dashboards yet (future)
* ❌ No write access except for explicitly opted-in records
* ❌ No global control operations (shutdown, config reload, etc.)

---

## 📐 Deliverables

### 1) Design Document

Create `docs/design/remote-access/aimx-v1.md`:

Must include:

#### Protocol Primitives
* `hello` - Client announces version and capabilities
* `welcome` - Server responds with version and permissions
* `request` - Client sends command
* `response` - Server replies (success or error)
* `event` - Server pushes subscription updates

#### Request Methods
* `record.list` - List all records with metadata
* `record.get` - Read current snapshot of a record
* `record.subscribe` - Stream updates for a record
* `record.unsubscribe` - Stop streaming updates

#### Protocol Details
* **Framing**: NDJSON (newline-delimited JSON)
* **Versioning**: Semantic versioning with negotiation
* **Error codes**: Typed error enum with protocol-level codes
* **Auth model**: Optional token + UDS permissions
* **Subscription semantics**: Bounded queues with drop policy

#### Example Session
```json
→ {"hello": {"version": "1.0", "client": "aimdb-cli/0.1.0"}}
← {"welcome": {"version": "1.0", "server": "aimdb/0.3.0", "permissions": ["read"]}}

→ {"id": 1, "method": "record.list"}
← {"id": 1, "result": [{"name": "SensorData", "type_id": "...", ...}]}

→ {"id": 2, "method": "record.get", "params": {"name": "SensorData"}}
← {"id": 2, "result": {"value": {...}, "timestamp": ...}}

→ {"id": 3, "method": "record.subscribe", "params": {"name": "SensorData"}}
← {"id": 3, "result": {"subscription_id": "sub-123"}}
← {"event": {"subscription_id": "sub-123", "data": {...}}}
← {"event": {"subscription_id": "sub-123", "data": {...}}}

→ {"id": 4, "method": "record.unsubscribe", "params": {"subscription_id": "sub-123"}}
← {"id": 4, "result": {}}
```

---

### 2) Builder Integration

Add `.with_remote_access(cfg)` to `AimDbBuilder`:

```rust
use aimdb_core::remote::{AimxConfig, SecurityPolicy};

let db = AimDbBuilder::new()
    .with_runtime(tokio_adapter)
    .with_remote_access(
        AimxConfig::uds_default()
            .socket_path("/var/run/aimdb/aimdb.sock")
            .security_policy(SecurityPolicy::ReadOnly)
            .max_connections(16)
            .subscription_queue_size(100)
    )
    .build()?;
```

**Optional write access**:
```rust
.with_remote_access(
    AimxConfig::uds_default()
        .security_policy(SecurityPolicy::ReadWrite)
        .allow_write_to("ConfigRecord")  // explicit opt-in per record
)
```

---

### 3) Runtime Task Architecture

* **RemoteSupervisor** - Single task per AimDB instance
  * Binds to UDS path
  * Accepts incoming connections
  * Spawns `ConnectionHandler` tasks
  * Tracks active connections
  * Handles graceful shutdown & socket cleanup

* **ConnectionHandler** - One task per client
  * Protocol handshake (`hello`/`welcome`)
  * Request routing
  * Subscription management
  * Bounded queue per subscription
  * Drop policy when queue full
  * Connection cleanup on disconnect

---

### 4) Public Rust Types

#### Core Configuration
```rust
pub struct AimxConfig {
    socket_path: PathBuf,
    security_policy: SecurityPolicy,
    max_connections: usize,
    subscription_queue_size: usize,
    auth_token: Option<String>,
}

pub enum SecurityPolicy {
    ReadOnly,
    ReadWrite,  // requires explicit record opt-in
}
```

#### Internal Handle
```rust
pub(crate) struct DbRemoteHandle {
    // Access to record registry
    // Metadata queries
    // Snapshot reads
    // Subscription setup
}
```

#### Error Types
```rust
pub enum RemoteError {
    ProtocolError(String),      // Malformed message
    VersionMismatch { client: String, server: String },
    NotFound(String),           // Record doesn't exist
    PermissionDenied(String),   // Write attempted in read-only mode
    QueueFull(String),          // Subscription queue overflow
    InternalError(String),      // DB operation failed
}
```

---

### 5) Record Metadata

Expose metadata needed for listing & policy enforcement:

```rust
#[derive(Debug, Clone, Serialize)]
pub struct RecordMetadata {
    pub type_id: TypeId,
    pub name: &'static str,
    pub buffer_kind: BufferType,
    pub producer_count: usize,
    pub consumer_count: usize,
    pub writable: bool,         // explicitly opt-in via builder
    pub created_at: Instant,
    pub last_update: Option<Instant>,
}
```

This requires enhancing the internal record registry to track:
- Record names (via type registration)
- Creation timestamps
- Last update times
- Write permissions

---

### 6) Read Path Implementation

Add **internal** APIs to `AimDbInner`:

```rust
impl AimDbInner {
    /// List all registered records with metadata
    pub(crate) async fn list_records(&self) -> Vec<RecordMetadata> {
        // Iterate registry, collect metadata
    }

    /// Read current snapshot of a record by TypeId
    pub(crate) async fn read_record_snapshot(
        &self, 
        type_id: TypeId
    ) -> DbResult<serde_json::Value> {
        // Lookup record, serialize latest value
    }

    /// Subscribe to record updates
    pub(crate) async fn subscribe_record(
        &self,
        type_id: TypeId,
        queue_size: usize,
    ) -> DbResult<RecordSubscription> {
        // Create bounded channel, spawn consumer task
    }
}

pub struct RecordSubscription {
    pub rx: UnboundedReceiver<serde_json::Value>,
    pub unsubscribe: oneshot::Sender<()>,
}
```

---

### 7) Security & Permissions

#### Primary Security Mechanisms
1. **UDS File Permissions** - Operating system level access control
   ```bash
   chmod 600 /var/run/aimdb/aimdb.sock  # Owner only
   chmod 660 /var/run/aimdb/aimdb.sock  # Owner + group
   ```

2. **Optional Auth Token** - Simple token-based auth
   ```rust
   .with_remote_access(
       AimxConfig::uds_default()
           .auth_token("secret-token-here")
   )
   ```

3. **Read-Only Default** - All records read-only unless opted in
   ```rust
   .allow_write_to("AdminConfigRecord")
   ```

#### Permission Announcement
Server announces capabilities in `welcome` message:
```json
{
  "welcome": {
    "version": "1.0",
    "server": "aimdb/0.3.0",
    "permissions": ["read"],
    "writable_records": []  // empty in read-only mode
  }
}
```

---

## 🚧 Implementation Phases

| Phase | Task                              | Done When                                 | Estimated Effort |
| ----- | --------------------------------- | ----------------------------------------- | ---------------- |
| **A** | Create design document            | PR merged to `docs/design/`               | 4 hours          |
| **B** | Add builder config types          | `AimxConfig` compiles & documents | 2 hours          |
| **C** | Enhance record metadata           | `RecordMetadata` available internally     | 4 hours          |
| **D** | Implement supervisor + UDS        | `nc -U /tmp/aimdb.sock` connects          | 6 hours          |
| **E** | Implement protocol handshake      | `hello`/`welcome` exchange works          | 3 hours          |
| **F** | Add `record.list`                 | Can list all records via socket           | 3 hours          |
| **G** | Add `record.get`                  | Can read record snapshot as JSON          | 4 hours          |
| **H** | Add `record.subscribe`            | Streaming updates work                    | 6 hours          |
| **I** | Documentation & example           | `/examples/remote-access-demo/`           | 4 hours          |

**Total Estimated Effort**: ~36 hours (~1 week)

---

## 🔒 Security Principles

1. **Read-only by default** - No writes unless explicitly enabled
2. **Explicit opt-in for writes** - Per-record basis via builder
3. **UDS permissions primary** - Leverage OS-level security
4. **Auth optional but recommended** - Token-based for additional layer
5. **Permission-scoped channels** - Each connection has clear capabilities
6. **No global control ops in v1** - Introspection only, no shutdown/reload
7. **Bounded resources** - Connection limits, queue limits prevent DoS

---

## 🧪 Testing Strategy

### Unit Tests
* Socket startup & cleanup
* Handshake protocol (valid & invalid)
* Version negotiation
* Error response formatting
* Permission checks
* Subscription queue overflow behavior

### Integration Tests
```rust
#[tokio::test]
async fn test_remote_access_flow() {
    // 1. Spawn AimDB with remote access enabled
    // 2. Connect via UDS
    // 3. Send hello, verify welcome
    // 4. List records
    // 5. Get record snapshot
    // 6. Subscribe, verify events received
    // 7. Unsubscribe
    // 8. Disconnect
}

#[tokio::test]
async fn test_read_only_enforcement() {
    // Verify writes rejected in read-only mode
}

#[tokio::test]
async fn test_subscription_backpressure() {
    // Verify bounded queue + drop policy works
}
```

### Fuzzing Tests
* Invalid JSON
* Malformed requests (missing fields, wrong types)
* Connection drops mid-request
* Rapid connect/disconnect cycles
* Subscription spam

### Manual Testing
```bash
# Start AimDB with remote access
cargo run --example remote-access-demo

# In another terminal
nc -U /tmp/aimdb.sock
{"hello": {"version": "1.0", "client": "manual-test"}}
# Should see welcome response

{"id": 1, "method": "record.list"}
# Should see record list
```

---

## ✅ Success Criteria

- [x] AimDB exposes a single UDS endpoint per instance
- [x] CLI can be implemented with zero internal hooks (just a client)
- [x] Zero overhead when no clients connected
- [x] Minimal overhead when clients connected (bounded queues)
- [x] Protocol fully documented with examples
- [x] Versioned and evolvable protocol
- [x] Works with Tokio runtime
- [x] Type-safe throughout (no breaking of core guarantees)
- [x] Record model unchanged (remote access is a view layer)
- [x] Security model clear and enforceable
- [x] Graceful shutdown & cleanup
- [x] Comprehensive tests (unit, integration, fuzzing)

---

## 📎 Future Dependent Issues

After this epic is complete, the following become unblocked:

1. **✅ Issue**: *Implement `aimdb-cli` using AimX client*
   - Pure client implementation
   - No special privileges
   - Uses public protocol only

2. **MCP Integration**: *Build Model Context Protocol adapter*
   - LLM/IDE integration
   - Read-only introspection
   - Schema discovery

3. **WebSocket Gateway**: *TCP/WebSocket transport layer*
   - Remote access (not just local)
   - TLS encryption
   - Authentication/authorization

4. **Metrics Endpoint**: *Prometheus/OpenMetrics exporter*
   - Uses `record.list` + `record.subscribe`
   - Standard metrics format

5. **Remote Write Extensions**: *Admin mode for writes*
   - Explicit admin API
   - Audit logging
   - Fine-grained permissions

6. **Introspection UI**: *Web-based dashboard*
   - Connects via WebSocket gateway
   - Real-time visualization
   - Record inspection

---

## 🎯 Design Considerations

### Why NDJSON?
- Human-readable for debugging
- Streaming friendly (line-delimited)
- JSON parsing widely available
- Easy to test with `nc` or `telnet`
- Binary protocol can be added later if needed

### Why UDS First?
- Simplest security model (file permissions)
- Zero network configuration
- Perfect for local CLI/tools
- Fast (no TCP overhead)
- Gateway can add TCP/WS later

### Why Read-Only Default?
- Introspection is 90% of use cases
- Writes are high-risk (data integrity)
- Explicit opt-in forces conscious decision
- Easier to audit/secure

### Why Per-Connection Tasks?
- Clean isolation
- Independent subscription state
- Easy cleanup on disconnect
- No shared mutable state between clients

---

## 📚 Documentation Requirements

### User-Facing Docs
- [ ] Quickstart guide: "Enable Remote Access"
- [ ] Protocol specification with examples
- [ ] Security best practices
- [ ] Troubleshooting common issues

### Developer Docs
- [ ] Architecture overview
- [ ] Adding new protocol methods
- [ ] Testing remote access features
- [ ] Performance considerations

### Examples
- [ ] `examples/remote-access-basic/` - Minimal setup
- [ ] `examples/remote-access-client/` - Simple Rust client
- [ ] `examples/remote-access-secure/` - Auth token + permissions

---

## 🚀 Migration Path

### Phase 1: Foundation (This Issue)
- Protocol definition
- Basic implementation
- Core features (list, get, subscribe)

### Phase 2: Tooling
- CLI implementation
- Testing tools
- Example clients

### Phase 3: Production
- Performance tuning
- Security hardening
- Observability

### Phase 4: Advanced
- WebSocket gateway
- Remote writes
- Distributed features

---

## ✅ Design Decisions

### 1. Subscription Drop Policy
**Decision: Drop oldest messages when queue full**
- Prioritizes real-time data (newest is most valuable)
- Standard streaming pattern (Kafka, ring buffers)
- Client always sees most recent state

### 2. Connection Limits
**Decision: Global limit across all AimDB instances**
- Simpler implementation (single atomic counter)
- Better resource control system-wide
- Can add per-instance limits later if needed

### 3. Auth Token Format
**Decision: Simple bearer token string for v1**
- KISS principle - don't over-engineer early
- Sufficient for local UDS security
- Easy to implement and test
- Clear upgrade path to JWT in v2 without breaking protocol

### 4. Record Resolution
**Decision: Support both TypeId and human-readable names**
- TypeId for efficiency (direct hash lookup)
- Name for convenience (CLI, humans)
- Server resolves name → TypeId internally
- Both exposed in `record.list` metadata

### 5. Error Recovery
**Decision: Document auto-reconnect with exponential backoff**
- Provide client implementation guidance
- Recommend 100ms → 30s backoff range
- Server may hint `retry_after_ms` in error responses
- Prevents reconnect storms

### 6. Metrics Exposure
**Decision: Expose AimX metrics as AimDB records**
- Dogfooding - AimX uses AimDB's own features
- Consistent API - subscribe to metrics like any record
- Real-time monitoring built-in
- Metrics record: `aimx.metrics` with connection count, requests/sec, etc.

---

## 🎉 Why This Matters

This is a **critical enabler** for:

- **Production operability** - Inspect running systems without redeployment
- **Developer UX** - Build tools on stable protocol, not internal APIs
- **Distributed systems** - Foundation for multi-instance inspection
- **AI/LLM integration** - MCP adapter for intelligent tooling
- **Observability** - Metrics, dashboards, monitoring

After this issue, **AimDB becomes externally inspectable** without compromising its core design principles or security model.


Phase	Task	Done When	Estimated Effort
A	Create design document	PR merged to `docs/design/`	4 hours
B	Add builder config types	`AimxConfig` compiles & documents	2 hours
C	Enhance record metadata	`RecordMetadata` available internally	4 hours
D	Implement supervisor + UDS	`nc -U /tmp/aimdb.sock` connects	6 hours
E	Implement protocol handshake	`hello`/`welcome` exchange works	3 hours
F	Add `record.list`	Can list all records via socket	3 hours
G	Add `record.get`	Can read record snapshot as JSON	4 hours
H	Add `record.subscribe`	Streaming updates work	6 hours
I	Documentation & example	`/examples/remote-access-demo/`	4 hours

Remote Access Foundation #38

Description

Epic: Remote Access Foundation (AimX v1)

Summary

Goals

Non-Goals

📐 Deliverables

1) Design Document

Protocol Primitives

Request Methods

Protocol Details

Example Session

2) Builder Integration

3) Runtime Task Architecture

4) Public Rust Types

Core Configuration

Internal Handle

Error Types

5) Record Metadata

6) Read Path Implementation

7) Security & Permissions

Primary Security Mechanisms

Permission Announcement

🚧 Implementation Phases

🔒 Security Principles

🧪 Testing Strategy

Unit Tests

Integration Tests

Fuzzing Tests

Manual Testing

✅ Success Criteria

📎 Future Dependent Issues

🎯 Design Considerations

Why NDJSON?

Why UDS First?

Why Read-Only Default?

Why Per-Connection Tasks?

📚 Documentation Requirements

User-Facing Docs

Developer Docs

Examples

🚀 Migration Path

Phase 1: Foundation (This Issue)

Phase 2: Tooling

Phase 3: Production

Phase 4: Advanced

✅ Design Decisions

1. Subscription Drop Policy

2. Connection Limits

3. Auth Token Format

4. Record Resolution

5. Error Recovery

6. Metrics Exposure

🎉 Why This Matters

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions