Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .changeset/config.json
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@
"@objectstack/service-cache",
"@objectstack/service-cluster",
"@objectstack/service-cluster-redis",
"@objectstack/service-external-datasource",
"@objectstack/service-feed",
"@objectstack/service-i18n",
"@objectstack/service-job",
Expand Down
21 changes: 21 additions & 0 deletions .changeset/external-datasource-federation-p1.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
---
"@objectstack/spec": minor
"@objectstack/driver-sql": minor
---

External Datasource Federation (ADR-0015) — Phase 1.

Adds the spec foundation and the DDL gate for federating mature external
databases without ObjectStack ever mutating their schema:

- `Datasource.schemaMode` (`managed` | `external` | `validate-only`) and
`Datasource.external` settings, with a cross-field invariant.
- `Object.external` binding (remote table/schema, writability, column map).
- Shared error contract: `ExternalSchemaMismatchError`,
`ExternalWriteForbiddenError`, `ExternalSchemaModeViolationError`
(stable `code`s) + structured `SchemaDiffEntry` rendering.
- `driver-sql` DDL gate: schema-mutating DDL (`initObjects`/`syncSchema`/
`dropTable`) is rejected when `schemaMode !== 'managed'`.

All changes are additive and backward-compatible (`schemaMode` defaults to
`'managed'`).
11 changes: 11 additions & 0 deletions .changeset/external-datasource-federation-p2-cli.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
---
"@objectstack/cli": minor
---

External Datasource Federation (ADR-0015) — CLI surface.

New `os datasource` command group: `list-tables` (list remote tables),
`introspect` (generate a reviewable `*.object.ts` draft from a remote table),
and `validate` (validate federated objects against the remote schema; exits
non-zero on mismatch). Backed by the `/api/v1/datasources/:name/external/*`
REST routes.
11 changes: 11 additions & 0 deletions .changeset/external-datasource-federation-p2-rest.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
---
"@objectstack/rest": minor
---

External Datasource Federation (ADR-0015) — REST surface.

Adds `registerExternalDatasourceRoutes`, mounting `/api/v1/datasources/:name/
external/*` — `GET tables`, `POST tables/:remote/draft`, `POST refresh-catalog`,
`POST validate` — served by the `external-datasource` service and wired into the
REST API plugin. Routes return `503 external_service_unavailable` when the
service is not registered, so they are safe to mount unconditionally.
27 changes: 27 additions & 0 deletions .changeset/external-datasource-federation-p2.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
---
"@objectstack/spec": minor
"@objectstack/service-external-datasource": minor
---

External Datasource Federation (ADR-0015) — Phase 2 (service core).

Adds the federation service contract, the type-compatibility matrix, and a
new service package that introspects, drafts, and validates federated
objects:

- `@objectstack/spec`:
- `data/type-compat.ts` — dialect-aware SQL↔field-type matrix
(`canonicalizeSqlType`, `suggestFieldType`, `isCompatible`) for
postgres/mysql/sqlite/snowflake/bigquery/mongo.
- `contracts/external-datasource-service.ts` — `IExternalDatasourceService`
plus `RemoteTable`, `GenerateDraftOpts`, `ObjectDraft`,
`SchemaValidationResult`/`Report`.
- `@objectstack/service-external-datasource` (new): implements the service —
`listRemoteTables`, `generateObjectDraft` (renders a reviewable
`*.object.ts` with `// REVIEW:` markers), `validateObject`/`validateAll`
(structured `SchemaDiffEntry` diffs), and `refreshCatalog`. Decoupled from
the kernel via injected I/O; kernel plugin registers it as the
`external-datasource` service.

REST routes and the `os datasource` CLI commands follow in a subsequent
slice.
12 changes: 12 additions & 0 deletions .changeset/external-datasource-federation-p3-boot-gate.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
---
"@objectstack/runtime": minor
---

External Datasource Federation (ADR-0015) — boot-validation gate (Gate 2).

Adds `ExternalValidationPlugin` (`createExternalValidationPlugin`) which, on
`kernel:ready`, validates every federated object against its remote table via
the `external-datasource` service and applies the datasource's
`external.validation.onMismatch` policy: `fail` (throws
`ExternalSchemaMismatchError`, aborting boot — the default), `warn` (logs the
diff), or `ignore`. No-op when federation is unused.
14 changes: 14 additions & 0 deletions .changeset/external-datasource-federation-p3-spec.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
---
"@objectstack/spec": minor
---

External Datasource Federation (ADR-0015) — Phase 3 spec: `external_catalog`
metadata type.

- Registers `external_catalog` in `MetadataTypeSchema` and
`DEFAULT_METADATA_TYPE_REGISTRY` (system domain, `allowRuntimeCreate: true`,
not org-overridable).
- Adds `data/external-catalog.zod.ts` — `ExternalCatalogSchema` /
`ExternalTableSchema` / `ExternalColumnSchema` for persisting a cached
remote-schema snapshot of a federated datasource (consumed by
`refreshCatalog`, the boot-validation gate, and Studio's schema browser).
12 changes: 12 additions & 0 deletions .changeset/external-datasource-federation-p4-ai.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
---
"@objectstack/service-ai": minor
---

External Datasource Federation (ADR-0015) — Phase 4: AI awareness.

`SchemaRetriever.renderSnippet` now annotates federated objects in the
auto-injected schema context, e.g.
`### wh_order — Warehouse Order [external, read-only, datasource=warehouse]`,
so the LLM knows an object comes from a customer's production database and must
not propose schema changes or unsafe writes. `ObjectShape` gains `datasource`
+ `external` (read from object metadata). Managed objects are unannotated.
15 changes: 15 additions & 0 deletions .changeset/external-datasource-federation-p6-write-gate.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
---
"@objectstack/objectql": minor
---

External Datasource Federation (ADR-0015) — write gate (Gate 3) + introspection plumbing.

- Write gate: ObjectQL `insert`/`update`/`delete` now block writes to a
federated datasource (`schemaMode !== 'managed'`) unless BOTH
`datasource.external.allowWrites` and `object.external.writable` are true,
throwing `ExternalWriteForbiddenError` (code `EXTERNAL_WRITE_FORBIDDEN`).
Managed datasources (and objects without a datasource definition) are
unaffected. New `registerDatasourceDef()` records declarative datasource
ownership; manifests carrying `datasources` are indexed during `registerApp`.
- `engine.introspectDatasource(name)` delegates to the named driver's
`introspectSchema()`, wiring the external-datasource service end-to-end.
147 changes: 147 additions & 0 deletions docs/plans/external-datasource-federation-impl.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,147 @@
# Plan: External Datasource Federation (ADR-0015) — Implementation

> Implementation plan and progress tracker for
> [ADR-0015 — External Datasource Federation](../adr/0015-external-datasource-federation.md).
> The ADR is the design source of truth; this document scopes the work
> against the **current** codebase and records what has shipped.

## Context

ObjectStack today owns its `default` datasource and freely runs DDL.
ADR-0015 adds the ability to *federate* a mature external database
(Postgres, Snowflake, BigQuery, …) so the AI/REST/View stack can query it
live, **without ObjectStack ever mutating the remote schema**.

The decisive design choice — a federated object stays a normal `Object`,
its remote-ness expressed by the **datasource it points to**
(`schemaMode !== 'managed'`) plus an optional **`object.external`**
sub-record — means almost the entire downstream stack (ObjectQL, REST,
Views, AI tools, RBAC, audit) works unchanged. Behavioural differences are
enforced by three runtime gates.

## Current-state assessment (greenfield)

A repo-wide grep confirmed **zero** prior implementation of `schemaMode`,
`object.external`, `external_catalog`, `IExternalDatasourceService`,
`type-compat`, or the three error classes. The supporting infrastructure
already exists and is reused:

| Already present (reused) | Location |
|:--|:--|
| Driver `introspectSchema()` (dialect-aware) | `packages/plugins/driver-sql/src/sql-driver.ts` |
| Per-object datasource routing | `packages/objectql/src/engine.ts`, `Object.datasource` |
| `kernel:ready` hook pattern for plugins | `packages/runtime/src/*-plugin.ts` |
| Metadata type registry | `packages/spec/src/kernel/metadata-plugin.zod.ts` (`DEFAULT_METADATA_TYPE_REGISTRY`) |
| Error formatting helpers | `packages/spec/src/shared/error-map.zod.ts` |
| oclif CLI command groups (e.g. `data/`) | `packages/cli/src/commands/` |
| Service package template + DI | `packages/services/service-*` |

## The three runtime gates

| Gate | Layer | Where | Enforces |
|:--|:--|:--|:--|
| **1. DDL** | driver | `sql-driver` `initObjects`/`dropTable` (+ future `applyMigrations`) | No DDL when `schemaMode !== 'managed'`. |
| **2. Boot validation** | runtime | new `external-validation-plugin` on `kernel:ready` | Federated object must match remote table (fail/warn/ignore). |
| **3. Write** | data engine | `IDataEngine.insert/update/delete` | Writes need `datasource.external.allowWrites` **and** `object.external.writable`. |

## Phased rollout

| Phase | Scope | Status |
|:-----:|:--|:--|
| **P1** | Spec changes (`schemaMode`, `object.external`, error classes) + DDL gate in `driver-sql` + tests | ✅ **Done** (this branch) |
| **P2** | `IExternalDatasourceService` impl + type-compat matrix + CLI `introspect`/`validate` | 🟡 **Service core done** (matrix + contract + service); REST routes + CLI pending |
| **P3** | Boot-validation plugin in `@objectstack/runtime` + `external_catalog` metadata type + caching | ⬜ Todo |
| **P4** | `SchemaRetriever` annotation + agent prompt + AI safety nets (LIMIT injection, timeout) | ⬜ Todo |
| **P5** | Studio UI in `../objectui` (wizard, schema browser, mapping editor, validation panel) | ⬜ Todo |
| **P6** | Write gate + `allowWrites`/`writable` double opt-in + tests | ⬜ Todo |
| **P7** | Additional drivers (Snowflake / BigQuery / MySQL) | ⬜ Todo |

**MVP = P1–P4**: connect a read-only Postgres replica, register a few
tables, let AI Data Chat query them safely.

## P1 — delivered in this change

Spec is additive and backward-compatible (defaults preserve current
behaviour).

1. **`packages/spec/src/data/datasource.zod.ts`**
- `SchemaModeSchema` enum (`managed` | `external` | `validate-only`),
exported `SchemaMode` type.
- `ExternalDatasourceSettingsSchema` (label, allowedSchemas,
`allowWrites`, validation policy, `credentialsRef`, `queryTimeoutMs`,
`requirePermission`).
- `Datasource.schemaMode` (default `'managed'`) + `Datasource.external`,
with a `superRefine` enforcing the cross-field invariant (external
settings ⇔ non-managed mode).

2. **`packages/spec/src/data/object.zod.ts`**
- `ObjectExternalBindingSchema` (remoteName, remoteSchema, `writable`,
columnMap, introspectedAt, ignoreColumns) + `Object.external`.
- The object↔datasource cross-artefact invariant is intentionally
enforced at metadata-load time (P3), not in Zod.

3. **`packages/spec/src/shared/external-errors.ts`** (new)
- `ExternalSchemaMismatchError` / `ExternalWriteForbiddenError` /
`ExternalSchemaModeViolationError`, each with a stable `code`.
- `SchemaDiffEntry` type + pure `renderDiffMessage()` (P2/P3 consume it).

4. **DDL gate — `packages/plugins/driver-sql/src/sql-driver.ts`**
- `SqlDriverConfig` gains an optional `schemaMode` (stripped before Knex).
- `assertSchemaMutable()` choke-point throws
`ExternalSchemaModeViolationError` when `schemaMode !== 'managed'`;
called from `initObjects` (covers `syncSchema`) and `dropTable`.

5. **Tests** — Zod refinements (datasource modes, external settings,
object binding), error classes + diff rendering, and the DDL gate
(managed allows DDL; external/validate-only block create/alter/drop;
`schemaMode` not leaked to Knex).

## P2 — delivered in this change (service core)

1. **`packages/spec/src/data/type-compat.ts`** — pure, dialect-aware matrix
(`canonicalizeSqlType` → `suggestFieldType` / `isCompatible`) covering
postgres/mysql/sqlite/snowflake/bigquery/mongo. Returns `true` / `'lossy'`
/ `false`. Independently unit-tested.

2. **`packages/spec/src/contracts/external-datasource-service.ts`** —
`IExternalDatasourceService` + `RemoteTable`, `GenerateDraftOpts`,
`ObjectDraft`, `SchemaValidationResult`/`Report`. Reuses the existing
`IntrospectedSchema` from `schema-diff-service.ts` and `SchemaDiffEntry`
from `external-errors.ts`.

3. **`packages/services/service-external-datasource`** (new package) —
`ExternalDatasourceService` implements the contract:
- `listRemoteTables` (schema-qualified, `allowedSchemas`-filtered),
- `generateObjectDraft` (type-compat mapping → reviewable `*.object.ts`
source with `// REVIEW:` markers on lossy/unknown columns),
- `validateObject` / `validateAll` (structured diffs: `missing_table`,
`missing_column`, `type_mismatch`; lossy = warning, hard mismatch =
error; honours `columnMap` + `ignoreColumns`),
- `refreshCatalog` (snapshot shape; persistence lands with P3's
`external_catalog` type).
The service takes injected I/O (`introspect` / `getDatasource` /
`getObject` / `listObjects`) so it is decoupled and fully unit-tested; the
`ExternalDatasourceServicePlugin` wires the live `IDataEngine` +
`IMetadataService` and registers it as the `external-datasource` service.

### Remaining P2 slice (next)

- **REST routes** under `/api/v1/datasources/:name/external/*` (ADR §6.2).
- **CLI** `os datasource list-tables | introspect | validate` (ADR §6.3) —
thin oclif commands over the REST routes.
- Driver introspection plumbing: expose
`getDatasourceDriver(name)` / `introspectDatasource(name)` on the data
engine so the plugin's default `introspect` works end-to-end.

### Follow-up notes / open items for later phases

- **DDL gate plumbing (P3)**: the runtime must inject `Datasource.schemaMode`
into `SqlDriverConfig` when constructing drivers. P1 wires the driver
side and defaults to `'managed'`; the runtime wiring lands with the
boot-validation plugin.
- **`applyMigrations` gate**: `ISchemaDiffService.applyMigrations` also
needs the gate (per ADR §5.1) when the migration runner ships.
- **Lint rule** preventing plugins from bypassing the gate via raw `knex`
(ADR §12 risk row) — defer to P2/P3 alongside the service.
- **error-map envelope**: map the three `code`s into the shared error
envelope when P6 surfaces them over REST.
85 changes: 85 additions & 0 deletions packages/cli/src/commands/datasource/introspect.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
// Copyright (c) 2025 ObjectStack. Licensed under the Apache-2.0 license.

import { Args, Command, Flags } from '@oclif/core';
import { writeFile } from 'node:fs/promises';
import { resolve, isAbsolute } from 'node:path';

/** Resolve server URL + token from flags then env (mirrors createApiClient). */
function resolveTarget(flags: { url?: string; token?: string }): { url: string; token?: string } {
const url = flags.url || process.env.OS_CLOUD_URL || 'http://localhost:3000';
const token = flags.token || process.env.OS_TOKEN;
return { url, token };
}

/**
* `os datasource introspect <name> --table <remote>` — generate an Object
* draft (`*.object.ts`) from a remote table (ADR-0015).
* POST /api/v1/datasources/:name/external/tables/:remote/draft.
*/
export default class DatasourceIntrospect extends Command {
static override description = 'Generate an Object draft from a remote table on an external datasource';

static override examples = [
'$ os datasource introspect warehouse --table fact_orders',
'$ os datasource introspect warehouse --table fact_orders --out objects/wh_order.object.ts',
];

static override args = {
name: Args.string({ description: 'Datasource name', required: true }),
};

static override flags = {
url: Flags.string({ char: 'u', description: 'Server URL', env: 'OS_CLOUD_URL' }),
token: Flags.string({ char: 't', description: 'Authentication token', env: 'OS_TOKEN' }),
table: Flags.string({ char: 'T', description: 'Remote table name', required: true }),
out: Flags.string({ char: 'o', description: 'Write the generated source to this file (under the current working directory)' }),
};

async run(): Promise<void> {
const { args, flags } = await this.parse(DatasourceIntrospect);
const { url, token } = resolveTarget(flags);

const res = await fetch(
`${url}/api/v1/datasources/${args.name}/external/tables/${encodeURIComponent(flags.table)}/draft`,
{
method: 'POST',
headers: {
'content-type': 'application/json',
...(token ? { authorization: `Bearer ${token}` } : {}),
},
body: '{}',
},
);
const body = (await res.json()) as {
draft?: { source?: string; review?: Array<{ column: string; note: string }> };
error?: string;
};
if (body.error) this.error(body.error);

const draft = body.draft;
if (!draft?.source) {
this.error(`Failed to generate draft for '${flags.table}' on '${args.name}'.`);
return;
}

if (flags.out) {
// Constrain the output path to the current working directory: the body
// is server-generated TypeScript, so refuse to write outside the project
// tree (defends against a malicious/compromised server supplying an
// absolute or traversing `--out` via shell expansion).
const target = resolve(process.cwd(), flags.out);
if (isAbsolute(flags.out) || !target.startsWith(process.cwd() + '/')) {
this.error(`--out must be a relative path within the current directory: ${flags.out}`);
return;
}
await writeFile(target, draft.source, 'utf8');

Check warning

Code scanning / CodeQL

Network data written to file Medium

Write to file system depends on
Untrusted data
.
this.log(`Wrote ${flags.out}`);
} else {
this.log(draft.source);
}

for (const r of draft.review ?? []) {
this.warn(`REVIEW: column '${r.column}' — ${r.note}`);
}
}
}
Loading
Loading