Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 23 additions & 0 deletions .changeset/seed-identity-binding.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
---
"@objectstack/spec": minor
"@objectstack/runtime": minor
---

Seed data: first-class identity binding + loud failures (fixes #1389)

Records seeded via `defineDataset` / `defineStack({ data })` can now bind to a
platform user with `cel\`os.user.id\`` (and to the org with `cel\`os.org.id\``),
which previously never resolved at boot.

- **`os.user` / `os.org` now actually resolve.** The runtime provisions a
deterministic, non-loginable system user (`usr_system`, role `system`)
*before* any seed runs and binds it to `os.user`, so identity-derived seed
values resolve even on a fresh boot — before the first human sign-up. The
human login admin remains a separate better-auth identity and need not own
seed data. Exposed as the canonical `SystemUserId.SYSTEM` constant.
- **New `SeedLoaderConfig.identity`** carries the `os.user` / `os.org` subject
into CEL evaluation (`@objectstack/spec`).
- **Failures are loud, not silent.** A record whose CEL value can't resolve
(e.g. a required `cel\`os.user.id\`` with no identity) — or that fails to
write — is now counted as an error, marks the load unsuccessful, and logs an
actionable message, instead of being silently dropped.
78 changes: 78 additions & 0 deletions content/docs/guides/seed-data.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -203,6 +203,84 @@ export const SeedData = [accountsSeed, contactsSeed];

---

## Dynamic Values (CEL)

Any field value may be a **CEL expression** evaluated at install time against a
single per-load pinned `now`. This is the only correct way to author time-based
or identity-derived seed values — a literal `new Date()` would ship the package
author's clock to every customer and break build determinism.

```typescript
import { defineDataset, cel } from '@objectstack/spec';

defineDataset(Opportunity, {
records: [{
name: 'Acme Q3 Renewal',
close_date: cel`daysFromNow(45)`,
created_at: cel`now()`,
owner_id: cel`os.user.id`, // the seed identity
organization_id: cel`os.org.id`,
}],
});
```

Available in the seed CEL context:

- **Functions:** `now()`, `today()`, `daysFromNow(n)`, `daysAgo(n)`,
`isBlank(v)`, `coalesce(v, fallback)`
- **Scope:** `os.user`, `os.org`, `os.env`

### Binding records to a user (`os.user`)

Many objects have a **required** owner lookup — `owner_id`, `created_by`,
`assigned_to`. To seed such a record, bind it to a user with `cel\`os.user.id\``.
This is the single canonical convention; there is no `currentUser()`, `@admin`,
or similar special syntax.

```typescript
defineDataset(Project, {
externalId: 'code',
records: [{
code: 'bootstrap',
name: 'Bootstrap Project',
owner_id: cel`os.user.id`, // ← bound to the seed identity
}],
});
```

**Where does `os.user` come from?** On a fresh boot there are no human users yet
— seeding runs *before* the first sign-up. So the runtime provisions a
deterministic, non-loginable **system user** (`usr_system`, role `system`)
*before* any seed runs and binds it to `os.user`. It owns seeded data the way
Salesforce's "Automated Process" user does — it has no credential and **cannot
sign in**.

- The **human login admin** is created separately (CLI sign-up / first-signup
promotion) through better-auth and need **not** be the seed owner.
- `os.org.id` resolves to the current organization; during a per-tenant replay
it is that tenant's id, falling back to the load's `organizationId`.

This ordering guarantee means `cel\`os.user.id\`` / `cel\`os.org.id\`` always
resolve at boot — you never have to sequence seeds around user creation.

### Failure is loud, not silent

If a record uses a CEL value that cannot be resolved — e.g. `cel\`os.user.id\``
when the system identity could not be provisioned — the record is **not silently
dropped**. The loader counts it as an error, marks the load unsuccessful, and
logs an actionable message:

```
[SeedLoader] Cannot resolve dynamic seed values for project record #0:
... Records using cel`os.user.id` / cel`os.org.id` require a seed identity —
ensure a system/admin user exists before seeding.
```

Write failures (e.g. a required field still missing after resolution) are
surfaced the same way. Tooling should check `result.success` / `result.errors`.

---

## Organising Multiple Datasets

For applications with several objects, co-locate seed files under `src/data/` and
Expand Down
115 changes: 109 additions & 6 deletions packages/runtime/src/app-plugin.ts
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ import { readEnvWithDeprecation } from '@objectstack/types';
import { SeedLoaderService } from './seed-loader.js';
import { loadDisabledPackageIds } from './package-state-store.js';
import type { IMetadataService, II18nService } from '@objectstack/spec/contracts';
import { SystemUserId } from '@objectstack/spec/system';
import { QuickJSScriptRunner } from './sandbox/quickjs-runner.js';
import { hookBodyRunnerFactory, actionBodyRunnerFactory } from './sandbox/body-runner.js';

Expand Down Expand Up @@ -482,6 +483,13 @@ export class AppPlugin implements Plugin {
object: d.object,
}));

// Resolve the seed identity (os.user / os.org) BEFORE any seed
// runs. Deterministically ensures a non-loginable system user
// exists so identity-derived seed values (e.g.
// `owner_id: cel`os.user.id``) resolve at boot — before the
// first human sign-up. See ensureSeedIdentity().
const seedIdentity = await this.ensureSeedIdentity(ql, ctx.logger);

// Stash datasets on a kernel service so SecurityPlugin's
// sys_organization insert hook can replay them per-tenant
// (Salesforce-sandbox style: every new org gets its own
Expand Down Expand Up @@ -529,6 +537,11 @@ export class AppPlugin implements Plugin {
defaultMode: 'upsert',
multiPass: true,
organizationId,
// Bind os.user (system identity) and os.org (this
// tenant) so identity-derived seed values resolve
// per-org. org.id falls back to organizationId
// inside the loader when identity.org is absent.
identity: seedIdentity,
},
});
const result = await seedLoader.load(request);
Expand Down Expand Up @@ -571,14 +584,38 @@ export class AppPlugin implements Plugin {
const { SeedLoaderRequestSchema } = await import('@objectstack/spec/data');
const request = SeedLoaderRequestSchema.parse({
datasets: normalizedDatasets,
config: { defaultMode: 'upsert', multiPass: true },
config: { defaultMode: 'upsert', multiPass: true, identity: seedIdentity },
});
const result = await seedLoader.load(request);
ctx.logger.info('[Seeder] Seed loading complete', {
inserted: result.summary.totalInserted,
updated: result.summary.totalUpdated,
errors: result.errors.length,
});
const { totalInserted, totalUpdated, totalSkipped, totalErrored } = result.summary;
if (result.success) {
ctx.logger.info('[Seeder] Seed loading complete', {
inserted: totalInserted,
updated: totalUpdated,
skipped: totalSkipped,
errored: totalErrored,
});
} else {
// LOUD FAILURE: dropped records were previously
// invisible (the summary only logged errors.length and
// omitted totalErrored). Report the count AND each
// actionable reason so broken seeds can't pass silently.
ctx.logger.warn(
`[Seeder] Seed loading completed with ${totalErrored} dropped record(s) and ${result.errors.length} error(s) for ${appId}`,
{
inserted: totalInserted,
updated: totalUpdated,
skipped: totalSkipped,
errored: totalErrored,
},
);
for (const e of result.errors.slice(0, 20)) {
ctx.logger.warn(`[Seeder] ✗ ${e.message}`);
}
if (result.errors.length > 20) {
ctx.logger.warn(`[Seeder] …and ${result.errors.length - 20} more error(s)`);
}
}
} else {
// Fallback: basic insert when metadata service is not available
ctx.logger.debug('[Seeder] No metadata service; using basic insert fallback');
Expand Down Expand Up @@ -633,6 +670,72 @@ export class AppPlugin implements Plugin {
this.emitCatalogEvent(ctx, 'app:unregistered', sys);
}

/**
* Resolve the identity bound to `os.user` / `os.org` for seed CEL values.
*
* On a fresh boot there are zero users until the first human sign-up
* (which the SeedLoader runs *before*), so identity-derived seeds like
* `owner_id: cel`os.user.id`` had nothing to resolve against and were
* dropped silently. To make seeds deterministic and self-sufficient we
* upsert a single non-loginable **system user** (`usr_system`) and bind
* it as `os.user`.
*
* Why a dedicated system user rather than the login admin:
* - `sys_user` is better-auth-managed and schema-locked (ADR-0010); the
* password lives in `sys_account`, so a *loginable* admin can only be
* minted through better-auth (the CLI does this via HTTP sign-up after
* boot). A raw insert here would bypass those invariants.
* - `usr_system` is an owner identity only (no credential row), analogous
* to Salesforce's "Automated Process" user. The human admin is created
* independently and need not be the seed owner.
*
* Idempotent: matches by the stable id, inserts once, reuses thereafter.
* Failures are non-fatal (logged) — records that actually need `os.user`
* then fail loudly in the loader with an actionable message.
*/
private async ensureSeedIdentity(
ql: any,
logger: PluginContext['logger'],
): Promise<{ user: { id: string; role: string; email: string } }> {
// Deterministic, non-loginable service identity that owns seeded data.
const SYSTEM_USER_ID = SystemUserId.SYSTEM;
const SYSTEM_USER_EMAIL = 'system@objectstack.local';
const identity = { user: { id: SYSTEM_USER_ID, role: 'system', email: SYSTEM_USER_EMAIL } };
const opts = { context: { isSystem: true } } as any;

try {
const existing = await (ql as any).find(
'sys_user',
{ where: { id: SYSTEM_USER_ID }, limit: 1 },
opts,
);
if (Array.isArray(existing) && existing.length > 0) {
return identity;
}
await (ql as any).insert(
'sys_user',
{
id: SYSTEM_USER_ID,
name: 'System',
email: SYSTEM_USER_EMAIL,
email_verified: true,
role: 'system',
},
opts,
);
logger.info(
`[Seeder] Provisioned deterministic system user (${SYSTEM_USER_ID}) as seed owner — binds os.user for identity-derived seed values`,
);
} catch (err: any) {
// Non-fatal: identity-dependent records will fail loudly in the
// loader; identity-free records still seed normally.
logger.warn('[Seeder] Failed to ensure system seed user; os.user-dependent seeds may be dropped', {
error: err?.message ?? String(err),
});
}
return identity;
}

/**
* Emit a kernel hook so the control-plane `AppCatalogService` can
* upsert / delete the corresponding `sys_app` row. Silently no-ops
Expand Down
132 changes: 132 additions & 0 deletions packages/runtime/src/seed-loader.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -1015,6 +1015,138 @@ describe('SeedLoaderService', () => {
// load — edge cases
// ========================================================================

describe('load — seed identity (os.user / os.org)', () => {
// CEL Expression envelope helper — same shape the spec persists.
const cel = (source: string) => ({ dialect: 'cel', source });

const baseConfig = (extra: Record<string, any> = {}): SeedLoaderConfig => ({
dryRun: false,
haltOnError: false,
multiPass: false,
defaultMode: 'upsert',
batchSize: 1000,
transaction: false,
...extra,
} as SeedLoaderConfig);

it('resolves cel`os.user.id` from config.identity', async () => {
const metadata = createMockMetadata({
note: { name: 'note', fields: { name: { type: 'text' }, author: { type: 'text' } } },
});
const engine = createMockEngine();
const loader = new SeedLoaderService(engine, metadata, logger);

const result = await loader.load({
datasets: [
{
object: 'note',
externalId: 'name',
mode: 'insert',
env: ['prod', 'dev', 'test'],
records: [{ name: 'N1', author: cel('os.user.id') }],
},
],
config: baseConfig({
identity: { user: { id: 'usr_system', role: 'system', email: 'system@objectstack.local' } },
}),
});

expect(result.success).toBe(true);
expect(result.summary.totalErrored).toBe(0);
expect(engine.insert).toHaveBeenCalledWith(
'note',
expect.objectContaining({ name: 'N1', author: 'usr_system' }),
expect.anything(),
);
});

it('fails loudly (no raw envelope written) when os.user is unbound', async () => {
const metadata = createMockMetadata({
note: { name: 'note', fields: { name: { type: 'text' }, author: { type: 'text' } } },
});
const engine = createMockEngine();
const loader = new SeedLoaderService(engine, metadata, logger);

const result = await loader.load({
datasets: [
{
object: 'note',
externalId: 'name',
mode: 'insert',
env: ['prod', 'dev', 'test'],
records: [{ name: 'N1', author: cel('os.user.id') }],
},
],
// No identity → os.user unbound → record must be dropped, not written.
config: baseConfig(),
});

expect(result.success).toBe(false);
expect(result.summary.totalErrored).toBe(1);
expect(result.errors).toHaveLength(1);
expect(result.errors[0].message).toContain('os.user');
// Critically: the unresolved Expression envelope is NEVER persisted.
expect(engine.insert).not.toHaveBeenCalled();
});

it('falls back os.org.id to organizationId during per-tenant replay', async () => {
const metadata = createMockMetadata({
note: { name: 'note', fields: { name: { type: 'text' }, org_label: { type: 'text' } } },
});
const engine = createMockEngine();
const loader = new SeedLoaderService(engine, metadata, logger);

const result = await loader.load({
datasets: [
{
object: 'note',
externalId: 'name',
mode: 'insert',
env: ['prod', 'dev', 'test'],
records: [{ name: 'N1', org_label: cel('os.org.id') }],
},
],
config: baseConfig({ organizationId: 'org_123' }),
});

expect(result.success).toBe(true);
expect(engine.insert).toHaveBeenCalledWith(
'note',
expect.objectContaining({ org_label: 'org_123' }),
expect.anything(),
);
});

it('surfaces write failures in result.errors (loud, not just counted)', async () => {
const metadata = createMockMetadata({
account: { name: 'account', fields: { name: { type: 'text' } } },
});
const engine = createMockEngine();
engine.insert = vi.fn(async () => {
throw new Error('boom');
});
const loader = new SeedLoaderService(engine, metadata, logger);

const result = await loader.load({
datasets: [
{
object: 'account',
externalId: 'name',
mode: 'insert',
env: ['prod', 'dev', 'test'],
records: [{ name: 'Acme' }],
},
],
config: baseConfig({ defaultMode: 'insert' }),
});

expect(result.success).toBe(false);
expect(result.summary.totalErrored).toBe(1);
expect(result.errors.length).toBeGreaterThan(0);
expect(result.errors[0].message).toContain('Failed to write');
});
});

describe('load — edge cases', () => {
it('should handle records with no matching externalId field', async () => {
const metadata = createMockMetadata({
Expand Down
Loading
Loading