Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 42 additions & 0 deletions .changeset/negation-and-evaluation-retirement.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
---
"@_linked/core": minor
---

### New: `.none()` collection quantifier

Added `.none()` on `QueryShapeSet` for filtering where no elements match a condition:

```typescript
// "People who have NO friends that play chess"
Person.select(p => p.name)
.where(p => p.friends.none(f => f.hobby.equals('Chess')))
```

Generates `FILTER(NOT EXISTS { ... })` in SPARQL. Equivalent to `.some(fn).not()`.

### Changed: `.equals()` now returns `ExpressionNode` (was `Evaluation`)

`.equals()` on query proxies now returns `ExpressionNode` instead of `Evaluation`, enabling `.not()` chaining:

```typescript
// Now works — .equals() chains with .not()
.where(p => p.name.equals('Alice').not())
.where(p => Expr.not(p.name.equals('Alice')))
```

### Changed: `.some()` / `.every()` now return `ExistsCondition` (was `SetEvaluation`)

`.some()` and `.every()` on collections now return `ExistsCondition` which supports `.not()`:

```typescript
.where(p => p.friends.some(f => f.name.equals('Alice')).not()) // same as .none()
```

### Breaking: `Evaluation` class removed

The `Evaluation` class and related types (`SetEvaluation`, `WhereMethods`, `WhereEvaluationPath`) have been removed. Code that imported or depended on these types must migrate to `ExpressionNode` / `ExistsCondition`. The `WhereClause` type now accepts `ExpressionNode | ExistsCondition | callback`.

### New exports

- `ExistsCondition` — from `@_linked/core/expressions/ExpressionNode`
- `isExistsCondition()` — type guard for ExistsCondition
12 changes: 5 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,16 @@
# @_linked/core
Core Linked package for the query DSL, SHACL shape decorators/metadata, and package registration.

Linked core gives you a type-safe, schema-parameterized query language and SHACL-driven Shape classes for linked data. It compiles queries into a normalized [Intermediate Representation (IR)](./documentation/intermediate-representation.md) that can be executed by any store.
A type-safe graph query builder and OGM for linked data — like Drizzle or Prisma, but for RDF and SPARQL.

Linked gives you a schema-parameterized query language and SHACL-driven Shape classes for graph data. It compiles queries into a normalized [Intermediate Representation (IR)](./documentation/intermediate-representation.md) that can be executed by any store — SPARQL endpoints, in-memory RDF stores, or custom backends.

## Linked core offers

- **Schema-Parameterized Query DSL**: TypeScript-embedded queries driven by your Shape definitions.
- **Fully Inferred Result Types**: The TypeScript return type of every query is automatically inferred from the selected paths — no manual type annotations needed. Select `p.name` and get `{id: string; name: string}[]`. Select `p.friends.name` and get nested result types. This works for all operations: select, create, update, and delete.
- **Dynamic Query Building**: Build queries programmatically with `QueryBuilder`, compose field selections with `FieldSet`, and serialize/deserialize queries as JSON — for CMS dashboards, dynamic forms, and API-driven query construction.
- **Shape Classes (SHACL)**: TypeScript classes that generate SHACL shape metadata.
- **Object-Oriented Data Operations**: Query, create, update, and delete data using the same Shape-based API.
- **Full CRUD Operations**: Query, create, update, and delete data using the same Shape-based API — including expression-based updates, conditional mutations, and bulk operations.
- **Storage Routing**: `LinkedStorage` routes query objects to your configured store(s) that implement `IQuadStore`.
- **Automatic Data Validation**: SHACL shapes can be synced to your store for schema-level validation, and enforced at runtime by stores that support it.

Expand All @@ -28,10 +29,7 @@ npm install
npm run setup
```

`npm run setup` syncs `docs/agents` into local folders for agent tooling:

- `.claude/agents`
- `.agents/agents`
`npm run setup` installs agent skills and syncs tooling configuration.

## Related packages

Expand Down
102 changes: 102 additions & 0 deletions docs/ideas/016-aggregations.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
---
summary: Expose sum/avg/min/max aggregate methods in DSL and explore explicit groupBy API
packages: [core]
---

# Aggregations — Ideation

## Context

Linked currently exposes only `.size()` (COUNT) as an aggregate in the DSL. The IR, SPARQL algebra, and serializer layers already support `sum`, `avg`, `min`, `max` — they're just not wired to the query surface.

### What exists today

**DSL layer** — `SelectQuery.ts`:
- `QueryShapeSet.size()` (line 1138) and `QueryPrimitiveSet.size()` (line 1511) return a `SetSize` object
- `SetSize` class (lines 1517–1549) builds a `SizeStep` with count metadata

**IR layer** — `IntermediateRepresentation.ts`:
- `IRAggregateExpression` (lines 177–181) already defines all five aggregate names:
```typescript
type IRAggregateExpression = {
kind: 'aggregate_expr';
name: 'count' | 'sum' | 'avg' | 'min' | 'max';
args: IRExpression[];
};
```

**SPARQL algebra** — `SparqlAlgebra.ts`:
- `SparqlAggregateExpr` (lines 137–142) supports any aggregate name + `distinct` flag
- `SparqlSelectPlan` (lines 174–185) has `groupBy?: string[]`, `having?: SparqlExpression`, `aggregates?: SparqlAggregateBinding[]`

**IR → Algebra** — `irToAlgebra.ts`:
- Lines 423–500: projection handling detects aggregate expressions, builds aggregates array, infers GROUP BY from non-aggregate projected variables
- Lines 765–772: converts `aggregate_expr` IR nodes to `SparqlAggregateExpr`

**Serialization** — `algebraToString.ts`:
- Lines 134–140: serializes `count(...)`, `sum(...)`, etc. with optional DISTINCT prefix
- Lines 292–298: serializes GROUP BY clause

**Existing idea doc** — `docs/ideas/012-aggregate-group-filtering.md`:
- Discusses HAVING semantics and whether `.groupBy()` should be public or remain implicit
- Proposes `count().where(c => c.gt(10))` as aggregate-local filtering syntax

**Pipeline flow for `.size()`:**
```
DSL: p.friends.size()
→ FieldSetEntry { path: ['friends'], aggregation: 'count' }
→ DesugaredCountStep { kind: 'count_step', path: [...] }
→ IRProjectionItem { expression: { kind: 'aggregate_expr', name: 'count', args: [...] } }
→ SparqlAggregateExpr → "COUNT(?a0_friends)"
→ auto GROUP BY on non-aggregate variables
```

**Test coverage:**
- `query-fixtures.ts`: `countFriends`, `countNestedFriends`, `countLabel`, `countValue`, `countEquals`
- Golden SPARQL tests confirm `(count(?a0_friends) AS ?a1)` with `GROUP BY ?a0`

### How other libraries do it

**SQLAlchemy:**
```python
select(func.count(User.id), func.avg(User.balance)).group_by(User.name).having(func.count() > 5)
```

**Drizzle:**
```typescript
db.select({ count: count(), avg: avg(users.age) }).from(users).groupBy(users.name)
```

**Prisma:**
```typescript
prisma.user.groupBy({ by: ['role'], _count: true, _avg: { balance: true }, having: { balance: { _avg: { gt: 100 } } } })
```

## Goals

- Expose `sum`, `avg`, `min`, `max` in the DSL alongside existing `size()` (count)
- Decide whether to add explicit `.groupBy()` or keep implicit grouping
- Maintain type safety — aggregate results should infer as `number`
- Keep the fluent expression style consistent with the rest of the DSL

## Open Questions

- [ ] Should aggregate methods live on collections (`.friends.age.avg()`) or as standalone Expr functions (`Expr.avg(p.friends.age)`)?
- [ ] Should `.size()` be aliased to `.count()` for consistency with sum/avg/min/max naming?
- [ ] Should explicit `.groupBy()` be introduced, or should grouping remain implicit from aggregate usage?
- [ ] How should aggregates on scalar properties work (e.g., `p.age.avg()` across all persons vs `p.friends.age.avg()` per person)?
- [ ] Should DISTINCT aggregates be supported (e.g., `p.friends.hobby.countDistinct()`)?
- [ ] How does this interact with the HAVING semantics from idea 012?

## Decisions

| # | Decision | Chosen | Rationale |
|---|----------|--------|-----------|

## Notes

- The IR and SPARQL layers are ready — this is primarily a DSL surface + desugaring task
- The `FieldSetEntry.aggregation` field currently only accepts `'count'` — would need to expand to `'sum' | 'avg' | 'min' | 'max'`
- `SetSize` class pattern could be generalized to a `SetAggregate` class
- SPARQL natively supports all five aggregates: `COUNT`, `SUM`, `AVG`, `MIN`, `MAX`, plus `GROUP_CONCAT` and `SAMPLE`
- Implicit GROUP BY (current behavior) keeps simple cases clean but may confuse when mixing aggregates with non-aggregate projections
112 changes: 112 additions & 0 deletions docs/ideas/017-upsert.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
---
summary: Introduce upsert (create-or-update) semantics to the mutation DSL
packages: [core]
---

# Upsert — Ideation

## Context

Linked currently supports `Person.create({...})` and `Person.update({...}).for({id})` as separate operations. There is no way to express "create if not exists, update if it does" in a single call.

### What exists today

**CreateBuilder** — `CreateBuilder.ts` (lines 1–147):
- `Person.create({ name: 'Alice' })` → builds `IRCreateMutation` → SPARQL INSERT DATA
- IR type (`IntermediateRepresentation.ts:189–193`):
```typescript
type IRCreateMutation = { kind: 'create'; shape: string; data: IRNodeData; };
```
- Always generates a new URI via `generateEntityUri()` in SparqlStore (line 84)

**UpdateBuilder** — `UpdateBuilder.ts` (lines 1–205):
- `Person.update({ name: 'Bob' }).for({ id: '...' })` → builds `IRUpdateMutation` → SPARQL DELETE/INSERT WHERE
- IR type (`IntermediateRepresentation.ts:201–207`):
```typescript
type IRUpdateMutation = { kind: 'update'; shape: string; id: string; data: IRNodeData; traversalPatterns?: IRTraversalPattern[]; };
```
- Supports expression-based updates: `Person.update(p => ({ age: p.age.plus(1) })).for({id})`
- Supports conditional updates: `.where(p => p.status.equals('pending'))`

**SparqlStore execution** — `SparqlStore.ts` (lines 78–120):
- `createQuery`: generates URI + INSERT DATA (line 88)
- `updateQuery`: DELETE/INSERT WHERE (lines 92–101)
- Each mutation is a single `executeSparqlUpdate(sparql)` call

**No upsert anywhere:**
- No "upsert" keyword in codebase
- No conditional create logic
- No SPARQL INSERT ... WHERE NOT EXISTS pattern

### How other libraries do it

**SQLAlchemy (PostgreSQL):**
```python
stmt = pg_insert(User).values(name='Alice', email='alice@example.com')
stmt = stmt.on_conflict_do_update(
index_elements=['email'],
set_={'name': stmt.excluded.name},
)
```

**Drizzle:**
```typescript
await db.insert(users).values({ email: 'x', name: 'Alice' })
.onConflictDoUpdate({ target: users.email, set: { name: 'updated' } });
```

**Prisma:**
```typescript
await prisma.user.upsert({
where: { email: 'alice@example.com' },
update: { name: 'Alice Updated' },
create: { email: 'alice@example.com', name: 'Alice' },
});
```

### RDF/SPARQL considerations

RDF doesn't have primary keys or unique constraints like SQL. Identity is by URI. This changes the upsert semantics:

- **SQL upsert**: "insert row; if unique constraint violated, update instead"
- **RDF upsert**: "ensure this node exists with these properties" — more naturally expressed as:
1. DELETE existing triples for the given properties
2. INSERT new triples
3. Optionally: INSERT the rdf:type triple if the node doesn't exist yet

SPARQL pattern for upsert:
```sparql
DELETE { <node> <prop> ?old }
INSERT { <node> rdf:type <Type> . <node> <prop> <newValue> }
WHERE { OPTIONAL { <node> <prop> ?old } }
```

This is actually what `updateQuery` already does (DELETE old + INSERT new), except it requires the node to already exist via `.for({id})`.

## Goals

- Single API call to create-or-update a node
- Works naturally with RDF's URI-based identity (no unique constraint concept)
- Integrates with existing CreateBuilder/UpdateBuilder patterns
- Supports both "known ID" and "match by properties" use cases

## Open Questions

- [ ] Should the API be `Person.upsert({...})` (Prisma-style split) or `Person.createOrUpdate({...})` (simpler)?
- [ ] For the "known ID" case, should it just be `Person.update({...}).for({id}).createIfNotExists()`?
- [ ] For the "match by properties" case, should we support matching on property values (like SQL's ON CONFLICT)?
- [ ] Should upsert always require an explicit ID, or should it support auto-generating one if the node doesn't exist?
- [ ] How should expression-based updates work in upsert? (e.g., `age: p.age.plus(1)` — what if node doesn't exist yet?)
- [ ] Should we add a new IR mutation kind (`'upsert'`) or compose from existing create + update IR?

## Decisions

| # | Decision | Chosen | Rationale |
|---|----------|--------|-----------|

## Notes

- The simplest implementation: `Person.update({...}).for({id})` already does DELETE/INSERT WHERE. Making it also insert rdf:type if missing would effectively make it an upsert for the known-ID case
- Prisma's three-way split (`where` / `create` / `update`) may be overengineered for RDF where identity = URI
- A simpler RDF-native API might be: `Person.ensure({ id: '...', name: 'Alice' })` — "make sure this node has these values"
- Expression-based updates in upsert are tricky — `p.age.plus(1)` has no value if node doesn't exist. Could require a `defaultValue` or reject expressions in upsert create path
Loading
Loading