Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
18 commits
Select commit Hold shift + click to select a range
22e1573
feat: composable prompt system (recall#794)
laynepenney Apr 26, 2026
bfd64d1
fix: harden prompt system against adversarial inputs
laynepenney Apr 26, 2026
1375318
Merge pull request #2 from synapt-dev/feat/794-prompt-system
laynepenney Apr 26, 2026
ea6c274
feat: publish prep for npm + PyPI (recall#795)
laynepenney Apr 26, 2026
c0e2026
fix: publish blockers (readme, prompt assets, CI build, SHA pinning)
laynepenney Apr 26, 2026
615e6ff
Merge pull request #3 from synapt-dev/feat/795-publish
laynepenney Apr 26, 2026
2776748
fix: rename schemas/extraction/ to schemas/extract/ per locked spec
laynepenney Apr 26, 2026
56f95a0
Merge pull request #4 from synapt-dev/fix/schema-path-rename
laynepenney Apr 26, 2026
8538d47
fix: switch npm publish to OIDC trusted publishing
laynepenney Apr 26, 2026
f6681ab
fix: close schema-validator parity gaps (3 findings)
laynepenney Apr 26, 2026
6c45a3d
fix: tighten JSON schemas to match validator semantics
laynepenney Apr 26, 2026
50e0e13
fix: tighten validators to match JSON Schema constraints
laynepenney Apr 26, 2026
8bac94a
fix: validate embedding vector item types
laynepenney Apr 26, 2026
2da87d2
Merge pull request #6 from synapt-dev/fix/npm-oidc-publish
laynepenney Apr 26, 2026
2b56058
Merge pull request #8 from synapt-dev/fix/schema-validator-parity
laynepenney Apr 26, 2026
0096473
test: add ts parity suite for extract
laynepenney Apr 26, 2026
3b7fcb3
test: align extracted_at parity expectation
laynepenney Apr 26, 2026
5b2d120
Merge pull request #7 from synapt-dev/sentinel/ts-test-parity
laynepenney Apr 26, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 57 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
name: CI

on:
push:
branches: [main]
pull_request:

jobs:
test-python:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ["3.10", "3.11", "3.12", "3.13"]
steps:
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

- uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5.6.0
with:
python-version: ${{ matrix.python-version }}

- run: pip install pytest

- run: pytest tests/python/ -v

build-python:
runs-on: ubuntu-latest
defaults:
run:
working-directory: packages/python
steps:
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

- uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5.6.0
with:
python-version: "3.12"

- run: pip install build

- run: test -d ../../prompts && cp -r ../../prompts src/synapt_extract/prompts || true

- run: python -m build

check-typescript:
runs-on: ubuntu-latest
defaults:
run:
working-directory: packages/ts
steps:
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

- uses: actions/setup-node@49933ea5288caeca8642d1e84afbd3f7d6820020 # v4.4.0
with:
node-version: "22"

- run: npm ci

- run: npx tsc --noEmit
29 changes: 29 additions & 0 deletions .github/workflows/publish-npm.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
name: Publish @synapt-dev/extract to npm

on:
release:
types: [published]

permissions:
contents: read
id-token: write

jobs:
publish:
runs-on: ubuntu-latest
defaults:
run:
working-directory: packages/ts
steps:
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

- uses: actions/setup-node@49933ea5288caeca8642d1e84afbd3f7d6820020 # v4.4.0
with:
node-version: "22"
registry-url: "https://registry.npmjs.org"

- run: npm ci

- run: npm run build

- run: npm publish --provenance --access public
33 changes: 33 additions & 0 deletions .github/workflows/publish-pypi.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
name: Publish synapt-extract to PyPI

on:
release:
types: [published]

permissions:
contents: read
id-token: write

jobs:
publish:
runs-on: ubuntu-latest
environment: pypi
defaults:
run:
working-directory: packages/python
steps:
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

- uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5.6.0
with:
python-version: "3.12"

- run: pip install build

- run: test -d ../../prompts && cp -r ../../prompts src/synapt_extract/prompts || true

- run: python -m build

- uses: pypa/gh-action-pypi-publish@76f52bc884231f62b54f3568f20e4e024f6eb07a # release/v1
with:
packages-dir: packages/python/dist/
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,9 @@ dist/
*.js.map
!packages/ts/src/**/*.ts

packages/ts/prompts/
packages/python/src/synapt_extract/prompts/

__pycache__/
*.pyc
*.egg-info/
Expand Down
117 changes: 117 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
# @synapt-dev/extract

SynaptExtraction is the intermediate language (IL) for [synapt](https://synapt.dev)'s product stack. It is the universal exchange format between text extraction and intelligence operations.

```
Any text + Any LLM -> SynaptExtraction (IL) -> @synapt/memory (intelligence)
```

This repo contains the v1 schema, types, validators, finalization pipeline, and composable prompt system in both TypeScript and Python.

## Packages

| Package | Registry | Install |
|---------|----------|---------|
| `@synapt-dev/extract` | npm | `npm install @synapt-dev/extract` |
| `synapt-extract` | PyPI | `pip install synapt-extract` |

## Quick start

### TypeScript

```typescript
import {
buildExtractionPrompt,
finalizeExtraction,
validateExtraction,
} from "@synapt-dev/extract";

// 1. Build a prompt for your LLM
const prompt = buildExtractionPrompt(text, {
profile: "standard",
categories: ["Health", "Family"],
});

// 2. Send to any LLM, parse JSON response
const llmOutput = JSON.parse(await llm.complete(prompt));

// 3. Finalize: inject client context, normalize, validate
const result = finalizeExtraction(llmOutput, {
produced_by: "openai://gpt-4o-mini",
user_id: userId,
kind: "conversa/prayer",
});

console.log(result.extraction); // Complete SynaptExtraction
console.log(result.validation); // { valid: true, errors: [] }
```

### Python

```python
from synapt_extract import (
build_extraction_prompt,
finalize_extraction,
FinalizeContext,
)

# 1. Build a prompt
prompt = build_extraction_prompt(text, profile="standard")

# 2. Send to any LLM, parse JSON response
llm_output = json.loads(llm.complete(prompt))

# 3. Finalize
result = finalize_extraction(llm_output, FinalizeContext(
produced_by="openai://gpt-4o-mini",
user_id=user_id,
kind="conversa/prayer",
))

assert result.validation.valid
```

## Three-stage pipeline

SynaptExtraction documents are assembled in three stages:

1. **Stage 1 (LLM)**: The LLM extracts content fields (entities, goals, themes, etc.) from text
2. **Stage 2 (Client)**: Your application injects context the LLM can't know (produced_by, user_id, embeddings, extensions)
3. **Stage 3 (Library)**: `finalizeExtraction()` normalizes the document (version injection, capability detection, sub-schema versioning, validation)

## Prompt profiles

| Profile | Model class | Capabilities |
|---------|------------|--------------|
| `minimal` | 3B-7B local | entities, entity_state, goals, themes, summary |
| `standard` | GPT-4o-mini, Haiku | + entity_context, goal_timing, facts, temporal_refs, sentiment, evidence_anchoring |
| `full` | GPT-4o, Sonnet, Opus | + entity_ids, goal_entity_refs, relations, relation_origin, assertion_signals, temporal_classes |

## JSON Schema

The canonical schema is hosted at:

```
https://synapt.dev/schemas/extract/v1.json
```

Sub-schemas: `source-ref/v1.json`, `embedding/v1.json`, `assertion-signals/v1.json`, `temporal-ref/v1.json`.

## Repo structure

```
extract/
packages/
ts/ # @synapt-dev/extract (TypeScript, npm)
python/ # synapt-extract (Python, PyPI)
schemas/ # JSON Schema files (language-agnostic)
prompts/
v1/ # Capability prompt fragments
profiles/ # Profile definitions (minimal, standard, full)
tests/
python/ # Python test suite
```

## License

MIT
56 changes: 56 additions & 0 deletions packages/python/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
# synapt-extract

SynaptExtraction is the intermediate language (IL) for [synapt](https://synapt.dev)'s product stack. It is the universal exchange format between text extraction and intelligence operations.

```
Any text + Any LLM -> SynaptExtraction (IL) -> @synapt/memory (intelligence)
```

## Install

```bash
pip install synapt-extract
```

## Quick start

```python
from synapt_extract import (
build_extraction_prompt,
finalize_extraction,
FinalizeContext,
)

# 1. Build a prompt
prompt = build_extraction_prompt(text, profile="standard")

# 2. Send to any LLM, parse JSON response
llm_output = json.loads(llm.complete(prompt))

# 3. Finalize
result = finalize_extraction(llm_output, FinalizeContext(
produced_by="openai://gpt-4o-mini",
user_id=user_id,
kind="conversa/prayer",
))

assert result.validation.valid
```

## Prompt profiles

| Profile | Model class | Capabilities |
|---------|------------|--------------|
| `minimal` | 3B-7B local | entities, entity_state, goals, themes, summary |
| `standard` | GPT-4o-mini, Haiku | + entity_context, goal_timing, facts, temporal_refs, sentiment, evidence_anchoring |
| `full` | GPT-4o, Sonnet, Opus | + entity_ids, goal_entity_refs, relations, relation_origin, assertion_signals, temporal_classes |

## Links

- [Repository](https://github.com/synapt-dev/extract)
- [JSON Schema](https://synapt.dev/schemas/extract/v1.json)
- [TypeScript package](https://www.npmjs.com/package/@synapt-dev/extract)

## License

MIT
18 changes: 17 additions & 1 deletion packages/python/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -5,17 +5,33 @@ build-backend = "setuptools.build_meta"
[project]
name = "synapt-extract"
version = "0.1.0"
description = "SynaptExtraction IL v1 schema, validation, and finalization"
description = "SynaptExtraction IL v1 -- schema, validation, and finalization"
readme = "README.md"
license = "MIT"
requires-python = ">=3.10"
authors = [
{name = "Layne Penney"},
]
classifiers = [
"Development Status :: 4 - Beta",
"Intended Audience :: Developers",
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3.10",
"Programming Language :: Python :: 3.11",
"Programming Language :: Python :: 3.12",
"Programming Language :: Python :: 3.13",
"Topic :: Software Development :: Libraries",
"Typing :: Typed",
]

[project.urls]
Homepage = "https://synapt.dev"
Repository = "https://github.com/synapt-dev/extract"
Documentation = "https://synapt.dev/docs/extract"
Schema = "https://synapt.dev/schemas/extract/v1.json"

[tool.setuptools.packages.find]
where = ["src"]

[tool.setuptools.package-data]
synapt_extract = ["prompts/**/*.txt", "prompts/**/*.json"]
3 changes: 3 additions & 0 deletions packages/python/src/synapt_extract/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
)
from synapt_extract.validate import validate_extraction, ValidationResult, ValidationError
from synapt_extract.finalize import finalize_extraction, FinalizeContext, FinalizeResult
from synapt_extract.prompt import build_extraction_prompt, resolve_capabilities

__all__ = [
"SynaptExtraction",
Expand All @@ -30,4 +31,6 @@
"finalize_extraction",
"FinalizeContext",
"FinalizeResult",
"build_extraction_prompt",
"resolve_capabilities",
]
Loading
Loading