Corbin is the Python-side RAG indexing layer for Clariform.
At this stage, Corbin consumes already-normalized chunks from Swivel. Swivel handles source retrieval and chunk generation. Corbin is responsible for the next stage: tokenization, embeddings, and vector database storage.
Repository:
https://github.com/clariform/corbin
Current release:
v0.1.1
Corbin currently provides a source adapter for Swivel:
Swivel chunks → Corbin ingestion path
The old direct Notion code has been removed or deprecated. Notion retrieval now belongs to Swivel.
corbin
├── examples
│ └── retrieve_swivel_chunks.py
├── src
│ └── corbin
│ ├── sources
│ │ ├── __init__.py
│ │ └── swivel.py
│ ├── __init__.py
│ └── py.typed
├── tests
│ └── test_swivel_source.py
├── pyproject.toml
├── uv.lock
└── README.md
Swivel owns:
source retrieval → normalization → chunk generation
Corbin owns:
chunks → tokenization → embeddings → vector database storage
This separation keeps Corbin source-agnostic. Corbin does not need to know how Notion works. It only needs clean chunk records.
Install the Swivel CLI:
cargo install swivelcliVerify:
swivel --helpInstall Python dependencies:
uv syncSet Notion credentials for live Swivel-backed tests:
export NOTION_API_KEY="ntn_..."
export NOTION_VERSION="2026-03-11"from corbin import CorbinSwivelSource
source = CorbinSwivelSource(
binary="swivel",
use_cargo=False,
)
chunks = source.get_database_chunks("your-database-id")
print(len(chunks))
print(chunks[0]["text"])source.get_database_chunks(database_id)
source.get_data_source_chunks(data_source_id)
source.get_page_chunks(page_id)All three methods return a list of Swivel chunk dictionaries.
Run:
uv run python examples/retrieve_swivel_chunks.pyExpected output includes:
retrieved chunks: <count>
Run:
uv run pytestThe live Swivel source test requires NOTION_API_KEY. If the environment variable is missing, the test is skipped.
Recommended local environment:
source ~/.venvs/corbin/bin/activate
uv syncRun checks:
uv run pytest
uv run ruff check .Optional type check:
uv run mypy srcBuild:
uv buildPublish:
uv publishCorbin is early and currently focused on consuming Swivel chunks. The next major step is to add tokenization, embedding generation, and vector database storage.