Skip to content

clariform/corbin

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Corbin

Corbin is the Python-side RAG indexing layer for Clariform.

At this stage, Corbin consumes already-normalized chunks from Swivel. Swivel handles source retrieval and chunk generation. Corbin is responsible for the next stage: tokenization, embeddings, and vector database storage.

Repository:

https://github.com/clariform/corbin

Current release:

v0.1.1

Current scope

Corbin currently provides a source adapter for Swivel:

Swivel chunks → Corbin ingestion path

The old direct Notion code has been removed or deprecated. Notion retrieval now belongs to Swivel.

Project layout

corbin
├── examples
│   └── retrieve_swivel_chunks.py
├── src
│   └── corbin
│       ├── sources
│       │   ├── __init__.py
│       │   └── swivel.py
│       ├── __init__.py
│       └── py.typed
├── tests
│   └── test_swivel_source.py
├── pyproject.toml
├── uv.lock
└── README.md

Relationship to Swivel

Swivel owns:

source retrieval → normalization → chunk generation

Corbin owns:

chunks → tokenization → embeddings → vector database storage

This separation keeps Corbin source-agnostic. Corbin does not need to know how Notion works. It only needs clean chunk records.

Requirements

Install the Swivel CLI:

cargo install swivelcli

Verify:

swivel --help

Install Python dependencies:

uv sync

Set Notion credentials for live Swivel-backed tests:

export NOTION_API_KEY="ntn_..."
export NOTION_VERSION="2026-03-11"

Usage

from corbin import CorbinSwivelSource

source = CorbinSwivelSource(
    binary="swivel",
    use_cargo=False,
)

chunks = source.get_database_chunks("your-database-id")

print(len(chunks))
print(chunks[0]["text"])

Available source methods

source.get_database_chunks(database_id)
source.get_data_source_chunks(data_source_id)
source.get_page_chunks(page_id)

All three methods return a list of Swivel chunk dictionaries.

Example

Run:

uv run python examples/retrieve_swivel_chunks.py

Expected output includes:

retrieved chunks: <count>

Tests

Run:

uv run pytest

The live Swivel source test requires NOTION_API_KEY. If the environment variable is missing, the test is skipped.

Development

Recommended local environment:

source ~/.venvs/corbin/bin/activate
uv sync

Run checks:

uv run pytest
uv run ruff check .

Optional type check:

uv run mypy src

Packaging

Build:

uv build

Publish:

uv publish

Status

Corbin is early and currently focused on consuming Swivel chunks. The next major step is to add tokenization, embedding generation, and vector database storage.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages