EmbedKit

A unified interface for text and image embeddings, supporting multiple providers.

Installation

pip install embedkit

Quick Start

from embedkit import EmbedKit
from embedkit.classes import Model, CohereInputType, SnowflakeInputType

# Initialize a provider
kit = EmbedKit.cohere(
    model=Model.Cohere.EMBED_V4_0,
    api_key="your-api-key",
)

# Get document embeddings
result = kit.embed_document("Hello world")
print(result.objects[0].embedding.shape)  # 1D array

# Get query embeddings (for providers that support it)
result = kit.embed_query("Hello world")
print(result.objects[0].embedding.shape)  # 1D array

# Get image embeddings
result = kit.embed_image("path/to/image.png")
print(result.objects[0].embedding.shape)  # 1D array
print(result.objects[0].source_b64)  # Base64 encoded image

Supported Providers

Cohere

kit = EmbedKit.cohere(
    model=Model.Cohere.EMBED_V4_0,  # or EMBED_ENGLISH_V3_0, EMBED_MULTILINGUAL_V3_0, etc.
    api_key="your-api-key",
)

# Different embeddings for queries vs documents
query_result = kit.embed_query("What is the capital of France?")
doc_result = kit.embed_document("Paris is the capital of France.")

Snowflake

kit = EmbedKit.snowflake(
    model=Model.Snowflake.ARCTIC_EMBED_L_V2_0,  # or ARCTIC_EMBED_M_V1_5
)

# Different embeddings for queries vs documents
query_result = kit.embed_query("What is the capital of France?")
doc_result = kit.embed_document("Paris is the capital of France.")

Qwen

# Lightweight model (0.6B parameters)
kit = EmbedKit.qwen(
    model=Model.Qwen.QWEN3_EMBEDDING_0_6B,
)

# Larger models (require more memory)
# kit = EmbedKit.qwen(
#     model=Model.Qwen.QWEN3_EMBEDDING_4B,
# )
# kit = EmbedKit.qwen(
#     model=Model.Qwen.QWEN3_EMBEDDING_8B,
# )

# Different embeddings for queries vs documents
query_result = kit.embed_query("What is the capital of France?")
doc_result = kit.embed_document("Paris is the capital of France.")

ColPali

kit = EmbedKit.colpali(
    model=Model.ColPali.COLPALI_V1_3,  # or COLSMOL_256M, COLSMOL_500M
)

# Same embeddings for queries and documents
query_result = kit.embed_query("What is the capital of France?")
doc_result = kit.embed_document("Paris is the capital of France.")
assert np.array_equal(query_result.objects[0].embedding, doc_result.objects[0].embedding)

Jina

kit = EmbedKit.jina(
    model=Model.Jina.CLIP_V2,
    api_key="your-api-key",
)

# Same embeddings for queries and documents
query_result = kit.embed_query("What is the capital of France?")
doc_result = kit.embed_document("Paris is the capital of France.")
assert np.array_equal(query_result.objects[0].embedding, doc_result.objects[0].embedding)

Response Format

class EmbeddingResponse:
    model_name: str
    model_provider: str
    input_type: str  # "text", "search_query", "search_document", "query", "image"
    objects: List[EmbeddingObject]

class EmbeddingObject:
    embedding: np.ndarray  # 1D array for everything except ColPali
    source_b64: Optional[str]  # Base64 encoded source for images and PDFs

Development

Running Tests

# Run all tests
pytest

# Run tests for specific providers
pytest -m cohere    # Run only Cohere tests
pytest -m colpali   # Run only ColPali tests
pytest -m jina      # Run only Jina tests
pytest -m snowflake # Run only Snowflake tests
pytest -m qwen      # Run only Qwen tests

# Additional options
pytest -v           # Verbose output
pytest -s           # Show print statements
pytest -x           # Stop on first failure

Requirements

Python 3.10+

License

MIT

GitHub

https://github.com/databyjp/embedkit

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
src/embedkit		src/embedkit
tests		tests
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
main.py		main.py
pyproject.toml		pyproject.toml
temp.py		temp.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EmbedKit

Installation

Quick Start

Supported Providers

Cohere

Snowflake

Qwen

ColPali

Jina

Response Format

Development

Running Tests

Requirements

License

GitHub

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

EmbedKit

Installation

Quick Start

Supported Providers

Cohere

Snowflake

Qwen

ColPali

Jina

Response Format

Development

Running Tests

Requirements

License

GitHub

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages