Fine-tune any embedding model in under a minute. Even closed-source models from OpenAI, Cohere, Voyage, etc.
Weightgain works by training an adapter that sits on top of the model, transforming the embeddings after they're generated. This produces task-specific embeddings optimized for your specific RAG/retrieval use case.
With weightgain, you can train an adapter in just a couple lines of code –– even if you don't have a dataset.
> pip install weightgain
from weightgain import Dataset, Adapter
# Generate a dataset (or supply your own)
dataset = Dataset.from_synthetic_chunks(
prompt="Chunks of code from an arbitrary Python codebase.",
llm="openai/gpt-4o-mini",
)
# Train the adapter
adapter = Adapter("openai/text-embedding-3-large")
adapter.fit(dataset)
# Apply the adapter
new_embeddings = adapter.transform(old_embeddings)
Weightgain wraps LiteLLM. You can fine-tune any embedding model supported by LiteLLM, e.g. models from OpenAI, Cohere, Voyage, etc. Here's the full list of supported models.
You need a dataset of [query, chunk]
pairs to get started. A chunk is a retrieval result, e.g. a code snippet or excerpt from a document. And the query is a string that's similar to the chunk and should match in a vector search. You can either generate a synthetic dataset or supply your own.
If you already have chunks:
from weightgain import Dataset
chunks = [...] # list of strings
dataset = Dataset.from_chunks(
chunks,
llm="openai/gpt-4o-mini",
n_queries_per_chunk=1
)
This will use OpenAI's gpt-4o-mini
(or whatever LiteLLM model you want) to generate 1
query per chunk.
If you don't have chunks:
dataset = Dataset.from_synthetic_chunks(
prompt="Chunks of code from an arbitrary Python codebase.",
llm="openai/gpt-4o-mini",
n_chunks=25,
n_queries_per_chunk=1
)
This will generate chunks using the prompt, and then generate 1
query per chunk.
If you have queries and chunks:
qa_pairs = [...] # list of (str, str) tuples
dataset = Dataset.from_pairs(qa_pairs, model)
from weightgain import Adapter
adapter = Adapter.fit(
dataset,
batch_size=25,
max_epochs=50,
learning_rate=100.0,
dropout=0.0
)
After training, you can generate a report with various plots (training loss, cosine similarity distributions before/after training, etc.):
adapter.show_report()
old_embeddings = [...] # list of vectors
new_embeddings = adapter.transform(old_embeddings)
Behind the scenes, an adapter is just a matrix of weights that you can multiply your embeddings with. You can access this matrix like so:
adapter.matrix # returns numpy.ndarray
- Add option to train an MLP instead of a linear layer
- Add a method for easy hyperparameter search