You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Apr 3, 2024. It is now read-only.
The default embeddings (e.g. Ada-002 from OpenAI, etc) are great generalists. However, they are not tailored for your specific use-case.
Proposed Solution
🎉 Customizing Embeddings!
ℹ️ See my tutorial / lessons learned if you're interested in learning more, step-by-step, with screenshots and tips.
🎯 Specifically for Lanchain Hub would be providing a collection of pre-trained custom embeddings.
Similar to https://huggingface.co/models except focused on semantic embeddings.
List the known tasks so developers can search the available custom embeddings for each:
Hub provides a set of Tasks each with:
Modality (e.g. text, image, etc)
Embedding engine to use & # of dimensions (text=>ada-002 with 1536 dimensions, image=>CLIP...)
Expected prompt formats for documents and/or queries (i.e. what data should look like before being sent to embedding model)
e.g. Documents should look like X. Short form queries look like Y. Topic or objective is Z.
Pre-made Datasets for training on your own
Data preparation scripts
Pre-trained Matrices
Leverage Langchain's helpers to help train and use the custom embedding matrix:
Problem
The default embeddings (e.g. Ada-002 from OpenAI, etc) are great generalists. However, they are not tailored for your specific use-case.
Proposed Solution
🎉 Customizing Embeddings!
🎯 Specifically for Lanchain Hub would be providing a collection of pre-trained custom embeddings.
Similar to https://huggingface.co/models except focused on semantic embeddings.
List the known tasks so developers can search the available custom embeddings for each:
Hub provides a set of Tasks each with:
X
. Short form queries look likeY
. Topic or objective isZ
.Leverage Langchain's helpers to help train and use the custom embedding matrix:
The text was updated successfully, but these errors were encountered: