-
Notifications
You must be signed in to change notification settings - Fork 13.4k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Description: This PR adds embeddings for LocalAI ( https://github.com/go-skynet/LocalAI ), a self-hosted OpenAI drop-in replacement. As LocalAI can re-use OpenAI clients it is mostly following the lines of the OpenAI embeddings, however when embedding documents, it just uses string instead of sending tokens as sending tokens is best-effort depending on the model being used in LocalAI. Sending tokens is also tricky as token id's can mismatch with the model - so it's safer to just send strings in this case. Partly related to: #5256 Dependencies: No new dependencies Twitter: @mudler_it --------- Signed-off-by: mudler <mudler@localai.io> Co-authored-by: Bagatur <baskaryan@gmail.com>
- Loading branch information
Showing
3 changed files
with
508 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,161 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"attachments": {}, | ||
"cell_type": "markdown", | ||
"id": "278b6c63", | ||
"metadata": {}, | ||
"source": [ | ||
"# LocalAI\n", | ||
"\n", | ||
"Let's load the LocalAI Embedding class. In order to use the LocalAI Embedding class, you need to have the LocalAI service hosted somewhere and configure the embedding models. See the documentation at https://localai.io/basics/getting_started/index.html and https://localai.io/features/embeddings/index.html." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 1, | ||
"id": "0be1af71", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from langchain.embeddings import LocalAIEmbeddings" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 2, | ||
"id": "2c66e5da", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"embeddings = LocalAIEmbeddings(openai_api_base=\"http://localhost:8080\", model=\"embedding-model-name\")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 3, | ||
"id": "01370375", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"text = \"This is a test document.\"" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 4, | ||
"id": "bfb6142c", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"query_result = embeddings.embed_query(text)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 5, | ||
"id": "0356c3b7", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"doc_result = embeddings.embed_documents([text])" | ||
] | ||
}, | ||
{ | ||
"attachments": {}, | ||
"cell_type": "markdown", | ||
"id": "bb61bbeb", | ||
"metadata": {}, | ||
"source": [ | ||
"Let's load the LocalAI Embedding class with first generation models (e.g. text-search-ada-doc-001/text-search-ada-query-001). Note: These are not recommended models - see [here](https://platform.openai.com/docs/guides/embeddings/what-are-embeddings)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "c0b072cc", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from langchain.embeddings.openai import LocalAIEmbeddings" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "a56b70f5", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"embeddings = LocalAIEmbeddings(openai_api_base=\"http://localhost:8080\", model=\"embedding-model-name\")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "14aefb64", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"text = \"This is a test document.\"" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "3c39ed33", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"query_result = embeddings.embed_query(text)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "e3221db6", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"doc_result = embeddings.embed_documents([text])" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "aaad49f8", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"# if you are behind an explicit proxy, you can use the OPENAI_PROXY environment variable to pass through\n", | ||
"os.environ[\"OPENAI_PROXY\"] = \"http://proxy.yourcompany.com:8080\"" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "Python 3.11.1 64-bit", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.11.1" | ||
}, | ||
"vscode": { | ||
"interpreter": { | ||
"hash": "e971737741ff4ec9aff7dc6155a1060a59a8a6d52c757dbbe66bf8ee389494b1" | ||
} | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 5 | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.