-
Notifications
You must be signed in to change notification settings - Fork 13.4k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Allow ElasticsearchEmbeddings to create a connection with ES Client o…
…bject (#5321) This PR adds a new method `from_es_connection` to the `ElasticsearchEmbeddings` class allowing users to use Elasticsearch clusters outside of Elastic Cloud. Users can create an Elasticsearch Client object and pass that to the new function. The returned object is identical to the one returned by calling `from_credentials` ``` # Create Elasticsearch connection es_connection = Elasticsearch( hosts=['https://es_cluster_url:port'], basic_auth=('user', 'password') ) # Instantiate ElasticsearchEmbeddings using es_connection embeddings = ElasticsearchEmbeddings.from_es_connection( model_id, es_connection, ) ``` I also added examples to the elasticsearch jupyter notebook Fixes # #5239 --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
- Loading branch information
1 parent
4b57854
commit 153df9d
Showing
2 changed files
with
314 additions
and
123 deletions.
There are no files selected for viewing
374 changes: 251 additions & 123 deletions
374
docs/modules/models/text_embedding/examples/elasticsearch.ipynb
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,124 +1,252 @@ | ||
{ | ||
"nbformat": 4, | ||
"nbformat_minor": 0, | ||
"metadata": { | ||
"colab": { | ||
"provenance": [] | ||
}, | ||
"kernelspec": { | ||
"name": "python3", | ||
"display_name": "Python 3" | ||
}, | ||
"language_info": { | ||
"name": "python" | ||
} | ||
}, | ||
"cells": [ | ||
{ | ||
"cell_type": "code", | ||
"source": [ | ||
"!pip -q install elasticsearch langchain" | ||
], | ||
"metadata": { | ||
"id": "6dJxqebov4eU" | ||
}, | ||
"execution_count": null, | ||
"outputs": [] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"source": [ | ||
"import elasticsearch\n", | ||
"from langchain.embeddings.elasticsearch import ElasticsearchEmbeddings" | ||
], | ||
"metadata": { | ||
"id": "RV7C3DUmv4aq" | ||
}, | ||
"execution_count": null, | ||
"outputs": [] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"source": [ | ||
"# Define the model ID\n", | ||
"model_id = 'your_model_id'" | ||
], | ||
"metadata": { | ||
"id": "MrT3jplJvp09" | ||
}, | ||
"execution_count": null, | ||
"outputs": [] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"source": [ | ||
"# Instantiate ElasticsearchEmbeddings using credentials\n", | ||
"embeddings = ElasticsearchEmbeddings.from_credentials(\n", | ||
" model_id,\n", | ||
" es_cloud_id='your_cloud_id', \n", | ||
" es_user='your_user', \n", | ||
" es_password='your_password'\n", | ||
")\n" | ||
], | ||
"metadata": { | ||
"id": "svtdnC-dvpxR" | ||
}, | ||
"execution_count": null, | ||
"outputs": [] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"source": [ | ||
"# Create embeddings for multiple documents\n", | ||
"documents = [\n", | ||
" 'This is an example document.', \n", | ||
" 'Another example document to generate embeddings for.'\n", | ||
"]\n", | ||
"document_embeddings = embeddings.embed_documents(documents)\n" | ||
], | ||
"metadata": { | ||
"id": "7DXZAK7Kvpth" | ||
}, | ||
"execution_count": null, | ||
"outputs": [] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"source": [ | ||
"# Print document embeddings\n", | ||
"for i, embedding in enumerate(document_embeddings):\n", | ||
" print(f\"Embedding for document {i+1}: {embedding}\")\n" | ||
], | ||
"metadata": { | ||
"id": "K8ra75W_vpqy" | ||
}, | ||
"execution_count": null, | ||
"outputs": [] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"source": [ | ||
"# Create an embedding for a single query\n", | ||
"query = 'This is a single query.'\n", | ||
"query_embedding = embeddings.embed_query(query)\n" | ||
], | ||
"metadata": { | ||
"id": "V4Q5kQo9vpna" | ||
}, | ||
"execution_count": null, | ||
"outputs": [] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"source": [ | ||
"# Print query embedding\n", | ||
"print(f\"Embedding for query: {query_embedding}\")\n" | ||
], | ||
"metadata": { | ||
"id": "O0oQDzGKvpkz" | ||
}, | ||
"execution_count": null, | ||
"outputs": [] | ||
} | ||
] | ||
} | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "1eZl1oaVUNeC" | ||
}, | ||
"source": [ | ||
"# Elasticsearch\n", | ||
"Walkthrough of how to generate embeddings using a hosted embedding model in Elasticsearch\n", | ||
"\n", | ||
"The easiest way to instantiate the `ElasticsearchEmebddings` class it either\n", | ||
"- using the `from_credentials` constructor if you are using Elastic Cloud\n", | ||
"- or using the `from_es_connection` constructor with any Elasticsearch cluster" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"id": "6dJxqebov4eU" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"!pip -q install elasticsearch langchain" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"id": "RV7C3DUmv4aq" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"import elasticsearch\n", | ||
"from langchain.embeddings.elasticsearch import ElasticsearchEmbeddings" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"id": "MrT3jplJvp09" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"# Define the model ID\n", | ||
"model_id = 'your_model_id'" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "j5F-nwLVS_Zu" | ||
}, | ||
"source": [ | ||
"## Testing with `from_credentials`\n", | ||
"This required an Elastic Cloud `cloud_id`" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"id": "svtdnC-dvpxR" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"# Instantiate ElasticsearchEmbeddings using credentials\n", | ||
"embeddings = ElasticsearchEmbeddings.from_credentials(\n", | ||
" model_id,\n", | ||
" es_cloud_id='your_cloud_id', \n", | ||
" es_user='your_user', \n", | ||
" es_password='your_password'\n", | ||
")\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"id": "7DXZAK7Kvpth" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"# Create embeddings for multiple documents\n", | ||
"documents = [\n", | ||
" 'This is an example document.', \n", | ||
" 'Another example document to generate embeddings for.'\n", | ||
"]\n", | ||
"document_embeddings = embeddings.embed_documents(documents)\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"id": "K8ra75W_vpqy" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"# Print document embeddings\n", | ||
"for i, embedding in enumerate(document_embeddings):\n", | ||
" print(f\"Embedding for document {i+1}: {embedding}\")\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"id": "V4Q5kQo9vpna" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"# Create an embedding for a single query\n", | ||
"query = 'This is a single query.'\n", | ||
"query_embedding = embeddings.embed_query(query)\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"id": "O0oQDzGKvpkz" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"# Print query embedding\n", | ||
"print(f\"Embedding for query: {query_embedding}\")\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "rHN03yV6TJ5q" | ||
}, | ||
"source": [ | ||
"## Testing with Existing Elasticsearch client connection\n", | ||
"This can be used with any Elasticsearch deployment" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"id": "GMQcJDwBTJFm" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"# Create Elasticsearch connection\n", | ||
"es_connection = Elasticsearch(\n", | ||
" hosts=['https://es_cluster_url:port'], \n", | ||
" basic_auth=('user', 'password')\n", | ||
")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"id": "WTYIU4u3TJO1" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"# Instantiate ElasticsearchEmbeddings using es_connection\n", | ||
"embeddings = ElasticsearchEmbeddings.from_es_connection(\n", | ||
" model_id,\n", | ||
" es_connection,\n", | ||
")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"id": "4gdAUHwoTJO3" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"# Create embeddings for multiple documents\n", | ||
"documents = [\n", | ||
" 'This is an example document.', \n", | ||
" 'Another example document to generate embeddings for.'\n", | ||
"]\n", | ||
"document_embeddings = embeddings.embed_documents(documents)\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"id": "RC_-tov6TJO3" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"# Print document embeddings\n", | ||
"for i, embedding in enumerate(document_embeddings):\n", | ||
" print(f\"Embedding for document {i+1}: {embedding}\")\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"id": "6GEnHBqETJO3" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"# Create an embedding for a single query\n", | ||
"query = 'This is a single query.'\n", | ||
"query_embedding = embeddings.embed_query(query)\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"id": "-kyUQAXDTJO4" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"# Print query embedding\n", | ||
"print(f\"Embedding for query: {query_embedding}\")\n" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"colab": { | ||
"provenance": [] | ||
}, | ||
"kernelspec": { | ||
"display_name": "Python 3 (ipykernel)", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.11.3" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 1 | ||
} |
Oops, something went wrong.