# Elasticsearch
## 正在运行并连接Elasticsearch

设置 Elasticsearch 实例的主要方法有两种：

Elastic Cloud：Elastic Cloud 是一种托管的 Elasticsearch 服务。注册即可免费试用。
要连接到不需要登录凭据的 Elasticsearch 实例（启动启用安全性的 docker 实例），请将 Elasticsearch URL 和索引名称以及嵌入对象传递给构造函数。

本地安装 Elasticsearch：通过在本地运行 Elasticsearch 来开始使用。最简单的方法是使用官方 Elasticsearch Docker 镜像。有关更多信息，请参阅[Elasticsearch Docker](https://www.elastic.co/guide/en/elasticsearch/reference/current/docker.html) 文档。


## 通过
示例：运行禁用安全性的单节点 Elasticsearch 实例。不建议将其用于生产用途。
```shell
docker run -p 9200:9200 -e "discovery.type=single-node" -e "xpack.security.enabled=false" -e "xpack.security.http.ssl.enabled=false" docker.elastic.co/elasticsearch/elasticsearch:8.12.1
```
一旦 Elasticsearch 实例运行，您就可以使用 Elasticsearch URL 和索引名称以及嵌入对象到构造函数来连接到它。

例子：


In [2]:
from dotenv import load_dotenv, find_dotenv
from langchain.globals import set_debug

load_dotenv(find_dotenv())
set_debug(False)

In [3]:
from langchain_elasticsearch import ElasticsearchStore
from langchain_openai import OpenAIEmbeddings

embedding = OpenAIEmbeddings()
elastic_vector_search = ElasticsearchStore(
    es_url="http://36.150.110.168:9200/",
    index_name="test_index",
    embedding=embedding
)

对于生产，我们建议您在启用安全性的情况下运行。要使用登录凭据进行连接，您可以使用参数`es_api_key`或`es_user`和`es_password`。

例子

In [10]:
from langchain_elasticsearch import ElasticsearchStore
from langchain_openai import OpenAIEmbeddings

embedding = OpenAIEmbeddings()
elastic_vector_search = ElasticsearchStore(
    es_url="https://740ce5a8b5904eacbc138def7cb1bde9.us-central1.gcp.cloud.es.io:443",
    index_name="test_index",
    embedding=embedding,
    es_user="enterprise_search",
    es_password="123456"
)

In [5]:
embedding = OpenAIEmbeddings()
elastic_vector_search = ElasticsearchStore(
    es_url="https://740ce5a8b5904eacbc138def7cb1bde9.us-central1.gcp.cloud.es.io:443",
    index_name="test_index",
    embedding=embedding,
    es_api_key="QWtvejVaQUJKVFV6WjA2Z2hRT1g6STFNaGpHa0RSX2lRRXlCSWwzVDBiUQ=="
)

API 参考：[ElasticsearchStore](https://api.python.langchain.com/en/latest/vectorstores/langchain_elasticsearch.vectorstores.ElasticsearchStore.html)
如何获取默认“elastic”用户的密码         
要获取默认“elastic”用户的 Elastic Cloud 密码：        

登录 Elastic Cloud 控制台https://cloud.elastic.co       
转到“安全”>“用户”       
找到“elastic”用户并点击“编辑”       
点击“重置密码”       
根据提示重置密码       
如何获取 API 密钥      
要获取 API 密钥：      

登录 Elastic Cloud 控制台https://cloud.elastic.co       
打开 Kibana 并转到“堆栈管理”>“API 密钥”      
点击“创建 API 密钥”      
输入 API 密钥的名称并点击“创建”         
复制 API 密钥并粘贴到api_key参数中    


## 基本示例
这个例子中，我们将通过 TextLoader 加载“state_of_the_union.txt”，将文本分块为 500 个字的块，然后将每个块索引到 Elasticsearch 中。

一旦数据被索引，我们就会执行一个简单的查询来找到与查询“总统对 Ketanji Brown Jackson 有何评价”相似的前 4 个块。

Elasticsearch 使用docker在 localhost:9200 本地运行。有关如何从 Elastic Cloud 连接到 Elasticsearch 的更多详细信息，请参阅上面的使用身份验证进行连接。


In [15]:
from langchain_elasticsearch import ElasticsearchStore
from langchain_openai import OpenAIEmbeddings

API 参考：ElasticsearchStore | OpenAIEmbeddings

In [16]:
from langchain_community.document_loaders import TextLoader
from langchain_text_splitters import CharacterTextSplitter

loader = TextLoader("../data/state_of_the_union.txt")
documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=500, chunk_overlap=0)
docs = text_splitter.split_documents(documents)

embeddings = OpenAIEmbeddings()

In [15]:
db = ElasticsearchStore.from_documents(
    docs,
    es_url="https://740ce5a8b5904eacbc138def7cb1bde9.us-central1.gcp.cloud.es.io:443",
    index_name="test-basic",
    embedding=embedding,
    es_api_key="QWtvejVaQUJKVFV6WjA2Z2hRT1g6STFNaGpHa0RSX2lRRXlCSWwzVDBiUQ=="
)

db.client.indices.refresh(index="test-basic")

query = "What did the president say about Ketanji Brown Jackson"
results = db.similarity_search(query)
print(results)

[Document(metadata={'source': '../data/state_of_the_union.txt'}, page_content='One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n\nAnd I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.'), Document(metadata={'source': '../data/state_of_the_union.txt'}, page_content='A former top litigator in private practice. A former federal public defender. And from a family of public school educators and police officers. A consensus builder. Since she’s been nominated, she’s received a broad range of support—from the Fraternal Order of Police to former judges appointed by Democrats and Republicans. \n\nAnd if we are to advance liberty and justice, we need to secure the Border and fix the immigration system.'), Document(metadata={'source': '../data/state_of_the_union.txt'}, pa

## 元数据
ElasticsearchStore支持将元数据与文档一起存储。此元数据字典对象存储在 Elasticsearch 文档中的元数据对象字段中。根据元数据值，Elasticsearch 将通过推断元数据值的数据类型来自动设置映射。例如，如果元数据值是字符串，Elasticsearch 将把元数据对象字段的映射设置为字符串类型。



In [17]:
# Adding metadata to documents
for i, doc in enumerate(docs):
    doc.metadata["date"] = f"{range(2010, 2020)[i % 10]}-01-01"
    doc.metadata["rating"] = range(1, 6)[i % 5]
    doc.metadata["author"] = ["John Doe", "Jane Doe"][i % 2]

db = ElasticsearchStore.from_documents(
    docs, 
    embedding=embeddings, 
    es_url="https://740ce5a8b5904eacbc138def7cb1bde9.us-central1.gcp.cloud.es.io:443",
    es_api_key="QWtvejVaQUJKVFV6WjA2Z2hRT1g6STFNaGpHa0RSX2lRRXlCSWwzVDBiUQ==",
    index_name="test-metadata"
)

query = "What did the president say about Ketanji Brown Jackson"
docs = db.similarity_search(query)
print(docs[0].metadata)

{'source': '../data/state_of_the_union.txt', 'date': '2016-01-01', 'rating': 2, 'author': 'John Doe'}


### 示例：按精确
注意：我们正在使用未经分析的关键字子字段

In [18]:
docs = db.similarity_search(
    query, filter=[{"term": {"metadata.author.keyword": "John Doe"}}]
)
print(docs[0].metadata)

{'source': '../data/state_of_the_union.txt', 'date': '2016-01-01', 'rating': 2, 'author': 'John Doe'}


### 示例:按部分匹配过滤

本示例展示了如何按部分匹配进行过滤。当您不知道元数据字段的确切值时，这很有用。例如，如果希望按元数据字段author进行筛选，而又不知道作者的确切值，那么可以使用部分匹配来按作者的姓氏进行筛选。还支持模糊匹配。“Jon”与“John Doe”匹配，因为“Jon”与“John”token非常匹配。

In [19]:
docs = db.similarity_search(
    query,
    filter=[{"match": {"metadata.author": {"query": "Jon", "fuzziness": "AUTO"}}}],
)
print(docs[0].metadata)

{'source': '../data/state_of_the_union.txt', 'date': '2016-01-01', 'rating': 2, 'author': 'John Doe'}


### 示例：按日期范围


In [20]:
docs = db.similarity_search(
    "Any mention about Fred?",
    filter=[{"range": {"metadata.date": {"gte": "2010-01-01"}}}],
)
print(docs[0].metadata)

{'source': '../data/state_of_the_union.txt', 'date': '2017-01-01', 'rating': 3, 'author': 'Jane Doe'}


### 示例：按数字范围


In [21]:
docs = db.similarity_search(
    "Any mention about Fred?", filter=[{"range": {"metadata.rating": {"gte": 2}}}]
)
print(docs[0].metadata)

{'source': '../data/state_of_the_union.txt', 'date': '2017-01-01', 'rating': 3, 'author': 'Jane Doe'}


## 示例：按地理距离
需要为 声明具有 `geo_point` 映射的索引`metadata.geo_location`。

In [24]:
docs = db.similarity_search(
    "Any mention about Fred?",
    filter=[
        {
            "geo_distance": {
                "distance": "200km",
                "metadata.geo_location": {"lat": 40, "lon": -70},
            }
        }
    ],
)
print(docs[0])

BadRequestError: BadRequestError(400, 'search_phase_execution_exception', 'failed to find geo field [metadata.geo_location]')

## 距离相似度算法
Elasticsearch 支持以下向量距离相似度算法：

- 余弦
- 欧几里得
- 点积
- 余弦相似度算法是默认算法。

您可以通过相似性参数指定所需的相似性算法。

注意： 根据检索策略，相似度算法不能在查询时更改。需要在为字段创建索引映射时设置。如果需要更改相似度算法，则需要删除索引并使用正确的 `distance_strategy` 重新创建。

In [None]:

db = ElasticsearchStore.from_documents(
    docs, 
    embeddings, 
    es_url="http://localhost:9200", 
    index_name="test",
    distance_strategy="COSINE"
    # distance_strategy="EUCLIDEAN_DISTANCE"
    # distance_strategy="DOT_PRODUCT"
)


## 检索策略

Elasticsearch与其他仅面向向量的数据库相比有很大的优势，因为它能够支持广泛的检索策略。在本手册中，我们将配置ElasticsearchStore以支持一些最常见的检索策略。默认情况下，ElasticsearchStore使用DenseVectorStrategy(在0.2.0版本之前称为ApproxRetrievalStrategy)。

### DenseVectorStrategy
这将返回与查询向量最相似的前k个向量。k参数在初始化ElasticsearchStore时设置。缺省值为10。

In [5]:
from langchain_elasticsearch import DenseVectorStrategy

db = ElasticsearchStore.from_documents(
    docs,
    embeddings,
    es_url="https://740ce5a8b5904eacbc138def7cb1bde9.us-central1.gcp.cloud.es.io:443",
    index_name="test",
    es_api_key="QWtvejVaQUJKVFV6WjA2Z2hRT1g6STFNaGpHa0RSX2lRRXlCSWwzVDBiUQ==",
    strategy=DenseVectorStrategy(),
)

docs = db.similarity_search(
    query="What did the president say about Ketanji Brown Jackson?", k=10
)

NameError: name 'docs' is not defined

In [27]:
docs[0]

Document(metadata={'source': '../data/state_of_the_union.txt', 'date': '2017-01-01', 'rating': 3, 'author': 'Jane Doe'}, page_content='A former top litigator in private practice. A former federal public defender. And from a family of public school educators and police officers. A consensus builder. Since she’s been nominated, she’s received a broad range of support—from the Fraternal Order of Police to former judges appointed by Democrats and Republicans. \n\nAnd if we are to advance liberty and justice, we need to secure the Border and fix the immigration system.')

### 示例:密集向量和关键字搜索的混合检索
这个示例将展示如何配置ElasticsearchStore来执行混合检索，使用近似语义搜索和基于关键字的搜索的组合。我们使用RRF来平衡来自不同检索方法的两个分数。

我们使用RRF来平衡来自不同检索方法的两个分数。为了启用混合检索，我们需要在`DenseVectorStrategy`构造函数中设置`hybrid=True`。

In [None]:

db = ElasticsearchStore.from_documents(
    docs, 
    embeddings, 
    es_url="https://740ce5a8b5904eacbc138def7cb1bde9.us-central1.gcp.cloud.es.io:443",
    index_name="test",
    es_api_key="QWtvejVaQUJKVFV6WjA2Z2hRT1g6STFNaGpHa0RSX2lRRXlCSWwzVDBiUQ==",
    strategy=DenseVectorStrategy(hybrid=True)
)

当启用hybrid时，执行的查询将是近似语义搜索和基于关键字的搜索的组合。它将使用rrf(互惠等级融合)来平衡来自不同检索方法的两个分数。注意RRF需要Elasticsearch 8.9.0或更高版本。

```json
{
    "knn": {
        "field": "vector",
        "filter": [],
        "k": 1,
        "num_candidates": 50,
        "query_vector": [1.0, ..., 0.0],
    },
    "query": {
        "bool": {
            "filter": [],
            "must": [{"match": {"text": {"query": "foo"}}}],
        }
    },
    "rank": {"rrf": {}},
}
```

### 示例:在Elasticsearch中使用嵌入模型进行密集向量搜索
这个示例将展示如何配置ElasticsearchStore，以使用部署在Elasticsearch中的嵌入模型进行密集向量检索。要使用它，通过查询模型id参数在DenseVectorStrategy构造函数中指定模型id。说明该模型需要部署并运行在Elasticsearch ml节点上。关于如何使用eland部署模型，请参阅笔记本示例。

In [6]:
DENSE_SELF_DEPLOYED_INDEX_NAME = "test-dense-self-deployed"

# 注意：这没有指定嵌入函数，相反，我们将使用部署在 Elasticsearch 中的嵌入模型
db = ElasticsearchStore(
    es_cloud_id="e2c70ef1bb04406da4169b5a2af80afd:dXMtY2VudHJhbDEuZ2NwLmNsb3VkLmVzLmlvOjQ0MyQ3NDBjZTVhOGI1OTA0ZWFjYmMxMzhkZWY3Y2IxYmRlOSQ5Zjc0YmUyZDA2NTg0ODcwODlmMzBhMWFhODJmMGE2MQ==",
    es_user="elastic",
    es_password="uBCEbXYCICojRp0ELUxIUCga",
    index_name=DENSE_SELF_DEPLOYED_INDEX_NAME,
    query_field="text_field",
    vector_query_field="vector_query_field.predicted_value",
    strategy=DenseVectorStrategy(model_id="sentence-transformers__all-minilm-l6-v2"),
)



In [7]:
# 设置一个 Ingest Pipeline 来执行文本字段的嵌入
db.client.ingest.put_pipeline(
    id="test_pipeline",
    processors=[
        {
            "inference": {
                "model_id": "sentence-transformers__all-minilm-l6-v2",
                "field_map": {"query_field": "text_field"},
                "target_field": "vector_query_field",
            }
        }
    ],
)



ObjectApiResponse({'acknowledged': True})

In [8]:
# 使用管道创建新索引，不依赖 langchain 来创建索引
db.client.indices.create(
    index=DENSE_SELF_DEPLOYED_INDEX_NAME,
    mappings={
        "properties": {
            "text_field": {"type": "text"},
            "vector_query_field": {
                "properties": {
                    "predicted_value": {
                        "type": "dense_vector",
                        "dims": 384,
                        "index": True,
                        "similarity": "l2_norm",
                    }
                }
            },
        }
    },
    settings={"index": {"default_pipeline": "test_pipeline"}},
)



ObjectApiResponse({'acknowledged': True, 'shards_acknowledged': True, 'index': 'test-dense-self-deployed'})

In [14]:
try:
    db.from_texts(
        ["hello world"],
        es_cloud_id="e2c70ef1bb04406da4169b5a2af80afd:dXMtY2VudHJhbDEuZ2NwLmNsb3VkLmVzLmlvOjQ0MyQ3NDBjZTVhOGI1OTA0ZWFjYmMxMzhkZWY3Y2IxYmRlOSQ5Zjc0YmUyZDA2NTg0ODcwODlmMzBhMWFhODJmMGE2MQ==",
        es_user="elastic",
        es_password="uBCEbXYCICojRp0ELUxIUCga",
        index_name=DENSE_SELF_DEPLOYED_INDEX_NAME,
        query_field="text_field",
        vector_query_field="vector_query_field.predicted_value",
        strategy=DenseVectorStrategy(model_id="sentence-transformers__all-minilm-l6-v2"),
    )

    # Perform search
    db.similarity_search("hello world", k=10)
except Exception as e:
    print(e)

1 document(s) failed to index.


### SparseVectorStrategy (ELSER)
此策略使用 Elasticsearch 的稀疏向量检索来检索 top-k 结果。目前我们仅支持我们自己的“ELSER”嵌入模型。

注意这需要在 Elasticsearch ml 节点中部署并运行 ELSER 模型。

要使用此功能，请在构造函数中指定（在 0.2.0 版本之前`SparseVectorStrategy`调用） 。您需要提供模型ID `SparseVectorRetrievalStrategyElasticsearchStore`

In [18]:
from langchain_elasticsearch import SparseVectorStrategy

# Note that this example doesn't have an embedding function. This is because we infer the tokens at index time and at query time within Elasticsearch.
# This requires the ELSER model to be loaded and running in Elasticsearch.
db = ElasticsearchStore.from_documents(
    docs,
    es_cloud_id="e2c70ef1bb04406da4169b5a2af80afd:dXMtY2VudHJhbDEuZ2NwLmNsb3VkLmVzLmlvOjQ0MyQ3NDBjZTVhOGI1OTA0ZWFjYmMxMzhkZWY3Y2IxYmRlOSQ5Zjc0YmUyZDA2NTg0ODcwODlmMzBhMWFhODJmMGE2MQ==",
        es_user="elastic",
        es_password="uBCEbXYCICojRp0ELUxIUCga",
    index_name="test-elser",
    strategy=SparseVectorStrategy(model_id="elser_model_2"),
)

db.client.indices.refresh(index="test-elser")

results = db.similarity_search(
    "What did the president say about Ketanji Brown Jackson", k=4
)
print(results[0])

NotFoundError: NotFoundError(404, 'resource_not_found_exception', 'Could not find trained model [elser_model_2]')

### DenseVectorScriptScoreStrategy
该策略使用Elasticsearch的脚本分数查询执行精确向量检索(也称为蛮力)来检索top-k结果。(这个策略在0.2.0版本之前被称为`ExactRetrievalStrategy`。)
要使用它，在ElasticsearchStore构造函数中指定`DenseVectorScriptScoreStrategy`。

In [None]:
from langchain_elasticsearch import DenseVectorScriptScoreStrategy

db = ElasticsearchStore.from_documents(
    docs, 
    embeddings, 
    es_url="http://localhost:9200", 
    index_name="test",
    strategy=DenseVectorScriptScoreStrategy(),
)

### BM25Strategy
最后，您可以使用全文关键字搜索。

要使用它，在ElasticsearchStore构造函数中指定`BM25Strategy`。

In [19]:
from langchain_elasticsearch import BM25Strategy

db = ElasticsearchStore.from_documents(
    docs, 
    es_url="http://36.150.110.168:9200", 
    index_name="test",
    strategy=BM25Strategy(),
)

#### BM25RetrievalStrategy

该策略允许用户使用纯 BM25 进行搜索，而无需向量搜索。

要使用它，请BM25RetrievalStrategy在ElasticsearchStore构造函数中指定。

请注意，在下面的示例中，未指定嵌入选项，这表明搜索是在不使用嵌入的情况下进行的。

In [2]:


from langchain_elasticsearch import ElasticsearchStore

db = ElasticsearchStore(
    es_url="http://36.150.110.168:9200",
    index_name="test_index",
    strategy=ElasticsearchStore.BM25RetrievalStrategy(),
)

db.add_texts(
    ["foo", "foo bar", "foo bar baz", "bar", "bar baz", "baz"],
)

results = db.similarity_search(query="foo", k=10)
print(results)

[Document(page_content='foo'), Document(page_content='foo'), Document(page_content='foo bar'), Document(page_content='foo bar'), Document(page_content='foo bar baz'), Document(page_content='foo bar baz')]


### 自定义查询
在搜索中使用自定义查询参数，您可以调整用于从Elasticsearch检索文档的查询。如果您想使用更复杂的查询，以支持字段的线性提升，这将非常有用。

In [7]:
# Example of a custom query thats just doing a BM25 search on the text field.
def custom_query(query_body: dict, query: str):
    """Custom query to be used in Elasticsearch.
    Args:
        query_body (dict): Elasticsearch query body.
        query (str): Query string.
    Returns:
        dict: Elasticsearch query body.
    """
    print("Query Retriever created by the retrieval strategy:")
    print(query_body)
    print()

    new_query_body = {"query": {"match": {"text": query}}}

    print("Query thats actually used in Elasticsearch:")
    print(new_query_body)
    print()

    return new_query_body

results = db.similarity_search(
    "foo",
    k=4,
    custom_query=custom_query,
)
print("Results:")
print(results)

Query Retriever created by the retrieval strategy:
{'query': {'bool': {'must': [{'match': {'text': {'query': 'foo'}}}], 'filter': []}}}

Query thats actually used in Elasticsearch:
{'query': {'match': {'text': 'foo'}}}

Results:
[Document(page_content='foo'), Document(page_content='foo'), Document(page_content='foo bar'), Document(page_content='foo bar')]


#### 自定义文档生成器
在搜索中使用文档构建器参数，您可以调整如何使用从Elasticsearch检索的数据构建文档。如果索引不是使用Langchain创建的，这一点特别有用。




In [9]:
from typing import Dict

from langchain_core.documents import Document


def custom_document_builder(hit: Dict) -> Document:
    print("hit", hit)
    src = hit.get("_source", {})
    return Document(
        page_content=src.get("content", "Missing content!"),
        metadata={
            "page_number": src.get("page_number", -1),
            "original_filename": src.get("original_filename", "Missing filename!"),
        },
    )


results = db.similarity_search(
    "foo",
    k=4,
    doc_builder=custom_document_builder,
)
print("Results:")
print(results[0])

hit {'_index': 'test_index', '_id': '262b8336-910e-4444-aead-eb87293d2534', '_score': 0.8287628, '_source': {'metadata': {}, 'text': 'foo'}}
hit {'_index': 'test_index', '_id': '2ca5a718-028d-4b4f-b79d-c47a95984698', '_score': 0.8287628, '_source': {'metadata': {}, 'text': 'foo'}}
hit {'_index': 'test_index', '_id': '0b7a994d-5599-4548-b9fe-9a732dbe910f', '_score': 0.64072424, '_source': {'metadata': {}, 'text': 'foo bar'}}
hit {'_index': 'test_index', '_id': '74ab7279-6886-40fe-b981-776de99c2239', '_score': 0.64072424, '_source': {'metadata': {}, 'text': 'foo bar'}}
Results:
page_content='Missing content!' metadata={'page_number': -1, 'original_filename': 'Missing filename!'}


## 常见

问题:当索引文档到Elasticsearch时，我得到超时错误。我该如何解决这个问题

一个可能的问题是您的文档可能需要更长的时间来索引到Elasticsearch。ElasticsearchStore使用Elasticsearch批量API，它有一些默认值，你可以调整以减少超时错误的机会。当您使用SparseVectorRetrievalStrategy时，这也是一个好主意。默认值为

- chunk_size: 500
- max_chunk_bytes: 100MB

要调整这些，您可以将`chunk_size`和`max_chunk_bytes`参数传递给 ElasticsearchStore`add_texts`方法。

```json
    vector_store.add_texts(
        texts,
        bulk_kwargs={
            "chunk_size": 50,
            "max_chunk_bytes": 200000000
        }
    )
```

### 升级到 ElasticsearchStore
如果您已经在基于 langchain 的项目中使用了 Elasticsearch，那么您可能正在使用旧实现：ElasticVectorSearch和，ElasticKNNSearch这两个实现现已弃用。我们引入了一个名为的新实现，ElasticsearchStore它更加灵活且更易于使用。本笔记本将指导您完成升级到新实现的过程。


新的实现现在是一个类ElasticsearchStore，可以通过策略用于近似密集向量、精确密集向量、稀疏向量（ELSER）、BM25 检索和混合检索。

### I am using ElasticKNNSearch

In [None]:
## 旧的实现
from langchain_community.vectorstores.elastic_vector_search import ElasticKNNSearch

db = ElasticKNNSearch(
  elasticsearch_url="http://localhost:9200",
  index_name="test_index",
  embedding=embedding
)


In [None]:
## 新的实现 

from langchain_elasticsearch import ElasticsearchStore, DenseVectorStrategy

db = ElasticsearchStore(
  es_url="http://localhost:9200",
  index_name="test_index",
  embedding=embedding,
  # if you use the model_id
  # strategy=DenseVectorStrategy(model_id="test_model")
  # if you use hybrid search
  # strategy=DenseVectorStrategy(hybrid=True)
)

## ElasticVectorSearch


In [None]:
## 旧的实现

from langchain_community.vectorstores.elastic_vector_search import ElasticVectorSearch

db = ElasticVectorSearch(
  elasticsearch_url="http://localhost:9200",
  index_name="test_index",
  embedding=embedding
)


In [None]:
## 新的实现 
from langchain_elasticsearch import ElasticsearchStore, DenseVectorScriptScoreStrategy

db = ElasticsearchStore(
  es_url="http://localhost:9200",
  index_name="test_index",
  embedding=embedding,
  strategy=DenseVectorScriptScoreStrategy()
)

## 删除索引



In [11]:
db.client.indices.delete(
    index="test-metadata, test-elser, test-basic",
    ignore_unavailable=True,
    allow_no_indices=True,
)

ConnectionTimeout: Connection timed out