# Elasticsearch langchain playground

First, let's configure connection to our Elasticsearch cluster in cloud.


In [1]:
import os
#os.environ["LANGCHAIN_HANDLER"] = "langchain"
#os.environ["LANGCHAIN_TRACING"] = "true"
#os.environ["LANGCHAIN_SESSION"] = "my_session" # Make sure this session actually exists. You can create a new session in the UI.

from langchain import OpenAI, ESEngine, ESChain
from langchain.chat_models import ChatOpenAI

ES_URL = os.environ.get("ES_URL", "https://test1.es.europe-north1.gcp.elastic-cloud.com:443")
KB_URL = os.environ.get("KB_URL", "https://test1.kb.europe-north1.gcp.elastic-cloud.com:9243")

username = "elastic"
password = os.environ.get("ES_PASSWORD", "")

db = ESEngine.from_uri(ES_URL, 
                       connection_args={
                           "basic_auth": (username, password),
                           "http_compress": True,
                       },
                       sample_docs_in_index_info = 0,
                       )



Let's setup Elasticsearch toolkit to interact with the cluster from langchain agent.

In [2]:

from langchain.agents import create_elasticsearch_agent
from langchain.agents.agent_toolkits import ElasticsearchDatastoreToolkit
from langchain.agents.agent_toolkits import SQLDatabaseToolkit
from langchain.es_engine import ESEngine
from langchain.llms.openai import OpenAI
from langchain.chat_models import ChatOpenAI
from langchain.agents import AgentExecutor

#llm = OpenAI(temperature=0, verbose=True)    # use gpt3 model
llm = ChatOpenAI(temperature=0, verbose=True) # use gpt-3.5-turbo model
toolkit = ElasticsearchDatastoreToolkit(db=db, llm=llm)

agent_executor = create_elasticsearch_agent(
    llm=llm,
    toolkit=toolkit,
    top_k=0,
    verbose=True
)


A list of sample questions around data stored in the cluster.
We use sample dataset from Kibana.

In [3]:

#agent_executor.run("Who bought the most in e-commerce shop?")
#agent_executor.run("What's the total revenue of my ecommerce shop?")
#agent_executor.run("Can you find any trace of an attack to my website in the logs?")
#agent_executor.run("What is the top product sold this week?")

#agent_executor.run("Flights to which destination city are the most delayed?")
#agent_executor.run("Which IP sent the most of the bytes?")
agent_executor.run("What's median spending in the last 7 days in my ecommerce shop?")
#agent_executor.run("What's total revenue of my shop in the last 7 days?")
#agent_executor.run("How daily spending is increasing in my ecommerce shop over the last week?")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mAction:
```
{
  "action": "list_indices_es_db",
  "action_input": ""
}
```[0m
Observation: [33;1m[1;3m.fleet-files-agent-000001, kibana_sample_data_ecommerce, test-123, kibana_sample_data_logs, .fleet-file-data-agent-000001, kibana_sample_data_flights[0m
Thought:[32;1m[1;3mI should use the `kibana_sample_data_ecommerce` index to answer this question. I need to check the schema of this index to see what fields are available for me to use in the query.
Action:
```
{
  "action": "schema_es_db",
  "action_input": "kibana_sample_data_ecommerce"
}
```
[0m
Observation: [36;1m[1;3mIndex: kibana_sample_data_ecommerce
Fields: {'category': 'text', 'category.keyword': 'keyword', 'currency': 'keyword', 'customer_birth_date': 'date', 'customer_first_name': 'text', 'customer_first_name.keyword': 'keyword', 'customer_full_name': 'text', 'customer_full_name.keyword': 'keyword', 'customer_gender': 'keyword', 'customer_id': 'keyword', 

'The daily spending in the ecommerce shop over the last week is increasing. Here are the daily spending values for the last 7 days:\n\n- June 12th: 10328.91\n- June 13th: 10886.55\n- June 14th: 11008.98\n- June 15th: 13210.54\n- June 16th: 10263.96\n- June 17th: 11282.36\n- June 18th: 11489.23'