# <h1 align="center"><font color="red">ChatBot usando qdrant</font></h1>

<font color="yellow">Senior Data Scientist.: Dr. Eddy Giusepe Chirinos Isidro</font>

Links de estudo:

* [qdrant: OpenAI](https://qdrant.tech/documentation/embeddings/openai/)

* [Tutorial de Prince Krampah](https://ai.gopubby.com/building-a-smart-chatbot-using-qdrant-cloud-and-langchain-for-customer-support-tickets-504a3e17ac83)

* [chatbot com qdrant](https://github.com/Princekrampah/qdrant_csv_chatbot_tutorial/blob/master/chatbot.ipynb)

# <font color="gree">Instalação e configuração</font>

In [1]:
# !pip install pandas langchain-openai qdrant-client langchain langchain-community langchain-qdrant
import openai
import os
from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file
#openai.api_key  = os.environ['OPENAI_API_KEY']
#Eddy_key_openai  = os.environ['OPENAI_API_KEY']

#from openai import OpenAI
#client = OpenAI(api_key=Eddy_key_openai)

In [3]:
from decouple import config
import pandas as pd
import ast

df = pd.read_csv("./data/product_ticket_description.csv")
df.head()

Unnamed: 0,ID,product_purchased,ticket_description
0,0,GoPro Hero,I'm having an issue with the GoPro Hero. Pleas...
1,1,LG Smart TV,I'm having an issue with the LG Smart TV. Plea...
2,2,Dell XPS,I'm facing a problem with my Dell XPS. The Del...
3,3,Microsoft Office,I'm having an issue with the Microsoft Office....
4,4,Autodesk AutoCAD,I'm having an issue with the Autodesk AutoCAD....


# <font color="gree">Criando Docs LangChain do arquivo CSV</font>

In [4]:
from langchain_core.documents import Document

documents = []

for index, row in df[:250].iterrows():
    document = Document(
        page_content=row["ticket_description"],
        metadata={"product_name": row["product_purchased"]}
    )
    documents.append(document)

In [10]:
# Mostramos apenas 5 linhas:
documents[:5]

[Document(metadata={'product_name': 'GoPro Hero'}, page_content="I'm having an issue with the GoPro Hero. Please assist.Your billing zip code is: 71701.We appreciate that you have requested a website address.Please double check your email address. I've tried troubleshooting steps mentioned in the user manual, but the issue persists."),
 Document(metadata={'product_name': 'LG Smart TV'}, page_content="I'm having an issue with the LG Smart TV. Please assist.If you need to change an existing product.I'm having an issue with the LG Smart TV. Please assist.If The issue I'm facing is intermittent. Sometimes it works fine, but other times it acts up unexpectedly."),
 Document(metadata={'product_name': 'Dell XPS'}, page_content="I'm facing a problem with my Dell XPS. The Dell XPS is not turning on. It was working fine until yesterday, but now it doesn't respond.1.8.3 I really I'm using the original charger that came with my Dell XPS, but it's not charging properly."),
 Document(metadata={'prod

In [11]:
print(documents[0].page_content)

I'm having an issue with the GoPro Hero. Please assist.Your billing zip code is: 71701.We appreciate that you have requested a website address.Please double check your email address. I've tried troubleshooting steps mentioned in the user manual, but the issue persists.


In [12]:
print(documents[0].metadata)

{'product_name': 'GoPro Hero'}


In [13]:
len(documents)

250

# <font color="gree">Document UUID</font>

<font color="orange">Cada documento da lista de documentos que criamos acima terá uma `identificação única`. Vamos em frente e criá-los:</font>

In [14]:
from uuid import uuid4
uuids = [str(uuid4()) for _ in range(len(documents))]

# <font color="gree">Conecte-se ao Qdrant Cloud</font>

<font color="orange">Agora que temos os documentos preparados, precisamos ser capazes de incorporá-los e armazená-los em nosso `banco de dados de vetores`, neste caso `Qdrant Cloud`. Para isso, primeiro precisamos nos conectar à instância `Qdrant Cloud` que acabamos de criar.

Para isso, precisaremos dos seguintes blocos de código</font>

In [19]:
from qdrant_client import QdrantClient

import os
from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file
qdrant_api_key  = os.environ['QDRANT_API_KEY']
qdrant_url  = os.environ['QDRANT_URL']


qdrant_client = QdrantClient(
    api_key=qdrant_api_key,
    url=qdrant_url
)


In [22]:
qdrant_client.get_collections()

CollectionsResponse(collections=[])

# <font color="gree">Criando `collection`</font>

In [21]:
from qdrant_client.http.models import Distance, VectorParams

COLLECTION_NAME ="customer_support_tickets"

In [23]:
qdrant_client.create_collection(collection_name=COLLECTION_NAME,
                                vectors_config=VectorParams(size=os.getenv("QDRANT_VECTOR_DIMENSION", 1536), 
                                                            distance=Distance.COSINE
                                                           ),
                               )

True

# <font color="gree">Conexão com o DBVector</font>

In [26]:
from langchain_qdrant import QdrantVectorStore
from langchain_openai import OpenAIEmbeddings

embedding_model = OpenAIEmbeddings(model=os.getenv("EMBEDDING_MODEL"), 
                                   default="text-embedding-3-small"
                                  )
    


`QdrantVectorStore` suporta 3 modos para buscas de `similaridade`. Eles podem ser configurados usando o parâmetro `retrieval_mode` ao configurar a classe.

* Dense Vector Search(Default)
* Sparse Vector Search
* Hybrid Search

In [None]:
from langchain_qdrant import RetrievalMode

vector_store = QdrantVectorStore(
    client=qdrant_client,
    collection_name=COLLECTION_NAME,
    embedding=embedding_model,
    # retrieval_mode=RetrievalMode.DENSE
)