#### This code will develop a Retrieval Augemented Generation (RAG) based chat bot using LangChain, PineCone and OpenAI with the following salient points
- Read the JSON question/answer pairs of a company
- Convert each question/answer pair into a separate text chunk using LangChain TextSplitter
- Convert the chunked question/answer pair into embeddings using OpenAI Embeddings
- Create an Index in Pinecone and upsert the embedded chunks in the Pinecone vector
- Take the query from user, vectorize it and then find the matching text chunk from Pinecone vector store
- Include the matching text chunk into the query which is to be sent to LLM to answer the specific question about the user.

In [1]:
import json
import time
import hashlib
from pinecone import Pinecone, ServerlessSpec
from langchain_community.document_loaders import TextLoader
from langchain_text_splitters import CharacterTextSplitter
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain_pinecone import PineconeVectorStore
from langchain.chains import RetrievalQA
from langchain.chains.conversational_retrieval.base import ConversationalRetrievalChain
from langchain.memory import ConversationBufferMemory

  from tqdm.autonotebook import tqdm


# Setting the Vector Database into the PineCone

#### Reading the API KEYS for OpenAI and Pinecone

In [2]:
creds_file = "../credentials.json"
    
with open(creds_file, 'r') as file:
    creds_data = json.load(file)
    openai_api_key = creds_data['OPENAI_API_KEY']
    pinecone_api_key = creds_data['PINECONE_API_KEY']

assert openai_api_key != None, ""
assert pinecone_api_key != None, ""

#### Loading the data from the text file

In [3]:
file_loader = TextLoader(file_path='scalexi.txt', encoding="utf-8")
documents = file_loader.load()

#### Dividing the complete text into overlapping chunks

There can be multiple ways of chunking the texts. We can do that by a specific heading or just randomly chunk that as it is done in the tutorial. It varies application to application and really depends on what context the model might need to answer the question

In [5]:
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
docs = text_splitter.split_documents(documents)
chunked_text = [doc.page_content for doc in docs]

#### Defining the embeddings model to create the embeddings

In [6]:
embed_model = OpenAIEmbeddings(api_key=openai_api_key, model="text-embedding-ada-002")
chunked_embeddings = embed_model.embed_documents(chunked_text)

### Combine the embeddings and Chunked Text

In [7]:
def generate_short_id(content: str) -> str:
    """
    Generate a short ID based on the content using SHA-256 hash.

    Args:
    - content (str): The content for which the ID is generated.

    Returns:
    - short_id (str): The generated short ID.
    """
    hash_obj = hashlib.sha256()
    hash_obj.update(content.encode("utf-8"))
    return hash_obj.hexdigest()

In [8]:
data_with_metadata = []

for doc_text, embedding in zip(chunked_text, chunked_embeddings):
    doc_id = generate_short_id(doc_text)
    
    data_item = {
        "id": doc_id,
        "values": embedding,
        "metadata": {'text':doc_text} #include the text as metadata
    }
    
    data_with_metadata.append(data_item)

#### Creating the PineCone Index

In [9]:
# configure client
pc = Pinecone(api_key=pinecone_api_key)

# configure serverless spec
spec = ServerlessSpec(cloud='aws', region='us-east-1')

pc.list_indexes()

{'indexes': [{'deletion_protection': 'disabled',
              'dimension': 1536,
              'host': 'rag-chatbot-description-ott2zv7.svc.aped-4627-b74a.pinecone.io',
              'metric': 'dotproduct',
              'name': 'rag-chatbot-description',
              'spec': {'serverless': {'cloud': 'aws', 'region': 'us-east-1'}},
              'status': {'ready': True, 'state': 'Ready'}}]}

In [10]:
# check for and delete index if already exists
index_name = 'rag-chatbot-description'
if index_name in pc.list_indexes().names():
    pc.delete_index(index_name)

# we create a new index
pc.create_index(
        index_name,
        dimension=1536,  # dimensionality of text-embedding-ada-002
        metric='dotproduct',
        spec=spec
    )

# Wait until the index is ready
while not pc.describe_index(index_name).status['ready']:
    time.sleep(1) 

#### Connecting to Index and Upserting our knowledgebase

In [11]:
# Connect to index
index = pc.Index(index_name)
time.sleep(1)
index.describe_index_stats()

{'dimension': 1536,
 'index_fullness': 0.0,
 'namespaces': {},
 'total_vector_count': 0}

Upserting our dataset into the Pinecone index

In [12]:
index.upsert(vectors=data_with_metadata)

{'upserted_count': 6}

In [13]:
index.describe_index_stats()

{'dimension': 1536,
 'index_fullness': 0.0,
 'namespaces': {},
 'total_vector_count': 0}

# Answering queries using information present in the database

In [15]:
text_field = "text" # it represents the field with which the source text is present in the pinecone index
vectorstore = PineconeVectorStore(index, embed_model, text_field)

In [16]:
# query = "Tell me something about Scalex?"
# vectorstore.similarity_search(query, k=3)

In [17]:
# completion llm
llm = ChatOpenAI(
    openai_api_key=openai_api_key,
    model_name='gpt-3.5-turbo',
    temperature=0.0
)

To only answer the question to each query directly without remembring the past information, use `RetrievalQA.from_chain_type`

In [18]:
qa = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=vectorstore.as_retriever()
)

query = "Tell me something about Scalex?"
print(qa.run(query))

  warn_deprecated(


ScaleX Innovation is a pioneering leader in the realm of Generative AI and Large Language Models. They specialize in integrating transformative technologies into business strategies to enhance innovation and operational efficiency. ScaleX Innovation offers tailored solutions such as automating workflows, content analysis, and custom model implementations across multiple industry verticals. They are dedicated to ethical compliance and versatility, making them a trusted partner for businesses worldwide.


In [25]:
qa.memory

To remember the information about the past conversation between the user and the robot, implement `ConversationalRetrievalChain.from_llm` with `chat_history` as the memory buffer. Please remember to use this if it is required since this will lead to more usage of tokens

In [19]:
memory = ConversationBufferMemory(memory_key='chat_history', return_messages=True)

conversation_chain = ConversationalRetrievalChain.from_llm(
    llm=llm,
    chain_type="stuff",
    retriever=vectorstore.as_retriever(),
    memory=memory
)

In [20]:
query = "Can you tell me about ScaleX Innovation?"
answer = conversation_chain.run(query)
print(answer)

  warn_deprecated(


ScaleX Innovation is a pioneering leader in the realm of Generative AI and Large Language Models. They specialize in integrating these transformative technologies into business strategies to enhance innovation and operational efficiency. ScaleX Innovation offers tailored solutions for automating workflows, content analysis, and custom model implementations. They are committed to bridging the gap between technology and business, with a focus on ethical compliance and versatility. Their expertise extends across multiple industry verticals, making them a trusted partner for businesses worldwide.


In [21]:
query = "Please tell me more"
answer = conversation_chain.run(query)
print(answer)

ScaleX Innovation is a pioneering leader in the realm of Generative AI and Large Language Models. They specialize in offering bespoke solutions that drive innovation, automate workflows, and enable unprecedented efficiencies for businesses. Their expertise extends across multiple industry verticals, ensuring that businesses can harness the power of AI-driven digital transformation. ScaleX Innovation is committed to bridging the gap between technology and business, offering specialized services like cross-domain consultation, business automation, and a client-centric approach. They are known for their adaptive AI solutions that can scale and adapt to diverse industrial requirements and challenges.
