## Chatbot without RAG

In [1]:
import os
from dotenv import load_dotenv
from langchain.chat_models import ChatOpenAI

os.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY") or "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"

chat = ChatOpenAI(
    openai_api_key="",
    model='gpt-3.5-turbo'
)

  warn_deprecated(


In [None]:
Chats with OpenAI's `gpt-3.5-turbo` and `gpt-4` chat models are typically structured (in plain text) like this:

```
System: You are a helpful assistant.

User: Hi AI, how are you today?

Assistant: I'm great thank you. How can I help you?

User: I'd like to understand string theory.

Assistant:
```

The final `"Assistant:"` without a response is what would prompt the model to continue the conversation. In the official OpenAI `ChatCompletion` endpoint these would be passed to the model in a format like:

```python
[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hi AI, how are you today?"},
    {"role": "assistant", "content": "I'm great thank you. How can I help you?"}
    {"role": "user", "content": "I'd like to understand string theory."}
]
```

In LangChain there is a slightly different format. We use three _message_ objects like so:

In [3]:
from langchain.schema import (
    SystemMessage,
    HumanMessage,
    AIMessage
)

messages = [
    SystemMessage(content="You are a helpful assistant."),
    HumanMessage(content="Hi AI, how are you today?"),
    AIMessage(content="I'm great thank you. How can I help you?"),
    HumanMessage(content="I'd like to understand string theory.")
]
res = chat(messages)
res

Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised RateLimitError: You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors..
Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised RateLimitError: You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors..
Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised RateLimitError: You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error

RateLimitError: You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.

In [None]:
print(res.content)

In [None]:
String theory is a theoretical framework in physics that aims to explain the fundamental particles and forces in the universe. It suggests that the basic building blocks of matter are not point-like particles, but tiny, vibrating strings. These strings can vibrate in different ways, giving rise to various particles and their properties.

Here are a few key points about string theory:

1. Fundamental particles: In string theory, particles such as electrons and quarks are not considered as point-like objects but rather as tiny vibrating strings. The different vibrational patterns of these strings correspond to different types of particles and their properties, such as mass and charge.

2. Extra dimensions: String theory also proposes that the universe has more than the three spatial dimensions (length, width, and height) that we are familiar with. It suggests the existence of additional compactified dimensions, which are curled up and not directly observable at our energy scales.

3. Unification of forces: One of the main goals of string theory is to unify all the fundamental forces of nature, including gravity, electromagnetism, and the strong and weak nuclear forces. This unification is achieved by treating these forces as different manifestations of the vibrations of the fundamental strings.

4. Mathematical framework: String theory requires a mathematical framework beyond classical physics called quantum mechanics. It incorporates concepts from both quantum mechanics and general relativity to describe the behavior of strings and their interactions.

5. String landscape: String theory predicts the existence of a vast number of possible solutions or configurations, often referred to as the "string landscape." Each configuration corresponds to a different universe with its own set of physical laws and properties. This idea has implications for the anthropic principle, suggesting that our universe may be just one among many possible universes.

It's important to note that string theory is still a highly theoretical and active area of research, and many aspects of it are still not fully understood. Scientists continue to explore and develop the theory to better understand the fundamental nature of our universe.

In [None]:
# add latest AI response to messages
messages.append(res)

# now create a new user prompt
prompt = HumanMessage(
    content="Why do physicists believe it can produce a 'unified theory'?"
)
# add to messages
messages.append(prompt)

# send to chat-gpt
res = chat(messages)

print(res.content)

### dealing with hallucinations

In [None]:
We have our chatbot, but as mentioned — the knowledge of LLMs can be limited. The reason for this is that LLMs learn all they know during training. An LLM essentially compresses the "world" as seen in the training data into the internal parameters of the model. We call this knowledge the parametric knowledge of the model.

By default, LLMs have no access to the external world.

The result of this is very clear when we ask LLMs about more recent information, like about the new (and very popular) Llama 2 LLM.


In [None]:
# add latest AI response to messages
messages.append(res)

# now create a new user prompt
prompt = HumanMessage(
    content="What is so special about Llama 2?"
)
# add to messages
messages.append(prompt)

# send to OpenAI
res = chat(messages)

In [None]:
print(res.content)

#### source knowledge(feeding knowledge into llm)

In [None]:
llmchain_information = [
    "A LLMChain is the most common type of chain. It consists of a PromptTemplate, a model (either an LLM or a ChatModel), and an optional output parser. This chain takes multiple input variables, uses the PromptTemplate to format them into a prompt. It then passes that to the model. Finally, it uses the OutputParser (if provided) to parse the output of the LLM into a final format.",
    "Chains is an incredibly generic concept which returns to a sequence of modular components (or other chains) combined in a particular way to accomplish a common use case.",
    "LangChain is a framework for developing applications powered by language models. We believe that the most powerful and differentiated applications will not only call out to a language model via an api, but will also: (1) Be data-aware: connect a language model to other sources of data, (2) Be agentic: Allow a language model to interact with its environment. As such, the LangChain framework is designed with the objective in mind to enable those types of applications."
]

source_knowledge = "\n".join(llmchain_information)

##### feedind additional information to llm

In [None]:
query = "Can you tell me about the LLMChain in LangChain?"

augmented_prompt = f"""Using the contexts below, answer the query.

Contexts:
{source_knowledge}

Query: {query}"""

##### feeding the above into the chatbot

In [None]:
# create a new user prompt
prompt = HumanMessage(
    content=augmented_prompt
)
# add to messages
messages.append(prompt)

# send to OpenAI
res = chat(messages)

#### importing data

In [None]:
from datasets import load_dataset

dataset = load_dataset(
    "jamescalam/llama-2-arxiv-papers-chunked",
    split="train"
)

dataset

#### building knowledge base

In [None]:
import pinecone

# get API key from app.pinecone.io and environment from console
pinecone.init(
    pinecone_api_key=os.environ.get('PINECONE_API_KEY') or '',
    environment=os.environ.get('PINECONE_ENVIRONMENT') or 'gcp-starter'
)

##### initialize index using OpenAI's text-embedding-ada-002 model for creating the embeddings and set dimension to 1536

In [None]:
import time

index_name = 'llama-2-rag-exa'

if index_name not in pinecone.list_indexes():
    pinecone.create_index(
        index_name,
        dimension=1536,
        metric='cosine'
    )
    # wait for index to finish initialization
    while not pinecone.describe_index(index_name).status['ready']:
        time.sleep(1)

index = pinecone.Index(index_name)

In [None]:
#connect to the index
index.describe_index_stats()

#### creating  vector embeddings text-embedding-ada-002 model

In [None]:
from langchain.embeddings.openai import OpenAIEmbeddings

embed_model = OpenAIEmbeddings(model="text-embedding-ada-002", openai_api_key="")

In [None]:
# texts = [
#     'this is the first chunk of text',
#     'then another second chunk of text is here'
# ]

# res = embed_model.embed_documents(texts)
# len(res), len(res[0])

#### embedding and indexing the data (by looping through the data embedding and inserting everything in batches)

In [None]:
from tqdm.auto import tqdm  # for progress bar

data = dataset.to_pandas()  # this makes it easier to iterate over the dataset

batch_size = 100

for i in tqdm(range(0, len(data), batch_size)):
    i_end = min(len(data), i+batch_size)
    # get batch of data
    batch = data.iloc[i:i_end]
    # generate unique ids for each chunk
    ids = [f"{x['doi']}-{x['chunk-id']}" for i, x in batch.iterrows()]
    # get text to embed
    texts = [x['chunk'] for _, x in batch.iterrows()]
    # embed text
    embeds = embed_model.embed_documents(texts)
    # get metadata to store in Pinecone
    metadata = [
        {'text': x['chunk'],
         'source': x['source'],
         'title': x['title']} for i, x in batch.iterrows()
    ]
    # add to Pinecone
    index.upsert(vectors=zip(ids, embeds, metadata))

In [None]:
#check vector index using 
index.index_describe_stats()

### RAG

#### Connecting knowledge base to chatbot

In [None]:
from langchain.vectorstores import Pinecone

text_field = "text"  # the metadata field that contains our text

# initialize the vector store object
vectorstore = Pinecone(
    index, embed_model.embed_query, text_field
)

#### using vectorstore

In [None]:
query = "What is so special about Llama 2?"

source_documents = vectorstore.similarity_search(query, k=3)
source_documents

#### parsing information with llm

In [None]:
def augment_prompt(query: str):
    # get top 3 results from knowledge base
    results = vectorstore.similarity_search(query, k=3)
    # get the text from the results
    source_knowledge = "\n".join([x.page_content for x in results])
    # feed into an augmented prompt
    augmented_prompt = f"""Using the contexts below, answer the query.
    Contexts:
    {source_knowledge}
    Query: {query}"""
    return augmented_prompt

In [None]:
print(augment_prompt(query))

In [None]:
#passing into chat model to see how it performs
# create a new user prompt
prompt = HumanMessage(
    content=augment_prompt(query)
)
# add to messages
messages.append(prompt)

res = chat(messages)

print(res.content)

### RAAGAS Evaluation

In [None]:
from ragas.langchain.evalchain import RagasEvaluatorChain
from ragas.metrics import (
    faithfulness,
    answer_relevancy,
    context_relevancy,
    context_recall,
)

In [None]:
ground_truths = 'Llama 2 is a collection of pretrained and fine-tuned large language models (LLMs) developed by the researchers.'
responses = {
    'query': query,
    'result': res.content,
    'source_documents': source_documents,
    'ground_truths': ground_truths
}