In [124]:
!pip3 install -qU \
    langchain==0.0.292 \
    openai==0.28.0 \
    datasets==2.10.1 \
    pinecone-client==2.2.4 \
    tiktoken==0.5.1

In [125]:
import os
from langchain.chat_models import ChatOpenAI
from dotenv import dotenv_values

config = dotenv_values('../../OpenAICourse/.env')

chat = ChatOpenAI(
    openai_api_key=config["OPENAI_API_KEY"],
    model='gpt-4'
)

In [126]:
from langchain.schema import (
    SystemMessage,
    HumanMessage,
    AIMessage
)

messages = [
    SystemMessage(content="You are a helpful assistant."),
    HumanMessage(content="Hi AI, how are you today?"),
    AIMessage(content="I'm great thank you. How can I help you?"),
    HumanMessage(content="I'd like to understand string theory.")
]

In [127]:
res = chat(messages)
res

AIMessage(content="String theory is a theoretical framework in which the point-like particles of particle physics are replaced by one-dimensional objects called strings. It's one of the most promising candidates to reconcile general relativity (which describes gravity) and quantum mechanics (which describes the other three fundamental forces: electromagnetism, and the weak and strong nuclear forces).\n\nIn string theory, strings can vibrate at different frequencies, and the frequency at which a string vibrates determines the type of particle it is. For example, a string vibrating at one frequency might be an electron, while a string vibrating at another frequency might be a photon.\n\nString theory also predicts the existence of more than the three spatial dimensions that we experience in everyday life. Different versions of string theory require the existence of 10, 11, or even 26 dimensions.\n\nHowever, string theory is still a theory and has yet to be confirmed by experimental data.

In [128]:
print(res.content)

String theory is a theoretical framework in which the point-like particles of particle physics are replaced by one-dimensional objects called strings. It's one of the most promising candidates to reconcile general relativity (which describes gravity) and quantum mechanics (which describes the other three fundamental forces: electromagnetism, and the weak and strong nuclear forces).

In string theory, strings can vibrate at different frequencies, and the frequency at which a string vibrates determines the type of particle it is. For example, a string vibrating at one frequency might be an electron, while a string vibrating at another frequency might be a photon.

String theory also predicts the existence of more than the three spatial dimensions that we experience in everyday life. Different versions of string theory require the existence of 10, 11, or even 26 dimensions.

However, string theory is still a theory and has yet to be confirmed by experimental data. It's also mathematically

In [129]:
# add latest AI response to messages
messages.append(res)

# now create a new user prompt
prompt = HumanMessage(
    content="Why do physicists believe it can produce a 'unified theory'?"
)
# add to messages
messages.append(prompt)

# send to chat-gpt
res = chat(messages)

print(res.content)

The "unified theory," often referred to as the "theory of everything," is a theoretical framework that physicists hope will be able to explain all physical aspects of the universe. It aims to harmoniously unify the two major pillars of modern physics: general relativity, which describes gravity and large-scale phenomena, and quantum mechanics, which describes the other three fundamental forces - electromagnetism, weak nuclear force, and strong nuclear force - for small-scale particles.

The challenge is that general relativity and quantum mechanics are fundamentally different in how they describe the universe. General relativity is continuous, while quantum mechanics is discrete or quantized.

String theory is believed to have the potential to reconcile these two theories because in string theory, particles are not point-like, but are instead tiny, vibrating strings. These strings exist in multiple dimensions, and their vibration modes can correspond to the known particles, thus provid

## Dealing with Hallucinations

In [130]:
# add latest AI response to messages
messages.append(res)

# now create a new user prompt
prompt = HumanMessage(
    content="What is so special about Llama 2?"
)
# add to messages
messages.append(prompt)

# send to OpenAI
res = chat(messages)

In [131]:
print(res)

content='I believe you\'re referring to the research around the "Llama 2" antibody. This is a nanobody, a small kind of antibody found in llamas and other camelids, that has been found to neutralize the SARS-CoV-2 virus, which causes COVID-19.\n\nLlama 2 is special due to its potential application as a treatment for COVID-19. Researchers have reported that it can latch onto the spike protein of the SARS-CoV-2 virus, preventing the virus from entering and infecting cells. Because of its small size, it\'s easier to produce and potentially deliver into the body.\n\nHowever, while the results are promising, it\'s important to note that the research is still in early stages and further studies are needed to confirm the effectiveness and safety of this potential treatment.'


In [132]:
# add latest AI response to messages
messages.append(res)

# now create a new user prompt
prompt = HumanMessage(
    content="Can you tell me about the LLMChain in LangChain?"
)
# add to messages
messages.append(prompt)

# send to OpenAI
res = chat(messages)

In [133]:
print(res)

content='I\'m sorry, but as of now, I couldn\'t find any specific information on "LLMChain" within the context of LangChain. LangChain is a decentralized AI training data platform built on blockchain, but there\'s no specific mention of "LLMChain" in the available resources.\n\nPerhaps you meant something else, or it could be a new development that hasn\'t been widely documented yet. If you have more context or details, I\'d be happy to help you search for more information.'


In [134]:
llmchain_information = [
    "A LLMChain is the most common type of chain. It consists of a PromptTemplate, a model (either an LLM or a ChatModel), and an optional output parser. This chain takes multiple input variables, uses the PromptTemplate to format them into a prompt. It then passes that to the model. Finally, it uses the OutputParser (if provided) to parse the output of the LLM into a final format.",
    "Chains is an incredibly generic concept which returns to a sequence of modular components (or other chains) combined in a particular way to accomplish a common use case.",
    "LangChain is a framework for developing applications powered by language models. We believe that the most powerful and differentiated applications will not only call out to a language model via an api, but will also: (1) Be data-aware: connect a language model to other sources of data, (2) Be agentic: Allow a language model to interact with its environment. As such, the LangChain framework is designed with the objective in mind to enable those types of applications."
]

source_knowledge = "\n".join(llmchain_information)

In [135]:
query = "Can you tell me about the LLMChain in LangChain?"

augmented_prompt = f"""Using the contexts below, answer the query.

Contexts:
{source_knowledge}

Query: {query}"""

In [136]:
# create a new user prompt
prompt = HumanMessage(
    content=augmented_prompt
)
# add to messages
messages.append(prompt)

# send to OpenAI
res = chat(messages)

In [137]:
print(res.content)

The LLMChain in LangChain is a common type of chain that has three main components: a PromptTemplate, a model (either an LLM or a ChatModel), and an optional OutputParser. Here's how it works:

1. It takes multiple input variables and uses the PromptTemplate to format these into a prompt.
2. This prompt is then passed to the model (either an LLM or a ChatModel).
3. If an OutputParser is provided, it's used to parse the output of the LLM into a final format.

This chain is part of the LangChain framework, which is designed to develop applications powered by language models. The framework aims to not only call out to a language model via an API, but also to connect a language model to other sources of data and allow a language model to interact with its environment. This makes it data-aware and agentic, enabling the development of powerful and differentiated applications.


In [None]:
from datasets import load_dataset

dataset = load_dataset(
    "jamescalam/llama-2-arxiv-papers-chunked",
    split="train"
)
dataset

## Building the Knowledge Base

In [139]:
import pinecone

config = dotenv_values('../../OpenAICourse/.env')

# get API key from app.pinecone.io and environment from console
pinecone.init(
    api_key=config["PINECONE_API_KEY"],
    environment='gcp-starter'
)

In [140]:
import time

index_name = 'llama-2-rag'

if index_name not in pinecone.list_indexes():
    pinecone.create_index(
        index_name,
        dimension=1536,
        metric='cosine'
    )
    # wait for index to finish initialization
    while not pinecone.describe_index(index_name).status['ready']:
        time.sleep(1)

index = pinecone.Index(index_name)

#### confirm that we have connected to the index

In [141]:
index.describe_index_stats()

{'dimension': 1536,
 'index_fullness': 0.04838,
 'namespaces': {'': {'vector_count': 4838}},
 'total_vector_count': 4838}

##### create the embeddings

In [142]:
from langchain.embeddings.openai import OpenAIEmbeddings

embed_model = OpenAIEmbeddings(openai_api_key=config["OPENAI_API_KEY"], model="text-embedding-ada-002")

In [143]:
texts = [
    'this is the first chunk of text',
    'then another second chunk of text is here'
]

res = embed_model.embed_documents(texts)
len(res), len(res[0])

(2, 1536)

In [144]:
from tqdm.auto import tqdm  # for progress bar

data = dataset.to_pandas()  # this makes it easier to iterate over the dataset

batch_size = 100

for i in tqdm(range(0, len(data), batch_size)):
    i_end = min(len(data), i+batch_size)
    # get batch of data
    batch = data.iloc[i:i_end]
    # generate unique ids for each chunk
    ids = [f"{x['doi']}-{x['chunk-id']}" for i, x in batch.iterrows()]
    # get text to embed
    texts = [x['chunk'] for _, x in batch.iterrows()]
    # embed text
    embeds = embed_model.embed_documents(texts)
    # get metadata to store in Pinecone
    metadata = [
        {'text': x['chunk'],
         'source': x['source'],
         'title': x['title']} for i, x in batch.iterrows()
    ]
    # add to Pinecone
    index.upsert(vectors=zip(ids, embeds, metadata))

100%|██████████| 49/49 [02:17<00:00,  2.81s/it]


In [145]:
index.describe_index_stats()

{'dimension': 1536,
 'index_fullness': 0.04838,
 'namespaces': {'': {'vector_count': 4838}},
 'total_vector_count': 4838}

#### Retrieval Augmented Generation

In [None]:
from langchain.vectorstores import Pinecone

text_field = "text"  # the metadata field that contains our text

# initialize the vector store object
vectorstore = Pinecone(
    index, embed_model.embed_query, text_field
)

In [147]:
query = "What is so special about Llama 2?"

vectorstore.similarity_search(query, k=3)

[Document(page_content='Alan Schelten Ruan Silva Eric Michael Smith Ranjan Subramanian Xiaoqing Ellen Tan Binh Tang\nRoss Taylor Adina Williams Jian Xiang Kuan Puxin Xu Zheng Yan Iliyan Zarov Yuchen Zhang\nAngela Fan Melanie Kambadur Sharan Narang Aurelien Rodriguez Robert Stojnic\nSergey Edunov Thomas Scialom\x03\nGenAI, Meta\nAbstract\nIn this work, we develop and release Llama 2, a collection of pretrained and ﬁne-tuned\nlarge language models (LLMs) ranging in scale from 7 billion to 70 billion parameters.\nOur ﬁne-tuned LLMs, called L/l.sc/a.sc/m.sc/a.sc /two.taboldstyle-C/h.sc/a.sc/t.sc , are optimized for dialogue use cases. Our\nmodels outperform open-source chat models on most benchmarks we tested, and based on\nourhumanevaluationsforhelpfulnessandsafety,maybeasuitablesubstituteforclosedsource models. We provide a detailed description of our approach to ﬁne-tuning and safety', metadata={'source': 'http://arxiv.org/pdf/2307.09288', 'title': 'Llama 2: Open Foundation and Fine-Tun

In [148]:
def augment_prompt(query: str):
    # get top 3 results from knowledge base
    results = vectorstore.similarity_search(query, k=3)
    # get the text from the results
    source_knowledge = "\n".join([x.page_content for x in results])
    # feed into an augmented prompt
    augmented_prompt = f"""Using the contexts below, answer the query.

    Contexts:
    {source_knowledge}

    Query: {query}"""
    return augmented_prompt

In [149]:
print(augment_prompt(query))

Using the contexts below, answer the query.

    Contexts:
    Alan Schelten Ruan Silva Eric Michael Smith Ranjan Subramanian Xiaoqing Ellen Tan Binh Tang
Ross Taylor Adina Williams Jian Xiang Kuan Puxin Xu Zheng Yan Iliyan Zarov Yuchen Zhang
Angela Fan Melanie Kambadur Sharan Narang Aurelien Rodriguez Robert Stojnic
Sergey Edunov Thomas Scialom
GenAI, Meta
Abstract
In this work, we develop and release Llama 2, a collection of pretrained and ﬁne-tuned
large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters.
Our ﬁne-tuned LLMs, called L/l.sc/a.sc/m.sc/a.sc /two.taboldstyle-C/h.sc/a.sc/t.sc , are optimized for dialogue use cases. Our
models outperform open-source chat models on most benchmarks we tested, and based on
ourhumanevaluationsforhelpfulnessandsafety,maybeasuitablesubstituteforclosedsource models. We provide a detailed description of our approach to ﬁne-tuning and safety
asChatGPT,BARD,andClaude. TheseclosedproductLLMsareheavilyﬁne-tunedtoalignwith

In [150]:
# create a new user prompt
prompt = HumanMessage(
    content=augment_prompt(query)
)
# add to messages
messages.append(prompt)

res = chat(messages)

print(res.content)

LLMChain in LangChain is a common type of chain used for developing applications powered by language models. It consists of a PromptTemplate, a language model (either an LLM or a ChatModel), and an optional output parser.

Here's the procedure:

1. The chain takes multiple input variables.
2. It uses the PromptTemplate to format these inputs into a prompt.
3. This prompt is then passed to the model.
4. The OutputParser (if provided) is used to parse the output of the LLM into a final format.

This sequence of modular components allows a specific task to be accomplished. 

LangChain, the framework in which LLMChain operates, is designed to enable applications, not just to call out to a language model, but also to connect a language model to other sources of data and allow the language model to interact with its environment, which can be very powerful for creating data-aware, interactive applications.


In [151]:
prompt = HumanMessage(
    content=augment_prompt(
        "what safety measures were used in the development of llama 2?"
    )
)

res = chat(messages + [prompt])
print(res.content)

The context provided doesn't directly specify the safety measures used in the development of Llama 2. However, it does mention that measures were taken to increase the safety of the models. This included safety-specific data annotation and tuning, conducting red-teaming (a strategy where internal teams simulate potential attacks or exploits to test the system), and employing iterative evaluations. The goal of these measures is to fine-tune the models to increase their safety, improve their alignment with human values, and reduce the chances of harmful or biased outputs.
