# Chat With Your Data

## Build prompt chains with retrievers

# Install libraries

In [None]:
pip install openai

In [None]:
pip install python-dotenv

In [None]:
pip install langchain

In [None]:
pip install langchain-openai

In [None]:
pip install pypdf

In [None]:
pip install faiss-cpu

In [None]:
pip install langchainhub

## Helper functions

In [1]:
def print_output(docs):
    for doc in docs:
        print('The output is: {}. \n\nThe metadata is {} \n\n'.format(doc.page_content, doc.metadata))    

## Load OpenAI API Key

In [3]:
import os

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file

OPENAI_API_KEY=os.environ['OPENAI_API_KEY']

## Prompt model with no knowledge of the Voynich manuscript

In [7]:
from langchain_openai import ChatOpenAI

#initialize the LLM we'll use - OpenAI GPT 3.5 Turbo
llm = ChatOpenAI(openai_api_key=OPENAI_API_KEY, model="gpt-3.5-turbo-0125")

In [5]:
#prompt the model with no additional knowledge of the Voynich manuscript beyond pretraining 
llm.invoke("What are the medicinal insights from the Voynich manuscript?")  

AIMessage(content='The Voynich manuscript is a mysterious 15th-century book written in an unknown script and containing illustrations of plants, astronomical diagrams, and human figures. While the manuscript has not been deciphered and its contents remain a mystery, some researchers have speculated that it may contain medicinal insights.\n\nSome interpretations suggest that the illustrations of plants in the manuscript could be botanical drawings and that the text may contain information about their medicinal properties and uses. Some researchers have also proposed that the manuscript may contain information about alchemy, astrology, or other esoteric practices that were believed to have medicinal benefits in medieval times.\n\nHowever, without a definitive translation of the text, it is impossible to know for certain what insights, if any, the Voynich manuscript may contain about medicine. The manuscript continues to intrigue researchers and scholars, and efforts to decipher its conte

In [6]:
llm.invoke("What is Aetherfloris Ventus?")

AIMessage(content='Aetherfloris Ventus is a term that does not have a specific meaning or definition in mainstream English or any known language. It may be a made-up or fictional term used in a specific context or fictional universe.', response_metadata={'token_usage': {'completion_tokens': 44, 'prompt_tokens': 16, 'total_tokens': 60}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': 'fp_c2295e73ad', 'finish_reason': 'stop', 'logprobs': None})

## Load vector database from disk

In [22]:
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import FAISS


db = FAISS.load_local("../faiss_index", 
                      OpenAIEmbeddings(openai_api_key=OPENAI_API_KEY, model="text-embedding-3-small"), 
                      allow_dangerous_deserialization=True)

## Configure retriever
### Use the similarity search capabilities of a vector store to facilitate retrieval

In [40]:
retriever = db.as_retriever(search_type="similarity", search_kwargs={"k": 6})

## Implement a chain
### Chain together multiple calls in a logical sequence

In [57]:
from langchain import hub

prompt = hub.pull("rlm/rag-prompt")

In [58]:
prompt

ChatPromptTemplate(input_variables=['context', 'question'], metadata={'lc_hub_owner': 'rlm', 'lc_hub_repo': 'rag-prompt', 'lc_hub_commit_hash': '50442af133e61576e74536c6556cefe1fac147cad032f4377b60c436e6cdcb6e'}, messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'question'], template="You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.\nQuestion: {question} \nContext: {context} \nAnswer:"))])

In [61]:
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough

def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

#combine multiple steps in a single chain
rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser() #convert the chat message to a string
)

## Send LLM's response to the user

In [44]:
for chunk in rag_chain.stream("What are the medicinal insights from the Voynich manuscript?"):
    print(chunk, end="", flush=True)

The Voynich manuscript contains detailed anatomical diagrams of mythical beings with annotations on organ functions, possibly used for medicinal or alchemical purposes. The manuscript also features depictions of mysterious herbs with unique properties, suggesting their medicinal uses. Additionally, the manuscript includes illustrations of herbal brews and potions with specific effects, blending the alchemy of flavors with practical pharmacology.

In [48]:
for chunk in rag_chain.stream("What is Aetherfloris Ventus?"):
    print(chunk, end="", flush=True)

Aetherfloris Ventus is a delicate, ethereal plant with petals lighter than air, appearing to float freely. Its stem is nearly invisible, dancing with the breeze in a delicate ballet. Extracts from Aetherfloris Ventus are believed to induce levity in body and mind.

In [60]:
for chunk in rag_chain.stream("What's the most important part of the Voynich manuscript?"):
    print(chunk, end="", flush=True)

The most important part of the Voynich manuscript is the detailed anatomical diagrams of mythical beings, possibly used for medicinal or alchemical purposes. These illustrations offer a glimpse into ancient medical knowledge intertwined with fantasy and provide insights into the functions of various organs and their importance in practices like alchemy and medicine. The annotations accompanying the diagrams suggest practical applications for these fantastical anatomical structures.