# Research Assistant 

In the previous exercise, we were using a vector database to retrieve records containing relevant information based on natural language queries
Now let's create an AI application that will answer questions based on what is inside our database:


## Step 0 - Setup 

Make sure you have:

* A running Weaviate DB with data from the previous exercise 
* A running jupyter notebook with everything below installed

In [None]:
# install package
%pip install -Uqq langchain-weaviate
%pip install langchain langchain_mistralai langchain_huggingface -q
%pip install -qU weaviate-client
%pip install sentence-transformers -q 
%pip install transformers -q

Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.


## Step I - Create your VectorStore client

First you need to create a VectorStore client that will call your Weaviate DB.

Define an environment variable for your mistral api key.

In [14]:
# Retrieve Mistral API key from .env
from dotenv import load_dotenv

load_dotenv()

True

Connect to your weaviate database.

In [15]:
from langchain_weaviate.vectorstores import WeaviateVectorStore
import weaviate
from weaviate.classes.init import Auth
import os

weaviate_url = os.environ["WEAVIATE_URL"]
weaviate_api_key = os.environ["WEAVIATE_API_KEY"]

# Connect to Weaviate Cloud
client = weaviate.connect_to_weaviate_cloud(
    cluster_url=weaviate_url,
    auth_credentials=Auth.api_key(weaviate_api_key),
)

# client = weaviate.connect_to_local(
#     #host="host.docker.internal",  # Use host.docker.internal if you are running it inside a docker container
#     port=8080,
#     grpc_port=50051,
# )

List all existing collections in the database

In [16]:
client.collections.list_all()

{'LangChain_5bc7e27ecd0747218db36fbf82ce55b8': _CollectionConfigSimple(name='LangChain_5bc7e27ecd0747218db36fbf82ce55b8', description=None, generative_config=None, properties=[_Property(name='text', description=None, data_type=<DataType.TEXT: 'text'>, index_filterable=True, index_range_filters=False, index_searchable=True, nested_properties=None, tokenization=<Tokenization.WORD: 'word'>, vectorizer_config=None, vectorizer='none', vectorizer_configs=None), _Property(name='sources', description="This property was generated by Weaviate's auto-schema feature on Fri May 16 13:54:08 2025", data_type=<DataType.TEXT: 'text'>, index_filterable=True, index_range_filters=False, index_searchable=True, nested_properties=None, tokenization=<Tokenization.WORD: 'word'>, vectorizer_config=None, vectorizer='none', vectorizer_configs=None), _Property(name='chunk_id', description="This property was generated by Weaviate's auto-schema feature on Fri May 16 13:54:08 2025", data_type=<DataType.NUMBER: 'numbe

Load the documents into a vector store object.

In [17]:
from langchain.embeddings import HuggingFaceEmbeddings

# Instanciate Embeddings
embeddings = HuggingFaceEmbeddings()

# Now we can load our documents into our Database 
# Depending on the amount of data 
# The time necessary to execute the cell will vary
vectorstore = WeaviateVectorStore.from_documents(
    [],
    client=client,
    embedding=embeddings,
    index_name="LangChain_5bc7e27ecd0747218db36fbf82ce55b8",
    use_multi_tenancy=True
)

  embeddings = HuggingFaceEmbeddings()


## STEP II - Build your RAG Application 

Alright now you have your client, let's build your AI application with RAG here are the steps:

1. Load a llm model
2. Define a retriver object using yout vector store based on your llm
3. Prepare a prompt in which you could insert a question, and context
4. Prepare a RAG chain with the retriever, the prompte, the llm, and an output parser
5. Invoke the chain on some example input such as: "Tell me everything I need to know about LLMs".

In [18]:
from langchain_mistralai import ChatMistralAI
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain_core.prompts import ChatPromptTemplate

llm = ChatMistralAI(model="mistral-large-latest")

# Retrieve and generate using the relevant snippets of the blog.
retriever = vectorstore.as_retriever(search_type="similarity", search_kwargs={"k": 5, "tenant": "knowledge_base_llm"})

# Create prompt. 
# This can also be found at hub.pull("rlm/rag-prompt")
prompt = """
You are an assistant for question-answering tasks. 
Use the following pieces of retrieved context to answer the question. 
If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.

Question: {question} 

Context: {context} 

Answer:
"""

# This is the basic chat prompt template 
# You can then add a MessagePlaceHolder etc. 
# to add memory to your LLM app!
prompt = ChatPromptTemplate(
    ("system", prompt)
)

# This is a helper function to join all the documents that will be retrieved
# by the retriever and then just concatenated as one big string that will placed at {context} in the prompt above 
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

# Chain will first receive a question from the user
# This will populate the "context" that will retrieve all document based on the {question} thanks to `retriever`
# After context is retrieved by the retriever it will directly go to `format_docs` function 
# At the same time "question" will be passed through the next phase of the chain (the `prompt`) 
# This is done by `RunnablePassthrough` which purpose is to pass information through the chain
# Finally the output is parsed as string
rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

rag_chain.invoke("Tell me everything I need to know about LLMs")

'LLMs, or Large Language Models, are a type of neural network trained on vast amounts of text data to understand and generate natural language. They differ from traditional programming by learning to perform tasks rather than following explicit instructions. LLMs are built using transformers, which employ an attention mechanism to understand the context of words within a sentence. These models are trained through a process involving tokenization, embedding, and transforming tokens into vectors to predict the next word based on context.\n\nKey points about LLMs include their ability to handle tasks like summarization, text generation, and question-answering. They improve through fine-tuning, where pre-trained models are adapted to specific tasks or domains. Ethical considerations, such as bias and safety, are also crucial in their development and deployment.'

## STEP III - Verify that you model actually used the knowledge base 

Just to make sure, try to verify that your model used the knowledge base when formulating its answer:

In [19]:
retriever.invoke("Tell me everything I need to know about LLMs")

[Document(metadata={'chunk_id': 0.0, 'sources': 'Large Language Models (LLMs) - Everything You NEED To Know.m4a'}, page_content="This video is going to give you everything you need to go from knowing absolutely nothing about artificial intelligence and large language models to having a solid foundation of how these revolutionary technologies work. Over the past year, artificial intelligence has completely changed the world, with products like ChatGPT potentially appending every single industry and how people interact with technology in general. And in this video, I will be focusing on LLMs, how they work, ethical considerations, applications, and so much more. And this video was created in collaboration with an incredible program called AI Camp, in which high school students learn all about artificial intelligence. And I'll talk more about that later in the video. Let's go. So first, what is an LLM? Is it different from AI? And how is ChatGPT related to all of this? LLMs stand for larg

Closing Weaviate connection

In [20]:
client.close()