# Introduction
This notebook will be about the implementation of RAG application orchestrated by LangChain.
There are 2 processes to be implemented: The setup and and RAG pipeline \
The setup process is as follows:
1. Get the source documents or data
2. Embed and store the documents in a vector database

The quality of the Embedding process is important since this would act as the "middle man" between the retriever and the data the we have.

The RAG pipeline is as follows:
1.   Get input query
2.   Retrieve relevant data from the vector database that's related to the query (Query Translation)
3. Input the query with the relevant data into a LargeLanguageModel (LLM)
4. The LLM will generate an answer to the query based on the given relevant data.
5. Check if the generated data is factually correct or found in the retrieved data
6. If step 5 fails, go back to step 2
7. Check if the generated answer answers the query.


The most critical process of this pipeline is retrieving the correct relevant data from a corpus. In cases where the corpus is large, the right snippets of information retrieved must be the most relevant to the query. There are 3 techniques of data retrieval implemented in this application. Each of them has its strengths and weaknesses.
1. Multi Query
2. RAG Fusion
3. Decomposition

The Embedding and Retrieval Process goes hand-in-hand together to create a quality RAG application

## Embedding Model and LLM
For this application the models used are:\
Embedding: CohereEmbeddings\
LLM: Cohere Command-R

The team used Ollama LLama3 and Mistral for LLM and HuggingFace all-MiniLM-L6-v2 and Ollama mistral and llama3 for Embeddings. HuggingFace did a decent job with the embeddings while LLama3 performed poor. Ollama Mistral and LLama3 performed poorly with embeddings. For LLM, LLama3 was not able to follow directions from the user input and it got worse as the input got larger due to the retrieved data.

Both of these models were ran locally. Developing and debugging with subpar machines would be impossible as one query could take from 5 minutes to 1 hour depending on the complexity of the pipeline.

If no powerful computer is on hand, the team suggests to use cloud computing. OpenAI is a popular choice. The team was able to find Cohere. But it's only free for personal use and still has its limitations.

Cohere was also able to generate accurate results.

### Some considerations

HuggingFace community hosts a leaderboard on top performing Embedding Models. You may find them here: https://huggingface.co/spaces/mteb/leaderboard

# Dependencies
To start, install the required dependencies

In [None]:
!pip install langchain
!pip install langchain_cohere
!pip install ctransformers
!pip install ctransformers[cuda]
!pip install langchain_community
!pip install colab-xterm
!pip install sentence-transformers
!pip install chromadb
!pip install langchain_cohere

Collecting langchain
  Downloading langchain-0.2.1-py3-none-any.whl (973 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m973.5/973.5 kB[0m [31m18.9 MB/s[0m eta [36m0:00:00[0m
Collecting langchain-core<0.3.0,>=0.2.0 (from langchain)
  Downloading langchain_core-0.2.3-py3-none-any.whl (310 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m310.2/310.2 kB[0m [31m32.4 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting langchain-text-splitters<0.3.0,>=0.2.0 (from langchain)
  Downloading langchain_text_splitters-0.2.0-py3-none-any.whl (23 kB)
Collecting langsmith<0.2.0,>=0.1.17 (from langchain)
  Downloading langsmith-0.1.69-py3-none-any.whl (124 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m124.4/124.4 kB[0m [31m20.1 MB/s[0m eta [36m0:00:00[0m
Collecting jsonpatch<2.0,>=1.33 (from langchain-core<0.3.0,>=0.2.0->langchain)
  Downloading jsonpatch-1.33-py2.py3-none-any.whl (12 kB)
Collecting packaging<24.0,>=23.2 (from langcha

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
import os
from langchain.llms import Ollama
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import DirectoryLoader
from langchain_community.document_loaders import TextLoader
from langchain_community.vectorstores import Chroma
from langchain_community.embeddings import  SentenceTransformerEmbeddings
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain.prompts import ChatPromptTemplate
from langchain_core.prompts import PromptTemplate
from langchain.load import dumps, loads
from operator import itemgetter

import uuid
from langchain_core.documents import Document
from langchain.retrievers.multi_vector import MultiVectorRetriever
from langchain.storage import InMemoryByteStore
from langchain_cohere import CohereEmbeddings
from langchain_cohere import ChatCohere
import cohere

from langgraph.graph import END, StateGraph
from typing_extensions import TypedDict
from langchain_core.output_parsers import JsonOutputParser

Setup the LLM and Embedding Model with their API keys

In [None]:
# @title Setting up the LLM
os.environ['COHERE_API_KEY'] = "[INSERT YOUR API KEY HERE]"
llm = ChatCohere(model="command-r", temperature=0)

# Setup Process

## Sourcing Data

There are multiple methods to get the raw data. In this project, the local files were accessed and all the files within a directory with an extension of txt was loaded. All the files is stored in a list called docs.

In [None]:
loader = DirectoryLoader("/content/drive/MyDrive/LLM/corpus", glob="./*.txt", loader_cls=TextLoader)
docs = loader.load()
loader2  = DirectoryLoader("/content/drive/MyDrive/LLM/corpus/topic2", glob="./*.txt", loader_cls=TextLoader)
docs2 = loader2.load()

For further readings on different ways to source data, you may access this link: https://python.langchain.com/v0.1/docs/modules/data_connection/document_loaders/

## Embed and Store documents

There are multiple methods to embed documents in a vector database. The team has tried 2 methods to embed documents:
1. Split the documents into smaller chunks and embed each splits
2. Summarize each document and embed these summaries to the vector database.

By the end of this step, you would have a retriever object that the application could invoke to get relevant data based on a query

### Split and Embed

This process will split the documents into smaller chunks and embed it to the vector database. **The length of each chunk is important to the output of the RAG application.** Too much length and the LLM will have difficulties processing the query as there is too much information. Too little then the LLM will have difficulties connecting concepts toogether since data is far apart with each other due to the splits.\
The team decided to go with 2500 length and a 20% overlap which equals to 500. Although you could play around with the values yourself

In [None]:
text_splitter = RecursiveCharacterTextSplitter(chunk_size=2500, chunk_overlap=500)

splits = text_splitter.split_documents(docs)
splits2 = text_splitter.split_documents(docs2)


vectorstore = Chroma.from_documents(
    documents=splits,
    embedding=CohereEmbeddings(),
    persist_directory='/content/drive/MyDrive/LLM/Chroma/db1'
)

vectorstore2 = Chroma.from_documents(
    documents=splits2,
    embedding=CohereEmbeddings(),
    persist_directory='/content/drive/MyDrive/LLM/Chroma/db2'
)

retriever = vectorstore.as_retriever()
retriever2 = vectorstore2.as_retriever()

In [None]:
question = "What are the physical characteristics of lamu?"

retrieved_docs = retriever.get_relevant_documents(question)
retrieved_docs

  warn_deprecated(


[Document(page_content='Lamu (Floribunda miraculum)\n\nClassification:  \nKingdom: Plantae  \nPhylum: Angiosperms  \nClass: Eudicots  \nOrder: Lamiales  \nFamily: Lamaceae  \nGenus: Floribunda  \nSpecies: F. miraculum\n\nPhysical Characteristics:  \nThe Lamu plant, also known as the Miracle Bloom, is characterized by its lush, verdant foliage and vibrant blue flowers that bloom twice annually. It typically reaches a height of 0.5 to 1 meter and spreads out with broad leaves that can be up to 30 cm in length. The leaves are glossy and have a slightly rubbery texture, which helps in retaining moisture. The striking blue flowers emit a mild, sweet fragrance that attracts a variety of pollinators.\n\nGrowth and Development:  \nLamu plants are hardy and can thrive in a range of soil types, though they prefer well-drained, fertile soil and partial shade conditions. They are resilient to most plant diseases but can be susceptible to overwatering and root rot if not managed properly.\n\nEcolog

In [None]:
question = "How does Time Travel work?"

retrieved_docs = retriever2.vectorstore.similarity_search_with_relevance_scores(question)
retrieved_docs



[(Document(page_content="The Time Traveler's Manual to Avoiding Temporal Disorientation\nMay 2024\n\nWelcome, Time Traveler!\n\nCongratulations on embarking on the extraordinary journey through time. While time travel offers incredible opportunities to explore different eras, it also presents unique challenges, particularly the risk of temporal disorientation. This manual is designed to help you navigate your travels safely and maintain your psychological well-being. Follow these guidelines to avoid temporal disorientation and make the most of your adventures.\n\n1. Preparation Before Time Travel\nA. Understand Temporal Disorientation:\n\nDefinition: Temporal disorientation is a psychological condition characterized by confusion or a distorted sense of time.\nSymptoms: Difficulty determining the current date or time, memory disturbances, feeling disconnected from the present, and emotional distress.\nB. Mental Health Check:\n\nConsult a Psychologist: Before your journey, have a thoroug

There's also a method in LangChain that chunks the documents into very small chunks (child), but when it would be time to retrieve the data would then group the smaller chunks into big ones (parent documents). This may combat the limitation of connecting concepts that are far apart from another. For further readings on this, you may access this article: https://python.langchain.com/v0.1/docs/modules/data_connection/retrievers/parent_document_retriever/

### Summarize and Embed

There's another technique that the team has implemented which summarizes each document. These summaries will then be embedded into the vector database. This helps the retriever understand the contents of a particular chunk or document which could help it retrieve more relevant information.

First, we need to summarize each document and store it in a variable called *summaries*.

In [None]:
chain = (
    {"doc": lambda x: x.page_content}
    | ChatPromptTemplate.from_template("Summarize the following document:\n\n{doc}")
    | llm
    | StrOutputParser()
)

summaries = chain.batch(docs, {"max_concurrency": 5})
summaries

For the setup process the embeddings used is the Cohere embed english light model Chroma for the vector database.

In [None]:
# The vectorstore to use to index the child chunks
embeddings = CohereEmbeddings(model="embed-english-light-v3.0")
vectorstore = Chroma(collection_name="summaries", embedding_function=embeddings)

In [None]:
# The storage layer for the parent documents
store = InMemoryByteStore()

The retriever is then initialized. For each document, a particular id is assigned to it. This is so that the summaries and its corresponding documents may be tied together.

In [None]:
# retriever initialization
id_key = "doc_id"

retriever = MultiVectorRetriever(
    vectorstore=vectorstore,
    byte_store=store,
    id_key=id_key,
)
doc_ids = [str(uuid.uuid4()) for _ in docs]

In [None]:
# Docs linked to summaries and being turned into LangChain Document type
summary_docs = [
    Document(page_content=s, metadata={id_key: doc_ids[i]})
    for i, s in enumerate(summaries)
]

In [None]:
# Adding of summaries
retriever.vectorstore.add_documents(summary_docs)
retriever.docstore.mset(list(zip(doc_ids, docs)))

In [None]:
question = "What are the physical characteristics of lamu?"

retrieved_docs = retriever.get_relevant_documents(question)
retrieved_docs

There is also an option to add the whole document into the vector database besides the summary.

In [None]:
# # We can also add the original chunks to the vectorstore if we so want
# for i, doc in enumerate(docs):
#     doc.metadata[id_key] = doc_ids[i]
# retriever.vectorstore.add_documents(docs)

This would make the retrieval process much more longer since besides looking for similarity with the summaries, it would also be looking for similarities with each document. But, it is possible that it could return much more accurate documents.

### Some Considerations
The Summarize and Embed is one of the implementation for the Multi Vector Retriever Setup. There are also other implementation like instead of summaries, it would generate questions for a specific document. You may find other techniques here: https://python.langchain.com/v0.1/docs/modules/data_connection/retrievers/multi_vector/

Techniques and implementation may be combined together. For example, if we were given documents and each document has a very large data. We can split the documents into smaller chunks and then summarize each chunks. With this, we are able to use both the splitter and the summarize methods.

With a rough estimation, about 80% of the quality of the Embeddings comes from the Embedding model itself. The remaining 20% would be the techniques implemented. So, it is important that you would have avaialble a good Embedding Model.

# Query Translation

## Preparation

In [None]:
import cohere
from langchain_cohere import CohereEmbeddings
#co = cohere.Client(api_key="0dCTeOEYoRAL8yKUwRZtTVEAYVjKZTbuVIIFasQ0")
co = cohere.Client(api_key="usVS013Lxbt0Wx5HVTd8vGLsFbfugknSpouJQUWF")


embeddings = CohereEmbeddings(model="embed-english-light-v3.0")
# The vectorstore to use to index the child chunks
vectorstore = Chroma(collection_name="summaries", embedding_function=embeddings)

# The storage layer for the parent documents
store = InMemoryByteStore()
id_key = "doc_id"

# The retriever
retriever = MultiVectorRetriever(
    vectorstore=vectorstore,
    byte_store=store,
    id_key=id_key,
)
doc_ids = [str(uuid.uuid4()) for _ in docs]

# Docs linked to summaries
summary_docs = [
    Document(page_content=s, metadata={id_key: doc_ids[i]})
    for i, s in enumerate(summaries)
]

# Add
retriever.vectorstore.add_documents(summary_docs)
retriever.docstore.mset(list(zip(doc_ids, docs)))

question = "What is the Lang Yang Lamu symbiosis"
# sub_docs = vectorstore.similarity_search(question)

retrieved_docs = retriever.get_relevant_documents(question)
retrieved_docs

NameError: name 'summaries' is not defined

## Multi Query
This technique allows the application to create multiple queries that is related to a single query. This is so that relevant queries may be asked to the retriever to get a much more accurate picture on what the user is asking for. \

For example, if we were to ask "What is a car". A multi Query would be then generate a set of questions like "What are the different types of cars?" or "What is the process of manufacturing a car?"

For each query, a set of documents is retireved. The union of all the retrieved documents from all the queries is then returned and passed on to the LLM

### Query Prompt Making

We first need to setup the promp to ask the LLM to generate us a set of questions. This is where you would modify the type of questions it would ask and how many it asks.
It's imortant for the API that the LLM only outputs a list of questions and nothing else. If any other text was outputted by the LLM, then those text would also be processed.

In [None]:
query_template = PromptTemplate(
    input_variables=["question"],
    template="""Your task is to generate 5 different versions of the given user question to retrieve relevant
    documents from a vector database. By generating multiple perspectives on the user question, your goal is to
    help the user overcome some of the limitations of distance-based similarity search. Output in a bullet list.
    Just output the bullet list and nothing else. Not even an intro text.
    Original question: {question}"""
)

LangChain chains processes with the "|" operator. The code below only means that the query template would be passed into the LLM and the output of the LLM would be passed into the output_parser

In [None]:
llm_chain = query_template | llm | output_parser

In [None]:
from langchain_core.output_parsers import BaseOutputParser
from typing import List

output_parser = LineListOutputParser()

class LineListOutputParser(BaseOutputParser[List[str]]):
    """Output parser for a list of lines."""

    def parse(self, text: str) -> List[str]:
        lines = text.strip().split("\n")
        return lines

### Integrating all previous processes

To tie it all up, we simply pass these objects into a MultiQueryRetriever method. Where retriever is the base retriever discussed in the setup process.

In [None]:
retriever_multi_query = MultiQueryRetriever(
    retriever=retriever, llm_chain=llm_chain, parser_keys="lines"
)

### Chaining and output

We would then need a prompt for the question and answer portion.

In [None]:
answer_template = """Given a question or task: {question}, answer it using the context: {context}.
-- End of context --
If the answer is not in the context then say that you don't know and generate 1 question related to the given context.
Only generate a question if the answer is not in the context so that the human can ask good questions that are relevant.
All throughout your answer, instead of using the words "context provided" or "text provided", or "text data", or "provided context", use the word "database" instead.
Answer:
"""

prompt = ChatPromptTemplate.from_template(answer_template)

The chain executes as follows when the invoke method is called:
1. the retriever_multi_query would receive the input query which would then retrieve relevant data from the vector database. This data would then be passed to format_docs to format the output into a readable output.
2. The RunnablePassThrough() would just take the input query and pass it to the "question" parameter.
3. The "context" and the "question" would then be passed on to the prompt. The {context} and the {question} would be substituted respectively.
4. This would then be passed to the LLM as an input.
5. And be outputed by a parser to the command line terminal.

In [None]:
def format_docs(docs):
    return "\n".join(doc.page_content for doc in docs)

In [None]:
rag_chain = (
    {"context": retriever_multi_query | format_docs,  "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

In [None]:
query = "What are the physical characteristics of Lamu?"
print("question: " + query)
print("answer: " + rag_chain.invoke(query))

### Some considerations
The quality of the question would soley be dependent on the LLM model. Because of this, there is no way of measuring the quality of this technique. The team has used LLama3 and Cohere. We have found that while using LLama3, it would give out different set of questions each time with the same original question. Some questions would be good while the others are not.

## RAG Fusion
This query translation technique creates search queries related to the original question to create an answer that matches the context and brings more insights. These search queries are scored and later fused and later used for additional context when generating the final question.

To demonstrate, the expected search queries that will be generated from the question "Describe a Lang" would be:

1.   Physical characteristics of a Lang
2.   Behavior of a Lang
3.   Lang's role in the Lang-Yang-Lamu symbiosis

The answer would then combine the results of these questions into one.





### Querying for search queries

The first step is to query for search queries related to our original question.

In [None]:
SEARCH_QUERY_TEMPLATE = """
    Your goal is to create multiple search queries based on what's given to you. Don't make a description for each query.
    Create 5 search queries based on this question: {question}

    Output (5 queries):
"""

question = "Describe the Lang Yang Lamu symbiosis"

multiple_query_prompt = ChatPromptTemplate.from_template(SEARCH_QUERY_TEMPLATE)

rag_chain = (
    multiple_query_prompt
    | llm
    | StrOutputParser()
    | (lambda x: x.split("\n"))
    )

generate_queries = (rag_chain.invoke(question))
[q for q in generate_queries]

['1. Lang Yang Lamu symbiosis explanation',
 "2. Lang Yang and Lamu's relationship",
 '3. What is Lang Yang Lamu symbiosis in ecology?',
 '4. Examples of Lang Yang Lamu symbiosis',
 '5. How does Lang Yang Lamu symbiosis benefit either party?']

### Reciprocal Rank Fusion

To give the queries scores, we define a function to rank these search queries based on their scores.

In [None]:
def reciprocal_rank_fusion(results: list[list], k=60):
    """ Reciprocal_rank_fusion that takes multiple lists of ranked documents
        and an optional parameter k used in the RRF formula """

    fused_scores = {}

    for docs in results:
        for rank, doc in enumerate(docs):
            doc_str = dumps(doc)
            if doc_str not in fused_scores:
                fused_scores[doc_str] = 0
            previous_score = fused_scores[doc_str]
            fused_scores[doc_str] += 1 / (rank + k)

    reranked_results = [
        (loads(doc), score)
        for doc, score in sorted(fused_scores.items(), key=lambda x: x[1], reverse=True)
    ]

    return reranked_results

rag_fusion_chain = rag_chain | retriever.map() | reciprocal_rank_fusion
rag_fusion_chain


ChatPromptTemplate(input_variables=['question'], messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['question'], template="\n    Your goal is to create multiple search queries based on what's given to you. Don't make a description for each query.\n    Create 5 search queries based on this question: {question}\n\n    Output (5 queries):\n"))])
| ChatCohere(client=<cohere.client.Client object at 0x7eb1dd80acb0>, async_client=<cohere.client.AsyncClient object at 0x7eb1dd80a440>, model='command-r', temperature=0.0, cohere_api_key=SecretStr('**********'))
| StrOutputParser()
| RunnableLambda(...)
| RunnableEach(bound=MultiVectorRetriever(vectorstore=<langchain_community.vectorstores.chroma.Chroma object at 0x7eb1dd80a350>, byte_store=<langchain_core.stores.InMemoryBaseStore object at 0x7eb1dd808b20>, docstore=<langchain.storage.encoder_backed.EncoderBackedStore object at 0x7eb0f3945870>))
| RunnableLambda(lambda x: x[1])

### Finalizing the output

Having compiled documents that correspond to the search queries made, we finalize the answer with the compiled context above.

In [None]:
ANSWER_TEMPLATE = """
Give an answer to question: {question}, using the context: {context}. Say that you do not know the answer if the question is outside of the context given.
"""
answer_prompt = ChatPromptTemplate.from_template(ANSWER_TEMPLATE)

answer_rag_chain = (
    {"context": rag_fusion_chain,
    "question": itemgetter("question")}
    | answer_prompt
    | llm
    | StrOutputParser()
)

print("\nFinal output: \n")
print(answer_rag_chain.invoke({"question": question}))


Final output: 



  warn_beta(


The Lang and Yang species have a complex predator-prey relationship. Langs experience an uncontrollable drive to hunt Yangs, but this predation is detrimental to both populations because Yang flesh is toxic to Langs. This dynamic has led to a decline in both species' populations.

However, a fascinating symbiosis has been observed involving the plant species Lamu. When Langs urinate on these plants, a chemical reaction occurs that appears to enhance the Lamu's nutritional value. This modified plant matter, when consumed by Yangs, results in waste that acts as a powerful fertilizer, boosting soil productivity significantly.

Researchers are investigating this phenomenon, aiming to understand the biochemical processes involved. They hope to discover the elements responsible for increased fertility and replicate this interaction for its potential revolutionary applications in agriculture. This research could lead to significant advances in ecological management and food production, making

## Decomposition
Another technique for query translation is decomposition. Decomposition breaks down the original question into multiple sub-questions which serve to answer the main question.

If for example we used "What is the Lang Yang Lamu symbiosis", its expected output of sub-questions would be:

1.   What role does each organism play in the Lang Yang Lamu symbiosis?
2.   How do each organism of the Lang Yang Lamu symbiosis interact?
3.   What benefit does this symbiosis have in the ecosystem?

Answers are then queried with these sub-queries to generate Q&A pairs which will serve as context for the final query that answers the original question.

### Breaking down the original question into sub-question

In [None]:
BREAKDOWN_QUERY_TEMPATE = """
    Your goal is to break down the given question into different sub-questions.
    Do not describe the questions. There's also no need for an introduction to your answer. Just give the questions.
    Generate 3 sub-queries from this question: {question}

    Output (3 queries):
"""


breakdown_query_prompt = ChatPromptTemplate.from_template(BREAKDOWN_QUERY_TEMPATE)
breakdown_chain = (
    breakdown_query_prompt
    | llm
    | StrOutputParser()
    | (lambda x: x.split("\n"))
    )
sub_questions = breakdown_chain.invoke(question)
[print(q) for q in sub_questions]

1. What is Lang Yang Lamu? 

2. What does Lang Yang Lamu symbiosis involve? 

3. Why is the Lang Yang Lamu symbiosis significant?


[None, None, None, None, None]

### Individual Retrieval
There are different ways to retrieve data. The first that will be covered is individual retrieval where each questions are asked one by one.

In [None]:
INDIV_DECOMPOSITION_QUERY_TEMPLATE = """
    Here's a Q&A to provide context for the question:
    {q_a_pairs}

    Create a synthesis that answers the question: {question}
"""

def indiv_retrieval(question:str, sub_queries:list[str], retriever, llm):
    """Sub-questions are asked one by one. It's answers are then used
    as a context and synthesized into the final answer."""
    compiled_q_a = ""
    for count, sub_query in enumerate(sub_queries, start=1):
        prompt = ChatPromptTemplate.from_template(ANSWER_TEMPLATE)
        rag_chain = (
            {"context": retriever, "question": RunnablePassthrough()}
            | prompt
            | llm
            | StrOutputParser()
        )
        print("Finished getting answers for sub-query " + str(count) + ".")
        compiled_q_a = compiled_q_a + "Question " + str(count) + " : " + sub_query + "\n"
        compiled_q_a = compiled_q_a + "Answer: " + rag_chain.invoke(sub_query) + "\n\n"

    print("Synthesizing results...")
    final_prompt = ChatPromptTemplate.from_template(INDIV_DECOMPOSITION_QUERY_TEMPLATE)
    final_rag_chain = (
        final_prompt
        | llm
        | StrOutputParser()
    )

    final_answer = final_rag_chain.invoke({"q_a_pairs": compiled_q_a, "question": question})
    print(final_answer)

indiv_retrieval(question, sub_questions, retriever, llm)

Finished getting answers for sub-query 1.




Finished getting answers for sub-query 2.




Finished getting answers for sub-query 3.




Finished getting answers for sub-query 4.




Finished getting answers for sub-query 5.
Question 1 : 1. What is Lang Yang Lamu? 
Answer: Lang Yang Lamu refers to a fascinating symbiosis observed between two species, Lang and Yang, in an ecological research setting. The term specifically relates to the interaction where Langs urinate on Lamu plants, a plant species seemingly capable of transforming Yang's toxic effects. This chemical transformation enhances the Lamu plant's nutritional value. 

The context also hints at a folklore narrative centered around the Lang and Yang creatures, symbolizing the intricate dance between predator and prey. However, the story does not explicitly mention the Lamu plants or their role in the ecosystem.

Question 2 : 
Answer: The research being conducted at the Global Ecology Research Center in Geneva aims to understand the biochemical processes behind a fascinating ecological interaction between the Lang and Yang species. The study hopes to discover the transformative effects of Lang urine on Lamu 

### Dynamic/Recursive Retrieval
With this retrieval type, answers generated from previous questions are used as additional context. This method is slightly more complex but can create a more cohesive result.

A function `format_qa_pair` is defined to make the compiled Q&A pairs more readable.

In [None]:
def format_qa_pair(question, answer):
    """Format Q and A pair"""

    formatted_string = ""
    formatted_string += f"Question: {question}\nAnswer: {answer}\n\n"
    return formatted_string.strip()

The function below then generates an answer for each sub-question, then uses answers generated cummulatively to generate the final answer.

In [None]:
DYNAMIC_DECOMPOSITION_QUERY_TEMPLATE = """
    Answer this question: {question}
    To aid in answering the question, there are question and answer pairs that can be used as context: {q_a_pairs}
    Finally, here's extra context that might help: {context}
    With this, generate an answer to the question asked.
"""

def dynamic_retrieval(question:str, sub_queries:list[str], retriever, llm):
    """When an answer is generated from a sub-question, it is used as additional context for the next questions."""
    decomposition_prompt = ChatPromptTemplate.from_template(DYNAMIC_DECOMPOSITION_QUERY_TEMPLATE)

    q_a_pairs = ""
    print("Generating q_a pairs..")
    for count, q in enumerate(sub_queries):
        rag_chain = (
            {"context": itemgetter("question") | retriever,
            "question": itemgetter("question"),
            "q_a_pairs": itemgetter("q_a_pairs")}
            | decomposition_prompt
            | llm
            | StrOutputParser())

        answer = rag_chain.invoke({"question":q,"q_a_pairs":q_a_pairs})
        q_a_pair = format_qa_pair(q,answer)
        q_a_pairs = q_a_pairs + "\n---\n"+  q_a_pair
        print("q_a pair " + str(count) + " is complete.")
    print(answer)

dynamic_retrieval(question, sub_questions, retriever, llm)



Generating q_a pairs..




q_a pair 0 is complete.




q_a pair 1 is complete.




q_a pair 2 is complete.




q_a pair 3 is complete.
q_a pair 4 is complete.
The Lang Yang Lamu symbiosis is significant for several reasons. On a fundamental level, it represents an intricate and complex ecological relationship between different species, offering a fascinating insight into the intricate webs of predator-prey dynamics. The interplay between Lang, Yang, and the Lamu plants showcases how unexpected and profound interactions can occur within ecosystems. 

Furthermore, this symbiosis has captured the attention of researchers studying sustainable solutions for food production. The unique effect of Lang's urine on the Lamu plants and the subsequent benefits to Yang and the soil fertility presents a remarkable potential for agricultural advancement. By understanding and replicating this phenomenon, scientists at the Global Ecology Research Center in Geneva aim to develop sustainable methods to enhance crop yields and combat food scarcity. 

The Lang Yang Lamu symbiosis goes beyond mere ecological interes

## Stepback

Stepback is a Query Translation technique where in the original question is abstracted to get a simpler question or a question to find the underlying concepts/theories that are needed knowledge for the question.

For example
Word Problem:

A garden has the shape of a rectangle. The length of the garden is 6 meters longer than its width. If the area of the garden is 91 square meters, what are the dimensions of the garden?

Possible Step Back Questions:
1. What mathematical concept is used to solve for the dimensions of the garden, and how is it applied in this problem?
2. What is the quadratic equation, and how does it help in determining the dimensions of the garden?
3. What steps are involved in solving a quadratic equation, and why are these steps necessary for finding the dimensions of the garden?

Some More Examples:

Original: What are the implications of the Lang-Yang-Lamu symbiosis for the human population?
ideal step back questions:
1. How do symbiotic relationships impact humans?

Original: Do Langs eat humans?
ideal step back questions
1. What is the typical diet of Langs?
2. Are Langs carnivorous?
3. Can a Lang kill a human?

One key note is that the LLM does not know about the concept of Langs, Yangs, and Lamus. Since it does not know that they are animal and plants. The LLM usually mistakes them for a Chinese family clan, people group, place, or thing.

In this scenario, ideal step back questions need to fill up the gaps of knowledge the LLM has.
For example:

Original: How do Langs help Yangs?

Ideal step back question:
What are Langs and Yangs, and how do they affect each other?

Original: What are the Side effects of time travel?

Ideal step back question:
What is time travel and how does it affect users?


Insights:
the step back questions from simple questions tend to be reworded versions of itself only rather than trying to find a more abstract level to view the question.

### Stepback Generation

In [None]:
sb_prompt = PromptTemplate(
        input_variables=["question", "context"],
        template="""You are an AI assistant. Your task is to rephrase a given question into a more general, step-back question that is easier to understand and answer.
    Please follow the principles below, along with their examples:

    1. Identify the Underlying Concept:
       - Example:
         Original: How does photosynthesis occur in plants?
         Step Back: What is the basic process of photosynthesis, and why is it important for plant life?

    2. Simplify the Context:
       - Example:
         Original: What are the effects of the Philippine Clean Air Act on industrial pollution?
         Step Back: What is environmental regulation, and how does it control pollution?

    3. Generalize Specific Details:
       - Example:
         Original: What role does Atticus Finch play in "To Kill a Mockingbird"?
         Step Back: What is the significance of moral characters in literature?

    4. Explore the Purpose or Function:
       - Example:
         Original: How does the Philippine judicial system address human rights violations?
         Step Back: What is the role of the judiciary in protecting human rights?

    5. Connect to Broader Implications:
       - Example:
         Original: How do antibiotics affect bacterial infections?
         Step Back: What are antibiotics, and why are they crucial in treating bacterial infections?

    Please respond with only the rephrased, step-back question.

    The given question: {question}

    Here is additional context about the question to help in generating a step-back question:
    {context}
    """
    )

In [None]:
regularQuestions = [
    "How do vaccines work to provide immunity against diseases?",
    "How does Shakespeare use foreshadowing in \"Macbeth\"?",
    "What is the process of DNA replication in cells?"
]


for question in regularQuestions:
  print("Original: " + question)
  generate_queries_step_back = sb_prompt | llm | StrOutputParser()
  print("Step Back: " + generate_queries_step_back.invoke({"question": question, "context": "None"}))
  print()

Original: How do vaccines work to provide immunity against diseases?
Step Back: How do immune-boosting solutions like vaccines help prevent diseases and contribute to long-term immunity?

Original: How does Shakespeare use foreshadowing in "Macbeth"?
Step Back: In Shakespearean tragedy, how do authors employ foreshadowing to enhance the plot and engage the audience?

Original: What is the process of DNA replication in cells?
Step Back: What is the fundamental mechanism of DNA replication and its significance in cellular processes?



In [None]:
corpusQuestions = [
    "How do Langs help Yangs?",
    "What are the side effects of time travel?",
    "Do Langs eat humans?",
    "What is the significance of Miracle Blooms in human society?",
    "How did the Yangs and Langs Originate?"
]


for question in corpusQuestions:
  print("Original: " + question)
  generate_queries_step_back = sb_prompt | llm | StrOutputParser()
  context = retrieved_docs = retriever.get_relevant_documents(question)
  #print(context)
  print("Step Back: " + generate_queries_step_back.invoke({"question": question, "context": "None"}))
  print()

Original: How do Langs help Yangs?
Step Back: In the context of the given scenario, what is the role of languages in supporting different cultures?

Original: What are the side effects of time travel?
Step Back: What are the potential consequences of altering temporal progression?

Original: Do Langs eat humans?
Step Back: Are predatory animals a threat to humans?

Original: What is the significance of Miracle Blooms in human society?
Step Back: What are the cultural and societal implications of remarkable phenomena like Miracle Blooms?

Original: How did the Yangs and Langs Originate?
Step Back: Where do family names originate from?



In [None]:
corpusQuestions = [
    "How do Langs help Yangs?",
    "What are the side effects of time travel?",
    "Do Langs eat humans?",
    "What is the significance of Miracle Blooms in human society?",
    "How did the Yangs and Langs Originate?"
]


for question in corpusQuestions:
  print("Original: " + question)
  generate_queries_step_back = sb_prompt | llm | StrOutputParser()
  context = retrieved_docs = retriever.get_relevant_documents(question)
  #print(context)
  print("Step Back: " + generate_queries_step_back.invoke({"question": question, "context": context}))
  print()

Original: How do Langs help Yangs?
Step Back: What is the ecological and cultural significance of the relationship between Langs and Yangs, and how has it influenced conservation efforts over time?

Original: What are the side effects of time travel?
Step Back: What are the potential implications of time travel on the ecosystem and human society?

Original: Do Langs eat humans?
Step Back: Do mystical wolf-like creatures play a significant role in human folklore and mythology?

Original: What is the significance of Miracle Blooms in human society?
Step Back: What is the ecological and cultural significance of plants in human society, with respect to their impact on folklore, conservation, and sustainable development?

Original: How did the Yangs and Langs Originate?
Step Back: What is the cultural and historical significance of the Lang-Yang relationship, and how has it influenced conservation and agricultural practices?



#### Analysis


For the regular questions, the cohere LLM generated workable stepback questions without context. However, regarding questions about our corpus, the LLM did not know much about what Yang's and Lang's are so the Stepback question generator only worked well if the LLM was given context documents.  

### Incorporating Stepback in RAG

In [None]:
def sb_generator(question:str):
  sb_prompt = PromptTemplate(
        input_variables=["question", "context"],
        template="""You are an AI assistant. Your task is to rephrase a given question into a more general, step-back question that is easier to understand and answer.
    Please follow the principles below, along with their examples:

    1. Identify the Underlying Concept:
       - Example:
         Original: How does photosynthesis occur in plants?
         Step Back: What is the basic process of photosynthesis, and why is it important for plant life?

    2. Simplify the Context:
       - Example:
         Original: What are the effects of the Philippine Clean Air Act on industrial pollution?
         Step Back: What is environmental regulation, and how does it control pollution?

    3. Generalize Specific Details:
       - Example:
         Original: What role does Atticus Finch play in "To Kill a Mockingbird"?
         Step Back: What is the significance of moral characters in literature?

    4. Explore the Purpose or Function:
       - Example:
         Original: How does the Philippine judicial system address human rights violations?
         Step Back: What is the role of the judiciary in protecting human rights?

    5. Connect to Broader Implications:
       - Example:
         Original: How do antibiotics affect bacterial infections?
         Step Back: What are antibiotics, and why are they crucial in treating bacterial infections?

    Please respond with only the rephrased, step-back question.

    The given question: {question}

    Here is additional context about the question to help in generating a step-back question:
    {context}
    """
    )
  generate_queries_step_back = sb_prompt | llm | StrOutputParser()
  context = retriever.get_relevant_documents(question)
  return generate_queries_step_back.invoke({"question": question, "context": context})

In [None]:
sb_generator("How Langs help Yangs?")

'What is the ecological and cultural significance of the Lang-Yang relationship, and how does it influence conservation efforts?'

In [None]:
def rag(question:str):
  # Get step back question
  prompt = PromptTemplate(
        input_variables=["question", "context"],
        template="""You are an AI assistant. Your task is to answer a given question based on the given context

    The given question: {question}

    Here is the context documents and a step back question with its answer
    Disregard all unrelated content regarding the given question.
    {context}
    """
    )
  generate_answer = prompt | llm | StrOutputParser()
  # answer step back question
  x = retriever.get_relevant_documents(question)
  docs = ""
  for y in x:
    docs += y.page_content
  #use step back question answer and docx as context for answer
  context = docs
  return generate_answer.invoke({"question": question, "context": context})

In [None]:
def sb_rag(question:str):
  # Get step back question
  sb_question = sb_generator(question)
  context = retriever.get_relevant_documents(sb_question)
  prompt = PromptTemplate(
        input_variables=["question", "context"],
        template="""You are an AI assistant. Your task is to answer a given question based on the given context

    The given question: {question}

    Here is the context documents and a step back question with its answer
    Disregard all unrelated content regarding the given question.
    {context}
    """
    )
  generate_sb_answer = prompt | llm | StrOutputParser()
  # answer step back question
  answer = generate_sb_answer.invoke({"question": sb_question, "context": context})
  x = retriever.get_relevant_documents(question)
  docs = ""
  for y in x:
    docs += y.page_content
  #use step back question answer and docx as context for answer
  context = docs + "\n\nQuestion: " + sb_question + "\nAnswer: " + answer
  return generate_sb_answer.invoke({"question": question, "context": context})

In [None]:
print(rag("How Langs help Yangs?"))

Langs help Yangs indirectly in a symbiotic relationship. Langs consume the Lamu plant which then processes into urine, a key ingredient in a potent natural fertiliser. This fertiliser is beneficial for the growth of vegetation that serves as a food supply for the Yang species. In this way, the Langs, by virtue of their diet, aid the Yangs in having an ample food source and sustaining their species. 

The relationship between Langs and Yangs has profound implications, shaping the folklore, conservation efforts, and agricultural practices of human societies that have developed around them.


In [None]:
print(sb_rag("How Langs help Yangs?"))

The Langs help the Yangs indirectly in a symbiotic relationship. The Yangs, or Mystic Sheep, are herbivores that graze on plants, including the Lamu plant, which becomes a vital fertilizer after being modified by the urine of the Langs. This fertilizer is crucial for agricultural applications, stabilizing economies in regions facing agricultural challenges and contributing to food production. 

Furthermore, the cultural significance of this relationship adds another layer of importance to conservation efforts. The Lang-Yang dynamic is deeply rooted in Central Asian folklore and mythology, symbolizing themes of nature's harmony, innocence, and strength. This cultural reverence has led to ecological stewardship and a deeper understanding of the ecosystem among human societies. Artists, writers, and creatives also draw inspiration from this relationship, promoting ecological conservation and cultural appreciation. 

Understanding the Lang-Yang symbiosis has helped humans develop a holisti

## HyDE

Hypothetical Document Embedding is creating a hypothetical document based on the question and use that document to better retrieve/search for related documents about the question.


In [None]:
from langchain.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser


def HyDE_generator(question:str):
  template = """Please write a scientific paper passage to answer the question
  Question: {question}
  Passage:"""
  prompt_hyde = ChatPromptTemplate.from_template(template)

  generate_docs_for_retrieval = (
    prompt_hyde | llm | StrOutputParser()
  )


  return generate_docs_for_retrieval.invoke({"question":question})
# HyDE document genration

# Run
print(HyDE_generator("How does the process of photosynthesis in plants work?"))

Photosynthesis in plants is an intricate process that transforms luminous energy into chemical energy, enabling plants to thrive and grow. This fascinating mechanism enables plants to convert sunlight, water, and carbon dioxide into oxygen and glucose, which serves as a crucial energy source for the plant's survival. The process predominantly occurs in specialized organelles called chloroplasts, which are abundant in the leaves of plants. Chloroplasts contain a green pigment called chlorophyll, which plays a pivotal role in harvesting sunlight, thus initiating photosynthesis. 

The process can be divided into two primary phases: the light-dependent reactions and the light-independent reactions. During the light-dependent phase, light energy is absorbed by chlorophyll, exciting the electrons within the chloroplasts. This excitement triggers a series of reactions that transfer the electrons through a complex network of proteins called the electron transport chain. This chain of reactions

In [None]:
HyDE_generator("How do Langs help Yangs?")

'Langs help Yangs by acting as a protective barrier. Yangs are a type of particle that carry a positive charge, and Langs, their negatively charged counterparts, help to neutralize this charge. When Langs and Yangs come close to each other, they are drawn together by their opposite charges. This causes them to combine and form a stable compound, preventing the Yangs from interacting with other Yangs and causing unwanted reactions. In this way, Langs act as a protective shield, guiding Yangs away from harmful interactions and ensuring their safety. This phenomenon is a well-known example of electrostatic attraction, demonstrating the power of opposite charges to attract and aid one another. Further studies into the behavior of Langs and Yangs and their applications in controlling particle interactions are ongoing and hold promise for various practical uses.'

In [None]:
def HyDE_rag(question:str):
  # Get step back question
  hyde = HyDE_generator(question)
  context = retriever.get_relevant_documents(hyde)
  prompt = PromptTemplate(
        input_variables=["question", "context"],
        template="""You are an AI assistant. Your task is to answer a given question based on the given context

    The given question: {question}

    Here is the context documents and a step back question with its answer
    Disregard all unrelated content regarding the given question.
    {context}
    """
    )
  generate_hyde_answer = prompt | llm | StrOutputParser()
  return generate_hyde_answer.invoke({"question": question, "context": context})

In [None]:
print(rag("How Langs help Yangs?"))

Langs help Yangs indirectly in a symbiotic relationship. Langs consume the Lamu plant which then processes into urine, a key ingredient in a potent natural fertiliser. This fertiliser is beneficial for the growth of vegetation that serves as a food supply for the Yang species. In this way, the Langs, by virtue of their diet, aid the Yangs in having an ample food source and sustaining their species. 

The relationship between Langs and Yangs has profound implications, shaping the folklore, conservation efforts, and agricultural practices of human societies that have developed around them.


In [None]:
print(HyDE_rag("How Langs help Yangs?"))

The given context discusses the symbiotic relationship between Langs and Yangs, two species that have impacted human societies throughout history. Langs are referred to as Canis mythicus, and Yangs as Ovis mystica. The connection between these species, deeply rooted in folklore and mythology, has influenced human interactions with nature and shaped conservation efforts over the centuries. 

Langs, identified as formidable predators, and Yangs, portrayed as gentle but poisonous prey, have their roles clearly defined in this ecological dance. Folktales and mythology have celebrated this predator-prey dynamic, emphasizing the cultural reverence for the natural balance they represent. 

As people became more scientifically aware, conservationists stepped in to protect both species from the negative consequences of their symbiotic relationship. This intervention was essential in the 19th and 20th centuries, as habitat loss and hunting threats endangered Langs and Yangs. By understanding the

# Routing

Routing is the process of choosing the correct database for the given question. For example, the question is about dogs, then the database that relevant documents may be found should probably be in the dog database.

## Logical Routing

Use the LLM to choose the database.

In [None]:
def Logical_Router(question:str):
  # Get step back question
  prompt = PromptTemplate(
        input_variables=["question"],
        template = """
    Given the question "{question}":

    Answer "topic1" if the question relates more to Lang-Yang-Lamu symbiosis.
    Answer "topic2" if the question relates more to time travelers.
    If the question does not relate to either topic, return the number 0.

    Topic 1 is about the Lang-Yang-Lamu symbiosis. Lang's are wolf-like creatures that prey on Yang's. Yang's are sheep-like creatures that eat the Lamu Plant.
    Keywords to note about in topic1: Lang, Yang, Lamu, and Miracle Bloom.

    Topic 2 is about time travelers and time travel.
    """
    )
  generate_answer = prompt | llm | StrOutputParser()
  answer = generate_answer.invoke({"question":question})

  if answer in ["topic1", "topic2"]:
      return answer
  else:
      return 0

In [None]:
# Expect topic1
Logical_Router("How do Langs help Yangs?")

'topic1'

In [None]:
# Expect topic2
Logical_Router("What are the side effects of time travel?")

0

In [None]:
# Expect topic1
Logical_Router("What is the Miracle Bloom?")

0

In [None]:
# Expect 0
logicalRouting("What is the scientific name of Lilies?")

0

Since the LLM was not trained on the corpus data, it has trouble determining what database that should be used for the specific question.

This method seems to tend to have more false negatives.

## Semantic Routing

Semantic Routing chooses a database based on relevance scores calculated through how close the embeddings of each of the databases are to the question.

In [None]:
def Semantic_Router(question:str):
  topic1 = retriever.vectorstore.similarity_search_with_relevance_scores(question)
  topic2 = retriever2.vectorstore.similarity_search_with_relevance_scores(question)
  topic1Sum = 0
  topic2Sum = 0
  # get sum of relevance scores
  print("topic1")
  for x in topic1:
      print(x[1])
      topic1Sum += x[1]
  print("topic2")
  for x in topic2:
      print(x[1])
      topic2Sum += x[1]
  print("topic1 Sum: " + str(topic1Sum))
  print("topic2 Sum: " + str(topic2Sum))
  if(topic1Sum >= topic2Sum):
      return "topic1"
      #all_documents = topic1
  elif(topic1Sum < topic2Sum):
      return "topic2"

In [None]:
Semantic_Router("How does time travel work?")



topic1
-10063.691568332295
-10063.69157528368
-10151.05773673493
-10151.057741210803
topic2
-4680.173562067659
-4680.173563522808
-4724.178992835884
-4724.1789944385555
topic1 Sum: -40429.49862156171
topic2 Sum: -18808.705112864907




'topic2'

In [None]:
Semantic_Router("How do Langs help Yangs?")



topic1
-3925.60292210574
-3925.602924595053
-4558.260394914752
-4558.260395157848
topic2
-8773.16512790148
-8773.165128499311
-9003.637074693981
-9003.63707657551
topic1 Sum: -16967.726636773394
topic2 Sum: -35553.60440767028




'topic1'

In [None]:
Semantic_Router("How do birds fly?")



topic1
-9179.26338609948
-9179.263391935408
-9251.197121031902
-9251.197123438888
topic2
-9561.781085737439
-9561.781088335325
-9775.910522012293
-9775.91053171193
topic1 Sum: -36860.921022505674
topic2 Sum: -38675.38322779699




'topic1'

In [None]:
cohere_client = cohere.Client('usVS013Lxbt0Wx5HVTd8vGLsFbfugknSpouJQUWF')
x = 'In response to this crisis, the government of Kyrgyzstan, in collaboration with international conservation groups, has announced a new initiative aimed at bolstering anti-poaching measures. These include increased patrols, the implementation of advanced surveillance technologies, and harsher penalties for those caught engaging in the illegal wildlife trade.\n\n"We must act now to ensure that future generations will also be able to witness the unique beauty of the Yang," Altin emphasized. "This is a call to the international community to join us in our efforts to protect these magnificent creatures and the natural wonders they help sustain."\n\nThe world watches as Central Asia confronts this urgent conservation challenge, hoping that these efforts will curb the illegal hunting activities and restore the balance so crucial to the region\'s ecological and cultural heritage.'
#cohere_client
response = cohere_client.embed(
    texts=["How do Langs help Yangs?", x], model="embed-english-v3.0", input_type="classification"
)
print(response.embeddings[0])
print(response.embeddings[1])

#cohere_embeddings = CohereEmbeddings()
#cohere_embeddings.embed("What are those?")

[0.024963379, -0.028808594, -0.020492554, -0.046569824, -0.018844604, 0.01776123, -0.01928711, 0.021347046, 0.028320312, 0.06738281, 0.030334473, 0.013389587, 0.020065308, -0.024795532, -0.023986816, 0.025756836, 0.013877869, -0.0134887695, -0.013084412, -0.02027893, 0.017349243, -0.020202637, -0.016143799, 0.049438477, 0.0049972534, 0.040100098, 0.02494812, -0.018814087, 0.01197052, -0.008354187, -0.0059776306, -0.016433716, 0.01890564, -0.011672974, 0.015640259, 0.043182373, 0.008926392, -0.015991211, 0.0134887695, 0.010597229, 0.041625977, -0.02545166, 0.014923096, 0.008804321, -0.015960693, 0.003194809, 0.012397766, -0.023025513, 0.049713135, 0.037017822, -0.027328491, -0.022979736, -0.046875, 0.040405273, -0.03250122, -0.029159546, -0.023727417, 0.045532227, 0.015434265, 0.049072266, -0.007385254, 0.05218506, -0.026062012, -0.05355835, 0.022216797, 0.0028305054, 0.026443481, -0.00083732605, 0.023025513, -0.00422287, -0.005718231, -0.011894226, 0.0079422, 0.0027236938, 0.025787354,

In [None]:
import numpy as np
from sklearn.metrics.pairwise import cosine_similarity
from numpy.linalg import norm

vector1 = np.array(response.embeddings[0])
vector2 = np.array(response.embeddings[1])

vector1_reshaped = vector1.reshape(1, -1)
vector2_reshaped = vector2.reshape(1, -1)

similarity = cosine_similarity(vector1_reshaped, vector2_reshaped)[0][0]
sim = np.dot(vector1,vector2)/(norm(vector1)*norm(vector2))

print("Cosine Similarity:", similarity)
print(sim)

Cosine Similarity: 0.2923548473643105
0.29235484736431044


In [None]:
def getVectors(textList):
  cohere_client = cohere.Client('usVS013Lxbt0Wx5HVTd8vGLsFbfugknSpouJQUWF')
  #cohere_client
  response = cohere_client.embed(
      texts=textList, model="embed-english-v3.0", input_type="classification"
  )
  return response

In [None]:
def cosineSimilarity(vector1, vector2):
  vector1 = np.array(vector1)
  vector2 = np.array(vector2)

  vector1_reshaped = vector1.reshape(1, -1)
  vector2_reshaped = vector2.reshape(1, -1)

  return cosine_similarity(vector1_reshaped, vector2_reshaped)[0][0]

In [None]:
topic1 = retriever.vectorstore.get()
#for x in topic1:
  #print(x)

None


In [None]:
topic1 = retriever.vectorstore.get()
question = "How do Yangs help Langs?"

qv = getVectors([question]).embeddings[0]
#print(qv)

question2 = "What is python code?"

qv2 = getVectors([question2]).embeddings[0]
#print(cosineSimilarity(qv, qv2))
vectordb = getVectors(topic1["documents"]).embeddings

documents = topic1["documents"]

data = []
for i, v in enumerate(vectordb):
  similarity = cosineSimilarity(qv, v)
  data.append({
      "vector": v,
      "cosine_similarity": similarity,
      "document": documents[i]
  })

data.sort(key=lambda item: item["cosine_similarity"], reverse=True)

for item in data[:5]:
  vector = item["vector"]
  similarity = item["cosine_similarity"]
  document = item["document"]
  print(vector)
  print(document)
  #print(f"Similarity: {similarity:.4f}, Document: {document}, Vector: {vector}")

[0.044158936, 0.01285553, 0.012748718, 0.0025787354, -0.0010719299, -0.00365448, 0.027709961, -0.07110596, 0.016418457, 0.032562256, 0.005405426, -0.015060425, 0.006767273, -0.021499634, -0.01260376, -0.01499939, 0.0211792, 0.030349731, 0.024749756, -0.029678345, -0.0049476624, -0.0113220215, -0.019134521, -0.015235901, 0.018615723, -0.018753052, -0.020004272, -0.019866943, 0.010124207, -0.0067367554, -0.0030498505, 0.03829956, 0.023590088, -0.020004272, 0.012084961, 0.007648468, 0.024291992, -0.014762878, -0.03756714, 0.054382324, 0.021774292, -0.0209198, -0.015731812, -0.044525146, -0.04486084, 0.036193848, -0.015235901, -0.0023002625, 0.035186768, -0.012680054, -0.001906395, -0.007545471, -0.00046992302, -0.020492554, -0.02658081, -0.0096588135, -0.027862549, 0.044555664, 0.04425049, -0.027328491, -0.04309082, -0.020828247, 0.02670288, -0.036071777, 0.018432617, -0.0012931824, 0.028564453, 0.023483276, 0.022277832, 0.022232056, -0.0044784546, -0.012046814, 0.014274597, -0.092285156,

In [None]:
retriever.vectorstore.similarity_search_with_relevance_scores(question)



[(Document(page_content="Habitat and Distribution:  \nYangs are versatile and can thrive in a variety of environments but are predominantly found in mountainous regions and grassy plains where there is ample food supply. While originally native to Central Asia, their range has expanded due to their adaptability and the shifting environmental conditions.\n\nConservation Status:  \nYangs are classified as a species of least concern but are monitored due to their unique ecological role and the effects of their interactions with the Lang. Conservation efforts are mainly directed towards habitat preservation and understanding the dynamics of their relationship with Langs to ensure both species' sustainability.\n\nCultural Significance:  \nIn cultural narratives, Yangs symbolize innocence and purity, often depicted as gentle and serene beings. They hold a significant place in local folklore and are sometimes believed to possess mystical properties due to their unique biological characteristi

Manually calculating the the vector similiarity seems to yield the expected outcome.

In [None]:
topic1 = retriever.vectorstore.similarity_search_with_relevance_scores(question)
for x in topic1:
  print(x)

(Document(page_content='In response to this crisis, the government of Kyrgyzstan, in collaboration with international conservation groups, has announced a new initiative aimed at bolstering anti-poaching measures. These include increased patrols, the implementation of advanced surveillance technologies, and harsher penalties for those caught engaging in the illegal wildlife trade.\n\n"We must act now to ensure that future generations will also be able to witness the unique beauty of the Yang," Altin emphasized. "This is a call to the international community to join us in our efforts to protect these magnificent creatures and the natural wonders they help sustain."\n\nThe world watches as Central Asia confronts this urgent conservation challenge, hoping that these efforts will curb the illegal hunting activities and restore the balance so crucial to the region\'s ecological and cultural heritage.', metadata={'source': '/content/drive/MyDrive/LLM/corpus/h.txt'}), -10106.973019996629)
(Do



In [None]:
topic1 = retriever.vectorstore.similarity_search_with_score(question)
for x in topic1:
  print(x)

(Document(page_content='In response to this crisis, the government of Kyrgyzstan, in collaboration with international conservation groups, has announced a new initiative aimed at bolstering anti-poaching measures. These include increased patrols, the implementation of advanced surveillance technologies, and harsher penalties for those caught engaging in the illegal wildlife trade.\n\n"We must act now to ensure that future generations will also be able to witness the unique beauty of the Yang," Altin emphasized. "This is a call to the international community to join us in our efforts to protect these magnificent creatures and the natural wonders they help sustain."\n\nThe world watches as Central Asia confronts this urgent conservation challenge, hoping that these efforts will curb the illegal hunting activities and restore the balance so crucial to the region\'s ecological and cultural heritage.', metadata={'source': '/content/drive/MyDrive/LLM/corpus/h.txt'}), 14294.832532980567)
(Doc

In [None]:
topic1 = retriever.vectorstore.get()
#print(topic1.['metadatas'])
#print(topic1.['metadatas'])
print(len(topic1["documents"]))

for x in topic1["documents"]:
  print(x)

NameError: name 'retriever' is not defined

In [None]:
retriever.vectorstore

<langchain_community.vectorstores.chroma.Chroma at 0x7830decec340>

In [None]:
vectorstore.documents

AttributeError: 'Chroma' object has no attribute 'documents'

Although Semantic Routing can much more accurately choose whether a specific topic should be in 1 or 2 more accurately than Logical routing. This method cannot distinguish if the question is a topic not in topics 1 and 2.

This is because the method to check if a question is out of scope of a topic is through relevance scores. Usually the threshold for out of scope topics is if the relevance score is less than 0.7. However, the test cases show that the relevance scores are extremely off and are negative as well. This may have to do with the embedding method we have chosen.  

# Hallucination and Fact Checking

The output of LLMs can vary. A layer f safety can be put in place so that the answer is factu ally correct based on the database and it answers the user's question. There are 2 parts of this section.
1. Checking if the context generated by the LLM is factually correct and found in the retrieved documents.
2. Checking if the final answer of the LLM makes sense and answers the original question of the user.

## LangGraph

A loop is required to handle the checking of the answers generated by the LLM.
1. Retrieve relevant documents
2. Generate a context answer based on the retrieved documents.
3. If the generated context is not factually in the retrieved documents, go back to step 2.

This can be done with a while loop but it could be easily implemented with LangGraph.

There are different types of graphs that can be used in LangGraph. You may find them [here](https://python.langchain.com/v0.1/docs/langgraph/). But for this implementaion, the State Graph would be the most appropriate.

### State Graph

With a State Graph, you have an object that has variables in it. The current values of the variables is the current state of the object. This object is to be passed around within the graph's nodes to change its state.

With this, we can have the user's question, the generated answer of the LLM, and the retrieved documents under 1 object state.

## Hallucination Prompts

We first provide the propmts to be fed into the LLM. We then create a chain to output the answer of the LLM into a JSON format.

### Is the context generated factual?

In [None]:
prompt = PromptTemplate(
    template=""" You are a grader assessing whether
    an answer is grounded in / supported by a set of facts. Give a binary 'yes' or 'no' score to indicate
    whether the answer is grounded in / supported by a set of facts. Provide the binary score as a JSON with a
    single key 'score' and no preamble or explanation.
    Here are the facts:
    \n ------- \n
    {documents}
    \n ------- \n
    Here is the answer: {generation} """,
    input_variables=["generation", "documents"],
)
hallucination_grader = prompt | llm | JsonOutputParser()

### Does the generated answer answer the original question?

In [None]:
prompt = PromptTemplate(
    template="""You are a grader assessing whether an
    answer is useful to resolve a question. Give a binary score 'yes' or 'no' to indicate whether the answer is
    useful to resolve a question. Provide the binary score as a JSON with a single key 'score' and no preamble or explanation.
    Here is the answer:
    \n ------- \n
    {generation}
    \n ------- \n
    Here is the question: {question} <|eot_id|><|start_header_id|>assistant<|end_header_id|>""",
    input_variables=["generation", "question"],
)
answer_grader = prompt | llm | JsonOutputParser()

## Function Implementation

As said before, we have an object that holds variables. The state of these variables are to be changed by the graph's nodes.

For this implementation, the state object is a variable called *workflow*.

In [None]:
class GraphState(TypedDict):
    """
    Represents the state of our graph.

    Attributes:
        question: question
        generation: LLM generation
        documents: list of documents
    """

    question: str
    generation: str
    documents: List[str]


workflow = StateGraph(GraphState)

### Nodes

Each node of a graph can have a method to be called in it. For this impementation, we would have 2 nodes. One node is for retrieving documents while the other is for generating answers from the LLM. The methods take the state as an input becuase these nodes will change the state object.

With the retrieve method, it is able to change the state's question and documents fields.

In [None]:
def retrieve(state):
    """
    Retrieve documents from vectorstore

    Args:
        state (dict): The current graph state

    Returns:
        state (dict): A value for the documents and question key for the state dict
    """
    print("---RETRIEVE---")
    question = state["question"]

    # Retrieval
    documents = retriever.invoke({"question":question})
    return {"documents": documents, "question": question}

With the generate method, it is able to change the state's generation field. The question and documents remain unchanged.

In [None]:
def generate(state):
    """
    Generate answer using RAG on retrieved documents

    Args:
        state (dict): The current graph state

    Returns:
        state (dict): New key added to state, generation, that contains LLM generation
    """
    print("---GENERATE---")
    question = state["question"]
    documents = state["documents"]

    # RAG generation
    generation = rag_chain.invoke({"context": documents, "question": question})
    return {"documents": documents, "question": question, "generation": generation}

### Edges

## Graph Connection and Output