# Langchain Vs Llamaindex

## Implementing RAG using Langchain, OPEN AI and FAISS

In [None]:
# !pip install langchain openai faiss-cpu langchain_community
# ! pip install langchain-openai

In [88]:
import os
from langchain import OpenAI, PromptTemplate, LLMChain
from langchain.vectorstores import FAISS
from langchain.embeddings import OpenAIEmbeddings
from langchain.document_loaders import TextLoader
from langchain.chains import RetrievalQA
from langchain.text_splitter import CharacterTextSplitter

In [91]:
# Load the data
loader = TextLoader('Politics.txt')
documents = loader.load()
OPENAI_API_KEY = "YOUR_API_KEY"

In [92]:
# Split the documents into smaller chunks
text_splitter = CharacterTextSplitter(chunk_size=1038, chunk_overlap=100)
texts = text_splitter.split_documents(documents)

# Create embeddings
embeddings = OpenAIEmbeddings(api_key= OPENAI_API_KEY)

# Build the vector store from documents
vector_store = FAISS.from_documents(texts, embeddings)


In [93]:
retriever = vector_store.as_retriever(search_type='similarity', search_kwargs={'k': 3})

In [94]:
llm = OpenAI(openai_api_key = OPENAI_API_KEY, temperature=0)

In [95]:
rag_chain = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type='stuff',
    retriever=retriever,
    return_source_documents=True
)

In [98]:
query = "What is the main topic of discussion in the document and summarize it in 250 words?"
result_1 = rag_chain(query)

# print("Answer:")
# print(result['result'])
display(Markdown(f"<b>{result_1}</b>"))

print("\nSource Documents:")
for doc in result_1['source_documents']:
    print(doc.page_content[:200] + '...')


<b>{'query': 'What is the main topic of discussion in the document and summarize it in 250 words?', 'result': "\nThe main topic of discussion in the document is the G20 Summit and its role in promoting global economic growth, protecting the planet, and cooperating with low-income and developing countries. The G20 is a forum of the world's largest economies, and its agenda has shifted towards the interests and priorities of developing countries. The G20 Summit in Hangzhou in 2016 resulted in an action plan and high-level principles document to help implement the 2030 Agenda for Sustainable Development. Japan hosted the 2019 summit, and the 2020 summit was held virtually due to the COVID-19 pandemic. The 2021 G20 Rome summit focused on addressing global economic challenges, development, climate change, and urgent issues like terrorism and refugees. The G7 recognized the need for a wider international partnership during the 2008 financial crisis, leading to the elevation of the G20 forum to the summit level. However, there have been tensions within the G20, such as Australia's proposal to ban Russia from the 2014 summit over its annexation of Ukrainian Crimea. The 2015 G20 Summit in Antalya, Turkey, resulted in the Antalya Action Plan and commitments to financial stability, tax regulation, and energy policy. Overall, the G20 Summit", 'source_documents': [Document(metadata={'source': '/content/data/Politics.txt'}, page_content="In 2016, the G20 framed its commitment to the 2030 Agenda, Sustainable Development Goals in three key themes; the promotion of strong sustainable and balanced growth; protection of the planet from degradation; and furthering co-operation with low-income and developing countries. At the G20 Summit in Hangzhou, members agreed on an action plan and issued a high level principles document to member countries to help facilitate the agenda's implementation.[34][35]\n\nJapan hosted the 2019 summit,[36] The 2020 summit was to be held in Saudi Arabia,[37] but was instead held virtually on 21–22 November 2020 due to the COVID-19 pandemic under the presidency of Saudi Arabia. 2021 G20 Rome summit which was held in Rome, the capital city of Italy, on 30–31 October 2021."), Document(metadata={'source': '/content/data/Politics.txt'}, page_content='The G7 recognised that they could not manage the 2008 financial crisis on their own and needed a wider international partnership, but one under their aegis. With this in mind, the G20 forum hitherto at the finance minister level was raised to the summit level. The G20 agenda is, however, shifting increasingly towards the interests and priorities of the developing countries (now being referred to as the Global South). During India’s G20 presidency, with India holding the Voice of the Global South summits before presiding over the G20 and at the conclusion of its work, and with the inclusion of the African Union as a G20 permanent member at India’s initiative, the pro-Global South content of the G20 agenda has got consolidated.'), Document(metadata={'source': '/content/data/Politics.txt'}, page_content='In March 2014, the former Australian foreign minister Julie Bishop, when Australia was hosting the 2014 G20 summit in Brisbane, proposed to ban Russia from the summit over its annexation of Ukrainian Crimea.[31] The BRICS foreign ministers subsequently reminded Bishop that "the custodianship of the G20 belongs to all Member States equally and no one Member State can unilaterally determine its nature and character."\n\nThe 2015 G20 Summit in Antalya, Turkey, focused on "Inclusiveness, Investment, and Implementation," gathering leaders to address global economic challenges, development, climate change, and urgent issues like terrorism and refugees. Key outcomes included the Antalya Action Plan and commitments to financial stability, tax regulation, and energy policy.[32][33]')]}</b>


Source Documents:
In 2016, the G20 framed its commitment to the 2030 Agenda, Sustainable Development Goals in three key themes; the promotion of strong sustainable and balanced growth; protection of the planet from deg...
The G7 recognised that they could not manage the 2008 financial crisis on their own and needed a wider international partnership, but one under their aegis. With this in mind, the G20 forum hitherto a...
In March 2014, the former Australian foreign minister Julie Bishop, when Australia was hosting the 2014 G20 summit in Brisbane, proposed to ban Russia from the summit over its annexation of Ukrainian ...


In [99]:
result_1

{'query': 'What is the main topic of discussion in the document and summarize it in 250 words?',
 'result': "\nThe main topic of discussion in the document is the G20 Summit and its role in promoting global economic growth, protecting the planet, and cooperating with low-income and developing countries. The G20 is a forum of the world's largest economies, and its agenda has shifted towards the interests and priorities of developing countries. The G20 Summit in Hangzhou in 2016 resulted in an action plan and high-level principles document to help implement the 2030 Agenda for Sustainable Development. Japan hosted the 2019 summit, and the 2020 summit was held virtually due to the COVID-19 pandemic. The 2021 G20 Rome summit focused on addressing global economic challenges, development, climate change, and urgent issues like terrorism and refugees. The G7 recognized the need for a wider international partnership during the 2008 financial crisis, leading to the elevation of the G20 forum to

## Implementing RAG using Llamaindex, OPEN AI

In [13]:
import os
os.environ['OPENAI_API_KEY'] = OPENAI_API_KEY

In [39]:
from llama_index.core import VectorStoreIndex,SimpleDirectoryReader, SimpleKeywordTableIndex
from llama_index.core.query_engine import RetrieverQueryEngine
from llama_index.core.postprocessor import SimilarityPostprocessor

from IPython.display import Markdown, display

In [80]:
documents=SimpleDirectoryReader("data").load_data()

index = VectorStoreIndex.from_documents(documents)

In [84]:
query_engine = index.as_query_engine()
response = query_engine.query(query)
display(Markdown(f"<b>{response}</b>"))

<b>The main topic of discussion in the document is the G20 summits and their key outcomes and themes from various years, starting from the recognition of the impact of Travel & Tourism in 2012 to the more recent summits in 2021 and 2023. The document highlights how different G20 summits have focused on various global economic challenges, development issues, climate change, terrorism, and refugees. It also mentions the commitments made by G20 leaders towards financial stability, tax regulation, energy policy, and sustainable development goals. Additionally, it covers the presidency themes of different countries hosting the G20 summits, such as Indonesia's focus on global health architecture, digital transformations, and sustainable energy transitions in 2022, and India's theme of "One Earth, One Family, One Future" in 2023. The document also touches upon the inclusion of the African Union in the G20 and the launch of the G20 Social initiative by the Brazilian presidency to involve civil society in summit discussions. Overall, the document provides a comprehensive overview of the G20 summits, their themes, outcomes, and the evolving agenda under different presidencies.</b>

In [85]:
query_engine_2 = index.as_query_engine(similarity_top_k=4)
response_2 = query_engine_2.query(query)
display(Markdown(f"<b>{response_2}</b>"))

<b>The main topic of discussion in the document is the G20 summits and their evolution over the years. It covers the history of G20 summits, key outcomes and themes of various summits, the rotation of the G20 presidency among member nations, and the significance of the G20 in global governance. The document highlights how the G20 has become a platform for addressing global economic challenges, development issues, climate change, terrorism, and other urgent matters. It also mentions the role of the G20 in promoting sustainable development goals, cooperation with low-income countries, and facilitating global economic stability. Additionally, it discusses the expansion of the G20 agenda to include priorities of the Global South, such as climate change, debt restructuring, and regulation of global cryptocurrencies. The document emphasizes the importance of the G20 in bringing together leaders from major economies to collaborate on addressing pressing global issues and fostering inclusive and sustainable growth.</b>

In [None]:
retrieved_chunks = query_engine_2.retrieve(query)
retrieved_chunks


In [140]:
[info.text for info in retrieved_chunks]


['As a result of this meeting and The World Travel & Tourism Council\'s Visa Impact Research, later on the Leaders of the G20, convened in Los Cabos on 18–19 June, would recognise the impact of Travel & Tourism for the first time. That year, the G20 Leaders Declaration added the following statement: "We recognise the role of travel and tourism as a vehicle for job creation, economic growth and development, and, while recognizing the sovereign right of States to control the entry of foreign nationals, we will work towards developing travel facilitation initiatives in support of job creation, quality work, poverty reduction and global growth."[30]\r\n\r\nIn March 2014, the former Australian foreign minister Julie Bishop, when Australia was hosting the 2014 G20 summit in Brisbane, proposed to ban Russia from the summit over its annexation of Ukrainian Crimea.[31] The BRICS foreign ministers subsequently reminded Bishop that "the custodianship of the G20 belongs to all Member States equall

In [120]:
query_engine_2.get_prompts().keys()

dict_keys(['response_synthesizer:text_qa_template', 'response_synthesizer:refine_template'])

In [141]:
query_engine_2.get_prompts()['response_synthesizer:text_qa_template'].default_template.template

'Context information is below.\n---------------------\n{context_str}\n---------------------\nGiven the context information and not prior knowledge, answer the query.\nQuery: {query_str}\nAnswer: '

In [136]:
query_engine_2.get_prompts()['response_synthesizer:refine_template'].conditionals

[(<function llama_index.core.prompts.utils.is_chat_model(llm: llama_index.core.base.llms.base.BaseLLM) -> bool>,
  ChatPromptTemplate(metadata={'prompt_type': <PromptType.CUSTOM: 'custom'>}, template_vars=['context_msg', 'query_str', 'existing_answer'], kwargs={}, output_parser=None, template_var_mappings=None, function_mappings=None, message_templates=[ChatMessage(role=<MessageRole.USER: 'user'>, content="You are an expert Q&A system that strictly operates in two modes when refining existing answers:\n1. **Rewrite** an original answer using the new context.\n2. **Repeat** the original answer if the new context isn't useful.\nNever reference the original answer or context directly in your answer.\nWhen in doubt, just repeat the original answer.\nNew Context: {context_msg}\nQuery: {query_str}\nOriginal Answer: {existing_answer}\nNew Answer: ", additional_kwargs={})]))]

In [139]:
query_engine_2.get_prompts()['response_synthesizer:refine_template'].default_template.template

"The original query is as follows: {query_str}\nWe have provided an existing answer: {existing_answer}\nWe have the opportunity to refine the existing answer (only if needed) with some more context below.\n------------\n{context_msg}\n------------\nGiven the new context, refine the original answer to better answer the query. If the context isn't useful, return the original answer.\nRefined Answer: "

# Llamaindex with FAISS

In [None]:
# !pip install llama-index-vector-stores-faiss

In [101]:
from llama_index.core import (
    SimpleDirectoryReader,
    load_index_from_storage,
    VectorStoreIndex,
    StorageContext,
)
from llama_index.vector_stores.faiss import FaissVectorStore

import faiss
d = 1536
faiss_index = faiss.IndexFlatL2(d)

In [102]:
documents=SimpleDirectoryReader("data").load_data()

In [103]:
vector_store = FaissVectorStore(faiss_index=faiss_index)
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex.from_documents(
    documents, storage_context=storage_context
)

In [104]:
index.storage_context.persist()

In [105]:
# load index from disk
vector_store = FaissVectorStore.from_persist_dir("./storage")
storage_context = StorageContext.from_defaults(
    vector_store=vector_store, persist_dir="./storage"
)
index = load_index_from_storage(storage_context=storage_context)

In [106]:
# set Logging to DEBUG for more detailed outputs
query_engine = index.as_query_engine()
response_3 = query_engine.query(query)

In [107]:
display(Markdown(f"<b>{response_3}</b>"))

<b>The main topic of discussion in the document is the G20 summits and their key outcomes and themes from various years, starting from the recognition of the impact of Travel & Tourism in 2012 to the more recent summits in 2021 and 2022. The document highlights how different G20 summits focused on various global economic challenges, development issues, climate change, terrorism, and refugees. It also mentions the commitments made by G20 leaders regarding financial stability, tax regulation, energy policy, and sustainable development goals. Additionally, it covers the rotation system for selecting the G20 Presidency each year, where member nations negotiate among themselves based on groupings to determine the next G20 President. The document provides insights into the evolving agenda of the G20 summits, including the inclusion of the African Union in 2023 and the launch of the G20 Social platform by the Brazilian presidency to involve civil society in discussions and policy formulations. Overall, the document showcases the G20 summits as platforms for global cooperation and decision-making on critical issues impacting the world.</b>

# Leveraging the power of both Langchain and Llamaindex

In [51]:
from llama_index.core import VectorStoreIndex,SimpleDirectoryReader
from langchain_openai import ChatOpenAI

llm=ChatOpenAI(model="gpt-4o")

documents=SimpleDirectoryReader("data").load_data()

index=VectorStoreIndex.from_documents(documents,llm=llm)

query_engine=index.as_query_engine()

response=query_engine.query(query)

print(response)

The main topic of discussion in the document is the G20 summits, including details about past summits, key outcomes, themes, and priorities of different presidencies.


In [None]:
response2 = query_engine.query(query)
display(Markdown(f"<b>{response2}</b>"))

In [53]:
display(Markdown(f"<b>{response2}</b>"))

<b>The main topic of discussion in the document is the G20 summits and their key outcomes and themes from various years, starting from the recognition of the impact of Travel & Tourism in 2012 to the more recent summits in 2021 and 2023. The document highlights how the G20 Leaders Declaration acknowledged the role of travel and tourism in job creation and economic growth, as well as the focus on issues like global economic challenges, development, climate change, terrorism, and refugees in later summits. It also mentions the commitment to the 2030 Agenda and Sustainable Development Goals, along with initiatives related to financial stability, tax regulation, and energy policy. Furthermore, it discusses the presidency themes of different countries hosting the G20 summits, such as Indonesia's focus on global health architecture, digital transformations, and sustainable energy transitions in 2022, and India's emphasis on a human-centric development approach addressing climate change, debt restructuring, and global cryptocurrencies in 2023. The document also touches upon the inclusion of the African Union in the G20 and the launch of G20 Social by the Brazilian presidency to involve civil society in summit discussions and policy formulations.</b>

# With prompt template

In [54]:
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from langchain_openai import ChatOpenAI

# Initialize the language model
llm = ChatOpenAI(model="gpt-4o")

# Load documents from the specified directory
documents = SimpleDirectoryReader("data").load_data()

# Create the vector store index from the documents using the LLM
index = VectorStoreIndex.from_documents(documents, llm=llm)

# Create the query engine from the index
query_engine = index.as_query_engine()

# Define a prompt template
prompt_template = """
You are a helpful assistant. Based on the following document, please answer the question:

Document: {document}
Question: {query}

Answer:
"""

# Create a function to format and execute the query with the prompt template
def query_with_template(query, document):
    # Format the prompt with the document and the user query
    formatted_prompt = prompt_template.format(document=document, query=query)
    # Execute the query using the formatted prompt
    response = query_engine.query(formatted_prompt)
    return response

# Define your query and retrieve the relevant document (you can customize this logic)
# query = "What is the main topic of discussion in the document?"
response = query_with_template(query, documents[0])  # Using the first document for example

# Print the response
display(Markdown(f"<b>{response2}</b>"))


<b>The main topic of discussion in the document is the G20 summits and their key outcomes and themes from various years, starting from the recognition of the impact of Travel & Tourism in 2012 to the more recent summits in 2021 and 2023. The document highlights how the G20 Leaders Declaration acknowledged the role of travel and tourism in job creation and economic growth, as well as the focus on issues like global economic challenges, development, climate change, terrorism, and refugees in later summits. It also mentions the commitment to the 2030 Agenda and Sustainable Development Goals, along with initiatives related to financial stability, tax regulation, and energy policy. Furthermore, it discusses the presidency themes of different countries hosting the G20 summits, such as Indonesia's focus on global health architecture, digital transformations, and sustainable energy transitions in 2022, and India's emphasis on a human-centric development approach addressing climate change, debt restructuring, and global cryptocurrencies in 2023. The document also touches upon the inclusion of the African Union in the G20 and the launch of G20 Social by the Brazilian presidency to involve civil society in summit discussions and policy formulations.</b>