# Task

The task of this project is to compare the result of OpenAI Model with and without the use of LangChain

In [1]:
from langchain.llms import OpenAI

In [2]:
from langchain.prompts import (
    ChatPromptTemplate,
    SystemMessagePromptTemplate,
    HumanMessagePromptTemplate,
)
from langchain.chat_models import ChatOpenAI
from langchain.document_loaders import WikipediaLoader

In [3]:
from langchain.embeddings import OpenAIEmbeddings
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores import Chroma
from langchain.document_loaders import TextLoader

In [4]:
from langchain.retrievers import ContextualCompressionRetriever
from langchain.retrievers.document_compressors import LLMChainExtractor

In [5]:
api_key = open("../../api_key.txt").read()

## Simple LLM Model

In [6]:
llm = OpenAI(openai_api_key=api_key)

In [7]:
print(llm('Who won the Fifa World Cup 2022'))



The 2022 FIFA World Cup is scheduled to take place in Qatar from 21 November to 18 December 2022. A winner has yet to be determined.


## LangChain Model with Document Loader

In [8]:
model = ChatOpenAI(openai_api_key=api_key)

In [9]:
def answer_question_about(person_name,question):
    
    '''
    Use the Wikipedia Document Loader to help answer questions about someone, insert it as additional helpful context.
    '''
    
    loader = WikipediaLoader(query=person_name, load_max_docs=1)
    data = loader.load()
    
    human_template = "Answer the following question\n{question}, here is some additional helpful context\n{document}"
    human_message_prompt = HumanMessagePromptTemplate.from_template(human_template)
    
    chat_prompt = ChatPromptTemplate.from_messages([human_message_prompt])
    
    model_request = chat_prompt.format_prompt(
        question=question,
        document=data[0].page_content
    ).to_messages()
    
    result = model(model_request)
    
    print(result.content)

In [10]:
answer_question_about("Fifa World Cup 2022","Who won the World Cup?")

Argentina won the 2022 FIFA World Cup.


## LangChain with Vector Store

In [11]:
query = "Fifa World Cup 2022"
loader = WikipediaLoader(query=query, load_max_docs=1)
documents = loader.load()
text_splitter = CharacterTextSplitter.from_tiktoken_encoder(chunk_size=500)
docs = text_splitter.split_documents(documents=documents)

In [12]:
docs

[Document(page_content="The 2022 FIFA World Cup was the 22nd FIFA World Cup, the world championship for national football teams organized by FIFA. It took place in Qatar from 20 November to 18 December 2022, after the country was awarded the hosting rights in 2010. It was the first World Cup to be held in the Arab world and Muslim world, and the second held entirely in Asia after the 2002 tournament in South Korea and Japan.This tournament was the last with 32 participating teams, with the number of teams being increased to 48 for the 2026 edition. To avoid the extremes of Qatar's hot climate, the event was held during November and December. It was held over a reduced time frame of 29 days with 64 matches played in eight venues across five cities. Qatar entered the event—their first World Cup—automatically as the host's national team, alongside 31 teams determined by the qualification process.\nArgentina were crowned the champions after winning the final against the title holder France

In [13]:
embedding_function = OpenAIEmbeddings(openai_api_key=api_key)

In [14]:
db = Chroma.from_documents(docs, embedding_function, persist_directory='./fifa_world_cup_2022')

In [15]:
db.persist()

### Asking Questions

In [16]:
retriever = db.as_retriever()

In [17]:
results = retriever.get_relevant_documents('Who was the captain of Argentina')

Number of requested results 4 is greater than number of elements in index 2, updating n_results = 2


In [18]:
print(results[0].page_content)

The 2022 FIFA World Cup was the 22nd FIFA World Cup, the world championship for national football teams organized by FIFA. It took place in Qatar from 20 November to 18 December 2022, after the country was awarded the hosting rights in 2010. It was the first World Cup to be held in the Arab world and Muslim world, and the second held entirely in Asia after the 2002 tournament in South Korea and Japan.This tournament was the last with 32 participating teams, with the number of teams being increased to 48 for the 2026 edition. To avoid the extremes of Qatar's hot climate, the event was held during November and December. It was held over a reduced time frame of 29 days with 64 matches played in eight venues across five cities. Qatar entered the event—their first World Cup—automatically as the host's national team, alongside 31 teams determined by the qualification process.
Argentina were crowned the champions after winning the final against the title holder France 4–2 on penalties followi

### Context Compression

In [19]:
compressor = LLMChainExtractor.from_llm(model)

In [20]:
compression_retriever = ContextualCompressionRetriever(
    base_compressor=compressor,
    base_retriever=db.as_retriever()
)

In [21]:
docs = db.similarity_search("Who was the captain of Argentina")
docs[0].page_content

Number of requested results 4 is greater than number of elements in index 2, updating n_results = 2


"The 2022 FIFA World Cup was the 22nd FIFA World Cup, the world championship for national football teams organized by FIFA. It took place in Qatar from 20 November to 18 December 2022, after the country was awarded the hosting rights in 2010. It was the first World Cup to be held in the Arab world and Muslim world, and the second held entirely in Asia after the 2002 tournament in South Korea and Japan.This tournament was the last with 32 participating teams, with the number of teams being increased to 48 for the 2026 edition. To avoid the extremes of Qatar's hot climate, the event was held during November and December. It was held over a reduced time frame of 29 days with 64 matches played in eight venues across five cities. Qatar entered the event—their first World Cup—automatically as the host's national team, alongside 31 teams determined by the qualification process.\nArgentina were crowned the champions after winning the final against the title holder France 4–2 on penalties follo

In [22]:
compressed_docs = compression_retriever.get_relevant_documents("Who was the captain of Argentina")
compressed_docs[0].page_content

Number of requested results 4 is greater than number of elements in index 2, updating n_results = 2


"Argentine captain Lionel Messi was voted the tournament's best player, winning the Golden Ball."