# Retrieval Augmented Generation

### Setup

Load the weaviate instance for document retrieval using vector similarity search.

In [2]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough

from langchain.retrievers.self_query.base import SelfQueryRetriever
from langchain.chains import RetrievalQAWithSourcesChain
from langchain.chains.query_constructor.base import AttributeInfo

from langchain_community.chat_models import ChatOpenAI
from langchain.llms import OpenAI

from manifesto_qa.app import GENERATIVE_MODEL, vector_db

In [2]:
retriever = vector_db.instance.as_retriever(search_type="similarity", k=5)
llm = ChatOpenAI(model=GENERATIVE_MODEL, temperature=0)

### Prompt template

In [4]:
template = """You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.
Question: {question}
Context: {context}
Answer:
"""
prompt = ChatPromptTemplate.from_template(template)

### Construct the RAG chain

In [4]:
rag_chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

In [5]:
rag_chain.invoke("What is conservative policy on national service?")

'The conservative policy on national service involves reinventing it for the 21st century to give young people valuable life skills and build a stronger national culture. National service will be compulsory for every 18-year-old, with a choice between civic service or military service. The military service option will be competitive and paid to ensure recruitment of the brightest and best for the armed forces.'

### Using self query

In [9]:
self_query_retriever = SelfQueryRetriever.from_llm(
    llm=OpenAI(model="gpt-3.5-turbo-instruct", temperature=0),
    vectorstore=vector_db.instance,
    document_contents="Manifestos",
    metadata_field_info=[
        AttributeInfo(
            name="source",
            description="The manifesto PDF the chunk is from",
            type="string",
        ),
        AttributeInfo(
            name="page",
            description="The page from the manifesto",
            type="integer",
        ),
    ],
    verbose=True,
    search_type="similarity", 
    k=5
)

In [10]:
llm = ChatOpenAI(model=GENERATIVE_MODEL, temperature=0)

In [11]:
rag_chain = (
    {"context": self_query_retriever, "question": RunnablePassthrough()}
    | prompt
    | ChatOpenAI(model=GENERATIVE_MODEL, temperature=0)
    | StrOutputParser()
)
rag_chain.invoke("What is reform's policy on the european union?")

/Users/longbe01/Documents/projects/llm-rag/venv-llm-rag/lib/python3.10/site-packages/pydantic/main.py:1070: PydanticDeprecatedSince20: The `dict` method is deprecated; use `model_dump` instead. Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.7/migration/
/Users/longbe01/Documents/projects/llm-rag/venv-llm-rag/lib/python3.10/site-packages/pydantic/main.py:1070: PydanticDeprecatedSince20: The `dict` method is deprecated; use `model_dump` instead. Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.7/migration/


"Reform's policy on the European Union includes legislating to scrap EU regulations, leaving the European Convention on Human Rights, and preparing for renegotiations on the EU Trade and Cooperation Agreement. They aim to ensure British laws and judges are not overruled by foreign courts and to protect the country's sovereignty from EU influence."

### Retrieval with sources

In [39]:
chain = RetrievalQAWithSourcesChain.from_llm(
    llm=ChatOpenAI(model=GENERATIVE_MODEL, temperature=0),
    retriever=self_query_retriever,
    question_prompt=prompt,
)
results = chain.invoke({"question":"What is reform's policy on the european union?"}, return_only_outputs=False)
# print(f"{results['answer']}\nSources: {results['sources']}")

/Users/longbe01/Documents/projects/llm-rag/venv-llm-rag/lib/python3.10/site-packages/pydantic/main.py:1070: PydanticDeprecatedSince20: The `dict` method is deprecated; use `model_dump` instead. Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.7/migration/
/Users/longbe01/Documents/projects/llm-rag/venv-llm-rag/lib/python3.10/site-packages/pydantic/main.py:1070: PydanticDeprecatedSince20: The `dict` method is deprecated; use `model_dump` instead. Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.7/migration/
/Users/longbe01/Documents/projects/llm-rag/venv-llm-rag/lib/python3.10/site-packages/pydantic/main.py:1070: PydanticDeprecatedSince20: The `dict` method is deprecated; use `model_dump` instead. Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.7/migration/
/Users/longbe01/Documents/proje

In [36]:
print(f"{results['answer']}\nSources: {results['sources']}")

Reform's policy on the European Union includes leaving the European Convention on Human Rights, ensuring British laws and judges are not overruled by foreign courts, and protecting UK courts from EU arrest warrants. Additionally, the policy involves seeking independence for Britain's Armed Forces, renegotiating the EU Trade and Cooperation Agreement, and abandoning the Windsor Framework.

Sources: /Users/longbe01/Documents/projects/llm-rag/data/Reform_UK_Contract_With_The_People.pdf


In [40]:
results.keys()

dict_keys(['question', 'answer', 'sources'])

### Including memory