# Retrieval Augmented Generation (RAG) with [LangChain](https://python.langchain.com/docs)

## 1. Install Dependencies

In [None]:
%pip install langchain openai pypdf chromadb pysqlite3-binary tiktoken

__import__('pysqlite3')
import sys
sys.modules['sqlite3'] = sys.modules.pop('pysqlite3')

## 2. Set API access

You may choose from a wide array of [providers](https://python.langchain.com/docs/integrations) to integrate with. The example below uses OpenAI models through [Azure](https://oai.azure.com/portal).

In [None]:
%env AZURE_OPENAI_ENDPOINT=https://<your-endpoint>.openai.azure.com/
%env AZURE_OPENAI_API_KEY=<YOUR_AZUREOPENAI_KEY>

## 3. Read data

In [None]:
from langchain.document_loaders import PyPDFLoader

loader = PyPDFLoader("https://ihl-databases.icrc.org/assets/treaties/365-GC-I-EN.pdf")
pages = loader.load_and_split()


## 4. Create vector store retriever

In [None]:
from langchain.vectorstores import Chroma
from langchain.embeddings import AzureOpenAIEmbeddings
from chromadb.config import Settings

embeddings = AzureOpenAIEmbeddings()
vectorstore = Chroma.from_documents(
    documents=pages,
    embedding=embeddings,
    client_settings=Settings()
)
retriever = vectorstore.as_retriever()

## 5. Create model

In [None]:
from langchain.llms import AzureOpenAI

temperature = 0.2
llm = AzureOpenAI(temperature=temperature)

## 6. Perform Q&A

In [None]:
from langchain.chains import RetrievalQA

qa = RetrievalQA.from_chain_type(llm=llm, chain_type="stuff", retriever=retriever)
query = "What is the emblem of the Geneva Convention?"
qa.run(query)