# 1. Simple RAG

<img src="https://miro.medium.com/v2/resize:fit:1400/1*J7vyY3EjY46AlduMvr9FbQ.png" height=300px>

In [6]:
!pip install --quiet \
    langchain-core \
    langchain-google-genai \
    langchain-groq \
    langchain-community \
    PyMuPDF

### Importing necessary files

In [7]:
import os
from langchain_groq import ChatGroq
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_core.prompts import ChatPromptTemplate
from langchain.chains import create_retrieval_chain
from langchain_community.vectorstores import FAISS
from langchain_community.document_loaders import PyPDFDirectoryLoader
from langchain_google_genai import GoogleGenerativeAIEmbeddings
import time

### Setting up everything

In [8]:
# Setting up the GROQ and OpenAI API keys
groq_api_key = ""
google_api_key = ""

# Initializing the model and prompt template
llm = ChatGroq(groq_api_key=groq_api_key, model_name="Llama3-8b-8192")
prompt = ChatPromptTemplate.from_template("""
Answer the questions based on the provided context only.
Please provide the most accurate response based on the question.
<context>
{context}
<context>
Questions: {input}
""")


In [9]:
# The vector embedding function
def vector_embedding():
    embeddings = GoogleGenerativeAIEmbeddings(model="models/embedding-001")
    loader = PyPDFDirectoryLoader("./pdf")  # Data Ingestion
    docs = loader.load()  # Document Loading
    text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)  # Chunk Creation
    final_documents = text_splitter.split_documents(docs[:20])  # Splitting
    vectors = FAISS.from_documents(final_documents, embeddings)  # Vector OpenAI embeddings
    return vectors

In [10]:

# Input
question ="What are some climate related risks?"

vectors = vector_embedding()
document_chain = create_stuff_documents_chain(llm, prompt)
retriever = vectors.as_retriever()
retrieval_chain = create_retrieval_chain(retriever, document_chain)

start = time.process_time()
response = retrieval_chain.invoke({'input': question})
print("Answer:", response['answer'])

Answer: According to the provided context, some climate-related risks include:

* Many climate-related risks are higher than assessed in AR5, and projected long-term impacts are up to multiple times higher than currently observed (high confidence).
* Risks and projected adverse impacts and related losses and damages from climate change escalate with every increment of global warming (very high confidence).
* Climatic and non-climatic risks will increasingly interact, creating compound and cascading risks that are more complex and difficult to manage (high confidence).
* Compound heatwaves and droughts are projected to become more frequent, including concurrent events across multiple locations (high confidence).
* Adverse climate impacts can reduce the availability of financial resources by incurring losses and damages and through impeding national economic growth, thereby further increasing financial constraints for adaptation, particularly for developing and least developed countries 

### Testing on questions which are not clearly present on docs

In [11]:
question="Who are Policymakers?"
response = retrieval_chain.invoke({'input': question})
print("Answer:", response['answer'])

Answer: Based on the provided context, Policymakers are not explicitly defined. However, it can be inferred that Policymakers refer to individuals or groups involved in making policies related to climate change, such as government officials, politicians, or decision-makers in international organizations.


Thank You