# RAG System Test Environment

### Load & Process Documents

In [None]:
# FIRST IS TO LOAD AND PROCESS THE DOCUMENT
from langchain.document_loaders import PyPDFLoader, TextLoader # type: ignore
from langchain.text_splitter import RecursiveCharacterTextSplitter # type: ignore

loader = PyPDFLoader("knowledge-base/Company Profile.pdf")
docs = loader.load()

splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
chunks = splitter.split_documents(docs)

In [2]:
len(chunks)

40

### Embed Chunks + Store in Vector DB (FAISS)

In [3]:
# NEXT IS TO EMBED CHUNKS AND STORE IN VECTOR DB(FAISS)
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.vectorstores import FAISS

embeddings = HuggingFaceEmbeddings(model_name='sentence-transformers/all-MiniLM-L6-v2')
vectorstore = FAISS.from_documents(chunks, embeddings)
vectorstore.save_local("my_faiss_index")

  embeddings = HuggingFaceEmbeddings(model_name='sentence-transformers/all-MiniLM-L6-v2')
  from .autonotebook import tqdm as notebook_tqdm


In [4]:
vectorstore

<langchain_community.vectorstores.faiss.FAISS at 0x12fdf1060>

### Querying (RAG Loop)

In [13]:
# NEXT IS TO QUERY THE RAG MODEL(TEST WITH A SIMPLE QUERY)
from langchain.chains import RetrievalQA
from langchain.llms import Ollama

llm = Ollama(model="tinyllama")
qa_chain = RetrievalQA.from_chain_type(llm=llm, retriever=vectorstore.as_retriever())
response = qa_chain.run("What are the company core values?")
print(response)

Yes, here's the answer to the question "What are the company core values?" The company's core values are:

1. Integrity - We uphold transparency, honesty, and ethical practices in all our dealings.
2. Innovation - We continuously develop and deploy forward-thinking financial technologies.
3. Customer-Centricity - We prioritize our clients' needs and success in driving impact.
4. Security - We invest heavily in the protection of our clients' data and transaction records.
5. Collaboration - We foster strong partnerships internally and externally to drive impact.
6. Diversification and Inclusion - We strive for a diverse and inclusive culture that values global diversity.
7. Simplifying cross-border payment solutions through advanced technology, regulatory compliance, and exceptional customer support. 
8. Advance technologies and regulatory adherence to ensure the company's tech stack is scalable and resilient. The company also coordinates with legal teams to remain compliant with global 

In [9]:
#CHECKING THE TIME TAKEN FOR RETRIEVAL AND GENERATION
import time

query = "What is the name of the company?"

start = time.time()
retrieved_docs = vectorstore.similarity_search(query)
print(f"Retrieval Time: {time.time() - start:.2f} sec")

start = time.time()
response = qa_chain.run(query)
print(f"Generation Time: {time.time() - start:.2f} sec")

print("\nAnswer:", response)

Retrieval Time: 0.43 sec
Generation Time: 122.14 sec

Answer: The question asks for the name of a company. The correct answer is GlobalPay Financial Services, as stated in the given context.


### UI with Streamlit

In [10]:
import streamlit as st

st.title("Free RAG Assistant")
query = st.text_input("Ask a question:")
if query:
    result = qa_chain.run(query)
    st.write(result)


2025-04-15 17:26:56.847 
  command:

    streamlit run /Users/user/RAG model/RAG-model/.venv/lib/python3.10/site-packages/ipykernel_launcher.py [ARGUMENTS]
2025-04-15 17:26:56.864 Session state does not function when running a script without `streamlit run`


### Logging Queries

In [None]:
#LOGGING THE QUERIES
import logging
logging.basicConfig(filename='queries.log', level=logging.INFO)
logging.info(f"User asked: {query}")