## RAG with LangChain on local ollama LLM deployment
Author: **Peeyush Sharma**; Feedback: **PSharma3@gmail.com**


In [11]:
DOC_DIR = "../../../00_Data"
EMBEDDING_MODEL_NAME = "all-MiniLM-L6-v2"
LLM_MODEL_NAME = "llama3.2"

In [12]:
from langchain_community.embeddings import HuggingFaceEmbeddings
embeddings = HuggingFaceEmbeddings(model_name=EMBEDDING_MODEL_NAME)

In [13]:
from langchain_chroma import Chroma
import chromadb
collection_name = "rag_collection"

chroma_client = chromadb.PersistentClient()
try:
    chroma_client.get_collection(collection_name)
except ValueError:
    chroma_client.delete_collection(collection_name)

collection = chroma_client.get_or_create_collection(collection_name)

vector_store = Chroma(
    client=chroma_client,
    collection_name=collection_name,
    embedding_function=embeddings
)

In [14]:
import os
from uuid import uuid4
from langchain.document_loaders import PyPDFLoader
from langchain.text_splitter import CharacterTextSplitter

# Load the document

loader = PyPDFLoader(os.path.join(DOC_DIR, "BREIT-Prospectus-with-previous-supplements.pdf"))
documents = loader.load()

# Split the document into chunks
text_splitter = CharacterTextSplitter(chunk_size=4096, chunk_overlap=256, separator="\n")
docs = text_splitter.split_documents(documents=documents)
uuids = [str(uuid4()) for _ in range(len(docs))]

print(len(uuids))

1250


In [15]:
print(docs[10])

page_content='his knowledge and experience in internal and external risk oversight, and his experience as a member of the board of directors of five public 
REITs, including chairman of two. 
Field Griffith has been a director of the Company since July 2016. He also currently serves as a non-executive director on the board of The 
Forest Company Limited and as a director for the Prime Property Fund LLC, positions he has held since March 2017 and February 2018, 
respectively. Mr. Griffith was most recently employed full time as the Director of Real Assets Investments for the Virginia Retirement 
System from 2004 to 2016 where he was responsible for managing all aspects of the System’s global real estate, infrastructure and natural 
resource portfolios. The global real estate portfolio consisted of publicly- and privately-traded equity and debt investments in the form of 
separate accounts, joint ventures, closed-end funds and open-end funds. Mr. Griffith was also a member of the managem

In [16]:
vector_store.add_documents(documents=docs, ids=uuids)

['939f0788-90c8-4bab-983b-02c93a5719dd',
 '7cc2aa06-fcbd-400e-b30c-562a0fa203ee',
 'd3199709-8201-495a-a593-2cf83d255733',
 '4c323498-7f65-41de-b22c-60c0b151af71',
 '01ec74a8-81e6-423d-87d9-80af19930915',
 '7869d4db-1d40-4f26-950f-bc0c4c419eb5',
 'f8fa5ff3-75e0-49b8-8992-157e1e0f954c',
 '9bda8f11-d6bc-415b-96a7-81e7cbf4bf56',
 'e973ff1d-0fae-4a25-927a-3b8b28fb1749',
 '24fa7d09-4a42-4012-afad-1a233dd83255',
 'ec3c6716-43ad-409d-8959-e75a0cbabd32',
 '6df3a23e-c269-4d37-a092-b7303a9c51a5',
 'fe3cdc70-e73c-42eb-8b15-db04e8d669e6',
 'cbee0ab2-b02a-456a-b144-51c3f36df85e',
 'd5636fa5-cfcb-4747-8330-05df16bfc558',
 '12062c07-0206-44f4-84d3-004eb1c44fe3',
 'fec0e009-3c63-4877-8c12-d622b9117fbd',
 '06b71a55-2ad5-411c-8012-3fe37977bdc1',
 '5b40650c-b37a-4782-9002-8d86c0e8d2c4',
 '96d2bdb8-5d1d-40ea-a0ba-8d5481d64160',
 'eb9c5044-97c3-408b-915a-891a3483699c',
 '4072384e-8210-4215-aa6b-4996bd9e58df',
 '04dbe93b-e351-42e0-967c-598406ad9bef',
 'c985a19b-b4bd-4e20-9e78-243af6eaee63',
 '81ae474d-1fe5-

In [17]:
lst_tuple_doc_score = vector_store.similarity_search_with_score("What is the fund name?")
# for tuple_doc_score in lst_tuple_doc_score:
#     print(f"Score=====:{tuple_doc_score[1]}")
#     print(f"Text=====:{tuple_doc_score[0]}")
#     print(f"==================")


In [18]:
from langchain.chains import RetrievalQA
from langchain_community.llms import Ollama

# Initialize the local model
llm = Ollama(model=LLM_MODEL_NAME)

# Create RetrievalQA
retriever = vector_store.as_retriever(search_type="similarity",
                search_kwargs={'k': 10})
qa = RetrievalQA.from_chain_type(llm=llm, chain_type="stuff", retriever=retriever)

queries = [
    "What is the fund name?",
    "What is the jurisdiction of the fund?",
    "What is the investment strategy of the fund?",
    "What is the minimum investment threshold of the fund?",
]

for query in queries:
    print(query)
    result = qa.run(query)
    print(result)
    print("=============")


What is the fund name?
The fund name is Blackstone Real Estate Income Trust, Inc. (BREIT).
What is the jurisdiction of the fund?
There is no specific information provided in the text about the jurisdiction of the fund. However, based on the context, it appears that the fund is likely organized under the laws of the state where Blackstone Real Assets Advisors LLC (the Advisor) is headquartered, which is likely Massachusetts or Delaware (common jurisdictions for private funds).
What is the investment strategy of the fund?
The investment strategy of the fund is to invest primarily in stabilized, income-generating commercial real estate in the United States. The fund aims to bring Blackstone's leading institutional-quality real estate investment platform to income-focused investors and conduct operations as a real estate investment trust (REIT) for U.S. federal income tax purposes.
What is the minimum investment threshold of the fund?
The text does not explicitly state the minimum investme