# 🧠 Local RAG QA Bot (Ollama + LangChain + ChromaDB)
This notebook runs a local Retrieval-Augmented Generation (RAG) chatbot using open-source components — no API keys, no internet required.

**Components used:**
- LangChain
- ChromaDB
- Ollama (local LLM runner)
- sentence-transformers for embeddings

**Tested on macOS (M-series & Intel) with Python 3.10+.**

In [1]:
# !pip install langchain chromadb pypdf sentence-transformers ollama --quiet
# !pip install langchain==0.2.14 langchain-community langchain-chroma sentence-transformers chromadb pypdf ollama --quiet
# !pip install -U langchain langchain-community langchain-chroma langchain-ollama langchain-text-splitters chromadb sentence-transformers pypdf
!pip install -U langchain langchain-community langchain-chroma langchain-ollama langchain-text-splitters chromadb sentence-transformers pypdf




In [2]:
pip install langchain==0.0.307 langsmith==0.0.40


Collecting langsmith==0.0.40
  Downloading langsmith-0.0.40-py3-none-any.whl.metadata (10 kB)
Downloading langsmith-0.0.40-py3-none-any.whl (39 kB)
Installing collected packages: langsmith
  Attempting uninstall: langsmith
    Found existing installation: langsmith 0.4.37
    Uninstalling langsmith-0.4.37:
      Successfully uninstalled langsmith-0.4.37
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
langchain-classic 1.0.0 requires langsmith<1.0.0,>=0.1.17, but you have langsmith 0.0.40 which is incompatible.
langchain-community 0.4 requires langsmith<1.0.0,>=0.1.125, but you have langsmith 0.0.40 which is incompatible.
langchain-core 1.0.0 requires langsmith<1.0.0,>=0.3.45, but you have langsmith 0.0.40 which is incompatible.[0m[31m
[0mSuccessfully installed langsmith-0.0.40
Note: you may need to restart the kernel to use updated packages.


In [3]:
pip install -U langchain-ollama

Collecting langsmith<1.0.0,>=0.3.45 (from langchain-core<2.0.0,>=1.0.0->langchain-ollama)
  Using cached langsmith-0.4.37-py3-none-any.whl.metadata (14 kB)
Using cached langsmith-0.4.37-py3-none-any.whl (396 kB)
Installing collected packages: langsmith
  Attempting uninstall: langsmith
    Found existing installation: langsmith 0.0.40
    Uninstalling langsmith-0.0.40:
      Successfully uninstalled langsmith-0.0.40
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
langchain 0.0.307 requires langsmith<0.1.0,>=0.0.40, but you have langsmith 0.4.37 which is incompatible.[0m[31m
[0mSuccessfully installed langsmith-0.4.37
Note: you may need to restart the kernel to use updated packages.


In [2]:
!pip install -U langchain==0.0.307 langchain-community langchain-chroma langchain-ollama langchain-text-splitters chromadb sentence-transformers pypdf
!pip install langsmith==0.0.40


Collecting langsmith<0.1.0,>=0.0.40 (from langchain==0.0.307)
  Using cached langsmith-0.0.92-py3-none-any.whl.metadata (9.9 kB)
INFO: pip is looking at multiple versions of langchain-community to determine which version is compatible with other requirements. This could take a while.
Collecting langchain-community
  Using cached langchain_community-0.4-py3-none-any.whl.metadata (3.0 kB)
  Using cached langchain_community-0.3.31-py3-none-any.whl.metadata (3.0 kB)
  Using cached langchain_community-0.3.30-py3-none-any.whl.metadata (3.0 kB)
  Using cached langchain_community-0.3.29-py3-none-any.whl.metadata (2.9 kB)
INFO: pip is still looking at multiple versions of langchain-community to determine which version is compatible with other requirements. This could take a while.
  Using cached langchain_community-0.3.28-py3-none-any.whl.metadata (2.9 kB)
Collecting langchain-core<1.0.0,>=0.3.74 (from langchain-community)
  Downloading langchain_core-0.3.79-py3-none-any.whl.metadata (3.2 kB)
C

In [3]:
from langchain_community.document_loaders import PyPDFLoader, TextLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_chroma import Chroma
from langchain_community.embeddings import SentenceTransformerEmbeddings
from langchain.chains import RetrievalQA
from langchain_community.llms import Ollama


In [4]:
# from langchain_community.document_loaders import PyPDFLoader, TextLoader
# from langchain_text_splitters import RecursiveCharacterTextSplitter
# from langchain_chroma import Chroma
# from langchain_community.embeddings import SentenceTransformerEmbeddings
# # from langchain.chains import create_retrieval_chain

# from langchain.chains import RetrievalQA
# from langchain_ollama import Ollama


## Step 1 — Load your document(s)
You can provide a path to any PDF or `.txt` file(s).

In [5]:
file_path = input('Enter path to your document (PDF or TXT): ').strip()

if file_path.endswith('.pdf'):
    loader = PyPDFLoader(file_path)
elif file_path.endswith('.txt'):
    loader = TextLoader(file_path)
else:
    raise ValueError('Please provide a .pdf or .txt file.')

documents = loader.load()
print(f'✅ Loaded {len(documents)} pages from {file_path}')

ValueError: Please provide a .pdf or .txt file.

## Step 2 — Split into chunks & create embeddings

In [None]:
splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
texts = splitter.split_documents(documents)
print(f'✅ Split into {len(texts)} chunks')

embedding_model = SentenceTransformerEmbeddings(model_name='all-MiniLM-L6-v2')
vectordb = Chroma.from_documents(texts, embedding=embedding_model, persist_directory='db')
retriever = vectordb.as_retriever(search_kwargs={'k': 3})
print('✅ Vector database built and ready!')

## Step 3 — Initialize Local Ollama Model

In [None]:
model_name = input('Enter Ollama model name (default: llama3): ').strip() or 'llama3'
llm = Ollama(model=model_name)
qa_chain = RetrievalQA.from_chain_type(llm=llm, retriever=retriever)
print(f'✅ Using local Ollama model: {model_name}')

## Step 4 — Ask Questions

In [None]:
print('💬 Ask questions about your document. Type exit() to quit.')
while True:
    query = input('Question: ')
    if query.lower() in ['exit', 'quit', 'q']:
        print('👋 Goodbye!')
        break
    answer = qa_chain.run(query)
    print('Answer:', answer)