# üîß Step 1: Install Ollama

**What is Ollama?**
- Local embedding model server (runs on your machine)
- Converts text into 768-dimensional vectors
- Completely free, no API costs

**What this cell does:**
- Installs Ollama software (~2 minutes)
- Detects GPU if available
- Sets up the server infrastructure




In [34]:
!sudo apt update
!sudo apt-get install -y pciutils
!curl -fsSL https://ollama.com/install.sh | sh
print("‚úÖ ollama installed successfully")


[33m0% [Working][0m            Hit:1 https://cloud.r-project.org/bin/linux/ubuntu jammy-cran40/ InRelease
Hit:2 https://cli.github.com/packages stable InRelease
Hit:3 http://archive.ubuntu.com/ubuntu jammy InRelease
Hit:4 https://r2u.stat.illinois.edu/ubuntu jammy InRelease
Hit:5 http://security.ubuntu.com/ubuntu jammy-security InRelease
Hit:6 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64  InRelease
Hit:7 http://archive.ubuntu.com/ubuntu jammy-updates InRelease
Hit:8 http://archive.ubuntu.com/ubuntu jammy-backports InRelease
Hit:9 https://ppa.launchpadcontent.net/deadsnakes/ppa/ubuntu jammy InRelease
Hit:10 https://ppa.launchpadcontent.net/graphics-drivers/ppa/ubuntu jammy InRelease
Hit:11 https://ppa.launchpadcontent.net/ubuntugis/ppa/ubuntu jammy InRelease
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
57 packages can be upgraded. Run 'apt list --upgradable' to see them.
[1;33mW: [0mSkipping acquire

# üöÄ Step 2: Start Ollama Server

**Why do we need this?**
- Ollama must be running before we can generate embeddings
- This starts a background server on port 11434
- Server stays active until you restart the runtime

**What happens:**
- Server starts in background
- Waits 5 seconds to initialize
- Ready to generate embeddings




In [35]:
import subprocess
import time

def start_ollama_server():
    subprocess.Popen(['ollama', 'serve'],
                     stdout=subprocess.DEVNULL,
                     stderr=subprocess.DEVNULL)
    print("üîÑ ollama server starting...")
    time.sleep(5)
    print("‚úÖ ollama server is running on port 11434")

start_ollama_server()


üîÑ ollama server starting...
‚úÖ ollama server is running on port 11434


# üì• Step 3: Download Embedding Model

**What is nomic-embed-text?**
- 768-dimensional embedding model
- 274MB size
- Optimized for semantic search
- Converts text ‚Üí numbers

**Why embeddings?**
- Enables similarity search
- "machine learning" and "ML" have similar embeddings
- Math enables fast semantic search



In [36]:
# download the nomic-embed-text model for generating embeddings
# this is a 274mb model optimized for semantic search
# only needs to download once then its cached locally
!ollama pull nomic-embed-text

print("embedding model downloaded")


[?2026h[?25l[1G[?25h[?2026l[?2026h[?25l[1G[?25h[?2026l[?2026h[?25l[1G[?25h[?2026l[?2026h[?25l[1G[?25h[?2026l[?2026h[?25l[1G[?25h[?2026l
embedding model downloaded


# üì¶ Step 4: Install Python Packages

**What we're installing:**
- **LangChain:** Framework for building RAG systems
- **langchain-groq:** Fast LLM inference via Groq API
- **ChromaDB:** Vector database for storing embeddings
- **PyPDF2:** Extract text from PDF files
- **Streamlit:** Web interface for the chatbot
- **pyngrok:** Create public URL for Colab



In [37]:
# install all the packages needed for the rag system
# langchain is the main framework for building llm applications
# langchain-community has integrations like ollama and chroma
# langchain-groq lets us use groq api for fast llm inference
# langchain-text-splitters handles chunking documents
# langchain-chroma and chromadb are for vector storage
# pypdf2 extracts text from pdf files
!pip install -qU \
    langchain \
    langchain-community \
    langchain-groq \
    langchain-text-splitters \
    langchain-chroma \
    chromadb \
    PyPDF2 \
    streamlit \
    pyngrok

print("‚úÖ all packages installed successfully")


‚úÖ all packages installed successfully


# üìö Step 5: Import Libraries

**What we're importing:**
- Embedding and LLM classes
- Vector database tools
- Text processing utilities
- PDF readers

**If this fails:** Previous package installation had an error


In [38]:
from langchain_community.embeddings import OllamaEmbeddings
from langchain_groq import ChatGroq
from langchain_chroma import Chroma
from langchain_text_splitters import CharacterTextSplitter
from PyPDF2 import PdfReader
from google.colab import userdata, drive

print("‚úÖ libraries imported successfully")


‚úÖ libraries imported successfully


# üîë Step 6: Test Groq API Key

**What is Groq?**
- Ultra-fast LLM inference service
- 500+ tokens/second speed
- Free tier: 14,400 requests/day

**This cell verifies:**
- Your API key is valid
- Can connect to Groq servers
- LLM responds correctly

**If this fails:**
1. Get key from console.groq.com/keys
2. Add to Colab Secrets as `GROQ_API_KEY`
3. Make sure it starts with `gsk_`


In [39]:
from google.colab import userdata
from langchain_groq import ChatGroq

groq_key = userdata.get('GROQ_API_KEY').strip()

print("üîç checking your groq api key:")
print(f"   length: {len(groq_key)} characters")
print(f"   preview: {groq_key[:10]}...{groq_key[-6:]}")

llm = ChatGroq(
    groq_api_key=groq_key,
    model_name="llama-3.3-70b-versatile",
    temperature=0
)

try:
    response = llm.invoke("say hello in one word")
    print(f"‚úÖ groq test successful! response: {response.content}")
except Exception as e:
    print(f"‚ùå error with groq: {str(e)}")
    print("\nif this fails check:")
    print("   1. copied entire key from console.groq.com/keys")
    print("   2. key starts with gsk_")
    print("   3. added to colab secrets as GROQ_API_KEY")


üîç checking your groq api key:
   length: 56 characters
   preview: gsk_FpWrcH...Spkgiq
‚úÖ groq test successful! response: Hello


# ‚öôÔ∏è Step 7: Initialize Both Models

**What this does:**
- Sets up Ollama for embeddings (local)
- Sets up Groq for LLM responses (cloud)
- Tests both to make sure they work

**Quick test:**
- Generates a test embedding (768 numbers)
- Gets a test response from Groq

**You'll see:**
- Embedding dimension: 768
- Groq response: "ok"


In [40]:
from langchain_community.embeddings import OllamaEmbeddings
from langchain_groq import ChatGroq
from google.colab import userdata

# setup ollama embeddings
embedding = OllamaEmbeddings(
    base_url="http://localhost:11434",
    model="nomic-embed-text"
)

# setup groq llm
groq_key = userdata.get('GROQ_API_KEY').strip()
llm = ChatGroq(
    groq_api_key=groq_key,
    model_name="llama-3.3-70b-versatile",
    temperature=0
)

print("‚úÖ ollama and groq initialized\n")

# test ollama embeddings
test_vec = embedding.embed_query("harrypotter")
print(f"üî¢ ollama test: generated {len(test_vec)}-dimensional embedding")
print(f"üìä first 10 values: {test_vec[:10]}")

# test groq llm
test_response = llm.invoke("say ok")
print(f"\nüí¨ groq test: {test_response.content}")


‚úÖ ollama and groq initialized

üî¢ ollama test: generated 768-dimensional embedding
üìä first 10 values: [-0.6045737266540527, 0.39139172434806824, -3.717906951904297, -0.23124998807907104, 0.6087762117385864, 0.7964234948158264, -0.8758982419967651, 0.7451996803283691, -0.7282124161720276, 0.14450234174728394]

üí¨ groq test: ok


# üìÑ Step 8: Test PDF Processing with Visual Outputs

**What RAG needs:**
1. **Extract text** from PDF
2. **Split into chunks** (800 chars each with 200 overlap)
3. **Generate embeddings** for each chunk
4. **Store in vector database**

**Why chunking?**
- LLMs have token limits
- Smaller chunks = more precise retrieval
- Overlap preserves context across boundaries

**This cell shows you:**
- How many chunks were created
- Example chunks from your PDF
- Example embeddings (the actual numbers!)


In [41]:
from google.colab import drive
from PyPDF2 import PdfReader
from langchain_text_splitters import CharacterTextSplitter
from langchain_chroma import Chroma
import os
import shutil

db_path = "/content/chroma_db"
if os.path.exists(db_path):
    shutil.rmtree(db_path)
print("‚úÖ database directory ready\n")

# mount drive
drive.mount('/content/drive')

# load your pdf
pdf_path = '/content/drive/MyDrive/csesch.pdf'  # CHANGE THIS PATH
pdfreader = PdfReader(pdf_path)

# extract text
raw_text = ''
for page in pdfreader.pages:
    content = page.extract_text()
    if content:
        raw_text += content

print(f"‚úÖ extracted {len(raw_text):,} characters from {len(pdfreader.pages)} pages")
print(f"\nüìñ preview of extracted text:")
print(f"{raw_text[:300]}...\n")

# split into chunks
text_splitter = CharacterTextSplitter(
    separator="\n",
    chunk_size=800,
    chunk_overlap=200,
    length_function=len,
)

texts = text_splitter.split_text(raw_text)
print(f"‚úÇÔ∏è split into {len(texts)} chunks\n")

# show example chunks
print("="*70)
print("üì¶ EXAMPLE CHUNKS (showing first 3):")
print("="*70)
for i, chunk in enumerate(texts[:3], 1):
    print(f"\n--- Chunk {i} ({len(chunk)} chars) ---")
    print(chunk)
    print()

# generate embeddings WITHOUT PERSISTENCE (for testing)
print("="*70)
print("üß† GENERATING EMBEDDINGS...")
print("="*70)
print("converting each chunk into 768-dimensional vector...")
print("this takes 1-2 minutes for multiple chunks\n")

# CREATE WITHOUT PERSIST DIRECTORY (in-memory only)
vectorstore = Chroma.from_texts(
    texts=texts,
    embedding=embedding
    # NO persist_directory for testing!
)

print(f"‚úÖ created embeddings for all {len(texts)} chunks")
print(f"\nüîç let's look at one example embedding:")
example_embedding = embedding.embed_query(texts[0])
print(f"   dimension: {len(example_embedding)}")
print(f"   first 20 values: {example_embedding[:20]}")
print(f"   these numbers represent the semantic meaning of the text!")

print(f"\n‚úÖ vector store ready with {len(texts)} documents")
print("\nüí° NOTE: this is in-memory storage (for testing only)")
print("   the streamlit app will save to disk properly")


‚úÖ database directory ready

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
‚úÖ extracted 24,497 characters from 12 pages

üìñ preview of extracted text:
06112022 /V5  Tentaive scheme for Computer Science and Engineering and allied branches (CSE/ISE and BT) 
1 
 Visvesvaraya Technological University, Belagavi 
Scheme of Teaching and Examinations- 2022  
Outcome-Based Education(OBE)and Choice Based Credit System(CBCS) 
(Effective   from the academic y...

‚úÇÔ∏è split into 41 chunks

üì¶ EXAMPLE CHUNKS (showing first 3):

--- Chunk 1 (778 chars) ---
06112022 /V5  Tentaive scheme for Computer Science and Engineering and allied branches (CSE/ISE and BT) 
1 
 Visvesvaraya Technological University, Belagavi 
Scheme of Teaching and Examinations- 2022  
Outcome-Based Education(OBE)and Choice Based Credit System(CBCS) 
(Effective   from the academic year 2022- 23) 
I Semester   (CSE    Streams)                

In [42]:
def ask_question(query, k=4):
    """
    main rag function that retrieves context and generates answer
    query: the user's question
    k: number of most relevant chunks to retrieve (default 4)
    """

    # step 1: search vector database for most similar chunks to the query
    # uses cosine similarity between query embedding and stored embeddings
    docs = vectorstore.similarity_search(query, k=k)

    # step 2: combine all retrieved chunks into one context string
    # separating with --- makes it clear where each chunk starts
    context = "\n\n---\n\n".join([doc.page_content for doc in docs])

    # step 3: build the prompt for the llm
    # we give it context first, then the question
    # important: we tell it to only use the context provided
    prompt = f"""answer the question based on the context below. be specific and cite information from the context. if you cannot answer based on the context, say you dont have enough information in the provided documents.

context:
{context}

question: {query}

answer:"""

    # step 4: send prompt to groq and get response
    response = llm.invoke(prompt)

    # return both the answer and the source chunks
    # sources help with transparency and debugging
    return {
        'answer': response.content,
        'sources': docs
    }

print("rag query function ready")


rag query function ready


In [43]:
print("rag system ready, ask questions about your document\n")

# interactive question and answer loop
# keeps running until user types quit
while True:
    query = input("your question (or type quit to exit): ").strip()

    # check if user wants to quit
    if query.lower() in ['quit', 'exit', 'q']:
        print("goodbye")
        break

    # skip empty inputs
    if not query:
        continue

    print(f"\n{'='*60}")
    print(f"question: {query}")
    print(f"{'='*60}\n")

    # get answer from rag system
    result = ask_question(query)

    print(f"answer:\n{result['answer']}\n")

    # show which chunks were used to generate the answer
    # helps with transparency and debugging
    print(f"sources used (top {len(result['sources'])} relevant chunks):")
    for i, doc in enumerate(result['sources'], 1):
        print(f"\n   [{i}] {doc.page_content[:150]}...")

    print(f"\n{'='*60}\n")


rag system ready, ask questions about your document



KeyboardInterrupt: Interrupted by user

In [44]:
# Install Streamlit and tunneling
!pip install -q streamlit pyngrok


print("Streamlit installed!")


Streamlit installed!


# üé® Step 9: Create Streamlit App

**What is Streamlit?**
- Web framework for Python
- Creates chat interfaces easily
- No HTML/CSS/JavaScript needed

**This cell creates `app.py` with:**
- Sidebar for configuration
- Multiple PDF upload
- Chat interface
- Source citations

**The app includes:**
- Groq API key input
- Drag-and-drop PDF upload
- Process documents button
- Chat history with sources


In [45]:
%%writefile app.py
import streamlit as st
from langchain_community.embeddings import OllamaEmbeddings
from langchain_groq import ChatGroq
from langchain_chroma import Chroma
from langchain_text_splitters import CharacterTextSplitter
from PyPDF2 import PdfReader
import tempfile
import os
import shutil

# setup the page configuration
st.set_page_config(
    page_title="rag chatbot",
    page_icon="ü§ñ",
    layout="wide"
)

# initialize session state variables
if 'messages' not in st.session_state:
    st.session_state.messages = []
if 'vectorstore' not in st.session_state:
    st.session_state.vectorstore = None
if 'embeddings' not in st.session_state:
    st.session_state.embeddings = None
if 'llm' not in st.session_state:
    st.session_state.llm = None
if 'doc_count' not in st.session_state:
    st.session_state.doc_count = 0

# sidebar for configuration
with st.sidebar:
    st.title("configuration")

    groq_key = st.text_input("enter your groq api key", type="password", value="")

    if groq_key and not st.session_state.llm:
        try:
            st.session_state.embeddings = OllamaEmbeddings(
                base_url="http://localhost:11434",
                model="nomic-embed-text"
            )

            st.session_state.llm = ChatGroq(
                groq_api_key=groq_key.strip(),
                model_name="llama-3.3-70b-versatile",
                temperature=0
            )
            st.success("models initialized successfully")
        except Exception as e:
            st.error(f"error initializing models: {str(e)}")

    st.divider()

    st.subheader("upload your documents")
    uploaded_files = st.file_uploader(
        "drag and drop multiple pdfs here",
        type=['pdf'],
        accept_multiple_files=True
    )

    if uploaded_files and st.session_state.embeddings:
        if st.button("process all documents"):
            with st.spinner("processing your documents..."):
                try:
                    all_texts = []
                    total_pages = 0
                    total_chars = 0

                    for uploaded_file in uploaded_files:
                        with tempfile.NamedTemporaryFile(delete=False, suffix='.pdf') as tmp_file:
                            tmp_file.write(uploaded_file.read())
                            tmp_path = tmp_file.name

                        pdfreader = PdfReader(tmp_path)
                        raw_text = ''
                        for page in pdfreader.pages:
                            content = page.extract_text()
                            if content:
                                raw_text += content

                        total_pages += len(pdfreader.pages)
                        total_chars += len(raw_text)

                        text_splitter = CharacterTextSplitter(
                            separator="\n",
                            chunk_size=800,
                            chunk_overlap=200,
                        )
                        texts = text_splitter.split_text(raw_text)
                        all_texts.extend(texts)

                        os.unlink(tmp_path)

                    # COMPLETE FRESH START - delete old db
                    db_path = "/content/chroma_db"
                    if os.path.exists(db_path):
                        shutil.rmtree(db_path)

                    # wait a moment
                    import time
                    time.sleep(1)

                    # create fresh directory with full permissions
                    os.makedirs(db_path, mode=0o777)

                    # create vector store WITHOUT persist first (in-memory)
                    # then save it
                    st.session_state.vectorstore = Chroma.from_texts(
                        texts=all_texts,
                        embedding=st.session_state.embeddings
                    )

                    st.session_state.doc_count = len(uploaded_files)

                    st.success(f"processed {len(uploaded_files)} documents successfully")
                    st.info(f"total: {total_pages} pages, {total_chars:,} characters, {len(all_texts)} chunks")

                except Exception as e:
                    st.error(f"error processing pdfs: {str(e)}")

    if st.session_state.vectorstore:
        st.success(f"ready to answer questions from {st.session_state.doc_count} documents")

    st.divider()

    if st.button("clear chat history"):
        st.session_state.messages = []
        st.rerun()

# main chat interface
st.title("rag chatbot with ollama and groq")
st.caption("upload multiple pdfs and ask questions about them")

for message in st.session_state.messages:
    with st.chat_message(message["role"]):
        st.markdown(message["content"])
        if "sources" in message:
            with st.expander("view sources"):
                for i, source in enumerate(message["sources"], 1):
                    st.text(f"[{i}] {source[:200]}...")

if prompt := st.chat_input("ask anything about your documents..."):

    if not st.session_state.vectorstore:
        st.error("please upload and process at least one pdf first")
    elif not st.session_state.llm:
        st.error("please enter your groq api key in the sidebar")
    else:
        st.session_state.messages.append({"role": "user", "content": prompt})

        with st.chat_message("user"):
            st.markdown(prompt)

        with st.chat_message("assistant"):
            with st.spinner("thinking..."):
                try:
                    docs = st.session_state.vectorstore.similarity_search(prompt, k=4)

                    context = "\n\n---\n\n".join([doc.page_content for doc in docs])

                    full_prompt = f"""answer the question based only on the context below. be specific and cite information from the context. if you cannot answer based on the context, say you dont have enough information.

context from documents:
{context}

user question: {prompt}

answer:"""

                    response = st.session_state.llm.invoke(full_prompt)
                    answer = response.content

                    st.markdown(answer)

                    sources = [doc.page_content for doc in docs]
                    with st.expander("view sources"):
                        for i, source in enumerate(sources, 1):
                            st.text(f"[{i}] {source[:200]}...")

                    st.session_state.messages.append({
                        "role": "assistant",
                        "content": answer,
                        "sources": sources
                    })

                except Exception as e:
                    error_msg = f"sorry, encountered an error: {str(e)}"
                    st.error(error_msg)
                    st.session_state.messages.append({
                        "role": "assistant",
                        "content": error_msg
                    })


Overwriting app.py


# üåê Step 10: Launch Streamlit with Ngrok

**What is ngrok?**
- Creates public URL for your Colab app
- Anyone with the link can access it
- Free tier available

**What happens:**
1. Starts Streamlit server on port 8501
2. Creates ngrok tunnel
3. Gives you a public URL

**Share the URL with:**
- Your juniors for the workshop
- Anyone who wants to test the chatbot


**To stop:** Runtime ‚Üí Interrupt execution


In [46]:
from pyngrok import ngrok
import subprocess
import time
from google.colab import userdata

# get ngrok token from colab secrets
ngrok_token = userdata.get('NGROK_AUTH_TOKEN')
ngrok.set_auth_token(ngrok_token)
ngrok.kill()  # kill any existing tunnels first

# start streamlit server in background
# runs on port 8501 which is streamlits default
process = subprocess.Popen(
    ["streamlit", "run", "app.py", "--server.port", "8501"],
    stdout=subprocess.PIPE,
    stderr=subprocess.PIPE
)

print("starting streamlit server...")
time.sleep(8)  # give it time to fully start up

# create public ngrok tunnel to the streamlit port
public_url = ngrok.connect(8501, bind_tls=True)

print("\n" + "="*70)
print("your rag chatbot is live")
print("="*70)
print(f"\npublic url: {public_url.public_url}")
print(f"\ninstructions:")
print("   1. click the url above to open in new tab")
print("   2. enter your groq api key in the sidebar")
print("   3. drag and drop your pdf files")
print("   4. click process all documents")
print("   5. start asking questions")
print(f"\nto stop the app: runtime menu > interrupt execution")
print("="*70)


starting streamlit server...

your rag chatbot is live

public url: https://unblinking-gushily-starr.ngrok-free.dev

instructions:
   1. click the url above to open in new tab
   2. enter your groq api key in the sidebar
   3. drag and drop your pdf files
   4. click process all documents
   5. start asking questions

to stop the app: runtime menu > interrupt execution


# üîß Step 11: Troubleshooting Cell (Run Only If Errors)

**When to run this:**
- If you get "connection refused" errors
- If processing documents fails
- If ollama stops responding

**What it does:**
- Kills any stuck ollama processes
- Restarts ollama server cleanly
- Verifies it's running on port 11434

**Don't run this unless you have problems!**


In [47]:
import subprocess
import time
import requests

# kill existing ollama
!pkill -9 ollama
time.sleep(2)

# restart ollama
print("üîÑ restarting ollama server...")
subprocess.Popen(['ollama', 'serve'],
                 stdout=subprocess.DEVNULL,
                 stderr=subprocess.DEVNULL)
time.sleep(10)

# verify it's working
try:
    response = requests.get("http://localhost:11434/api/tags", timeout=3)
    if response.status_code == 200:
        print("‚úÖ ollama server is running successfully")
        print("‚úÖ go back to streamlit and try again")
    else:
        print(f"‚ùå ollama responded with status: {response.status_code}")
except requests.exceptions.ConnectionError:
    print("‚ùå ollama is not responding - try running cell 2 again")
except Exception as e:
    print(f"‚ùå error: {e}")


üîÑ restarting ollama server...
‚úÖ ollama server is running successfully
‚úÖ go back to streamlit and try again


In [48]:
# check if ollama process is running
!ps aux | grep ollama

root       36848  0.4  0.2 1782432 32296 ?       Sl   07:34   0:00 ollama serve
root       36909  0.0  0.0   7376  3544 ?        S    07:34   0:00 /bin/bash -c ps aux | grep ollama
root       36911  0.0  0.0   6484  2468 ?        S    07:34   0:00 grep ollama


In [49]:
# check if anything is listening on port 11434
!netstat -tuln | grep 11434


tcp        0      0 127.0.0.1:11434         0.0.0.0:*               LISTEN     


In [None]:
import subprocess
import time
import requests

# kill any existing ollama
!pkill -9 ollama
time.sleep(2)

# start ollama in background properly (using Popen not run)
print("starting ollama server in background...")
subprocess.Popen(['ollama', 'serve'],
                 stdout=subprocess.DEVNULL,
                 stderr=subprocess.DEVNULL)

# wait for it to initialize
time.sleep(10)

# test if its working
try:
    response = requests.get("http://localhost:11434/api/tags", timeout=3)
    if response.status_code == 200:
        print("‚úì ollama server is running successfully on port 11434")
        print("‚úì go back to streamlit and click process all documents")
    else:
        print(f"√ó ollama responded but with status: {response.status_code}")
except requests.exceptions.ConnectionError:
    print("√ó ollama is not responding on port 11434")
    print("√ó try running: !ollama --version to check if its installed")
except Exception as e:
    print(f"√ó error: {e}")


In [None]:
import os
import shutil

# completely remove old database
!rm -rf /tmp/chroma_db

# create fresh directory with full permissions
os.makedirs("/tmp/chroma_db", mode=0o777, exist_ok=True)

# verify it was created
if os.path.exists("/tmp/chroma_db"):
    print("‚úì database directory created successfully")
    print(f"‚úì permissions: {oct(os.stat('/tmp/chroma_db').st_mode)[-3:]}")
else:
    print("‚úó failed to create directory")


In [None]:
import shutil
import os

# delete old locked database
!rm -rf /tmp/chroma_db

# create completely fresh one
os.makedirs("/tmp/chroma_db", exist_ok=True)
os.chmod("/tmp/chroma_db", 0o777)  # full permissions

print("fresh database created with full write permissions")


In [None]:
!mkdir -p /content/chroma_db
!chmod 777 /content/chroma_db


In [None]:
import subprocess
import time
import requests

# kill any stuck ollama
!pkill -9 ollama
time.sleep(2)

# restart ollama server
print("üîÑ restarting ollama server...")
subprocess.Popen(['ollama', 'serve'],
                 stdout=subprocess.DEVNULL,
                 stderr=subprocess.DEVNULL)
time.sleep(10)

# verify it's running
try:
    response = requests.get("http://localhost:11434/api/tags", timeout=3)
    if response.status_code == 200:
        print("‚úÖ ollama server is running successfully on port 11434")
        print("‚úÖ go back to streamlit and click 'process all documents' again")
    else:
        print(f"‚ùå ollama responded with status: {response.status_code}")
except requests.exceptions.ConnectionError:
    print("‚ùå ollama is not responding - try running cell 2 again")
except Exception as e:
    print(f"‚ùå error: {e}")


In [None]:
# FIX: Clean up and recreate database directory

import os

import shutil



db_path = "/content/chroma_db"



# Remove old database completely

if os.path.exists(db_path):

    shutil.rmtree(db_path)

    print("‚úÖ Removed old database")



# Create fresh directory with full permissions

os.makedirs(db_path, mode=0o777)

print("‚úÖ Created fresh database directory")



# Verify permissions

import stat

perms = oct(os.stat(db_path).st_mode)[-3:]

print(f"‚úÖ Permissions: {perms} (should be 777)")