# üèõÔ∏è Professional Enterprise RAG Pipeline
### Features: Parallel PDF Ingestion, Weaviate Hybrid Search, Llama 4

**Requirements:**
1. **Weaviate** running in Docker (`localhost:8080`).
2. **Ollama** running with `llama4:scout` pulled.

In [11]:

import subprocess
import time
import os

# 1. Install Ollama (this step is blocking by nature)
print("Installing Ollama...")
!curl -fsSL https://ollama.com/install.sh | sh

Installing Ollama...
>>> Installing ollama to /usr/local
>>> Downloading ollama-linux-amd64.tgz
######################################################################## 100.0%
>>> Creating ollama user...
>>> Adding ollama user to video group...
>>> Adding current user to ollama group...
>>> Creating ollama systemd service...
>>> The Ollama API is now available at 127.0.0.1:11434.
>>> Install complete. Run "ollama" from the command line.


In [15]:
import subprocess
import time
import asyncio
import nest_asyncio

# Necessary for Jupyter/Notebook environments
nest_asyncio.apply()
def setup_ollama():
    # 1. Start the Ollama server in the background
    print("Starting Ollama server...")
    ollama_log = open("ollama_server.log", "w")

    # Use start_new_session to ensure the server keeps running independently
    subprocess.Popen(
        ["ollama", "serve"],
        stdout=ollama_log,
        stderr=ollama_log,
        start_new_session=True
    )

    # 2. Give the server a moment to initialize
    time.sleep(5)

    # 3. Pull the model (Blocking call to ensure it's ready before use)
    print("Downloading Ollama wizardlm2:7b'... Please wait, this may take a few minutes.")
    try:
        subprocess.run(["ollama", "pull", "wizardlm2:7b"], check=True)
        print("‚úÖ Model downloaded and server is ready!")
    except subprocess.CalledProcessError as e:
        print(f"‚ùå Error pulling model: {e}")

if __name__ == "__main__":
    setup_ollama()
    # Your weather or LLM logic goes here

Starting Ollama server...
Downloading Ollama wizardlm2:7b'... Please wait, this may take a few minutes.
‚úÖ Model downloaded and server is ready!


In [16]:
# Check if the Ollama server is awake and sees your model
!ollama list

NAME            ID              SIZE      MODIFIED               
wizardlm2:7b    c9b1aff820f2    4.1 GB    Less than a second ago    
gpt-oss:20b     17052f91a42e    13 GB     25 seconds ago            


In [1]:
pip install weaviate langchain_huggingface langchain_community langchain_weaviate langchain_text_splitters langchain_core langchain_ollama pypdf

Collecting weaviate
  Downloading weaviate-0.1.2-py3-none-any.whl.metadata (296 bytes)
Collecting langchain_huggingface
  Downloading langchain_huggingface-1.2.0-py3-none-any.whl.metadata (2.8 kB)
Collecting langchain_community
  Downloading langchain_community-0.4.1-py3-none-any.whl.metadata (3.0 kB)
Collecting langchain_weaviate
  Downloading langchain_weaviate-0.0.6-py3-none-any.whl.metadata (2.6 kB)
Collecting langchain_text_splitters
  Downloading langchain_text_splitters-1.1.0-py3-none-any.whl.metadata (2.7 kB)
Collecting langchain_ollama
  Downloading langchain_ollama-1.0.1-py3-none-any.whl.metadata (2.5 kB)
Collecting pypdf
  Downloading pypdf-6.6.0-py3-none-any.whl.metadata (7.1 kB)
Collecting langchain-classic<2.0.0,>=1.0.0 (from langchain_community)
  Downloading langchain_classic-1.0.1-py3-none-any.whl.metadata (4.2 kB)
Collecting requests<3.0.0,>=2.32.5 (from langchain_community)
  Downloading requests-2.32.5-py3-none-any.whl.metadata (4.9 kB)
Collecting dataclasses-json<0

In [None]:
# 1. INSTALL LATEST LIBRARIES
# !pip install -U langchain-ollama langchain-weaviate langchain-huggingface weaviate-client pypdf

import os
import multiprocessing
from concurrent.futures import ProcessPoolExecutor
from google.colab import files

import weaviate
from langchain_weaviate.vectorstores import WeaviateVectorStore
from langchain_huggingface import HuggingFaceEmbeddings
from langchain_ollama import OllamaLLM
from langchain_community.document_loaders import PyPDFLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser

print("‚úÖ Core modules loaded.")

## ‚öôÔ∏è 1. Infrastructure Setup

In [None]:
import weaviate
print(weaviate.__version__)


In [None]:
import weaviate
from langchain_weaviate.vectorstores import WeaviateVectorStore
from langchain_huggingface import HuggingFaceEmbeddings

def initialize_colab_rag():
    print("üöÄ Starting Embedded Weaviate (Colab, v4)...")

    embeddings = HuggingFaceEmbeddings(
        model_name="all-MiniLM-L6-v2"
    )

    # ‚úÖ FIXED LINE (v4 embedded API)
    client = weaviate.connect_to_embedded(
        persistence_data_path="./weaviate_data"
    )

    vectorstore = WeaviateVectorStore(
        client=client,
        index_name="EnterpriseDocs",
        embedding=embeddings,
        text_key="content"
    )

    print("‚úÖ SUCCESS: Embedded Weaviate is live.")
    return vectorstore, client


vectorstore, client = initialize_colab_rag()


##We use small embedding to retrive and an intelligent llama 4 for inference from queries.

In [26]:
import weaviate
from langchain_weaviate.vectorstores import WeaviateVectorStore
from langchain_huggingface import HuggingFaceEmbeddings
from langchain_ollama import OllamaLLM

try:
    # Small Searcher (CPU)
    embeddings = HuggingFaceEmbeddings(model_name='all-MiniLM-L6-v2')

    # Large Brain (Ollama 0.3+)
    llm = OllamaLLM(model='wizardlm2:7b')

    # Connect to the already running embedded instance
    #client = weaviate.connect_to_local(port=8079, grpc_port=50050)

    vectorstore = WeaviateVectorStore(
        client=client,
        index_name='EnterpriseDocs',
        embedding=embeddings,
        text_key='content'
    )
    print("‚úÖ Infrastructure connected successfully.")

except Exception as e:
    print(f"‚ùå Connection Error: {e}")


‚úÖ Infrastructure connected successfully.


## üìÇ 2. Parallel Ingestion Swarm
Upload your PDFs here. The agents will process them across all CPU cores.

In [21]:
# 1. Install faster PDF library
!pip install -q pymupdf langchain-community

import multiprocessing
from concurrent.futures import ThreadPoolExecutor
from google.colab import files
from langchain_community.document_loaders import PyMuPDFLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter

# Use PyMuPDF (fitz) - it's much faster than PyPDFLoader
def pdf_agent(file_path):
    try:
        loader = PyMuPDFLoader(file_path)
        # We split here so the heavy lifting is done in parallel
        splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=150)
        return loader.load_and_split(splitter)
    except Exception as e:
        print(f"Error processing {file_path}: {e}")
        return []

print("Select your PDF documents:")
uploaded = files.upload()
file_list = list(uploaded.keys())

if file_list:
    # Colab usually has 2 cores; threads work better for I/O bound tasks
    cores = multiprocessing.cpu_count()
    print(f"üöÄ Scaling to {cores} threads using PyMuPDF...")

    all_chunks = []

    # ThreadPoolExecutor reduces the overhead of "pickling" data between processes
    with ThreadPoolExecutor(max_workers=cores) as executor:
        results = list(executor.map(pdf_agent, file_list))

    # Flatten the list of lists
    for sublist in results:
        all_chunks.extend(sublist)
    vectorstore = WeaviateVectorStore(
        client=client,
        index_name="EnterpriseDocs",
        embedding=embeddings,
        text_key="content"
    )
    # Ensure your vectorstore is initialized before this line
    vectorstore.add_documents(all_chunks)

    print(f"‚úÖ Successfully indexed {len(all_chunks)} chunks.")
else:
    print("No files selected.")

Select your PDF documents:


Saving SQL CheatSheet Deeplytic Technologies .pdf to SQL CheatSheet Deeplytic Technologies  (1).pdf
Saving high-performance-python-practical-performant-programming-for-humans-2nbsped-1492055026-9781492055020_compress.pdf to high-performance-python-practical-performant-programming-for-humans-2nbsped-1492055026-9781492055020_compress (1).pdf
üöÄ Scaling to 2 threads using PyMuPDF...
‚úÖ Successfully indexed 1366 chunks.


In [24]:
# Check the count of objects in the index
response = client.collections.get("EnterpriseDocs").aggregate.over_all(total_count=True)
print(f"Total documents in Weaviate: {response.total_count}")

Total documents in Weaviate: 3746


## üß† 3. Intelligence Chain
Ask questions about your uploaded documents using LangChain's Expression Language (LCEL)

In [28]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough # Added import
from langchain_core.output_parsers import StrOutputParser # Ensure this is also imported if not already

retriever = vectorstore.as_retriever(search_kwargs={'alpha': 0.5, 'k': 5})
prompt = ChatPromptTemplate.from_template("Context: {context}\n\nQuestion: {question}\n\nAnswer:")

rag_chain = (
    {'context': retriever, 'question': RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

query = "Tell me full tutorial of SQL cheatsheet make it understandable."
print("Assistant is thinking...\n")
print(rag_chain.invoke(query))

Assistant is thinking...

 Certainly! The SQL CheatSheet provided by Deeplytic Technologies is a comprehensive guide for beginners looking to understand the fundamentals of SQL and its various commands, functions, and concepts. Below is a full tutorial based on the content of the cheatsheet, structured in an understandable manner:

### Introduction to SQL and Databases
- **SQL (Structured Query Language)**: A standard language for accessing and manipulating databases.
- **Databases**: Collections of data that are stored and accessed electronically. They allow for efficient data storage, retrieval, update, and management.

### Basic SQL Commands
1. **SELECT**: Retrieves data from one or more tables.
   - Example: `SELECT * FROM Employees;`
2. **INSERT**: Inserts new records into a table.
3. **UPDATE**: Modifies existing records in a table.
4. **DELETE**: Removes records from a table.
5. **DROP, TRUNCATE, DELETE**: All delete data but differ in how they do it.
   - **DROP**: Permanently 

#Streaming version of the Rag

In [30]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser

# Your LLM (replace with the one you already have)
# For example, `llm = your_llm_instance` which supports streaming

# Prompt template
prompt = ChatPromptTemplate.from_template(
    "Context: {context}\n\nQuestion: {question}\n\nAnswer:"
)

# RAG chain
rag_chain = (
    {'context': retriever, 'question': RunnablePassthrough()}
    | prompt
    | llm   # make sure your LLM has streaming enabled
)

# Streaming invocation
for token in rag_chain.stream(query):
    print(token, end="", flush=True)  # prints tokens as they arrive


 Certainly! Below is a comprehensive guide based on the SQL CheatSheet provided in the document metadata. This guide will cover various aspects of SQL, from basic concepts to advanced topics, including commands, functions, and operations. Each section will be explained in a clear and understandable manner.

### Introduction to SQL (Structured Query Language)

SQL is the standard language for managing and manipulating data in a relational database management system (RDBMS). It allows you to perform tasks such as querying, updating, and maintaining database structures.

### Basic Concepts

1. **Database**: A structured set of data held in a computer. Databases are organized so that data can be easily accessed, managed, and used by database systems.

2. **Table**: The basic logical unit of data storage in a relational database. Tables consist of rows (records) and columns (attributes).

3. **Primary Key**: A unique column or attribute that identifies each row uniquely in a table. No two r

## Congratulationüéâ you have completed your course on how to make your own Local Rags