<a href="https://colab.research.google.com/github/pedroMoya/Agents-Over-The-Weekend/blob/main/Lior_Gazit/workshop_september_2025/codes_for_Lior_Bootcamp_talk_sept2025_demo1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Demo 1 - RAG Pipeline with Chained Prompt Processing
By: [Lior Gazit](https://github.com/LiorGazit).  
Repo: [Agents-Over-The-Weekend](https://github.com/PacktPublishing/Agents-Over-The-Weekend/tree/main/Lior_Gazit/workshop_september_2025/)  
Running LLMs locally for free: This code leverages [`LLMPop`](https://pypi.org/project/llmpop/) that is dedicated to spinning up local or remote LLMs in a unified and modular syntax.  

<a target="_blank" href="https://colab.research.google.com/github/PacktPublishing/Agents-Over-The-Weekend/blob/main/Lior_Gazit/workshop_september_2025/codes_for_Lior_Bootcamp_talk_sept2025_demo1.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a> (pick a GPU Colab session for fastest computing)  

```
Disclaimer: The content and ideas presented in this notebook are solely those of the author, Lior Gazit, and do not represent the views or intellectual property of the author's employer.
```

Installing:

In [6]:
%pip -q install llmpop
%pip -q install sentence-transformers faiss-cpu langchain langchain_core # tiktoken langsmith langchain_openai

**Imports:**

In [7]:
import os
import requests

Setting up the Vector Store and RAG pipeline:

In [8]:
# Import required libraries
from sentence_transformers import SentenceTransformer
import faiss
import numpy as np

# Example documents (could be clinical notes, financial filings, etc.)
documents = [
    "Patient has diabetes type 2 and shows high glucose levels.",
    "Recent financial filings show revenue growth despite supply chain issues.",
    "Patient diagnosed with hypertension, recommended lifestyle changes.",
    "The company's earnings call mentioned concerns over increased production costs.",
]

# Step 1: Create embeddings using SentenceTransformer
embedding_model = SentenceTransformer('all-MiniLM-L6-v2')  # lightweight embedding model

# Generate embeddings for the documents
document_embeddings = embedding_model.encode(documents)

# Step 2: Setup FAISS vector store
dimension = document_embeddings.shape[1]
faiss_index = faiss.IndexFlatL2(dimension)
faiss_index.add(document_embeddings)

# Step 3: Define the retriever(query) function
def retriever(query, top_k=2):
    # Generate embedding for the query
    query_embedding = embedding_model.encode([query])

    # Perform the similarity search in the FAISS index
    distances, indices = faiss_index.search(query_embedding, top_k)

    # Retrieve the top_k most similar documents
    retrieved_docs = [documents[idx] for idx in indices[0]]

    return retrieved_docs

# Example usage of the retriever
query = "What did the company say about production costs?"
context_docs = retriever(query)
print("\n\nRetrieved documents for context:\n", context_docs)



Retrieved documents for context:
 ["The company's earnings call mentioned concerns over increased production costs.", 'Recent financial filings show revenue growth despite supply chain issues.']


Prompting using a locally hosted LLM via Ollama:

In [9]:
from llmpop import init_llm
from langchain_core.prompts import ChatPromptTemplate

query = "What is the patient's diagnosis given these notes?"
context_docs = retriever(query)
question = f"Using the following context, answer the question:\n\n{context_docs}\n\nQ: {query}\n\n---\nA:"
local_llm = init_llm(model="llama3.2:1b", provider="ollama")

prompt = ChatPromptTemplate.from_template("Q: {question}\nA:")
print((prompt | local_llm).invoke({"question":question}).content)

🚀 Starting Ollama server...
→ Ollama PID: 25002
⏳ Waiting for Ollama to be ready…
Ready!

🚀 Pulling model 'llama3.2:1b'…
All done setting up Ollama (ChatOllama).

Based on the provided context, it appears that the patient's primary diagnosis is insulin resistance or type 2 diabetes. The mention of "high glucose levels" supports this diagnosis, as people with type 2 diabetes typically have elevated blood sugar readings.

The additional note about hypertension (high blood pressure) is also relevant to the patient's overall health, but it does not directly indicate a specific diagnosis related to their primary condition (diabetes). It may be worth considering in conjunction with the diabetes diagnosis or referring the patient for further testing to rule out other conditions.


Prompting using OpenAI's API (paid) route:

In [10]:
# In Colab, use getpass to securely prompt for your API key
from getpass import getpass
import openai

openai.api_key = getpass("Paste your OpenAI API key: ")

response = openai.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role":"system","content":"You are a medical assistant."},
        {"role":"user",  "content": question}
    ]
)

answer_api = response.choices[0].message.content
print("\n\n")
print(answer_api)

Paste your OpenAI API key: ··········



The patient's diagnoses are diabetes type 2 with high glucose levels and hypertension.
