# Embeddings / RAG

[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/9juhnke/llm-api-gwdg-saia/main?filepath=embeddings_rag.ipynb)
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/9juhnke/llm-api-gwdg-saia/blob/main/embeddings_rag.ipynb)

This notebook demonstrates the generation of embeddings via the SAIA API (see [README.md](./README.md)) and shows how this can be used to build a simple Retrieval Augmented Generation (RAG) application with ChromaDB - including indexing and querying your own documents.

In [None]:
# To install the required packages remove the comment character before the next line
# !pip install openai

Embeddings are only available via the API and support the same API as the [OpenAI Embeddings API](https://platform.openai.com/docs/guides/embeddings).

See the following minimal example:

In [None]:
from openai import OpenAI

# API configuration
api_key = "<Your API_KEY>" # Insert your API Key
base_url = "https://chat-ai.academiccloud.de/v1"
model="e5-mistral-7b-instruct" # This is the embeddings model

# Start OpenAI client
client = OpenAI(
    api_key=api_key, 
    base_url=base_url
    )

# Get embeddings
response = client.embeddings.create(
    input="The food was delicious and the waiter ...",
    model=model,
    encoding_format="float"
)

# Print full response as JSON or extract the response text from the JSON object
print(response.data[0].embedding)

[0.01216933410614729, 0.01970178447663784, 0.004577118903398514, 0.010902555659413338, -0.0014405702240765095, -0.022289060056209564, -0.009196390397846699, 0.02913280390202999, 0.03710417076945305, 0.003584495047107339, -0.000584330060519278, -0.020435428246855736, 0.0288788303732872, 0.0014871652238070965, 0.0061538806185126305, -0.015730837360024452, 0.023429524153470993, 0.006835825741291046, -0.004457361530512571, -0.02554476261138916, -0.007348496001213789, -0.016075100749731064, 0.00915249902755022, 0.0006792726344428957, -0.001541088568046689, 0.006771501153707504, 0.011100951582193375, -0.021443340927362442, 0.006966174114495516, -0.0014100756961852312, 0.012409809045493603, 0.008255956694483757, 0.005294600501656532, 0.0025935526937246323, -0.01187890488654375, -0.011508866213262081, -0.010075084865093231, -0.0007584418635815382, -0.006006816867738962, 0.018684489652514458, -0.015086563304066658, -0.016364332288503647, 0.03117881342768669, 0.0014257614966481924, 0.01077072694

See the following code example for developing RAG applications with ChromaDB as a persistent store:


In [None]:
from openai import OpenAI
import chromadb
from chromadb.utils import embedding_functions

# API configuration
api_key = "<Your API_KEY>" # Insert your API Key
base_url = "https://chat-ai.academiccloud.de/v1"
embedding_model = "e5-mistral-7b-instruct" # This is the embeddings model
chat_model = "meta-llama-3.1-8b-instruct" # Choose any available chat (text) model

# Path configuration
db_path = "./chroma_db"

# Embedding client
embedding_client = embedding_functions.OpenAIEmbeddingFunction(
    api_key=api_key,
    api_base=base_url,
    model_name=embedding_model
)

# Initialize Chroma collection
collection = chromadb.PersistentClient(path=db_path).get_or_create_collection(
    name="my_rag_collection",
    embedding_function=embedding_client
)

# Load, index and add documents
docs = [
    "The Campus Garden Project at Westhill University was launched in 2023 as a student-led initiative to grow herbs and vegetables for the cafeteria. Located behind the physics building, it covers 150 square meters and uses a rainwater collection system. Volunteers meet every Thursday afternoon.",
    "The coffee machine on the third floor of the Informatics Department only accepts coins and was out of order from April 3rd to May 7th. It now operates again, but only offers black coffee and espresso. A card-based replacement is planned for the next semester.",
    "The 'Ethics in Data Science' seminar takes place every second Monday at 4 PM in Room B204. It is part of the Digital Society series and hosted by Dr. Lena Kovács. Topics include algorithmic bias, data ownership, and AI transparency. Registration is via the university intranet.",
]
ids = ["campus-garden", "coffee-machine", "seminar-schedule"]
collection.add(documents=docs, ids=ids)

# User query
query_text = "When does the 'Ethics in Data Science' seminar take place?"
results = collection.query(query_texts=[query_text], n_results=2)
context = "\n".join(results["documents"][0])

# OpenAI request
client = OpenAI(api_key=api_key, base_url=base_url)
response = client.chat.completions.create(
    model=chat_model,
    messages=[{
        "role": "user",
        "content": f"""Here is the context:
{context}

Question: {query_text}
Answer:"""
    }]
)

# Print full response as JSON or extract the response text from the JSON object
print(response.choices[0].message.content)

Every second Monday at 4 PM in Room B204.
