# Hack Night at MSFT - May 20, 2025

![Image](https://images.lumacdn.com/cdn-cgi/image/format=auto,fit=cover,dpr=2,background=white,quality=75,width=100,height=100/event-covers/gu/98d09ce6-a834-45bb-b82b-f5d91c629d84.png)

Today, we're going to explore observability over our RAG Applications. [Weaviate](https://weaviate.io/) provides the retrieval, [FriendliAI](https://friendli.ai/) provides the inference layer, and [Comet Opik](https://comet.com/opik) is our observability layer.

This simple example will get you started with using Opik, Weaviate, and Friendli Serverless Endpoints to build a RAG system.

To use this notebook successfully, you'll need an account with Comet, Friendli and Weaviate.


**Note:** A Weaviate cluster is already set up, so you technically don't need to create a new cluster, and you can just READ off an existing cluster. If you want to learn more about how this cluster was set up, check out the `weaviate-embeddings-and-friendliai` dierctory in [this repository](https://github.com/weaviate/BookRecs/tree/main/data-pipeline).

You can create free accounts on all platforms.


# Set up your Environment with Comet Opik

[Comet](https://www.comet.com/) provides a hosted version of the Opik platform, simply [create a free account](https://www.comet.com/site/products/opik/) and grab you API Key from the UI.

First, we need pip install the opik and openai libraries.

In [1]:
%pip install -U opik openai --quiet

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m149.3/149.3 kB[0m [31m1.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m547.2/547.2 kB[0m [31m22.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m68.8/68.8 kB[0m [31m3.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m161.7/161.7 kB[0m [31m8.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m44.4/44.4 kB[0m [31m2.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.9/7.9 MB[0m [31m95.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m647.0/647.0 kB[0m [31m33.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m3.1/3.1 MB[0m [31m73.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Now, we'll configure Opik and FriendliAI with our respective API keys.

In [2]:
import opik

opik.configure(use_local=False)

OPIK: Your Opik API key is available in your account settings, can be found at https://www.comet.com/api/my/settings/ for Opik cloud


Please enter your Opik API key:··········
Do you want to use "ramchandra3101" workspace? (Y/n)Y


OPIK: Configuration saved to file: /root/.opik.config


# FriendliAI Inference

Set up Friendli AI and get a token

1.   Head to [FriendliAI](https://friendli.ai/get-started/serverless-endpoints), and create an account.
2.   Grab a [**`FRIENDLI_TOKEN`**](https://friendli.ai/suite/setting/tokens) to use Friendli Serverless Endpoints for LLM calls.

In [3]:
import getpass
import os

if not os.environ.get("FRIENDLI_TOKEN"):
    os.environ["FRIENDLI_TOKEN"] = getpass.getpass("Enter your Friendli Token: ")

Enter your Friendli Token: ··········


Traces will now be automatically logged to the Opik UI where you can inspect the inputs, outputs, and configure evaluation metrics. After you run this cell, follow the link to the Comet UI to see you traces.

# Set up Weaviate Client

Weaviate is a vector database which supports billion scale vector search with sub 50ms query times. We'll use Weaviate to query for books in this example.

In [4]:
%pip install -U weaviate-client --quiet

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/437.0 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[90m╺[0m[90m━[0m [32m419.8/437.0 kB[0m [31m12.4 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m437.0/437.0 kB[0m [31m8.6 MB/s[0m eta [36m0:00:00[0m
[?25h[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/43.5 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m43.5/43.5 kB[0m [31m3.0 MB/s[0m eta [36m0:00:00[0m
[?25h[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/223.8 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m223.8/223.8 kB[0m [31m14.8 MB/s[0m eta [36m0:00:00[0m
[?25h[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/2.5 MB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━

In [5]:
import os
import weaviate
from weaviate.classes.init import Auth
from weaviate.classes.init import AdditionalConfig, Timeout


WEAVIATE_CLUSTER_URL = os.getenv('WEAVIATE_CLUSTER_URL') or 'https://zxzyqcyksbw7ozpm5yowa.c0.us-west2.gcp.weaviate.cloud'
WEAVIATE_API_KEY = os.getenv('WEAVIATE_API_KEY') or 'n6mdfI32xrXF3DH76i8Pwc2IajzLZop2igb6' # This is a read key

weaviate_client = weaviate.connect_to_weaviate_cloud(
    cluster_url=WEAVIATE_CLUSTER_URL,
    auth_credentials=Auth.api_key(WEAVIATE_API_KEY),
    headers={"X-Friendli-Token": os.getenv('FRIENDLI_TOKEN')},
)

print(weaviate_client.is_connected())

book_collection = weaviate_client.collections.get(name="WeaviateEmbeddingBooks")

True


# Write a RAG app with Friendli, Weaviate and Opik Traces

Next, we will build a very simple LLM reasoning application and log the trace data to Opik where we can apply additional evaluation metrics and debug the LLM response.

We will use FriendliAI as our inference provider to get fast, low-cost results from open source models. In this example, we're using Friendli's serverless endpoints, which require no infrastructure setup and are ideal for quick prototyping and experimentation. Just provide the API URL endpoint as https://api.friendli.ai/serverless/v1.

For production use or personal deployments of custom models, [Friendli Dedicated Endpoints](https://friendli.ai/products/dedicated-endpoints) offers personal deployments of over 100k models on Hugging Face.

We will use Opik to collect traces to inspect the inputs and outputs of the reasoning tasks, and to create evaluation metrics for hallicinations and other common or custom issues you want to detect.

Opik integrates with OpenAI to provide a simple way to log traces for all OpenAI LLM calls. This works for all OpenAI models, including if you are using the streaming API.

In [6]:
from opik.integrations.openai import track_openai
from openai import OpenAI

os.environ["OPIK_PROJECT_NAME"] = "rag-project" #name your project. This will appear as the project name in the Opik UI


friendli_client = OpenAI(
    base_url="https://api.friendli.ai/serverless/v1",
    api_key=os.getenv('FRIENDLI_TOKEN')
)

@opik.track
def call_llm(client, messages):
    response = friendli_client.chat.completions.create(
      model="meta-llama-3.3-70b-instruct",
      messages=messages
    )
    return response

In [7]:
user_query = input("What would you like to query for in the BookRecs dataset? ")

response = book_collection.query.near_text(
        query=user_query,
        limit=3
    )

What would you like to query for in the BookRecs dataset? Non fiction books


In [8]:
for book in response.objects:
    print(book.properties['title'])

Non-Fiction
The Puffin Book of Nonsense Verse
Species of Spaces and Other Pieces


We are using the @opik.track decorator and the OpenAI logging integration to automatically log our traces and spans. Learn more here https://www.comet.com/docs/opik/tracing/log_traces#using-an-integration

In [9]:
@opik.track
def retrieve_context(user_query):
    # Semantic Search
    response = book_collection.query.near_text(
        query=user_query,
        limit=3
    )

    recommended_books = []
    for book in response.objects:
        recommended_books.append(book.properties['title'])
    return recommended_books

In [10]:
@opik.track
def generate_response(user_query, recommended_books):
  prompt = f"""
  You're a helpful assistant, reply to a chatbot message for someone inquiring for
  book recommendations. The user query was {user_query}


  These were the book that were extracted from the vector
  search:

  {recommended_books}
  """

  messages=[
      {
          "role": "user",
          "content": prompt
      }
  ]
  response = call_llm(friendli_client, messages)


  return (response.choices[0].message.content)

In [11]:
@opik.track(name="rag-example")
def llm_chain(user_query):
    context = retrieve_context(user_query)
    response = generate_response(user_query, context)
    return response

In [12]:
# Use the LLM chain
user_query = input("What types of books are you looking for? ")
result = llm_chain(user_query)
print(result)

What types of books are you looking for? movies


OPIK: Started logging traces to the "rag-project" project at https://www.comet.com/opik/api/v1/session/redirect/projects/?trace_id=0196f008-7e25-7f39-aad8-0746e0c39f38&path=aHR0cHM6Ly93d3cuY29tZXQuY29tL29waWsvYXBpLw==.


It seems like you were looking for movie recommendations, but we found some great book titles that might interest you instead. If you're open to exploring some book options, we found:

1. 'For Keeps' - a romantic novel that might have a cinematic feel to it.
2. 'The Lord of the Rings' - a classic fantasy series that was actually adapted into a movie trilogy, so you might enjoy the book version.
3. 'The Art of Alfred Hitchcock' - a non-fiction book about the legendary film director, which could give you insights into the world of movies.

However, if you'd still like some movie recommendations, please let me know what type of movies you're in the mood for (e.g. action, comedy, horror, etc.) and I'd be happy to provide some suggestions!
